ChatGPT enters the chat

Dermatologists discuss the possibilities — and pitfalls — of ChatGPT in clinical practice.

Feature

By Emily Margosian, Assistant Editor, April 1, 2024

Design a webpage. Make a content strategy. Answer medical questions. Write an essay about trickle-down economics in the style of a Taylor Swift song. The possibilities of OpenAI’s ChatGPT are nearly as endless as the internet itself.

Since its launch in November 2022, the program has delighted — and raised alarms about the impact of its use (and no, it did not write this article).

While the introduction of generative artificial intelligence (AI) has radically shifted the stakes for creative industries, prompting calls for new regulations, medicine likewise has begun its own reckoning. Amid a flurry of new research and discussion about its potential use, could ChatGPT emerge as a strong thought partner for dermatologists?

“I think it would be a huge win if we can be thoughtful and creative in terms of how we use tools like ChatGPT in our clinics,” said Ivy Lee, MD, FAAD, chair of the Academy’s Augmented Intelligence Committee. “We just have to be very careful about keeping safety, fairness, and patient privacy in mind.” This month, experts discuss ChatGPT’s potential utility in dermatology practice — and caution about limitations regarding its use.

Short on time?

Key takeaways from this article:

ChatGPT is a large language model (LLM) released by OpenAI in November 2022. It can automatically create written text based on prompts from users, including essays, poems, lyrics for a song, travel agendas, and even an academic manuscript.
ChatGPT can potentially help improve clinical efficiency when used as a tool to draft generic clinical documents, including templates for after-visit summaries, post-procedure instructions, referrals, prior authorization appeal letters, and educational handouts.
ChatGPT can also be used to translate existing medical documents and educational content as well as adjust the reading level and length of the original material.
While current research indicates that ChatGPT may also show potential as a diagnostic partner, this use is still premature in its current iteration.
Major concerns about the use of ChatGPT in medicine involve the validity, accuracy, and biases of the information it provides.
Medical journals are currently setting standards on the use of ChatGPT in scientific publishing; most prohibit its use as an author or reviewer.
LLMs like ChatGPT are continually evolving alongside the development of large vision models (LVMs), which can incorporate images and audio input as well as text.

What is ChatGPT?

Released in late 2022 by OpenAI, ChatGPT is an AI chatbot that automatically creates text based on written prompts from users. Deriving its datasets from the general corpus of the internet, it is an example of a large language model (LLM). Other examples of LLMs include Google’s Bard and Med-PaLM 2, among others.

“Large language models are machine learning models designed to mimic human conversation and generate detailed information when prompted, based on large amounts of data,” explained Gabrielle Schwartzman, MD, and Soumya Reddy, BSA, medical student at Texas Tech University Health Sciences Center, co-authors of a recent commentary on ChatGPT in dermatology clinical practice.

Potential utility in medicine

ChatGPT can be used to generate various forms of writing, including essays, poems, lyrics for a song, travel agendas, and even an academic manuscript. This generative ability can also be potentially harnessed for different uses in clinical practice, say some physicians.

“I think LLMs can be a huge asset to alleviate some of the administrative burden that can contribute to burnout and give us more time and capacity to really enjoy the more human aspect of medicine,” said Dr. Lee.

Improving clinical efficiency

One way ChatGPT can be deployed in dermatology is by streamlining elements of clinical practice.

“There are several functions of ChatGPT that can be used to improve clinical efficiency. ChatGPT can be used to draft generic clinical documents, assist in research (finding resources for reviews, formulating hypotheses, etc.), administer patient surveys, recruit for clinical trials, and even automatically monitor high-risk medications,” said Dr. Schwartzman and Reddy. “However, even with its many potential applications, it is important for physicians to consider and review the quality, accuracy, and safety of what it produces.”

The program can also provide dermatologists with quick access to medical information, similar to a search engine, but with more contextual understanding. “I would say that this has replaced my Google search,” said Dr. Lee. “I don’t use ChatGPT, but I do use Perplexity, which is another AI LLM, because it cites the sources from which it’s generating information. Given the abundance and accessibility of information today, we have to be discerning consumers and think critically about the validity of information. As content creators and physicians, we can help our patients distinguish between valid information vs. misinformation and metabolize the content to determine whether it applies to them.”

As integration of LLMs in medicine progresses, dermatologists in the near future may have the option to delegate burdensome administrative tasks to AI, freeing up valuable time for doctors to invest in patient care. “One major way ChatGPT can be used to alleviate administrative burden in dermatology practice is through drafting generic medical documents. This includes templates for after-visit summaries, post-procedure instructions, referrals, prior authorization appeal letters, and educational handouts. This can help increase efficiency for dermatologists in and outside of the clinic, thus reducing physician burnout and leading to better patient care,” according to Dr. Schwartzman and Reddy.

“There’s a lot of experimentation in terms of looking at how we can generate text, visuals, and audio to reduce administrative burden,” agreed Dr. Lee. “Doximity is an online network used by many physicians, and has a collection of tools called DocGPT, that use a LLM for high-volume and high-frequency physician use cases, like prior authorization letters, patient letters, and letters of recommendation for our trainees and colleagues. LLMs — whether they’re Open AI’s ChatGPT or others — are increasingly being experimented with in health care systems for documentation and patient education purposes.”

“If I have a patient with a low health literacy level, I can input that information, and specify how I would like the information distilled. ChatGPT can easily summarize the key points and create an output that’s a paragraph long at a fourthgrade reading level.”

Current research indicates that ChatGPT may also show potential as a diagnostic partner. A recent study found that in its current iteration, ChatGPT may improve physicians’ diagnostic accuracy in interpreting test results. In the study, medical cases along with test results were presented to ChatGPT and physicians. “The chatbot was more accurate than physicians in assessing the likelihood of diagnoses in all five cases where test results came back negative. When diagnostic tests in the clinical scenarios were positive, ChatGPT was equal to doctors in assessing risk,” explained Craig Burkhart, MD, MS, MPH, FAAD, in a recent DermWorld column, noting that many physicians struggle with estimating pre-test and post-test probabilities of disease.

According to Dr. Burkhart, the study’s findings “reveal a future where we have the option to improve medical test interpretation by consulting with chatbots. This article is also exciting as it shows a simple way that dermatologists can experiment with integrating artificial intelligence into clinical practices right now without the need for expensive labs or specialized training.”

However, while dermatologists may want to experiment with ChatGPT as a clinical support tool, they should exercise caution at the present moment. “Although there is significant potential, the utility of ChatGPT as a clinical decision support tool is yet to be determined,” cautioned Dr. Schwartzman and Reddy. “Physicians should regard it similarly to information retrieved from a search engine. The information produced by ChatGPT should be used to support but not confirm a diagnosis or direct management. Clinical suspicion by the dermatologist is of the utmost importance and should always trump ChatGPT’s output.” Also, remember not to enter PHI.

“Right now, in terms of actual clinical integration, LLM use is early in the adoption phase, but it is growing,” added Dr. Lee. “These LLMs can be an effective thought partner. If you’re practicing on your own, it’s great to kind of bounce ideas off of, brainstorm research or media campaigns, or simulate challenging patient or employee conversations. It’s an exciting time. The challenge is how do we explore and experiment with this new tool, and still hold it accountable to our frameworks for ethics and evidence?” Trust in these tools and meaningful adoption of LLMs depends on it.

What has caused it, and what resources can help? Dermatologists discuss solutions.

Bolstering physician-patient communication

While ChatGPT’s output has sometimes been criticized for its lack of human finesse, ironically, it can be utilized to better improve communication between people in certain aspects of medical practice.

One potential use is translation — either in place of a virtual translator during patient visits or to convert patient education materials into a different language. “I’ve used ChatGPT to translate practice materials into different languages. I do always have someone on my staff who speaks that language double check it to make sure it’s valid,” said Dr. Lee. “It can also be used to improve the accessibility of content that’s already been created. Say for example, we’re using AAD patient education on molluscum contagiosum that’s in pamphlet form. If I have a patient with a low health literacy level, I can input that information, and specify how I would like the information distilled. ChatGPT can easily summarize the key points and create an output that’s a paragraph long at a fourth-grade reading level.”

While bouncing queries off a chatbot may lack a human touch, in some instances, AI may show greater empathy in answering medical questions than a human physician. A recent JAMA Internal Medicine study compared physician and AI chatbot responses to patient questions posted on a public social media forum. “It was quite fascinating. In terms of accuracy, ChatGPT did alright, but its responses were rated significantly more empathetic than a human’s,” said Dr. Lee. “I think this is an area where we can use AI in combination with other training to improve those empathy and communication skills to really build that trust with our patients.”

For physicians struggling with the marketing and mass communication aspect of running a medical practice, ChatGPT can also be a handy ghostwriter. “In my private practice, we use it to write a lot of marketing material. On the website, newsletters, social accounts — I use generative AI for that all the time. It takes the mental load of it off,” said Dr. Lee.

Advancing medical education

ChatGPT has also recently been incorporated into medical education at select institutions, primarily used as a tool to model interactive patient encounters and help students practice taking a patient’s history and create a differential diagnosis. “This application of ChatGPT can help medical students hone their clinical skills in a low-stress environment without the restrictions that can come from hiring and training standardized patients, especially when mimicking dermatologic clinical encounters,” said Dr. Schwartzman and Reddy.

“Stanford and Harvard are the two main schools that are really experimenting with incorporating it into medical education,” said Dr. Lee. “I think it can be a great way to better teach ourselves, our physicians in training, and our care team colleagues on how best to use AI in our clinics, whether it’s to streamline administrative tasks outside of the exam room, or in front of the patient in a clinical encounter to help improve comprehension of the diagnosis and management plan. AI has the potential to be a great thought partner that we have at bedside to help generate or expand the differential that our minds come up with, and maybe fill in some of those blind spots that we all have.”

Ask ChatGPT

While much of this article has discussed dermatologists’ thoughts on ChatGPT as a potential tool, what does ChatGPT think it can do for dermatologists? When prompted, the program offers the following response:

“ChatGPT can assist dermatologists in various ways, such as:

Information retrieval: Providing information on skin conditions, diseases, treatments, and medications.
Educational support: Offering explanations and insights into dermatological concepts, procedures, and research.
Case discussion: Engaging in discussions about specific cases, symptoms, or patient scenarios to brainstorm possible diagnoses and treatment options.
Language translation: Facilitating communication with patients who speak different languages by offering translation services.
Patient education materials: Assisting in the creation of patient-friendly educational materials to help individuals better understand their skin conditions and treatment plans.
General queries: Answering general questions related to dermatology, skincare routines, and preventive measures.

Remember that while ChatGPT can provide valuable information, it is not a substitute for professional medical advice. Dermatologists should always rely on their expertise and consider consulting with colleagues or referring to medical literature for critical decisions.”

Areas of concern

While the use of ChatGPT in medicine has great potential, in its current state, it also comes with a series of risks and in some instances, the ability to do harm.

Validity and accuracy

While information obtained from ChatGPT is just a click away for both patients and physicians, safeguards do not currently exist to ensure the safety and accuracy of the information supplied. LLMs are not currently regulated in the United States, and there are no models that have been shown to be safe to diagnose or treat skin conditions. “Patients should also be cautious when seeking medical advice from ChatGPT, as it could be misleading and/or incorrect and undermine the patient-physician relationship,” said Dr. Schwartzman and Reddy.

Different LLMs also have variable and not completely understood limitations in terms of their datasets. “ChatGPT was trained on the Common Crawl dataset, which is a large corpus of text scraped from the internet,” explained Dr. Lee. “ChatGPT 3 had a knowledge cutoff of October 2021, and ChatGPT 4 has a knowledge cutoff of April 2023.”

“ChatGPT was trained on the entire internet, for better or for worse. It reflects the accuracy, inaccuracies, the biases, and the blind spots of the internet.”

Information validity regarding health care advice can also vary depending on how a question or task is presented. As LLMs can evolve over time, they may generate different outputs to the same query. “ChatGPT is programmed on vast amounts of data but needs to be repeatedly updated to incorporate the latest data and research available,” said Dr. Schwartzman and Reddy. “This is a cause for concern in terms of the accuracy or bias of its output because the LLM cannot access data past 2023, the year which it was most recently updated; research and data from 2024 is not available to ChatGPT.”

Likewise, not all LLMs are created equal. “In terms of LLMs, there are lots of options. Some that are more popular among physicians are Elicit AI or Perplexity AI. These are different LLMs that are trained on more precise data sets. For example, Elicit is trained on the PubMed archive, and Perplexity is one that has been fine-tuned to cite its sources. So, if you are questioning the validity of its output, you can click on it and it will link to where it derives its information from,” said Dr. Lee.

As ChatGPT’s source data is subject to the breadth of the internet, its responses to queries can be influenced by biases found throughout the web. “ChatGPT was trained on the entire internet, for better or for worse. It reflects the accuracy, inaccuracies, the biases, and the blind spots of the internet,” explained Dr. Lee. “One of the struggles we have in our current society is misinformation. Even we as humans — with our natural neural network — have a hard time discerning what’s valid information versus misinformation. It’s the same with LLMs. They can also hallucinate and create either false or fabricated information. Therefore, if I use ChatGPT in an area that I’m not an expert in, it would be hard for me to discern whether the output it generates for me is valid and accurate.”

Perpetuation of existing biases

The broad source of ChatGPT’s datasets can also have an insidious effect on the information it produces — a metaphorical snowball effect reinforcing existing biases in medicine. “AI models like ChatGPT may perpetuate harmful stereotypes and biases that are encoded in training data,” said Dr. Schwartzman and Reddy.

A recent study found that four major commercial LLM models had instances of promoting race-based medicine and racist tropes, or repeated unsubstantiated claims around race. The study’s authors suggest that LLMs like ChatGPT require more adjustment to eradicate inaccurate racial content and are therefore not ready for clinical use or integration as a medical decision-making tool.

“Again, because ChatGPT is trained on the general corpus of the internet, it also ingests and reflects the biases that exist on the internet,” said Dr. Lee. “The paper that was just published on the racial biases of LLMs found that even though we as the medical community know that race is a social construct and not a genetic construct, ChatGPT’s response — especially in terms of renal function — reflects back biases that we know medically are inaccurate.”

Impact on peer-reviewed publishing

While the film industry recently underwent its own highly public reckoning with the ethics and regulation of AI-generated screenplays, the scientific publishing community is still working to write a set of common standards.

Several organizations within the scholarly publishing community have issued policies to govern the use of ChatGPT, which poses an unprecedented and immediate challenge, with major concerns including copyright, attribution, plagiarism, and authorship for articles generated by AI.

In early 2023, as the scientific publishing community grappled with the issue, high-profile journals such as Nature and JAMA updated their submission guidelines to prohibit, restrict, or discourage the use of this technology. The JAMA Network published an editorial that addressed author responsibilities, reproduced and re-created material, and image integrity.

The Science family of journals banned all ChatGPT-generated text, figures, images, or graphics; while reports began to emerge in February 2023 of journal articles that were written using ChatGPT, most notably among them, an article in Radiology written entirely by ChatGPT, with a radiologist listed as the corresponding author.

Efforts to address these issues were also undertaken by prominent publishing organizations. In May 2023, the Committee on Publication Ethics, and the World Association of Medical Editors issued position statements on the use of this technology. The COPE position statement addresses the key publishing issues at stake. “AI tools cannot meet the requirements for authorship as they cannot take responsibility for the submitted work. As non-legal entities, they cannot assert the presence or absence of conflicts of interest nor manage copyright and license agreements.”

WAME issued its recommendations on the use of ChatGPT in scholarly publications, calling for new tools to detect AI-generated or modified content.

Similarly, the International Committee of Medical Journal Editors updated its Recommendations for the Conduct, Reporting, Editing, and Publication of Scholarly Work in Medical Journals to “provide guidance on how work conducted with the assistance of AI technology (including ChatGPT) should and should not be acknowledged.

Likewise, publisher Elsevier’s policy on the use of generative AI and AI-assisted technologies in writing for Elsevier allows AI to improve readability but states that ChatGPT cannot be an author. The Journal of the American Academy of Dermatology (JAAD), also adheres to this policy, according to Lara Graf, MS, AAD director of medical journals publishing. “JAAD does have a policy on use of AI in research and authorship. Artificial intelligence is not permitted to be used in an author capacity; authors should only use these technologies to improve readability and language.”

JAAD’s policy also requires authors to disclose in their manuscript the use of AI and AI-assisted technologies in the writing process, and mirrors Elsevier policy on the use of generative AI and AI-assisted tools in figures, images, and artwork. This policy puts strict limits on how AI may be used to create or alter images and requires documentation and reporting of which tools were used, and how.

Per JAAD policy, “The use of Generative AI or AI-assisted tools is NOT permitted to create or alter images in submitted manuscripts. Image forensics tools or specialized software might be applied to submitted manuscripts to identify suspected image irregularities.”

JAAD is also working to add a policy on the use of AI in the peer review process that mirrors Elsevier’s policy on publishing ethics. The Elsevier policy states, “Peer review is at the heart of the scientific ecosystem … Managing the editorial evaluation of a scientific manuscript implies responsibilities that can only be attributed to humans … the critical thinking and original assessment needed for this work is outside of the scope of this technology and there is a risk that the technology will generate incorrect, incomplete, or biased conclusions about the manuscript.” Further, submitted manuscripts are considered confidential documents, which should not be uploaded into a generative AI tool as this may violate “the authors’ confidentiality and proprietary rights and, where the paper contains personally identifiable information, may breach data privacy rights.”

Looking toward the future — responsible use of generative AI in medicine

Like them or not, LLMs likely aren’t going away. Dermatologists should be aware of the limitations and potential risks of using ChatGPT, carefully evaluating for accuracy, biases, and protection of patient privacy. “However, if used appropriately with these limitations in mind, ChatGPT has the potential to improve efficiency in clinical practice through information retrieval and document drafting. This increased efficiency can decrease physician burnout and improve patient care,” said Dr. Schwartzman and Reddy.

As LLMs continue to develop, new modalities may also emerge for use in dermatology. “Large vision models (LVMs) are another type of machine learning which can take input in the form of audio and images as well as text, and then produce output in the form of audio, images, or text,” explained Dr. Lee. “There’s a LVM called SkinGPT, where for example, you can snap a photo of an image of a skin lesion, and this LVM depending on what you’re asking, then puts out text in terms of diagnosis and management. The growth curve of those LVMs is very much paralleling the LLMs.”

While the use of ChatGPT and other LLMs in medicine currently requires careful human oversight and awareness, the evolution of AI also offers unforeseen opportunities. “Overall, I am hopeful that the accessibility of ChatGPT will allow dermatologists and our patients to experiment with and identify the most meaningful use cases in our work — and personal — lives,” said Dr. Lee. “ChatGPT is improving dramatically and very quickly. With each iteration, the amount of knowledge and how recent and proximate the knowledge is, is increasing and improving. As physicians, we must educate ourselves on the potential risks and benefits of this technology so we can apply it responsibly and equitably. Along with other stakeholders, we need to co-create and advocate for guardrails on safety, fairness, and ethics — what exciting times we live in!”

ChatGPT enters the chat

Dermatologists discuss the possibilities — and pitfalls — of ChatGPT in clinical practice.

Feature

Short on time?

What is ChatGPT?

Potential utility in medicine

Improving clinical efficiency

Erosion of administrative support in dermatology

Bolstering physician-patient communication

Advancing medical education

Ask ChatGPT

Areas of concern

Validity and accuracy

Perpetuation of existing biases

Impact on peer-reviewed publishing

Looking toward the future — responsible use of generative AI in medicine

Additional DermWorld Resources

In this issue

From the Editor

What’s hot

Clinical Applications

Derm Coding Consult

Ask the Expert

Feature

Answers in Practice

Legally Speaking

Feature

From the President

Asked and Answered

Facts at your Fingertips

Classifieds

Membership

Featured

Meetings & Education

Featured

Practice management

Featured

Clinical & quality

Featured

Publications & apps

Featured

Career development

Featured

Advocacy

Featured

ChatGPT enters the chat

Dermatologists discuss the possibilities — and pitfalls — of ChatGPT in clinical practice.

Feature

Short on time?

What is ChatGPT?

Potential utility in medicine

Improving clinical efficiency

Erosion of administrative support in dermatology

Bolstering physician-patient communication

Advancing medical education

Ask ChatGPT

Areas of concern

Validity and accuracy

Perpetuation of existing biases

Impact on peer-reviewed publishing

Looking toward the future — responsible use of generative AI in medicine

Additional DermWorld Resources

In this issue

From the Editor

What’s hot

Clinical Applications

Derm Coding Consult

Ask the Expert

Feature

Answers in Practice

Legally Speaking

Feature

From the President

Asked and Answered

Facts at your Fingertips

Classifieds

Registration begins today at 12 pm noon (CT) for physician members, life members, and honorary members.