Learn about the Academy's advocacy priorities and how to join efforts to protect your practice.
Advertisement
Advertisement
Can computer algorithms improve dermatologists’ accuracy when diagnosing melanoma?
Clinical Applications
Dr. Schwarzenberger is the former physician editor of DermWorld. She interviews the author of a recent study each month.
By Kathryn Schwarzenberger, MD, January 1, 2021
In this month’s Clinical Applications column, Physician Editor Kathryn Schwarzenberger, MD, talks with Michael A. Marchetti, MD, about his recent JAAD article "Computer algorithms show potential for improving dermatologists' accuracy to diagnose cutaneous melanoma: Results of the International Skin Imaging Collaboration 2017."
DermWorld: Tell us about your study. Why did you choose to study the role of AI in dermatology?
Dr. Marchetti: Our study represents the work of the International Skin Imaging Collaboration (ISIC): Melanoma Project, an academia-industry partnership with the principal goal of applying digital skin imaging to reduce melanoma deaths. Given recent progress in computer vision, which can now identify faces and objects in images with an accuracy that parallels human performance, we believe that artificial intelligence has the potential to aid the detection of melanoma.
To help spur the development and validation of AI for melanoma diagnosis, ISIC has hosted annual grand challenges (2016-2020) using dermoscopy images from its burgeoning online archive of skin images. Over time, the competitions have increased in complexity and scale. This particular article published in JAAD represents the companion reader study from the disease classification portion of the 2017 challenge. Briefly, we selected 2,750 high-quality dermoscopy images from the archive: 521 were melanomas, 1,843 were nevi, and 386 were seborrheic keratoses. We provided 2,000 of these images as training examples to participating teams and used another 600 images to evaluate the performance of their submitted algorithms. The winning algorithm was developed by a Japanese team from Shinshu University and Casio Computer Co., Ltd., and achieved a balanced multiclass accuracy of 0.911.
To better understand how this AI performance measures relative to physicians, we conducted a reader study with dermatologists and residents.
Dr. Marchetti: We randomly selected 150 of the 600 images from the test set of the challenge. Fifty images were melanomas, 50 were nevi, and 50 were seborrheic keratoses. Using an online web interface, eight dermatologists and nine residents evaluated all 150 images, providing their diagnosis, management decision, and confidence level. Of note, readers did not have access to information typically used when evaluating suspicious skin lesions, such as patient risk factors and lesion symptoms, and they made choices without time restrictions. These are notable limitations to our study. To explore the feasibility of algorithms augmenting physician performance, we also examined the impact of substituting dermatologist decisions with algorithm predictions when reader confidence was low. We chose to perform this analysis as we hypothesized that this would represent the most likely circumstance that a physician would use AI when examining a patient.
DermWorld: What did you find, and were you surprised by any of the results?
Dr. Marchetti: The performance of the top-ranked algorithm was clearly superior to the dermatologists and the residents for both diagnosis and management decisions. For example, at the dermatologists’ mean sensitivity in diagnosis (76%), the AI algorithm had a higher specificity than the dermatologists (85% vs. 73%, p=0.001). Differences in the performance between the AI algorithm and the residents were even more pronounced. These data were not necessarily surprising to us given recent publications that have shown human-level performance of AI in classifying skin images.
However, we also showed that replacing low-confidence reader decisions with algorithm predictions improved reader performance. For example, the dermatologists’ mean sensitivity increased from 76% to 80.8% and the specificity increased from 72.6% to 72.8%. This finding was unexpected and exciting because at the time we did not know if AI accuracy would be maintained on the lesions that are most challenging to physicians. Subsequent research has empirically demonstrated that providing AI predictions to dermatologists can lead to more accurate decisions at least in artificial reader settings, providing stronger evidence of potential clinical utility (Nat Med. 2020; 26(8): 1229-34; J Invest Dermatol. 2020; 140(9):1753-61).
DermWorld: Technology has drastically altered many industries and professions. With regard to dermatology, do you see artificial intelligence as a threat and/or an opportunity and why?
Dr. Marchetti: I see AI as both a real threat and a tremendous opportunity. First, we must recognize that the diagnosis of melanoma remains challenging — many are missed, and dozens of moles are removed for every melanoma found. AI could help physicians diagnose melanoma more easily and at earlier stages. It might also help prevent unnecessary biopsies. In an ideal world, the combined effect could reduce morbidity, decrease melanoma mortality, and bring down health care costs. Thus, if rigorously demonstrated to lead to higher quality care and clinical utility, who wouldn’t want AI to be used on themselves, a loved family member, or a close friend? I think we all would.
On the other hand, I worry that AI may have unintended consequences — analogous to what has happened with the application of advanced imaging diagnostics (i.e., ultrasound, computed tomography) to the detection of thyroid and kidney cancers (N Engl J Med. 2019; 381(14):1378-86). Compared to those organs, the skin and its lesions are even more amenable to being “tested.” It is not difficult to imagine a time in the near future when devices could be used (by physicians and even patients) to rapidly scan every lesion on the body and provide: (a) predictions about the probability of cancer, and (b) data on the most minute irregularities or changes in size, shape, and color. What might happen with such raw diagnostic power? Will it help or harm? Without careful consideration and implementation, I fear that AI will contribute to a self-perpetuating loop of skin cancer diagnosis (BMJ. 2015;350:h705). Not only would this accelerate pre-existing levels of melanoma “overdiagnosis” (N Engl J Med. 2019; 381(14):1378-86) — or the detection of cancer that is so indolent that it would never bother or harm the patient in their lifetime — but it would probably exponentially compound detection and resulting treatment of “dysplasias,” “proliferations,” “cannot rule out-omas,” for which there is no demonstrated benefit.
Consequences might include increased costs, reduced value, overtreatment, increased patient anxiety and sense of vulnerability, unrealistic patient and provider expectations, and the medicalization of ordinary life conditions (BMJ. 2015;350:h705).
Most likely, AI will lead to some combination of these aforementioned effects. Dermatology as a specialty is ultimately responsible for maximizing the benefit and minimizing the harm.
DermWorld: In your paper, you conclude that, “When judiciously applied, use of computer algorithm predictions can improve dermatologist accuracy for melanoma diagnosis.” Explain what you mean by “judiciously applied.”
Dr. Marchetti: Here we specifically meant that AI predictions can improve accuracy for melanoma diagnosis when applied to low-confidence lesions, as this is what we examined in our study. The clinical settings and scenarios in which AI may provide the greatest benefit, however, remain unclear and require further investigation. Generally speaking, the decision to pursue more testing should be guided by the pretest probability of an outcome. Tests may not help and can actually lead to greater confusion or even harm when the pretest probability of a disease is either very high or very low.
“AI predictions can improve accuracy for melanoma diagnosis when applied to low-confidence lesions, as this is what we examined in our study. The clinical settings and scenarios in which AI may provide the greatest benefit, however, remain unclear and require further investigation. ”
As an example, the dermatologists in our study rated their confidence as “very high” for 20% of lesion evaluations. Among this subset they performed well — the sensitivity was 92% and the specificity was 90%! Thus, although we did not formally examine this, it is very possible that exposure to AI predictions for these high-confidence lesions could lead dermatologists to make worse decisions, ultimately harming patients. Supporting this hypothesis, Tschandl P et al recently found that expert dermatologists did not improve their diagnostic accuracy after exposure to AI predictions when they were confident and suggested they should either ignore AI results or not use them at all in this scenario (Nat Med. 2020;26(8):1229-1234). To date, these findings have been identified through artificial reader studies and they should be interpreted cautiously. What we need are rigorous, high-quality, clinically based trials to better understand when to “reach” for AI at the bedside.
DermWorld: Do you think it’s important for dermatology to be on the forefront of this technological advancement? If so, why and how can physicians get out ahead of this?
Dr. Marchetti: Yes, I do think it is important for dermatology to try to apply state-of-the-art technologies to improve melanoma outcomes. However, we should not automatically believe that “high tech is better than no tech,” “new is better than old,” “advanced is better than the simple,” and “more is better than less” (BMJ. 2015;350:h705). Although the media may portray AI as ready for clinical implementation, there is a lot of progress needed before that actually takes place (and helps patients).
Michael Marchetti, MD,
is a dermatologist at Memorial Sloan Kettering Cancer Center. His paper appeared in JAAD. Dr. Marchetti has no relevant financial or commercial conflicts of interest.
Disclaimer: The views and opinions expressed in this article do not necessarily reflect those of DermWorld.
The American Academy of Dermatology is a non-profit professional organization and does not endorse companies or products. Advertising helps support our mission.