Zum Hauptinhalt springen
Journal Club

Journal Club by SWISS / KNIFE

Original Paper

"Evaluating artificial intelligence in decision-making for surgical treatment of benign breast conditions" De Pellegrin L, Weinzierl A, Kappos EA, Lindenblatt N, Zucal I, Harder Y. Journal of Plastic, Reconstructive & Aesthetic Surgery 105 (2025) 189–195. Doi: 10.1016/j.bjps.2025.03.057"

Artificial intelligence (AI) is increasingly explored in breast surgery for its potential to assist with diagnosis, anticipate disease progression, and support surgical planning. It may aid in interpreting imaging, estimating complication risks, and contributing to personalized care approaches. However, the integration of AI into clinical workflows demands careful validation to ensure reliability, safety, and effectiveness.

This study investigated the performance of ChatGPT-4o in assessing and proposing therapeutic strategies for common benign breast conditions. Five anonymized clinical scenarios—including hypertrophy, congenital deformity, hypotrophy, postpartum atrophy, and weight-loss-induced atrophy of the breast – were submitted to the AI with accompanying descriptions and photograph of the patient’s torso. Swiss plastic surgeons evaluated the AI’s outputs for accuracy, relevance, and completeness using a Likert scale. Ratings from postgraduate residents and board-certified surgeons were statistically compared, and response readability was analyzed.

Out of 100 invited participants, 25 responded (9 residents, 16 surgeons). The AI demonstrated high diagnostic accuracy, though variability was observed in the comprehensiveness of treatment suggestions. Objectivity was rated positively, while decision-making support was identified as an area for further refinement. There were no significant differences in evaluations between residents and attending surgeons. Readability assessments showed that the AI’s responses generally required advanced reading proficiency, and some redundancy was noted.

In summary, tools like ChatGPT-4o show potential in supporting the diagnosis and planning of benign breast surgery cases. Strengths include diagnostic precision and acknowledgment of psychological aspects of care. Yet, limitations such as lack of personalization, domain-specific nuances, and complex language use underscore the need for cautious and well-supervised clinical integration. Continued development should prioritize transparency, ethical oversight, and expert validation.

Interview with Dr. med. Laura De Pellegrin (Inselspital Bern)

 

De Pellegrin_Laura.png

What inspired you to conduct this study?

We observed a rapid growing use of large language models like ChatGPT and others in various clinical environments – not only by medical professionals but also by patients. These models were increasingly being employed for a wide range of tasks, from correcting letters and simplifying patient communication, to assisting with anatomical learning and surgical dissection. While their accessibility and fluency were impressive, we began to question whether these tools could provide truly reliable and personalized support in a field as delicate and individualized as plastic and reconstructive surgery.

This surgical field often requires careful consideration of subtle psychosocial and anatomical nuances that directly impact treatment quality and patient outcomes. With that in mind, we wanted to assess whether ChatGPT – specifically in its then-latest iteration– could in future times contribute to clinical decision-making in common but, yet complex scenarios. The goal was not to evaluate AI as a replacement for expertise, but to better understand its current capabilities and limitations within the evolving landscape of clinical practice.

Were there any unexpected findings?

Yes, we were indeed surprised by a few aspects of our findings. One particularly unexpected observation was the model’s ability to reflect psychosocial considerations, such as patient self-image, even though these were not explicitly prompted. For instance, in one clinical scenario addressing breast hypotrophy, ChatGPT-4o showed an implicit awareness of how certain anatomical features could impact the patient’s self-perception. This level of nuance is not commonly expected from a language model and suggests a potential for AI to address not only clinical but also psychological dimensions of care.

Another unanticipated result was the lack of statistically significant difference in the evaluation of AI-generated responses between residents and board-certified plastic surgeons. We had expected more variability between these two groups, yet both assessed the model’s output in a very similar way.

What is the direct impact on the surgeon's work?

At this point, we do not expect any immediate impact on the daily work of plastic surgeons. ChatGPT is not yet suitable for autonomous clinical decision-making of this type of clinical scenario. However, our study represents an initial step toward understanding, how artificial intelligence might one day support clinical reasoning and surgical planning. We are currently developing further research in this direction, with the aim of identifying specific areas where AI-integration could eventually bring tangible benefits to our field. Whether through patient communication, educational support, or preoperative planning, we believe that soon, AI may play a meaningful role in improving efficiency and personalization in plastic surgery.

"What is your learning point from this project?"

This project reinforced the importance of critically evaluating new technologies before integrating them into clinical workflows. While we were ourselves surprised by ChatGPT’s diagnostic accuracy and even its sensitivity to psychosocial aspects like patient self-image, we also recognized its current limitations – particularly in the depth and nuances of surgical planning and personalization of surgical treatment.

Our main takeaway is that while AI can already offer valuable support in clinical reasoning and communication, it must be further refined, validated, and ethically guided to ensure it enhances rather than replaces clinical expertise. This study served as a valuable first step in understanding where AI can be integrated into plastic surgery—and where it still falls short. It ultimately encouraged us to continue exploring this evolving field, with a focus on meaningful, patient-centered applications.

Are there any subsequent projects planned?

Yes, we are currently planning additional studies, that will build upon the findings of this initial work. Unfortunately, we can’t go into detail yet. Our goal is to further explore the integration of AI into clinical decision-making in plastic and reconstructive surgery in a responsible and meaningful way. We aim to investigate how AI can assist with more complex patient scenarios, potentially incorporating pre-operative imaging, individualized patient data, or interdisciplinary medical data, so you should definitely stay tuned.

Damit diese Website ordnungsgemäß funktioniert und um dein Erlebnis zu verbessern, verwenden wir Cookies. Ausführlichere Informationen findest du in unserer Cookie-Richtlinie.

Einstellungen anpassen
  • Notwendige Cookies ermöglichen die Kernfunktionen. Die Website kann ohne diese Cookies nicht richtig funktionieren und kann nur deaktiviert werden, indem du deine Browsereinstellungen änderst.