Journal Club by SWISS / KNIFE

Original Paper

"Evaluating artificial intelligence in decision-making for surgical treatment of benign breast conditions" De Pellegrin L, Weinzierl A, Kappos EA, Lindenblatt N, Zucal I, Harder Y. Journal of Plastic, Reconstructive & Aesthetic Surgery 105 (2025) 189–195. Doi: 10.1016/j.bjps.2025.03.057"

Autoren

Prof. Dr. med. Yves Harder yves.harder@chuv.ch

Dr. med. Laura De Pellegrin laura.depellegrin@insel.ch

Artificial intelligence (AI) is increasingly explored in breast surgery for its potential to assist with diagnosis, anticipate disease progression, and support surgical planning. It may aid in interpreting imaging, estimating complication risks, and contributing to personalized care approaches. However, the integration of AI into clinical workflows demands careful validation to ensure reliability, safety, and effectiveness.

This study investigated the performance of ChatGPT-4o in assessing and proposing therapeutic strategies for common benign breast conditions. Five anonymized clinical scenarios—including hypertrophy, congenital deformity, hypotrophy, postpartum atrophy, and weight-loss-induced atrophy of the breast – were submitted to the AI with accompanying descriptions and photograph of the patient’s torso. Swiss plastic surgeons evaluated the AI’s outputs for accuracy, relevance, and completeness using a Likert scale. Ratings from postgraduate residents and board-certified surgeons were statistically compared, and response readability was analyzed.

Out of 100 invited participants, 25 responded (9 residents, 16 surgeons). The AI demonstrated high diagnostic accuracy, though variability was observed in the comprehensiveness of treatment suggestions. Objectivity was rated positively, while decision-making support was identified as an area for further refinement. There were no significant differences in evaluations between residents and attending surgeons. Readability assessments showed that the AI’s responses generally required advanced reading proficiency, and some redundancy was noted.

In summary, tools like ChatGPT-4o show potential in supporting the diagnosis and planning of benign breast surgery cases. Strengths include diagnostic precision and acknowledgment of psychological aspects of care. Yet, limitations such as lack of personalization, domain-specific nuances, and complex language use underscore the need for cautious and well-supervised clinical integration. Continued development should prioritize transparency, ethical oversight, and expert validation.

Interview with Dr. med. Laura De Pellegrin (Inselspital Bern)

What inspired you to conduct this study?

We observed a rapid growing use of large language models like ChatGPT and others in various clinical environments – not only by medical professionals but also by patients. These models were increasingly being employed for a wide range of tasks, from correcting letters and simplifying patient communication, to assisting with anatomical learning and surgical dissection. While their accessibility and fluency were impressive, we began to question whether these tools could provide truly reliable and personalized support in a field as delicate and individualized as plastic and reconstructive surgery.

This surgical field often requires careful consideration of subtle psychosocial and anatomical nuances that directly impact treatment quality and patient outcomes. With that in mind, we wanted to assess whether ChatGPT – specifically in its then-latest iteration– could in future times contribute to clinical decision-making in common but, yet complex scenarios. The goal was not to evaluate AI as a replacement for expertise, but to better understand its current capabilities and limitations within the evolving landscape of clinical practice.

Were there any unexpected findings?

Yes, we were indeed surprised by a few aspects of our findings. One particularly unexpected observation was the model’s ability to reflect psychosocial considerations, such as patient self-image, even though these were not explicitly prompted. For instance, in one clinical scenario addressing breast hypotrophy, ChatGPT-4o showed an implicit awareness of how certain anatomical features could impact the patient’s self-perception. This level of nuance is not commonly expected from a language model and suggests a potential for AI to address not only clinical but also psychological dimensions of care.

Another unanticipated result was the lack of statistically significant difference in the evaluation of AI-generated responses between residents and board-certified plastic surgeons. We had expected more variability between these two groups, yet both assessed the model’s output in a very similar way.

What is the direct impact on the surgeon's work?

At this point, we do not expect any immediate impact on the daily work of plastic surgeons. ChatGPT is not yet suitable for autonomous clinical decision-making of this type of clinical scenario. However, our study represents an initial step toward understanding, how artificial intelligence might one day support clinical reasoning and surgical planning. We are currently developing further research in this direction, with the aim of identifying specific areas where AI-integration could eventually bring tangible benefits to our field. Whether through patient communication, educational support, or preoperative planning, we believe that soon, AI may play a meaningful role in improving efficiency and personalization in plastic surgery.

"What is your learning point from this project?"

This project reinforced the importance of critically evaluating new technologies before integrating them into clinical workflows. While we were ourselves surprised by ChatGPT’s diagnostic accuracy and even its sensitivity to psychosocial aspects like patient self-image, we also recognized its current limitations – particularly in the depth and nuances of surgical planning and personalization of surgical treatment.

Our main takeaway is that while AI can already offer valuable support in clinical reasoning and communication, it must be further refined, validated, and ethically guided to ensure it enhances rather than replaces clinical expertise. This study served as a valuable first step in understanding where AI can be integrated into plastic surgery—and where it still falls short. It ultimately encouraged us to continue exploring this evolving field, with a focus on meaningful, patient-centered applications.

Are there any subsequent projects planned?

Yes, we are currently planning additional studies, that will build upon the findings of this initial work. Unfortunately, we can’t go into detail yet. Our goal is to further explore the integration of AI into clinical decision-making in plastic and reconstructive surgery in a responsible and meaningful way. We aim to investigate how AI can assist with more complex patient scenarios, potentially incorporating pre-operative imaging, individualized patient data, or interdisciplinary medical data, so you should definitely stay tuned.

Das könnte Sie auch Interessieren

Journal Club 02-03-2026

Journal Club by SWISS / KNIFE

Original Paper

"Pediatric complicated appendicitis: Results of a standardized antibiotic protocol in a tertiary center"

Studer E, Flament-Viricel C, Calinescu AM, Wildhaber BE.

J Pediatr Surg. 2025 Dec 8;61(3):162862. doi: 10.1016/j.jpedsurg.2025.162862. Online ahead of print.

Journal Club 11-02-2026

Journal Club by SWISS / KNIFE

Original Paper

"Does health insurance status influence surgical complications? An analysis of abdominal, thoracic and vascular interventions in a Swiss tertiary referral centre".

Bley M, Gutknecht S, Burla L, Zindel C, Weber M, Wrann S.

Swiss Med Wkly. 2025 Dec 10;155:4179. doi: 10.57187/s.4179. PMID: 41474101.

Journal Club 13-01-2026

Journal Club by SWISS/KNIFE

Original Paper

“International Reference Values for Surgical Outcomes of Total Pancreatectomy”

Philip C Müller, Caroline Berchtold, Christoph Kuemmerli, Eva Breuer, Zhihao Li, Alessia Vallorani, Carsten Hansen, Cristiano Guidetti, Janina Eden, Brady A Campbell, Pengfei Wu, Sara Nicole Cecchetto, Hallbera Gudmundsdottir, Michael Kendrick, Patrick P Starlinger, Nicolò Pecorelli, Giovanni Guarneri, Waqas Farooqui, Christoph Tschuor, Stefan Kobbelgaard Burgdorf, Julia Mühlhäusser, Jörn-Markus Gass, Brian K P Goh, Ye-Xin Koh, Artur Rebelo, Jörg Kleeff, Tomas Seip, Martin Santibanes, Letizia Todeschini, Giovanni Marchegiani, Nadiya Belfil, Mickaël Lesurtel, Marcel Machado, Ugo Boggi, Emanuele Kauffmann, Marie Cappelle, Bas Groot Koerkamp, Fabrizio Di Benedetto, Keith Roberts, Avinoam Nevler, Harish Lavu, Philipp Dutkowski, Felix Nickel, Thilo Hackert, Jin He, Massimo Falconi, Mark Truty, Adrian T Billeter, Beat P Müller; Outcomes for Total Pancreatectomy Group

JAMA Surg. 2025 Nov 12:e254941. doi: 10.1001/jamasurg.2025.4941