Yan Leyfman, Speaker at Oncology Conferences

NYP- Meyer Cancer Center, United States

Title : Real-world clinician adoption and disease-specific safety failures of large language models in oncology clinical decision support

Abstract:

Background: Large language models (LLMs) are increasingly used in oncology clinical workflows; however, real-world clinician adoption has outpaced formal institutional guidance and systematic safety validation. Existing evaluations often rely on aggregate performance metrics, which may obscure disease-specific safety risks. We sought to (1) characterize real-world AI use and verification behaviors among oncology clinicians and (2) assess disease-dependent safety failures of LLM-based clinical decision support systems.

Methods: We conducted a voluntary, anonymous cross-sectional survey of oncology clinicians evaluating AI access, patterns of use, verification behaviors, perceived accountability, and responses to a standardized oncology clinical scenario. Separately, we developed 216 simulated tumor-board vignettes across five oncology domains: leukemia (n=30), breast (n=50), gastrointestinal (n=50), central nervous system metastases (n=50), and gynecologic malignancies (n=50). Each vignette was evaluated using three LLM configurations: (1) unconstrained LLM, (2) NCCN guideline-anchored retrieval-augmented generation (RAG), and (3) literature-anchored RAG. Outputs were independently scored by two board-certified oncologists using a modified Generative Performance Score (mGPS; −1 to +1), incorporating guideline concordance and hallucination penalties. Safety disparity was conservatively defined as the highest severity across scoring axes.

Results: Thirty-one clinicians completed the survey, including fellows (45%) and attending oncologists (29%), representing academic and community practice settings. Despite limited or uncertain access to institution-approved AI tools, nearly all respondents reported independent AI use for professional tasks. Most clinicians reported routinely verifying AI outputs against guidelines or primary literature and maintaining clinician-centered accountability for AI-related errors, while formal institutional governance was frequently absent.

In vignette-based evaluation, NCCN-anchored RAG demonstrated improved guideline concordance and reduced hallucinations compared with unconstrained models; however, safety performance varied substantially by disease context. Leukemia demonstrated predominantly low-to-intermediate safety disparity (93%), whereas high disparity was observed in CNS metastases (80%) and gynecologic malignancies (70%), driven by concurrent hallucinations, staging errors, and inappropriate extrapolation. Readability scores did not correlate with safety, frequently obscuring clinically significant errors.

Conclusions: Oncology clinicians are already integrating AI into clinical practice with high levels of independent verification but limited institutional oversight. LLM safety is strongly disease-dependent and inadequately captured by aggregate accuracy metrics. Disease-stratified validation frameworks incorporating guideline concordance and hallucination detection are necessary to inform responsible clinical deployment. These findings support the need for clinician-led governance and disease-specific risk stratification prior to broad adoption of AI decision support in oncology.

Biography:

Dr. Yan Leyfman is a physician-scientist in oncology recognized as a 40 Under 40 Emerging Leader in Cancer at the 2023 ASCO Annual Meeting. During the COVID-19 pandemic, he led the Immunology Division of the Global COVID-19 Taskforce and conducted early translational work on SARS-CoV-2 and cancer interactions, presented at ASCO and published in peer-reviewed journals. His research spans cancer immunology, CAR T-cell therapies, and clinical AI. Dr. Leyfman is co-founder and Executive Director of MedNews Week, advancing global oncology education and equity. He is committed to mentorship, clinical innovation, and responsible translation of emerging technologies in oncology care.