TY - JOUR AU - C. Seifen AU - K. Bahr-Hamm AU - H. Gouveris AU - J. Pordzik AU - A. Blaikie AU - C. Matthias AU - S. Kuhn AU - C. R. Buhr A1 - AB - PURPOSE: Timely identification of comorbidities is critical in sleep medicine, where large language models (LLMs) like ChatGPT are currently emerging as transformative tools. Here, we investigate whether the novel LLM ChatGPT o1 preview can identify individual health risks or potentially existing comorbidities from the medical data of fictitious sleep medicine patients. METHODS: We conducted a simulation-based study using 30 fictitious patients, designed to represent realistic variations in demographic and clinical parameters commonly seen in sleep medicine. Each profile included personal data (eg, body mass index, smoking status, drinking habits), blood pressure, and routine blood test results, along with a predefined sleep medicine diagnosis. Each patient profile was evaluated independently by the LLM and a sleep medicine specialist (SMS) for identification of potential comorbidities or individual health risks. Their recommendations were compared for concordance across lifestyle changes and further medical measures. RESULTS: The LLM achieved high concordance with the SMS for lifestyle modification recommendations, including 100% concordance on smoking cessation (κ = 1; p < 0.001), 97% on alcohol reduction (κ = 0.92; p < 0.001) and endocrinological examination (κ = 0.92; p < 0.001) or 93% on weight loss (κ = 0.86; p < 0.001). However, it exhibited a tendency to over-recommend further medical measures (particularly 57% concordance for cardiological examination (κ = 0.08; p = 0.28) and 33% for gastrointestinal examination (κ = 0.1; p = 0.22)) compared to the SMS. CONCLUSION: Despite the obvious limitation of using fictitious data, the findings suggest that LLMs like ChatGPT have the potential to complement clinical workflows in sleep medicine by identifying individual health risks and comorbidities. As LLMs continue to evolve, their integration into healthcare could redefine the approach to patient evaluation and risk stratification. Future research should contextualize the findings within broader clinical applications ideally testing locally run LLMs meeting data protection requirements. AD - Sleep Medicine Center & Department of Otolaryngology, Head and Neck Surgery, University Medical Center Mainz, Mainz, Germany.; School of Medicine, University of St Andrews, St Andrews, UK.; Institute for Digital Medicine, Philipps University Marburg, University Hospital Giessen and Marburg, Marburg, Germany. AN - 40321662 BT - Nat Sci Sleep C5 - HIT & Telehealth; Medically Unexplained Symptoms DO - 10.2147/nss.S510254 DP - NLM ET - 20250429 JF - Nat Sci Sleep LA - eng N2 - PURPOSE: Timely identification of comorbidities is critical in sleep medicine, where large language models (LLMs) like ChatGPT are currently emerging as transformative tools. Here, we investigate whether the novel LLM ChatGPT o1 preview can identify individual health risks or potentially existing comorbidities from the medical data of fictitious sleep medicine patients. METHODS: We conducted a simulation-based study using 30 fictitious patients, designed to represent realistic variations in demographic and clinical parameters commonly seen in sleep medicine. Each profile included personal data (eg, body mass index, smoking status, drinking habits), blood pressure, and routine blood test results, along with a predefined sleep medicine diagnosis. Each patient profile was evaluated independently by the LLM and a sleep medicine specialist (SMS) for identification of potential comorbidities or individual health risks. Their recommendations were compared for concordance across lifestyle changes and further medical measures. RESULTS: The LLM achieved high concordance with the SMS for lifestyle modification recommendations, including 100% concordance on smoking cessation (κ = 1; p < 0.001), 97% on alcohol reduction (κ = 0.92; p < 0.001) and endocrinological examination (κ = 0.92; p < 0.001) or 93% on weight loss (κ = 0.86; p < 0.001). However, it exhibited a tendency to over-recommend further medical measures (particularly 57% concordance for cardiological examination (κ = 0.08; p = 0.28) and 33% for gastrointestinal examination (κ = 0.1; p = 0.22)) compared to the SMS. CONCLUSION: Despite the obvious limitation of using fictitious data, the findings suggest that LLMs like ChatGPT have the potential to complement clinical workflows in sleep medicine by identifying individual health risks and comorbidities. As LLMs continue to evolve, their integration into healthcare could redefine the approach to patient evaluation and risk stratification. Future research should contextualize the findings within broader clinical applications ideally testing locally run LLMs meeting data protection requirements. PY - 2025 SN - 1179-1608 (Print); 1179-1608 SP - 677 EP - 688+ ST - Simulation-Based Evaluation of Large Language Models for Comorbidity Detection in Sleep Medicine - a Pilot Study on ChatGPT o1 Preview T1 - Simulation-Based Evaluation of Large Language Models for Comorbidity Detection in Sleep Medicine - a Pilot Study on ChatGPT o1 Preview T2 - Nat Sci Sleep TI - Simulation-Based Evaluation of Large Language Models for Comorbidity Detection in Sleep Medicine - a Pilot Study on ChatGPT o1 Preview U1 - HIT & Telehealth; Medically Unexplained Symptoms U3 - 10.2147/nss.S510254 VL - 17 VO - 1179-1608 (Print); 1179-1608 Y1 - 2025 ER -