News|Articles|March 4, 2024

Chatbot accuracy in ophthalmology: A work in progress

ChatGPT-4 (OpenAI) had an overall “fair” performance when answering multiple-choice ophthalmic questions unrelated to multifocal imaging.

A Canadian study, led by first author, Andrew Mihalache, MD, from the Temerty School of Medicine, University of Toronto, Toronto, Ontario, Canada, reported that ChatGPT-4 (OpenAI) had an overall “fair” performance when answering multiple-choice ophthalmic questions unrelated to multifocal imaging.¹

Correct interpretation of clinical images is at the heart of treatment in Ophthalmology to ensure appropriate treatment. With the rapid development of artificial intelligence (AI) globally, the accuracy of technology such as chatbots is imperative.

The authors commented on the importance of this technology, “Ophthalmology is reliant on effective interpretation of multimodal imaging to ensure diagnostic accuracy. Multimodal imaging enhances patient outcomes through earlier and more precise diagnoses, and more effective follow-up visits and treatments.^2,3 The new release of the chatbot holds great potential in enhancing the efficiency of ophthalmic image interpretation, which may reduce the workload on clinicians, mitigate variability in interpretations and errors, and ultimately, lead to improved patient outcomes.”

They conducted a cross-sectional study to evaluate how ChatGPT-4 performed when processing imaging data. A publicly available dataset of ophthalmic cases, OCTCases, a medical education platform from the Department of Ophthalmology and Vision Sciences at the University of Toronto, was used. A total of 137 cases were available and 99% had multiple choice questions. the authors explained. The study’s primary outcome was the accuracy of the chatbot in answering these questions related to image recognition.

Chatbot performance

Among the 136 cases that contained multiple-choice questions, the chatbot was tasked with fielding 429 multiple-choice questions; 448 images also were included in the analysis.

“The chatbot answered 299 multiple-choice questions correctly across all cases (70%). The chatbot’s performance was better on retina questions than neuro-ophthalmology questions (77% vs 58%; difference = 18%; 95% confidence interval [CI], 7.5%-29.4%; χ21 = 11.4; P < 0.001),” Dr. Mihalache and colleagues reported.

They also found that the chatbot did better answering nonimage–based questions compared with image-based questions (82% vs 65%; difference = 17%; 95% CI, 7.8%-25.1%; χ21 = 12.2; P < 0.001).

Finally, the chatbot showed an intermediate performance on questions based on the topics of ocular oncology (72% correct), pediatric ophthalmology (68% correct), uveitis (67% correct), and glaucoma (61% correct).

The authors concluded, “In this study, the recent version of the chatbot accurately responded to most multiple-choice questions pertaining to ophthalmic cases requiring multimodal input from OCTCases, albeit performing better on questions that did not rely on ophthalmic imaging interpretation. As multimodal large language models become increasingly widespread, it remains imperative to continuously stress their appropriate use in medicine and highlight concerns surrounding confidentiality and bioethics. Future studies should continue investigating the chatbot’s ability to interpret different ophthalmic imaging modalities to gauge whether it can eventually become as accurate as specific machine learning systems in ophthalmology. Future work should also evaluate the chatbot’s ability to interpret ophthalmic images that are not publicly accessible.”

References:

Mihalache A, Huang RS, Popovic MM, et al. Accuracy of an artificial intelligence Chatbot’s interpretation of clinical ophthalmic images. JAMA Ophthalmol. 2024; published online February 29; doi:10.1001/jamaophthalmol.2024.0017
Schuster AK, Wolfram C, Hudde T, et al. Impact of routinely performed optical coherence tomography examinations on quality of life in patients with retinal diseases-results from the ALBATROS data collection. J Clin Med. 2023;12(12):3881. doi:10.3390/jcm12123881
Huang D, Swanson EA, Lin CP, et al. Optical coherence tomography. Science. 1991;254(5035):1178-1181. doi:10.1126/science.1957169

Don’t miss out—get Ophthalmology Times updates on the latest clinical advancements and expert interviews, straight to your inbox.

Subscribe Now!

Latest CME

In-Person Event

EnVision Summit

February 13-16, 2026

$(CME Track) The Neural Frontier: Mapping Neurostimulation Across the DED Patient Spectrum for Refractive Surgery$

Chatbot accuracy in ophthalmology: A work in progress

Chatbot performance

References:

Mihalache A, Huang RS, Popovic MM, et al. Accuracy of an artificial intelligence Chatbot’s interpretation of clinical ophthalmic images. JAMA Ophthalmol. 2024; published online February 29; doi:10.1001/jamaophthalmol.2024.0017

Schuster AK, Wolfram C, Hudde T, et al. Impact of routinely performed optical coherence tomography examinations on quality of life in patients with retinal diseases-results from the ALBATROS data collection. J Clin Med. 2023;12(12):3881. doi:10.3390/jcm12123881

Huang D, Swanson EA, Lin CP, et al. Optical coherence tomography. Science. 1991;254(5035):1178-1181. doi:10.1126/science.1957169

Newsletter

Related Content

MeiraGTx Licenses complement-targeted geographic atrophy program from ZipBio

Looking back at the 2025 EnVision Summit

Last year in glaucoma at EnVision Summit 2025

China's NMPA approves ZEISS ARTEVO 750 and ARTEVO 850 ophthalmic surgical microscopes for clinical use

NeuroOp Guru: Understanding optic disc cupping after optic neuritis

Latest CME

EnVision Summit

(CME Track) The Neural Frontier: Mapping Neurostimulation Across the DED Patient Spectrum for Refractive Surgery

(CME Credit) Time Matters in GA: The Impact of Early Detection and Proactive Treatment Approaches

(COPE Credit) Time Matters in GA: The Impact of Early Detection and Proactive Treatment Approaches

(CME Track) Expanding Horizons in Toric IOLs: Translating Technological Advances Into Improved Patient Outcomes

Practical Approaches to Modern Dry Eye Treatment and Management

(CME Track) Revolutionizing nAMD and DME Management: Collaborative Strategies in the Age of Durable Treatments

(CME Track) Patient-Centered Treatment Strategies in the Management of nAMD and DME

(CME Track) The TED Perspective: A Multidisciplinary Approach to Thyroid Eye Care

(CME Track) Visionary Approaches: Rethinking Therapeutic and Interventional Glaucoma Management

(COPE Track) Expanding Horizons in Toric IOLs: Translating Technological Advances Into Improved Patient Outcomes

(COPE Track) The TED Perspective: A Multidisciplinary Approach to Thyroid Eye Care

(COPE Track) Patient-Centered Treatment Strategies in the Management of nAMD and DME

(COPE Track) Revolutionizing nAMD and DME Management: Collaborative Strategies in the Age of Durable Treatments

(COPE Track) The Neural Frontier: Mapping Neurostimulation Across the DED Patient Spectrum for Refractive Surgery

(COPE Track) Visionary Approaches: Rethinking Therapeutic and Interventional Glaucoma Management

(CME Track) Clinical Consultations™: Framing a New Approach to Geographic Atrophy Management – Expert Insights into Recent Developments

(COPE Track) Clinical Consultations™: Framing a New Approach to Geographic Atrophy Management – Expert Insights into Recent Developments

(CME Track) Rapid Reviews in Retina™: Emerging Updates from Winter 2025 – Addressing the Wealth of New Data in Treatments for nAMD and DME

(COPE Track) Rapid Reviews in Retina™: Emerging Updates from Winter 2025 – Addressing the Wealth of New Data in Treatments for nAMD and DME

Living With X-Linked Retinitis Pigmentosa: What We Can Learn From a Patient’s Experience

Living With X-Linked Retinitis Pigmentosa: What We Can Learn From a Patient’s Experience

(CME Track) Collaborative Community Connections™: Mastering the Management of nAMD and DME Through Therapeutic Innovation

(COPE Track) Collaborative Community Connections™: Mastering the Management of nAMD and DME Through Therapeutic Innovation

Navigating the Glaucoma Therapeutic and Surgical Landscape: From Conventional to Cutting-Edge

(COPE Track) Neurotrophic Keratitis: Multidisciplinary Approaches to Enhance Patient Outcomes

(CME Track) Neurotrophic Keratitis: Multidisciplinary Approaches to Enhance Patient Outcomes

(CME Track) The Neural Network: Exploring The Role of Neuromodulation in Dry Eye Disease Management

(COPE Track) The Neural Network: Exploring The Role of Neuromodulation in Dry Eye Disease Management

(COPE Track) Clinical Case Connections: Expert Insights on Applying Therapeutic Innovations in nAMD

(COPE Track) Toric IOLs Unleashed: From Technological Progress to Patient Success

(CME Track) Clinical Case Connections: Expert Insights on Applying Therapeutic Innovations in nAMD

(CME Track) Toric IOLs Unleashed: From Technological Progress to Patient Success

(CME Track) Clinical Case Connections: Understanding the Impact of Advances in Treatment for DME and DR

(COPE Track) Clinical Case Connections: Understanding the Impact of Advances in Treatment for DME and DR

(CME Credit) Navigating Pharmacological Presbyopia Treatment for Enhanced Patient Care

Neurotrophic Keratitis Insights: An Interactive Corneal Sensitivity Testing Workshop

(COPE Credit) Navigating Pharmacological Presbyopia Treatment for Enhanced Patient Care

(COPE Track) Small Mites, Big Impact: Revolutionizing Demodex Blepharitis Care

(CME Track) Small Mites, Big Impact: Revolutionizing Demodex Blepharitis Care

Rapid Reviews in Retina™: Emerging Updates from Spring 2025—Addressing the Wealth of New Data in Treatments for Neovascular Retinal Disease

Interventional Dry Eye: A Stepwise Treatment & Management Approach

Trending on Ophthalmology Times - Clinical Insights for Eye Specialists

MeiraGTx Licenses complement-targeted geographic atrophy program from ZipBio

Metformin use associated with reduced incidence of intermediate AMD

Last year in glaucoma at EnVision Summit 2025

NeuroOp Guru: Understanding optic disc cupping after optic neuritis

Ocumetics reports first-in-human data investigational accommodating IOL in severe vision loss