TY - JOUR
T1 - Assessing the utility of ChatGPT as an artificial intelligence‐based large language model for information to answer questions on myopia
AU - Biswas, Sayantan
AU - Logan, Nicola S.
AU - Davies, Leon N.
AU - Sheppard, Amy L.
AU - Wolffsohn, James S.
N1 - Copyright © 2023 The Authors. Ophthalmic and Physiological Optics published by John Wiley & Sons Ltd on behalf of College of Optometrists.
This is an open access article under the terms of the Creative Commons Attribution License, which permits use, distribution and reproduction in any medium, provided the original work is properly cited.
PY - 2023/7/21
Y1 - 2023/7/21
N2 - Purpose: ChatGPT is an artificial intelligence language model, which uses natural language processing to simulate human conversation. It has seen a wide range of applications including healthcare education, research and clinical practice. This study evaluated the accuracy of ChatGPT in providing accurate and quality information to answer questions on myopia.Methods:A series of 11 questions (nine categories of general summary, cause, symptom, onset, prevention, complication, natural history, treatment and prognosis) were generated for this cross-sectional study. Each question was entered five times into fresh ChatGPT sessions (free from influence of prior questions). The responses were evaluated by a five-member team of optometry teaching and research staff. The evaluators individually rated the accuracy and quality of responses on a Likert scale, where a higher score indicated greater quality of information (1: very poor; 2: poor; 3: acceptable; 4: good; 5: very good). Median scores for each question were estimated and compared between evaluators. Agreement between the five evaluators and the reliability statistics of the questions were estimated.Results:Of the 11 questions on myopia, ChatGPT provided good quality information (median scores: 4.0) for 10 questions and acceptable responses (median scores: 3.0) for one question. Out of 275 responses in total, 66 (24%) were rated very good, 134 (49%) were rated good, whereas 60 (22%) were rated acceptable, 10 (3.6%) were rated poor and 5 (1.8%) were rated very poor. Cronbach's α of 0.807 indicated good level of agreement between test items. Evaluators' ratings demonstrated ‘slight agreement’ (Fleiss's κ, 0.005) with a significant difference in scoring among the evaluators (Kruskal–Wallis test, p < 0.001).Conclusion:Overall, ChatGPT generated good quality information to answer questions on myopia. Although ChatGPT shows great potential in rapidly providing information on myopia, the presence of inaccurate responses demonstrates that further evaluation and awareness concerning its limitations are crucial to avoid potential misinterpretation.
AB - Purpose: ChatGPT is an artificial intelligence language model, which uses natural language processing to simulate human conversation. It has seen a wide range of applications including healthcare education, research and clinical practice. This study evaluated the accuracy of ChatGPT in providing accurate and quality information to answer questions on myopia.Methods:A series of 11 questions (nine categories of general summary, cause, symptom, onset, prevention, complication, natural history, treatment and prognosis) were generated for this cross-sectional study. Each question was entered five times into fresh ChatGPT sessions (free from influence of prior questions). The responses were evaluated by a five-member team of optometry teaching and research staff. The evaluators individually rated the accuracy and quality of responses on a Likert scale, where a higher score indicated greater quality of information (1: very poor; 2: poor; 3: acceptable; 4: good; 5: very good). Median scores for each question were estimated and compared between evaluators. Agreement between the five evaluators and the reliability statistics of the questions were estimated.Results:Of the 11 questions on myopia, ChatGPT provided good quality information (median scores: 4.0) for 10 questions and acceptable responses (median scores: 3.0) for one question. Out of 275 responses in total, 66 (24%) were rated very good, 134 (49%) were rated good, whereas 60 (22%) were rated acceptable, 10 (3.6%) were rated poor and 5 (1.8%) were rated very poor. Cronbach's α of 0.807 indicated good level of agreement between test items. Evaluators' ratings demonstrated ‘slight agreement’ (Fleiss's κ, 0.005) with a significant difference in scoring among the evaluators (Kruskal–Wallis test, p < 0.001).Conclusion:Overall, ChatGPT generated good quality information to answer questions on myopia. Although ChatGPT shows great potential in rapidly providing information on myopia, the presence of inaccurate responses demonstrates that further evaluation and awareness concerning its limitations are crucial to avoid potential misinterpretation.
KW - ChatGPT
KW - artificial intelligence
KW - chatbot response
KW - myopia
KW - patient information
UR - https://onlinelibrary.wiley.com/doi/10.1111/opo.13207
UR - http://www.scopus.com/inward/record.url?scp=85165477915&partnerID=8YFLogxK
U2 - 10.1111/opo.13207
DO - 10.1111/opo.13207
M3 - Article
SN - 0275-5408
JO - Ophthalmic and Physiological Optics
JF - Ophthalmic and Physiological Optics
ER -