• BACKGROUND
    • Large language model (LLM)-based artificial intelligence (AI) chatbots, such as ChatGPT and Gemini, have become widespread sources of information. Few studies have evaluated LLM responses to questions about orthopaedic conditions, especially osteochondritis dissecans (OCD).
  • HYPOTHESIS
    • ChatGPT and Gemini will generate accurate responses that align with American Academy of Orthopaedic Surgeons (AAOS) clinical practice guidelines.
  • STUDY DESIGN
    • Cohort study.
  • LEVEL OF EVIDENCE
    • Level 2.
  • METHODS
    • LLM prompts were created based on AAOS clinical guidelines on OCD diagnosis and treatment, and responses from ChatGPT and Gemini were collected. Seven fellowship-trained orthopaedic surgeons evaluated LLM responses on a 5-point Likert scale, based on 6 categories: relevance, accuracy, clarity, completeness, evidence-based, and consistency.
  • RESULTS
    • ChatGPT and Gemini exhibited strong performance across all criteria. ChatGPT mean scores were highest for clarity (4.771 ± 0.141 [mean ± SD]). Gemini scored highest for relevance and accuracy (4.286 ± 0.296, 4.286 ± 0.273). For both LLMs, the lowest scores were for evidence-based responses (ChatGPT, 3.857 ± 0.352; Gemini, 3.743 ± 0.353). For all other categories, ChatGPT mean scores were higher than Gemini scores. The consistency of responses between the 2 LLMs was rated at an overall mean of 3.486 ± 0.371. Inter-rater reliability ranged from 0.4 to 0.67 (mean, 0.59) and was highest (0.67) in the accuracy category and lowest (0.4) in the consistency category.
  • CONCLUSION
    • LLM performance emphasizes the potential for gathering clinically relevant and accurate answers to questions regarding the diagnosis and treatment of OCD and suggests that ChatGPT may be a better model for this purpose than the Gemini model. Further evaluation of LLM information regarding other orthopaedic procedures and conditions may be necessary before LLMs can be recommended as an accurate source of orthopaedic information.
  • CLINICAL RELEVANCE
    • Little is known about the ability of AI to provide answers regarding OCD.