Performance of large language models versus clinicians and novices in veterinary theriogenology decision support


OKUR D. T., Cengiz M., Küçükaslan İ., Peker C., ÇİPLAK A. Y., TOHUMCU V., ...Daha Fazla

JAVMA-JOURNAL OF THE AMERICAN VETERINARY MEDICAL ASSOCIATION, cilt.264, sa.5, ss.616-623, 2026 (SCI-Expanded, Scopus) identifier identifier identifier

  • Yayın Türü: Makale / Tam Makale
  • Cilt numarası: 264 Sayı: 5
  • Basım Tarihi: 2026
  • Doi Numarası: 10.2460/javma.25.09.0615
  • Dergi Adı: JAVMA-JOURNAL OF THE AMERICAN VETERINARY MEDICAL ASSOCIATION
  • Derginin Tarandığı İndeksler: Science Citation Index Expanded (SCI-EXPANDED), Scopus, BIOSIS, EMBASE, Public Affairs Index, DIALNET
  • Sayfa Sayıları: ss.616-623
  • Anahtar Kelimeler: clinical decision support, large language model, ChatGPT, theriogenology, dystocia
  • Atatürk Üniversitesi Adresli: Evet

Özet

Objective: To compare the clinical decision-support performance of 2 large language models (LLMs), ChatGPT-5 and ChatGPT-5 Thinking, with that of experienced clinicians and novices in veterinary theriogenology. Methods: 15 standardized obstetric and gynecologic scenarios were independently evaluated by 2 expert clinicians, 2 novice veterinarians, and both LLMs under matched, cold-start conditions. Responses were assessed with a 5-point global quality score by a blinded expert panel. Results: ChatGPT-5 Thinking achieved the highest overall quality ratings, followed by ChatGPT-5 and the expert clinicians. Novice veterinarians received the lowest scores. Responses generated by LLM were generally more consistent and complete than those of human readers. Conclusions: Within the constraints of a simulated scenario design, LLMs, particularly ChatGPT-5 Thinking, provided clinically appropriate guidance that exceeded novice performance and approached that of expert clinicians. These findings support the potential role of LLMs as adjunct decision-support tools in time-sensitive obstetric and gynecologic cases. Clinical Relevance: LLMs may assist clinicians and trainees in managing reproductive emergencies by offering rapid, structured, guideline-aligned recommendations. Further evaluation in real clinical settings is warranted.