Logo del repository
  1. Home
 
Opzioni

Assessment of ChatGPT performance in orbital MRI reporting with multimetric evaluation of transformer based language models

Tel A.
•
Bolognesi F.
•
Michelutti L.
altro
Robiony M.
2025
  • journal article

Periodico
SCIENTIFIC REPORTS
Abstract
Transformer-based large language models (LLMs), such as ChatGPT-4, are increasingly used to streamline clinical practice, of which radiology reporting is a prominent aspect. However, their performance in interpreting complex anatomical regions from MRI data remains largely unexplored. This study investigates the capability of ChatGPT-4 to produce clinically reliable reports based on orbital MR images, applying a multimetric, quantitative evaluation framework in 25 patients with orbital lesions. Due to inherent limitations of current version of GPT-4, the model was not fed with MR volumetric data, but key 2D images only. For each case, ChatGPT-4 generated a free-text report, which was then compared to the corresponding ground-truth report authored by a board-certified radiologist. Evaluation included established NLP metrics (BLEU-4, ROUGE-L, BERTScore), clinical content recognition scores (RadGraph F1, CheXbert), and expert human judgment. Among the automated metrics, BERTScore demonstrated the highest language similarity, while RadGraph F1 best captured clinical entity recognition. Clinician assessment revealed moderate agreement with the LLM outputs, with performance decreasing in complex or infiltrative cases. The study highlights both the promise and current limitations of LLMs in radiology, particularly regarding their inability to process volumetric data and maintain spatial consistency. These findings suggest that while LLMs may assist in structured reporting, effective integration into diagnostic imaging workflows will require coupling with advanced vision models capable of full 3D interpretation.
DOI
10.1038/s41598-025-19669-1
WOS
WOS:001593346200027
Archivio
https://hdl.handle.net/11390/1318229
info:eu-repo/semantics/altIdentifier/scopus/2-s2.0-105018648423
https://ricerca.unityfvg.it/handle/11390/1318229
Diritti
open access
license:creative commons
license uri:http://creativecommons.org/licenses/by-nc-nd/4.0/
Soggetti
  • Artificial intelligen...

  • ChatGPT

  • NLP metric

  • Orbital MRI

  • Radiology report gene...

  • Transformer-based lan...

google-scholar
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your nstitution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Realizzato con Software DSpace-CRIS - Estensione mantenuta e ottimizzata da 4Science

  • Impostazioni dei cookie
  • Informativa sulla privacy
  • Accordo con l'utente finale
  • Invia il tuo Feedback