Logo del repository
  1. Home
 
Opzioni

PILs of Knowledge: A Synthetic Benchmark for Evaluating Question Answering Systems in Healthcare

Lunardi, Riccardo
•
Soprano, Michael
•
Coppola, Paolo
altro
Roitero, Kevin
2025
  • conference object

Abstract
Patient Information Leaflets (PILs) provide essential information about medication usage, side effects, precautions, and interactions, making them a valuable resource for Question Answering (QA) systems in healthcare. However, no dedicated benchmark currently exists to evaluate QA systems specifically on PILs, limiting progress in this domain. To address this gap, we introduce a fact-supported synthetic benchmark composed of multiple-choice questions and answers generated from real PILs. We construct the benchmark using a fully automated pipeline that leverages multiple Large Language Models (LLMs) to generate diverse, realistic, and contextually relevant question-answer pairs. The benchmark is publicly released as a standardized evaluation framework for assessing the ability of LLMs to process and reason over PIL content. To validate its effectiveness, we conduct an initial evaluation with state-of-the-art LLMs, showing that the benchmark presents a realistic and challenging task, making it a valuable resource for advancing QA research in the healthcare domain.
DOI
10.1145/3726302.3730283
Archivio
https://hdl.handle.net/11390/1309164
info:eu-repo/semantics/altIdentifier/scopus/2-s2.0-105011828083
https://dl.acm.org/doi/10.1145/3726302.3730283
Diritti
open access
license:creative commons
license uri:http://creativecommons.org/licenses/by/4.0/
Soggetti
  • benchmarks, synthetic...

google-scholar
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your nstitution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Realizzato con Software DSpace-CRIS - Estensione mantenuta e ottimizzata da 4Science

  • Impostazioni dei cookie
  • Informativa sulla privacy
  • Accordo con l'utente finale
  • Invia il tuo Feedback