Logo del repository
  1. Home
 
Opzioni

Diagnostics for topic modelling. The dubious joys of making quantitative decisions in a qualitative environment

Sciandra, Andrea
•
Trevisani, Matilde
•
Tuzzi, Arjuna
2023
  • conference object

Abstract
Diagnostics is a crucial component of any topic modelling application. However, available measures seldom offer indisputable and consistent solutions. We analyse the score distribution of a large set of intrinsic measures by varying two model inputs: text length and topic number. The first aim is to identify an ideal text length (or range of) by exploring per-length diagnostic distributions over the topic number. The second aim, once the optimal text length has been set, is to select the best model (or candidates) by comparing different specifications that include document metadata. We will also detect any conflict or ambivalence in the solutions produced by the different diagnostics.
Archivio
https://hdl.handle.net/11368/3073738
https://www.paviauniversitypress.it/catalogo/proceedings-of-the-statistics-and-data-science-conference/6705
Diritti
open access
license:creative commons
license uri:http://creativecommons.org/licenses/by-nc-sa/4.0/
Soggetti
  • diagnostic measure

  • topic modelling

  • structural topic mode...

  • model selection

google-scholar
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your nstitution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Realizzato con Software DSpace-CRIS - Estensione mantenuta e ottimizzata da 4Science

  • Impostazioni dei cookie
  • Informativa sulla privacy
  • Accordo con l'utente finale
  • Invia il tuo Feedback