Logo del repository
  1. Home
 
Opzioni

Ranking the information content of distance measures

Glielmo, Aldo
•
Zeni, Claudio
•
Cheng, Bingqing
altro
Laio, Alessandro
2022
  • journal article

Periodico
PNAS NEXUS
Abstract
Real-world data typically contain a large number of features that are often heterogeneous in nature, relevance, and also units of measure. When assessing the similarity between data points, one can build various distance measures using subsets of these features. Finding a small set of features that still retains sufficient information about the dataset is important for the successful application of many statistical learning approaches. We introduce a statistical test that can assess the relative information retained when using 2 different distance measures, and determine if they are equivalent, independent, or if one is more informative than the other. This ranking can in turn be used to identify the most informative distance measure and, therefore, the most informative set of features, out of a pool of candidates. To illustrate the general applicability of our approach, we show that it reproduces the known importance ranking of policy variables for Covid-19 control, and also identifies compact yet informative descriptors for atomic structures. We further provide initial evidence that the information asymmetry measured by the proposed test can be used to infer relationships of causality between the features of a dataset. The method is general and should be applicable to many branches of science.
DOI
10.1093/pnasnexus/pgac039
WOS
WOS:001063384200013
Archivio
https://hdl.handle.net/20.500.11767/131770
info:eu-repo/semantics/altIdentifier/scopus/2-s2.0-85130149185
Diritti
metadata only access
Soggetti
  • causality detection

  • feature selection

  • information theory

  • Settore FIS/03 - Fisi...

google-scholar
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your nstitution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Realizzato con Software DSpace-CRIS - Estensione mantenuta e ottimizzata da 4Science

  • Impostazioni dei cookie
  • Informativa sulla privacy
  • Accordo con l'utente finale
  • Invia il tuo Feedback