Logo del repository
  1. Home
 
Opzioni

Cluster based oversampling for imbalanced classification

Di Credico, Gioia
•
TORELLI, Nicola
2024
  • Controlled Vocabulary...

Abstract
Oversampling is a widespread remedy used when data imbalance in classification problems occurs. Some oversampling techniques amount to generating new cases in the minority class, which are similar to the observed ones. ROSE (Random Over Sampling Examples) is an algorithm for generating new data, both in minority and majority classes, using kernel density estimation and bootstrap resampling. In practical application of ROSE, fine tuning of smoothing parameter in kernel density estimate is advisable, especially for the rare class. This is particularly true when well separated subgroups characterize the rare class. We propose a new strategy, ROSEclust, which pairs density-based clustering methods with ROSE to deal with a strongly skewed distribution of the classes and grouping within the rare class. Evidence from simulation studies and real data applications shows that the new approach solves some issues related to ROSE in dealing with complex class data structures. The synthetic data distribution is closer to the original one, and predictive performances of classification methods to synthetic data are not compromised. The entire procedure is designed to be free from parameter tuning. Therefore, the ROSEclust strategy expands application of ROSE and automates the balancing data step, leaving more room for the modelling step.
Archivio
https://fvg.alb-1.adb.units.it/handle/123456789/456127
Soggetti
  • Density-based cluster...

  • tuning parameters

  • resampling

  • ROSE

  • SMOTE

google-scholar
Get Involved!
  • Source Code
  • Documentation
  • Slack Channel
Make it your own

DSpace-CRIS can be extensively configured to meet your needs. Decide which information need to be collected and available with fine-grained security. Start updating the theme to match your nstitution's web identity.

Need professional help?

The original creators of DSpace-CRIS at 4Science can take your project to the next level, get in touch!

Realizzato con Software DSpace-CRIS - Estensione mantenuta e ottimizzata da 4Science

  • Impostazioni dei cookie
  • Informativa sulla privacy
  • Accordo con l'utente finale
  • Invia il tuo Feedback