↓ Skip to main content

Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification

Overview of attention for article published in BioData Mining, December 2016
Altmetric Badge

Citations

dimensions_citation
34 Dimensions

Readers on

mendeley
33 Mendeley
You are seeing a free-to-access but limited selection of the activity Altmetric has collected about this research output. Click here to find out more.
Title
Adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique algorithm for tackling binary imbalanced datasets in biomedical data classification
Published in
BioData Mining, December 2016
DOI 10.1186/s13040-016-0117-1
Pubmed ID
Authors

Jinyan Li, Simon Fong, Yunsick Sung, Kyungeun Cho, Raymond Wong, Kelvin K. L. Wong

Abstract

An imbalanced dataset is defined as a training dataset that has imbalanced proportions of data in both interesting and uninteresting classes. Often in biomedical applications, samples from the stimulating class are rare in a population, such as medical anomalies, positive clinical tests, and particular diseases. Although the target samples in the primitive dataset are small in number, the induction of a classification model over such training data leads to poor prediction performance due to insufficient training from the minority class. In this paper, we use a novel class-balancing method named adaptive swarm cluster-based dynamic multi-objective synthetic minority oversampling technique (ASCB_DmSMOTE) to solve this imbalanced dataset problem, which is common in biomedical applications. The proposed method combines under-sampling and over-sampling into a swarm optimisation algorithm. It adaptively selects suitable parameters for the rebalancing algorithm to find the best solution. Compared with the other versions of the SMOTE algorithm, significant improvements, which include higher accuracy and credibility, are observed with ASCB_DmSMOTE. Our proposed method tactfully combines two rebalancing techniques together. It reasonably re-allocates the majority class in the details and dynamically optimises the two parameters of SMOTE to synthesise a reasonable scale of minority class for each clustered sub-imbalanced dataset. The proposed methods ultimately overcome other conventional methods and attains higher credibility with even greater accuracy of the classification model.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 33 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
Unknown 33 100%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 7 21%
Student > Master 7 21%
Lecturer 4 12%
Professor 2 6%
Student > Doctoral Student 1 3%
Other 2 6%
Unknown 10 30%
Readers by discipline Count As %
Computer Science 13 39%
Engineering 5 15%
Medicine and Dentistry 2 6%
Decision Sciences 1 3%
Economics, Econometrics and Finance 1 3%
Other 1 3%
Unknown 10 30%