↓ Skip to main content

Transfer learning for class imbalance problems with inadequate data

Overview of attention for article published in Knowledge and Information Systems, August 2015
Altmetric Badge

Citations

dimensions_citation
86 Dimensions

Readers on

mendeley
99 Mendeley
Title
Transfer learning for class imbalance problems with inadequate data
Published in
Knowledge and Information Systems, August 2015
DOI 10.1007/s10115-015-0870-3
Pubmed ID
Authors

Samir Al-Stouhi, Chandan K. Reddy

Abstract

A fundamental problem in data mining is to effectively build robust classifiers in the presence of skewed data distributions. Class imbalance classifiers are trained specifically for skewed distribution datasets. Existing methods assume an ample supply of training examples as a fundamental prerequisite for constructing an effective classifier. However, when sufficient data is not readily available, the development of a representative classification algorithm becomes even more difficult due to the unequal distribution between classes. We provide a unified framework that will potentially take advantage of auxiliary data using a transfer learning mechanism and simultaneously build a robust classifier to tackle this imbalance issue in the presence of few training samples in a particular target domain of interest. Transfer learning methods use auxiliary data to augment learning when training examples are not sufficient and in this paper we will develop a method that is optimized to simultaneously augment the training data and induce balance into skewed datasets. We propose a novel boosting based instance-transfer classifier with a label-dependent update mechanism that simultaneously compensates for class imbalance and incorporates samples from an auxiliary domain to improve classification. We provide theoretical and empirical validation of our method and apply to healthcare and text classification applications.

Mendeley readers

Mendeley readers

The data shown below were compiled from readership statistics for 99 Mendeley readers of this research output. Click here to see the associated Mendeley record.

Geographical breakdown

Country Count As %
China 1 1%
Unknown 98 99%

Demographic breakdown

Readers by professional status Count As %
Student > Ph. D. Student 34 34%
Researcher 7 7%
Student > Master 6 6%
Student > Bachelor 6 6%
Student > Doctoral Student 5 5%
Other 13 13%
Unknown 28 28%
Readers by discipline Count As %
Computer Science 41 41%
Engineering 11 11%
Decision Sciences 3 3%
Economics, Econometrics and Finance 2 2%
Biochemistry, Genetics and Molecular Biology 2 2%
Other 9 9%
Unknown 31 31%