Classification of Imbalanced Data with Multilayer Perceptrons
Sang-Hoon Oh*
*Mokwon University, Korea
E-mail:
Abstract / Classification of imbalanced data is reported in a wide range of applications. In order to classify the imbalanced data with multilayer perceptrons, this paper proposes a new error function which intensifies weight-updating for the patterns in the minority class and weakens weight-updating for the patterns in the majority class. Through simulations on mammography and thyroid data sets, the effectiveness of the proposed method is verified.

Keywords: imbalanced data; multilayer perceptrons; error function; error back-propagation algorithm.

1.  Introduction

Conventional pattern classifiers are based on the assumption that class priors are relatively balanced and error costs of all classes are equal[1]. These classifiers show poor performances in a wide range of applications such as credit assessment[2], gene ontology[3], remote sensing[4], bio-medical diagnoses[5], etc. This is due to the imbalance of data, that is, the population of interesting class is rare among whole data population.

There have been many attempts to resolve the imbalanced data problems. At the data level approach, re-balancing of data distribution is achieved through under-sampling[5,6], over-sampling[7], or combination of the two[7]. At the algorithmic level, existing classifier learning algorithms were modified to strengthen learning with regards to the minority class[4,8]. Also, cost-sensitive learning and threshold moving methods can be categorized into the algorithmic level approach[6,9]. Besides the two approaches, ensemble scheme showed superior performances to each individual classifier[4,10].

In this paper, we focus on the algorithmic approach since a better classifier can be applied to data level or ensemble approaches[8]. More specifically, we propose an error function for error back-propagation (EBP) algorithm of multilayer perceptrons (MLP’s), which tries to intensify weight-updating for the patterns of minority class and to weaken weight-updating for the patterns of majority class.

2.  Error back-propagation of multilayer perceptrons

Consider an MLP consisting of N inputs, H hidden, and M output nodes. When a p-th training pattern is presented to the MLP, the j-th hidden node is given by

where (1)

Here, and denotes the weight connecting the i-th input to . The k-th output node is

where (2)

Also, and denotes the weight connecting to .

Let desired output vector corresponding to the training pattern be , where the class from which originates is coded as follows:

(3)

We call the target node of class k. The conventional error function for P training patterns is EBP algorithm minimizes E through iterative update of weights to the negative direction of the error function[11].

Let us assume that there are two classes, where one is the minority class with training patterns and the other is the majority class with training patterns. Here, and . Since the EBP algorithm updates weights based on the negative gradient of the error function, training patterns of the majority class dominate training and this will cause the extension of majority class boundary to the boundary of minority class[5]. As a result, MLPs show poor performance for the imbalanced data[4].

3.  Error function for imbalanced data problems

Under the assumption that targets are coded as in (3), we propose a new error function which intensifies weight-updating for training patterns of the minority class and weakens weight-updating for training patterns of the majority class. In this sense, the proposed error function is given by

, (6)

where are positive integers. When , the proposed error function is equal to the nth order error function which dramatically reduces incorrect saturation of output nodes[12].

4. Simulations

“Ann-thyroid[13]” and “Mammography[7]” problems are used to verify the effectiveness of proposed method. Since the “Ann-thyroid” is a three-class, it is converted to two-class problems such as “Ann-thyroid13” and “Ann-thyroid23”[5]. Here, class1(2) is the minority class and class 3 is the majority class. For, “Mammography” data set, we used “1-out of 5 cross-validation” since its test data is not provided.

Through simulations of the two problems, we verified that the proposed method improved the classification performance of MLP for the imbalanced data.

5. Conclusion

In order to classify the imbalanced data with MLPs, this paper proposed a new error function for EBP algorithm which regulates weight-updating with regards to the minority or the majority classes. Through simulations of “Ann-thyroid” and “Mammography” data, we verified that the proposed error function improved the classification performance.

9. References

[1] F. Provost and T. Fawcett, “Robust classification for imprecise environments”, Machine Learning, vol.42, pp.203-231, 2001.

[2] Y.-M. Huang, C.-M. Hung, and H. C. Jiau, “Evaluation of neural networks and data mining methods on a credit assessment task for class imbalanced problem”, Nonlinear Analysis: Real World Applications, vol. 7, pp. 720-747, 2006.

[3] R. Bi, Y. Zhou, F. Lu, and W. Wang, “Predicting gene ontology functions based on support vector machines and statistical significance estimation”, Neurocomputing, vol. 70, pp. 718-725, 2007.

[4] L. Bruzzone and S. B. Serpico, “Classification of imbalanced remote-sensing data by neural networks”, Pattern Recognition Letters, vol. 18, pp. 1323-1328, 1997.

[5] P. Kang and S. Cho, “EUS SVMs: Ensemble of under-sampled SVMs for data imbalance problems”, Proc. ICONIP’06, Springer, pp. 837-846.

[6] Z.-H. Zhou and X.-Y. Liu, “Training cost-sensitive neural networks with methods addressing the class imbalance problem”, IEEE Trans. Knowledge and Data Eng., vol. 18, pp. 63-77, 2006.

[7] N. V. Chalwa, K. W. Bowyer, L. O. Hall, and W. P. Kegelmeyer, “SMOTE: Synthetic minority over-sampling technique”, Journal of Artificial Intelligence Research, vol. 16, pp. 321-351, 2002.

[8] S.-H. Oh, “Error back-propagation algorithm for classification of imbalanced data”, submitted to Neurocomputing.

[9] H. Zhao, “Instance weighting versus threshold adjusting for cost-sensitive classification”, Knowl. Inf. Syst., vol. 15, pp. 321-334, 2008.

[10] Y. Sun, M. S. Kamel, A. K. C. Wong, and Y. Wang, “Cost-sensitive boosting for classification of imbalanced data”, Pattern Recognition, vol. 40, pp. 3358-3378, 2007.

[11] D. E. Rumelhart and J. L. McClelland, Parallel Distributed Processing, MIT Press, Cambridge, MA, 1986.

[12] S.-H. Oh, “Improving the error back-propagation algorithm with a modified error function”, IEEE Trans. Neural Networks, vol. 8, pp. 799-803, 1997.

[13] A. Frank and A. Asuncion, UCI Machine Learning Repository, http://archive.ics.uci.edu/ml, University of California, Irvine, School of Information and Computer Sciences, 2010.

1

1