Tabu search approaches for solving the two-group classification problem

S. Hanafi and N. Yanev

The classification problem consists in classifying a set of observations correctly by assigning each to one of a number of mutually exclusive pre-defined groups, according to the observation’s attributes. Statistical and mathematical programming communities have extensively studied this problem. In statistical approaches, it is also called the discriminant analysis problem. When the population distributions are multivariate normal, the classification rules can be derived from Fisher's discriminant function, see Fisher (1936). Discriminant analysis has been applied in a variety of disciplines including biology, artificial intelligence, social and administrative sciences, economics, marketing. In the last decade, Tabu Search (TS) has been applied to several classification problems.

The two-group classification problem consists in determining a plane that separates the groups optimally. The objective is to minimize the number of misclassified points. In Hanafi and Yanev (2009), we develop a tabu search (TS) heuristic for solving this NP-hard problem. The tabu search approach proposed is based on a non-standard formulation using linear system infeasibility. We also propose supplementary new intensification phases based on surrogate constraints. The results of the conducted computational experiments show that our TS algorithms produce solutions very close to the optimum and require significantly less computational effort than the MIP approaches, making it a valuable alternative. The results of this comparison show that our tabu search approach is quite robust. It is able to solve very large instances of the discriminant analysis problem within a reasonable amount of CPU time and functions well regardless to the number of misclassifications. Our TS algorithms function as Scatter Search methods with adaptive memory programming where at each iteration we maintain a set of separating hyperplanes induced by the current basis. Moreover, the tabu search procedures can be extended in a natural way to the general classification problem, which consists of generating more than one separating hyperplanes.

References

Freed N. and Glover F., (1981). Simple but powerful goal programming models for discriminant problems. European Journal of Operational Research, 7, 44-60.

A. Fréville, S. Hanafi, F. Semet, N. Yanev (2010). A tabu search with an oscillation strategy for the discriminant analysis problem. Computers & OR 37(10): 1688-1696 (2010)

Glen, J.J. (2005). Mathematical programming models for piecewise-linear discriminant analysis. Journal of the Operational Research Society, Vol. 56, pp. 331-341.

Glover F. and M. Laguna, (1997). Tabu Search, (Dordrecht: Kluwer Academic Publishers.

Glover, F. (1990). Improved Linear Programming Models for Discriminant Analysis. Decision Sciences, Vol, 21, No. 4, pp. 771-785.

Glover, F. (1993). Improved Linear and Integer Programming Models for Discriminant Analysis. Creative and Innovative Approaches to the Science of Management, RGK Foundation Press, pp. 187-215.

Glover, F. (2006). Improved classification and discrimination by successive hyperplane and multi-hyperplane separation. (Working Paper), University of Colorado at Boulder.

F. Glover, (1989). Tabu Search-Part I, ORSA Journal on Computing, 1(3): 190-206.

F. Glover, (1990). Tabu Search-Part II, ORSA Journal on Computing, 2(1): 4-32.

S. Hanafi, N. Yanev (2009). Tabu search approaches for solving the two-group classification problem. Annals of Operations Research, doi 10.1007/s10479-009-0581-9.

Mangasarian O.L., (1996). Mathematical programming in data mining, Technical Report 96-5, Computer Science Department, University of Wisconsin, USA.

Marcotte P. and G. Savard, (1992). Novel approach to the discrimination problem, Operations Research, 36, 517-544.

Yanev N. and S. Balev, (1999). A combinatorial approach to the classification problem, European Journal of Operational Research 15, 339-350.