Cooperative coevolutive approach for learning ensembles of labelsets for multilabel problems with partial learning

Abstract: Multilabel classification is a supervised learning paradigm that has garnered increased attention recently. Multilabel classification addresses problems in which each sample simultaneously belongs to multiple binary classes, which are termed labels within this paradigm. The task of learning multilabel datasets is harder than single-label classification. Low-density label datasets, noisy labels, and complex relationships among labels make this problem extremely difficult. Labelset-based approaches, where multiclass classifiers are learned for different subsets of labels, are among the top-performing methods. However, the huge search space for an optimum combination of labelsets makes many of the models suboptimal. In this paper, we propose a new approach based on cooperative coevolution that aims at obtaining the best labelsets and the best combinations of them by means of two collaborating populations. The scalability of the method is achieved by partial learning of the multiclass classifiers, where only a small number of instances are used to train the models. An extensive comparison using 60 datasets and four different classification models demonstrates the advantageous performance of our approach.

Authors: N. García-Pedrajas, J. M. Cuevas-Muñoz, and A. de Haro-García.

Submitted.

Supplementary material