In the multi-label problem there is no constraint on how many of the classes the instance can be assigned to. Formally, multi-label classification is the problem of finding a model that maps inputs x to binary vectors y (assigning a value of 0 or 1 for each element (label) in y).

Problem transformation methods

  • Transformation into binary classification problems

    • binary relevance

      amounts to independently training one binary classifier for each label.

      Although this method of dividing the task into multiple binary tasks may resemble superficially the one-vs.-all (OvA) and one-vs.-rest (OvR) methods for multiclass classification, it is essentially different from both, because a single classifier under binary relevance deals with a single label, without any regard to other labels whatsoever.

    • classifier chain

      It differs from binary relevance in that labels are predicted sequentially, and the output of all previous classifiers (i.e. positive or negative for a particular label) are input as features to subsequent classifiers.

  • Transformation into multi-class classification problem

    • label powerset
  • Ensemble methods

    A set of multi-class classifiers can be used to create a multi-label ensemble classifier. For a given example, each classifier outputs a single class (corresponding to a single label in the multi-label problem). These predictions are then combined by an ensemble method, usually a voting scheme where every class that receives a requisite percentage of votes from individual classifiers (often referred to as the discrimination threshold[8]) is predicted as a present label in the multi-label output.

Adapted algorithms

  • ML-kNN
  • Clare
  • kernel methods for vector output
  • BP-MLL

Statistics and evaluation metrics

  • Hamming loss

    the fraction of the wrong labels to the total number of labels

  • Precision, recall and $F_{1}$ score

Implementations and datasets

multi-labels algorithms and metrics

A list of commonly used multi-label data-sets is available at the Mulan website.

Competition

Toxic Comment Classification Challenge

https://stackabuse.com/python-for-nlp-multi-label-text-classification-with-keras/

https://towardsdatascience.com/multi-label-text-classification-with-scikit-learn-30714b7819c5

Questions from Cross Validated Stack Exchange

https://blog.mimacom.com/text-classification/