# Binary problems

Binary classification is a task to predict a label of each data given two categories.

Hivemall provides several tutorials to deal with binary classification problems as follows:

This page focuses on the evaluation of such binary classification problems. If your classifier outputs probability rather than 0/1 label, evaluation based on Area Under the ROC Curve would be more appropriate.

# Example

## Data

The following table shows examples of binary classification's prediction.

truth label predicted label description
1 0 False Negative
0 1 False Positive
0 0 True Negative
1 1 True Positive
0 1 False Positive
0 0 True Negative

In this case, 1 means positive label and 0 means negative label. The leftmost column shows truth labels, and center column includes predicted labels.

## Preliminary metrics

Some evaluation metrics are calculated based on 4 values:

• True Positive (TP): truth label is positive and predicted label is also positive
• True Negative (TN): truth label is negative and predicted label is also negative
• False Positive (FP): truth label is negative but predicted label is positive
• False Negative (FN): truth label is positive but predicted label is negative

TR and TN represent correct classification, and FP and FN illustrate incorrect ones.

In this example, we can obtain those values:

• TP: 1
• TN: 2
• FP: 2
• FN: 1

if you want to know about those metrics, Wikipedia provides more detail information.

### Recall

Recall indicates the true positive rate in truth positive labels. The value is computed by the following equation:

$\mathrm{recall} = \frac{\mathrm{\#TP}}{\mathrm{\#TP} + \mathrm{\#FN}}$

In the previous example, $\mathrm{precision} = \frac{1}{2}$.

### Precision

Precision indicates the true positive rate in positive predictive labels. The value is computed by the following equation:

$\mathrm{precision} = \frac{\mathrm{\#TP}}{\mathrm{\#TP} + \mathrm{\#FP}}$

In the previous example, $\mathrm{precision} = \frac{1}{3}$.

# Metrics

To use metrics examples, please create the following table.

create table data as
select 1 as truth, 0 as predicted
union all
select 0 as truth, 1 as predicted
union all
select 0 as truth, 0 as predicted
union all
select 1 as truth, 1 as predicted
union all
select 0 as truth, 1 as predicted
union all
select 0 as truth, 0 as predicted
;


## F1-score

F1-score is the harmonic mean of recall and precision. F1-score is computed by the following equation:

$\mathrm{F}_1 = 2 \frac{\mathrm{precision} * \mathrm{recall}}{\mathrm{precision} + \mathrm{recall}}$

Hivemall's fmeasure function provides the option which can switch micro(default) or binary by passing average argument.

### Caution

Hivemall also provides f1score function, but it is old function to obtain F1-score. The value of f1score is based on set operation. So, we recommend to use fmeasure function to get F1-score based on this article.

### Micro average

If micro is passed to average, recall and precision are modified to consider True Negative. So, micro f1score are calculated by those modified recall and precision.

$\mathrm{recall} = \frac{\mathrm{\#TP} + \mathrm{\#TN}}{\mathrm{\#TP} + \mathrm{\#FN} + \mathrm{\#TN}}$

$\mathrm{precision} = \frac{\mathrm{\#TP} + \mathrm{\#TN}}{\mathrm{\#TP} + \mathrm{\#FP} + \mathrm{\#TN}}$

If average argument is omitted, fmeasure use default value: '-average micro'.

The following query shows the example to obtain F1-score. Each row value has the same type (int or boolean). If row value's type is int, 1 is considered as the positive label, and -1 or 0 is considered as the negative label.

select fmeasure(truth, predicted, '-average micro') from data;


0.5

It should be noted that, since the old f1score(truth, predicted) function simply counts the number of "matched" elements between truth and predicted, the above query is equivalent to:

select f1score(array(truth), array(predicted)) from data;


### Binary average

If binary is passed to average, True Negative samples are ignored to get F1-score.

The following query shows the example to obtain F1-score with binary average.

select fmeasure(truth, predicted, '-average binary') from data;


0.4

## F-measure

F-measure is generalized F1-score and the weighted harmonic mean of recall and precision. F-measure is computed by the following equation:

$\mathrm{F}_{\beta} = (1+\beta^2) \frac{\mathrm{precision} * \mathrm{recall}}{\beta^2 \mathrm{precision} + \mathrm{recall}}$

$\beta$ is the parameter to determine the weight of precision. So, F1-score is the special case of F-measure given $\beta=1$.

If $\beta$ is larger positive value than 1.0, F-measure reaches recall. On the other hand, if $\beta$ is smaller positive value than 1.0, F-measure reaches precision.

If $\beta$ is omitted, hivemall calculates F-measure with $\beta=1$ (: equivalent to F1-score).

Hivemall's fmeasure function also provides the option which can switch micro(default) or binary by passing average argument.

The following query shows the example to obtain F-measure with $\beta=2$ and micro average.

select fmeasure(truth, predicted, '-beta 2. -average micro') from data;


0.5

The following query shows the example to obtain F-measure with $\beta=2$ and binary average.

select fmeasure(truth, predicted, '-beta 2. -average binary') from data;


0.45454545454545453