# Binary problems

Binary classification is a task to predict a label of each data given two categories.

Hivemall provides several tutorials to deal with binary classification problems as follows:

This page focuses on the evaluation of such binary classification problems. If your classifier outputs probability rather than 0/1 label, evaluation based on Area Under the ROC Curve would be more appropriate.

# Example

This page introduces toy example data and two metrics for explanation.

## Data

The following table shows examples of binary classification's prediction.

truth label | predicted label | description |
---|---|---|

1 | 0 | False Negative |

0 | 1 | False Positive |

0 | 0 | True Negative |

1 | 1 | True Positive |

0 | 1 | False Positive |

0 | 0 | True Negative |

In this case, `1`

means positive label and `0`

means negative label.
The leftmost column shows truth labels, and center column includes predicted labels.

## Preliminary metrics

Some evaluation metrics are calculated based on 4 values:

- True Positive (TP): truth label is positive and predicted label is also positive
- True Negative (TN): truth label is negative and predicted label is also negative
- False Positive (FP): truth label is negative but predicted label is positive
- False Negative (FN): truth label is positive but predicted label is negative

`TR`

and `TN`

represent correct classification, and `FP`

and `FN`

illustrate incorrect ones.

In this example, we can obtain those values:

- TP: 1
- TN: 2
- FP: 2
- FN: 1

if you want to know about those metrics, Wikipedia provides more detail information.

### Recall

Recall indicates the true positive rate in truth positive labels. The value is computed by the following equation:

$\mathrm{recall} = \frac{\mathrm{\#TP}}{\mathrm{\#TP} + \mathrm{\#FN}}$

In the previous example, $\mathrm{precision} = \frac{1}{2}$.

### Precision

Precision indicates the true positive rate in positive predictive labels. The value is computed by the following equation:

$\mathrm{precision} = \frac{\mathrm{\#TP}}{\mathrm{\#TP} + \mathrm{\#FP}}$

In the previous example, $\mathrm{precision} = \frac{1}{3}$.

# Metrics

To use metrics examples, please create the following table.

```
create table data as
select 1 as truth, 0 as predicted
union all
select 0 as truth, 1 as predicted
union all
select 0 as truth, 0 as predicted
union all
select 1 as truth, 1 as predicted
union all
select 0 as truth, 1 as predicted
union all
select 0 as truth, 0 as predicted
;
```

## F1-score

F1-score is the harmonic mean of recall and precision. F1-score is computed by the following equation:

$\mathrm{F}_1 = 2 \frac{\mathrm{precision} * \mathrm{recall}}{\mathrm{precision} + \mathrm{recall}}$

Hivemall's `fmeasure`

function provides the option which can switch `micro`

(default) or `binary`

by passing `average`

argument.

### Caution

Hivemall also provides `f1score`

function, but it is old function to obtain F1-score. The value of `f1score`

is based on set operation. So, we recommend to use `fmeasure`

function to get F1-score based on this article.

You can learn more about this from the following external resource:

### Micro average

If `micro`

is passed to `average`

,
recall and precision are modified to consider True Negative.
So, micro f1score are calculated by those modified recall and precision.

$\mathrm{recall} = \frac{\mathrm{\#TP} + \mathrm{\#TN}}{\mathrm{\#TP} + \mathrm{\#FN} + \mathrm{\#TN}}$

$\mathrm{precision} = \frac{\mathrm{\#TP} + \mathrm{\#TN}}{\mathrm{\#TP} + \mathrm{\#FP} + \mathrm{\#TN}}$

If `average`

argument is omitted, `fmeasure`

use default value: `'-average micro'`

.

The following query shows the example to obtain F1-score.
Each row value has the same type (`int`

or `boolean`

).
If row value's type is `int`

, `1`

is considered as the positive label, and `-1`

or `0`

is considered as the negative label.

```
select fmeasure(truth, predicted, '-average micro') from data;
```

0.5

It should be noted that, since the old `f1score(truth, predicted)`

function simply counts the number of "matched" elements between `truth`

and `predicted`

, the above query is equivalent to:

```
select f1score(array(truth), array(predicted)) from data;
```

### Binary average

If `binary`

is passed to `average`

, `True Negative`

samples are ignored to get F1-score.

The following query shows the example to obtain F1-score with binary average.

```
select fmeasure(truth, predicted, '-average binary') from data;
```

0.4

## F-measure

F-measure is generalized F1-score and the weighted harmonic mean of recall and precision. F-measure is computed by the following equation:

$\mathrm{F}_{\beta} = (1+\beta^2) \frac{\mathrm{precision} * \mathrm{recall}}{\beta^2 \mathrm{precision} + \mathrm{recall}}$

$\beta$ is the parameter to determine the weight of precision. So, F1-score is the special case of F-measure given $\beta=1$.

If $\beta$ is larger positive value than `1.0`

, F-measure reaches recall.
On the other hand,
if $\beta$ is smaller positive value than `1.0`

, F-measure reaches precision.

If $\beta$ is omitted, hivemall calculates F-measure with $\beta=1$ (: equivalent to F1-score).

Hivemall's `fmeasure`

function also provides the option which can switch `micro`

(default) or `binary`

by passing `average`

argument.

The following query shows the example to obtain F-measure with $\beta=2$ and micro average.

```
select fmeasure(truth, predicted, '-beta 2. -average micro') from data;
```

0.5

The following query shows the example to obtain F-measure with $\beta=2$ and binary average.

```
select fmeasure(truth, predicted, '-beta 2. -average binary') from data;
```

0.45454545454545453

You can learn more about this from the following external resource: