What is Multiclass Classification?
Multiclass Classification is a type of Supervised Learning used to predict which of several possible categories an observation belongs to.
Similar to regression and binary classification, it follows the same training, validation, and evaluation iterative process, reserving a portion of the data for validating the model.
Examples
- Predicting email categories (spam / work email / promotional email)
- Predicting disease types (flu / cold / allergy)
- Predicting penguin species (Adelie / Chinstrap / Gentoo)
Example: Penguin Species Classification
We observe the **flipper length of the penguins (x)** and use it to predict the **species of the penguins (y)**.
Species Codes
- 0: Adelie
- 1: Chinstrap
- 2: Gentoo
Sample Data
| Flipper Length (x) | Species (y) |
|---|---|
| 167 | 0 |
| 172 | 0 |
| 225 | 2 |
| 197 | 1 |
| 189 | 1 |
| 232 | 2 |
| 158 | 0 |
Objective: Train a model to **input flipper length (x)** and predict **penguin species (y)**.
Training a Multiclass Classification Model
Types of Multiclass Classification Algorithms
- One-vs-Rest (OvR) – Train multiple binary classification models, each predicting one category vs. all other categories.
- Multinomial Classification – Train a multiclass classification model that calculates the probabilities for all categories.
One-vs-Rest (OvR) Algorithm
Concept
- Train a binary classifier for each category to calculate the probability of belonging to that category:
- f0(x) = P(y=0 | x)
- f1(x) = P(y=1 | x)
- f2(x) = P(y=2 | x)
Characteristics
- Computationally efficient, suitable for cases with fewer categories.
- Training multiple binary classification models incurs high computational costs for each classification.
Multinomial Classification
Concept
- Calculate the probability for each category, then select the category with the highest probability as the predicted value.
Softmax Function
Example
- Softmax output calculation:
- Class 0 (Adelie): 20%
- Class 1 (Chinstrap): 30%
- Class 2 (Gentoo): 50% (highest)
Characteristics
- Computationally efficient, suitable for cases with many categories.
- Train a single multiclass model at once, reducing training costs.
Evaluating Multiclass Classification Models
Test Data
| Flipper Length (x) | Actual Species (y) | Predicted Species (ŷ) |
|---|---|---|
| 165 | 0 | 0 |
| 171 | 0 | 0 |
| 205 | 2 | 1 |
| 195 | 1 | 1 |
| 183 | 1 | 1 |
| 221 | 2 | 2 |
| 214 | 2 | 2 |
Confusion Matrix for Multiclass Classification
| Predicted \ Actual | 0 | 1 | 2 |
|---|---|---|---|
| 0 | 2 | 0 | 0 |
| 1 | 0 | 2 | 1 |
| 2 | 0 | 1 | 2 |
Interpretation
- Class 0:
- Correctly predicted 2 times (TP)
- No incorrect predictions
- Class 1:
- Correctly predicted 2 times (TP)
- Misclassified 1 time (FP)
- Class 2:
- Correctly predicted 2 times (TP)
- Misclassified 1 time (FN)
Calculating Evaluation Metrics
Metrics for Each Category
| Category | TP | TN | FP | FN | Accuracy | Recall | Precision | F1 Score |
|---|---|---|---|---|---|---|---|---|
| 0 | 2 | 5 | 0 | 0 | 1.00 | 1.00 | 1.00 | 1.00 |
| 1 | 2 | 4 | 1 | 0 | 0.86 | 1.00 | 0.67 | 0.80 |
| 2 | 2 | 4 | 0 | 1 | 0.86 | 0.67 | 1.00 | 0.80 |
Overall Evaluation Metrics
- **Overall Accuracy**:
Calculated:
90% of predictions are correct.
- Overall Recall
Calculated:
86% of actual categories were correctly identified.
- Overall Precision
Calculated:
86% of samples predicted as that category are correct.
- Overall F1 Score
Calculated:
F1 Score = 0.86, indicating good model performance.
Conclusion
Multiclass classification is suitable for prediction problems involving multiple categories. **Common algorithms include OvR and Softmax (multinomial regression)**. When evaluating multiclass classification, metrics can be calculated for each category or overall metrics.
Further Learning
- Scikit-learn Multiclass Classification
- TensorFlow Classification Tutorial