← Back to projects

Animal Classification with Regularized Models and Compact Ensembles

Created a mini-classifier to separate zoo animals into biological groups using features such as hair, feathers, milk, eggs, fins, legs, and aquatic behavior. Regularized models and compact ensemble methods were compared to build a simple, controlled classification pipeline.

About this project

The project uses the UCI Zoo Animal Classification dataset, where animals are classified into biological groups using simple descriptive features such as hair, feathers, eggs, milk, aquatic behavior, predator status, backbone, fins, legs, tail, and domestic status. Since the dataset is small, the project focused on simple regularized models and compact ensemble methods instead of overly complex models.

The workflow included basic data checking, train-validation-test splitting, preprocessing with imputation and scaling where needed, cross-validation model comparison, and hyperparameter tuning. Models tested included Regularized Logistic Regression, Regularized Linear SVM, Shallow Decision Tree, Bagging with shallow trees, Random Forest with shallow trees, and AdaBoost with decision stumps. The main goal was to reduce overfitting by using regularization, shallow trees, and controlled ensemble models.

After hyperparameter tuning, the best selected model was Regularized Logistic Regression with:

C = 0.1

The validation results were perfect:

Validation Accuracy: 1.00
Validation Macro F1: 1.00

This showed that the selected model generalized well during model selection. On the final test set, the model achieved:

Test Accuracy: 0.9375
Test Macro F1: 0.8367
Test Precision Macro: 0.8214
Test Recall Macro: 0.8571

The model correctly classified 15 out of 16 test samples, with only one misclassified instance. Overall, the results suggest strong performance with no clear signs of overfitting or underfitting. Accuracy was high, while macro F1 gave a more balanced view of class-level performance across the multi-class target.