Ensemble (stacked) — headline

Weighted F1
0.6170
Macro F1
0.2753
Accuracy
71.20%

Per-model comparison

Model Weighted F1 Macro F1 Accuracy
Ensemble (Stacked) ★ 0.6170 0.2753 71.20%
Random Forest 0.6087 0.2579 67.50%
Gradient Boosting 0.6145 0.2733 70.75%
XGBoost 0.5776 0.2983 53.55%

Class distribution

Class Owner range Tier Samples
0 ≤10K Common Indie 4,500
1 ≤35K Niche 2,200
2 ≤75K Growing 1,500
3 ≤150K Established 1,000
4 ≤350K Popular 504
5 ≥750K Breakout Hit 296

Evaluation plots

Visual breakdown of model performance across all base learners and the stacked ensemble on the held-out test set.

Model evaluation plots

Metric definitions

Weighted F1

Harmonic mean of precision and recall, weighted by class support. Ranges 0–1; higher is better.

Macro F1

Unweighted average F1 across all classes. Useful for evaluating performance on minority classes.

Accuracy

Percentage of correctly classified games. A simple overall performance metric.

About the ensemble. The stacked model combines Random Forest, Gradient Boosting, and XGBoost via an XGBoost meta-learner — leveraging the complementary strengths of each base model.