Model Metrics

Ensemble (stacked) — headline

Weighted F1

0.6170

Macro F1

0.2753

Accuracy

71.20%

Per-model comparison

Model	Weighted F1	Macro F1	Accuracy
Ensemble (Stacked) ★	0.6170	0.2753	71.20%
Random Forest	0.6087	0.2579	67.50%
Gradient Boosting	0.6145	0.2733	70.75%
XGBoost	0.5776	0.2983	53.55%

Class distribution

Class	Owner range	Tier	Samples
0	≤10K	Common Indie	4,500
1	≤35K	Niche	2,200
2	≤75K	Growing	1,500
3	≤150K	Established	1,000
4	≤350K	Popular	504
5	≥750K	Breakout Hit	296

Evaluation plots

Visual breakdown of model performance across all base learners and the stacked ensemble on the held-out test set.

Metric definitions

Weighted F1

Harmonic mean of precision and recall, weighted by class support. Ranges 0–1; higher is better.

Macro F1

Unweighted average F1 across all classes. Useful for evaluating performance on minority classes.

Accuracy

Percentage of correctly classified games. A simple overall performance metric.

About the ensemble. The stacked model combines Random Forest, Gradient Boosting, and XGBoost via an XGBoost meta-learner — leveraging the complementary strengths of each base model.