
- Performance collapse on extreme imbalance (under 1% positive class)
- Silent degradation when data drifts (sensor drift, behavior changes, etc.)
Key Results
Imbalanced data (Credit Card Fraud – 0.2% positives):
– PKBoost: 87.8% PR-AUC
– LightGBM: 79.3% PR-AUC
– XGBoost: 74.5% PR-AUC
Under realistic drift (gradual covariate shift):
– PKBoost: 86.2% PR-AUC (−2.0% degradation)
– XGBoost: 50.8% PR-AUC (−31.8% degradation)
– LightGBM: 45.6% PR-AUC (−42.5% degradation)
What's ifferent
The main innovation is using Shannon entropy in the split criterion alongside gradients. Each split maximizes:
Gain = GradientGain + λ·InformationGain
where λ adapts based on class imbalance. This explicitly optimizes for information gain on the minority class instead of just minimizing loss.
Combined with:
– Quantile-based binning (robust to scale shifts)
– Conservative regularization (prevents overfitting to majority)
– PR-AUC early stopping (focuses on minority performance)
The architecture is inherently more robust to drift without needing online adaptation.
Trade-offs
The good:
– Auto-tunes for your data (no hyperparameter search needed)
– Works out-of-the-box on extreme imbalance
– Comparable inference speed to XGBoost
The honest:
– ~2-4x slower training (45s vs 12s on 170K samples)
– Slightly behind on balanced data (use XGBoost there)
– Built in Rust, so less Python ecosystem integration
Why I'm Sharing
This started as a learning project (built from scratch in Rust), but the drift resilience results surprised me. I haven't seen many papers addressing this – most focus on online learning or explicit drift detection.
Looking for feedback on:
– Have others seen similar robustness from conservative regularization?
– Are there existing techniques that achieve this without retraining?
– Would this be useful for production systems, or is 2-4x slower training a dealbreaker?
Links
– GitHub: https://github.com/Pushp-Kharat1/pkboost
– Benchmarks include: Credit Card Fraud, Pima iabetes, Breast Cancer, Ionosphere
– MIT licensed, ~4000 lines of Rust
Happy to answer questions about the implementation or share more detailed results. Also open to PRs if anyone wants to extend it (multi-class support would be great).
—
Edit: Built this on a 4-core Ryzen 3 laptop with 8GB RAM, so the benchmarks should be reproducible on any hardware.
Edit: The Python library is now avaible for use, for furthur details, please check the Python folder in the Github Repo for Usage, Or Comment if any questions or issues