[R] PKBoost: Gradient boosting that stays accurate under data drift (2% degradation vs XGBoost’s 32%)


I've been working on a gradient boosting implementation that handles two problems I kept running into with XGBoost/LightGBM in production:

  1. Performance collapse on extreme imbalance (under 1% positive class)
  2. Silent degradation when data drifts (sensor drift, behavior changes, etc.)

Key Results

Imbalanced data (Credit Card Fraud – 0.2% positives):

– PKBoost: 87.8% PR-AUC

– LightGBM: 79.3% PR-AUC

– XGBoost: 74.5% PR-AUC

Under realistic drift (gradual covariate shift):

– PKBoost: 86.2% PR-AUC (−2.0% degradation)

– XGBoost: 50.8% PR-AUC (−31.8% degradation)

– LightGBM: 45.6% PR-AUC (−42.5% degradation)

What's ifferent

The main innovation is using Shannon entropy in the split criterion alongside gradients. Each split maximizes:

Gain = GradientGain + λ·InformationGain

where λ adapts based on class imbalance. This explicitly optimizes for information gain on the minority class instead of just minimizing loss.

Combined with:

– Quantile-based binning (robust to scale shifts)

– Conservative regularization (prevents overfitting to majority)

– PR-AUC early stopping (focuses on minority performance)

The architecture is inherently more robust to drift without needing online adaptation.

Trade-offs

The good:

– Auto-tunes for your data (no hyperparameter search needed)

– Works out-of-the-box on extreme imbalance

– Comparable inference speed to XGBoost

The honest:

– ~2-4x slower training (45s vs 12s on 170K samples)

– Slightly behind on balanced data (use XGBoost there)

– Built in Rust, so less Python ecosystem integration

Why I'm Sharing

This started as a learning project (built from scratch in Rust), but the drift resilience results surprised me. I haven't seen many papers addressing this – most focus on online learning or explicit drift detection.

Looking for feedback on:

– Have others seen similar robustness from conservative regularization?

– Are there existing techniques that achieve this without retraining?

– Would this be useful for production systems, or is 2-4x slower training a dealbreaker?

Links

– GitHub: https://github.com/Pushp-Kharat1/pkboost

– Benchmarks include: Credit Card Fraud, Pima iabetes, Breast Cancer, Ionosphere

– MIT licensed, ~4000 lines of Rust

Happy to answer questions about the implementation or share more detailed results. Also open to PRs if anyone wants to extend it (multi-class support would be great).

Edit: Built this on a 4-core Ryzen 3 laptop with 8GB RAM, so the benchmarks should be reproducible on any hardware.

Edit: The Python library is now avaible for use, for furthur details, please check the Python folder in the Github Repo for Usage, Or Comment if any questions or issues

Leave a Reply