[R] PKBoost: Gradient boosting that stays accurate under data drift (2% degradation vs XGBoost’s 32%)

By skyforbes Nov 28, 2025 No Comments

I've been working on a gradient boosting implementation that handles two problems I kept running into with XGBoost/LightGBM in production:

Performance collapse on extreme imbalance (under 1% positive class)
Silent degradation when data drifts (sensor drift, behavior changes, etc.)

Key Results

Imbalanced data (Credit Card Fraud – 0.2% positives):

– PKBoost: 87.8% PR-AUC

– LightGBM: 79.3% PR-AUC

– XGBoost: 74.5% PR-AUC

Under realistic drift (gradual covariate shift):

– PKBoost: 86.2% PR-AUC (−2.0% degradation)

– XGBoost: 50.8% PR-AUC (−31.8% degradation)

– LightGBM: 45.6% PR-AUC (−42.5% degradation)

What's ifferent

The main innovation is using Shannon entropy in the split criterion alongside gradients. Each split maximizes:

Gain = GradientGain + λ·InformationGain

where λ adapts based on class imbalance. This explicitly optimizes for information gain on the minority class instead of just minimizing loss.

Combined with:

– Quantile-based binning (robust to scale shifts)

– Conservative regularization (prevents overfitting to majority)

– PR-AUC early stopping (focuses on minority performance)

The architecture is inherently more robust to drift without needing online adaptation.

Trade-offs

The good:

– Auto-tunes for your data (no hyperparameter search needed)

– Works out-of-the-box on extreme imbalance

– Comparable inference speed to XGBoost

The honest:

– ~2-4x slower training (45s vs 12s on 170K samples)

– Slightly behind on balanced data (use XGBoost there)

– Built in Rust, so less Python ecosystem integration

Why I'm Sharing

This started as a learning project (built from scratch in Rust), but the drift resilience results surprised me. I haven't seen many papers addressing this – most focus on online learning or explicit drift detection.

Looking for feedback on:

– Have others seen similar robustness from conservative regularization?

– Are there existing techniques that achieve this without retraining?

– Would this be useful for production systems, or is 2-4x slower training a dealbreaker?

Links

– GitHub: https://github.com/Pushp-Kharat1/pkboost

– Benchmarks include: Credit Card Fraud, Pima iabetes, Breast Cancer, Ionosphere

– MIT licensed, ~4000 lines of Rust

Happy to answer questions about the implementation or share more detailed results. Also open to PRs if anyone wants to extend it (multi-class support would be great).

—

Edit: Built this on a 4-core Ryzen 3 laptop with 8GB RAM, so the benchmarks should be reproducible on any hardware.

Edit: The Python library is now avaible for use, for furthur details, please check the Python folder in the Github Repo for Usage, Or Comment if any questions or issues

By skyforbes

MachineLearning

[P] Getting purely curiosity driven agents to complete Doom E1M1

skyforbes Nov 28, 2025

MachineLearning

[R] rBridge: Predicting LLM Reasoning Performance with Small Proxy Models (100× Compute Reduction)

skyforbes Nov 28, 2025

MachineLearning

[R] Why do continuous normalising flows produce “half dog-half cat” samples when the data distribution is clearly topologically disconnected?

skyforbes Nov 28, 2025

[R] PKBoost: Gradient boosting that stays accurate under data drift (2% degradation vs XGBoost’s 32%)

Like this:

By skyforbes

Leave a ReplyCancel reply

You Missed

[P] Getting purely curiosity driven agents to complete Doom E1M1

I will find instagram influencer for your digital marketing

Massey Ferguson MF9250 DynaFlex Draper Header Operator’s Manual

Spamming email

Archives

[R] PKBoost: Gradient boosting that stays accurate under data drift (2% degradation vs XGBoost’s 32%)

Like this:

By skyforbes

Related Posts

[P] Getting purely curiosity driven agents to complete Doom E1M1

[R] rBridge: Predicting LLM Reasoning Performance with Small Proxy Models (100× Compute Reduction)

[R] Why do continuous normalising flows produce “half dog-half cat” samples when the data distribution is clearly topologically disconnected?

Leave a ReplyCancel reply

You Missed

[P] Getting purely curiosity driven agents to complete Doom E1M1

I will find instagram influencer for your digital marketing

Massey Ferguson MF9250 DynaFlex Draper Header Operator’s Manual

Spamming email