[D] On low quality reviews at ML conferences

Lately I've been really worried about a trend in the ML community: the overwhelming dominance of purely empirical researchers. It’s genuinely hard to be a rigorous scientist, someone who backs up arguments with theory and careful empirical validation. It’s much easier to throw together a bunch of empirical tricks, tune hyperparameters, and chase a +0.5% SOTA bump.

To be clear: I value empiricism. We absolutely need strong empirical researchers. But the problem is the imbalance. They're becoming the majority voice in spaces where rigor should matter most especially NeurIPS and ICLR. These aren't ACL or CVPR, where incremental benchmark improvements are more culturally accepted. These are supposed to be venues for actual scientific progress, not just leaderboard shuffling.

And the review quality really reflects this imbalance.

This year I submitted to NeurIPS, ICLR, and AISTATS. The difference was extereme. My AISTATS paper was the most difficult to read, theory-heavy, yet 3 out of 4 reviews were excellent. They clearly understood the work. Even the one critical reviewer with the lowest score wrote something like: “I suspect I’m misunderstanding this part and am open to adjusting my score.” That's how scientific reviewing should work.

But the NeurIPS/ICLR reviews? Many reviewers seemed to have zero grasp of the underlying science -tho it was much simpler. The only comments they felt confident making were about missing baselines, even when those baselines were misleading or irrelevant to the theoretical contribution. It really highlighted a deeper issue: a huge portion of the reviewer pool only knows how to evaluate empirical papers, so any theoretical or conceptual work gets judged through an empirical lens it was never meant for.

I’m convinced this is happening because we now have an overwhelming number of researchers whose skill set is only empirical experimentation. They absolutely provide value to the community but when they dominate the reviewer pool, they unintentionally drag the entire field toward superficiality. It’s starting to make parts of ML feel toxic: papers are judged not on intellectual merit but on whether they match a template of empirical tinkering plus SOTA tables.

This community needs balance again. Otherwise, rigorous work, the kind that actually advances machine learning, will keep getting drowned out.

EIT: I want to clarify a bit more. I still do believe there are a lot of good & qualified ppl publishing beautiful works. It's the trend that I'd love to point out. From my point of view, the reviewer's quality is deteriorating quite fast, and it will be a lot messier in the upcoming years.

Leave a Reply