I love incremental papers
ยท 2 min read
The most common reason reviewers use to reject a paper (lazily) is "not enough novelty".
As a reviewer, I've never done that. I love incremental papers, under two conditions:
- the authors acknowledge previous work.
- the method works.
Most papers pass 1. but fail at 2.; more precisely, most methods don't reproduce under non-identical conditions, especially non-incremental ones.
To illustrate this, consider the 13,024 papers that were accepted combined across ICLR, ICML and NeurIPS in 2025. Let's assume around 5% of them propose something that you could plug into the nanoGPT speedrun, ie., architectural tweaks, optimizers, etc. That would be 651 papers per year. Yet, we see only very few novel ideas in the leaderboard (modulo engineering tricks) and ironically, several of them weren't even published in those tier-1 conferences, such as Muon or value embeddings.
If a method doesn't have to reproduce, coming up with vastly novel, non-incremental ones is easy. We can sit down and come up with 100 exotic ideas in an afternoon because the space of things that don't work is infinitely large.
Bad incremental ideas are also easy to generate. But bad incremental work is at least grounded in components we understand, making it faster to diagnose and discard. Bad novel work can waste a lot of follow-up effort before we realize it's a dead end. As such, it harm progress because it wastes everyone's time.
During my PhD, this frustrated me so much that I decided to write a paper about training methods that (mostly) don't work: No Train No Gain. While writing this paper was fun and satisfying, it was also easy to do because, again, most methods don't survive combat with real workloads.
Can we go further and argue that the breakthroughs that did work were often incremental?
Consider how some of the biggest breakthroughs in deep learning look when you decompose them:
- Adam -> Combination of RMSProp and SGD-M
- Transformers -> RNNsearch minus RNN
- GRPO -> PPO without the critic
"We use minimum innovation for maximum results." - Ilya Sutskever (NeurIPS 2014)