Blog

Causes of entropy collapse(May 2026)
Conditional slop(Feb 2026)
I love incremental papers(Feb 2026)
What makes a game fun(Feb 2026)
Tricks for LLM diversity(Jan 2026)
Benefit of the doubt(Jan 2026)
We can't learn from zero rewards(Oct 2025)
Advice on advice(Sep 2025)
Startup as RL problem(Sep 2025)
Bottom-up vs Top-down(Sep 2025)
AI-to-Human Knowledge Distillation(Jun 2025)
DeepSeek's open-source week(Mar 2025)
Minimal Basics of CUDA Programming(Feb 2025)
KV Cache Optimization(Jan 2025)