You Can't Learn From Zero Rewards
•1 min read
Advice on advice
•2 min read
Why advice often fails
AI-to-Human Knowledge Distillation
•1 min read
How do we turn superhuman AI capabilities into human skills?
Startup as RL problem
•4 min read
Startup as RL problem
Bottom-up vs Top-down
•3 min read
Bottom-up vs Top-down
DeepSeek's open-source week
•34 min read
FlashMLA, DeepEP, DeepGEMM, DualPipe, EPLB, 3FS and Smallpond
Minimal Basics of CUDA Programming
•31 min read
Basics of GPU programming with CUDA.
KV Cache Optimization
•5 min read
Token selection, post-hoc compression, and architectural redesigns including MLA, GQA, and MQA