← All posts

Category: reading

Yuandong Tian talks: search quality is action-space quality
TLDR: More rollouts are not enough. Search becomes powerful when the action space, representation, evaluator, and memory make good trajectories easier to find.
3min read · May 22, 2026
2026 · search · research-methods · reading · research
Compression Is All You Need: measuring mathematical progress
TLDR: A mathematical abstraction is valuable when it compresses downstream work: proofs become shorter, repeated patterns disappear, and the library becomes easier to extend.
3min read · May 21, 2026
2026 · mathematical-progress · evaluation · reading · systems
Heuristic Learning: maintaining a learning system in code
TLDR: Heuristic Learning treats iterative agent work as maintaining a verifiable software system. Feedback updates code, tests, rules, state representations, and memory rather than neural network weights.
4min read · May 21, 2026
2026 · heuristic-learning · learning-systems · reading · systems
自私的基因：第 3 章不朽的双螺旋
TLDR: The durable unit is not the body but the replicating gene: bodies disappear, while genetic information keeps competing through copying and recombination.
2min read · May 19, 2026
2026 · selfish-gene · reading
AMP: automatic mixed precision as a dispatch policy
TLDR: AMP is not "turn the model into half precision." It is a runtime policy that runs safe, high-throughput ops in lower precision while protecting numerically sensitive paths.
4min read · May 18, 2026
2026 · mixed-precision · gpu-systems · reading · systems
Talk with Shunyu Yao: feedback is the center of AI research
TLDR: The conversation is useful because it frames AI research as system-driven experimental work: define verifiable problems, build feedback loops, debug carefully, and choose directions where scaling paths are still being shaped.
4min read · May 14, 2026
2026 · research-methods · agent-systems · reading · research
Anthropic Blogs: harness engineering and context engineering
The shared lesson across these Anthropic engineering posts is that long agent tasks fail at the runtime layer: context, evaluation, sandboxing, permissions, handoff, and feedback have to be engineered.
4min read · May 13, 2026
2026 · harness-engineering · context-engineering · reading · research
Building a C compiler with agent teams
The C compiler experiment worked because the project had the right substrate for agents: a modular architecture, objective tests, Git as shared memory, task locks, readable logs, and oracles that turned one giant goal into many local failures.
5min read · May 13, 2026
2026 · compiler-agents · multi-agent-systems · reading · agents