Blog

2026

Heuristic Learning: maintaining a learning system in code
TLDR: Heuristic Learning treats iterative agent work as maintaining a verifiable software system. Feedback updates code, tests, rules, state representations, and memory rather than neural network weights.
4min read · May 21, 2026
2026 · heuristic-learning · learning-systems · reading · systems
自私的基因：第 3 章不朽的双螺旋
TLDR: The durable unit is not the body but the replicating gene: bodies disappear, while genetic information keeps competing through copying and recombination.
2min read · May 19, 2026
2026 · selfish-gene · reading
CS336: Lecture 1 - Language Modeling as Engineering
TLDR: Modern LM work is easiest to understand by building the stack yourself, because tokenization, data, compute, and evaluation are all leaky engineering choices.
2min read · May 18, 2026
2026 · cs336 · language-modeling · learning · systems
CS336: Lecture 2 - PyTorch and resource accounting
Lecture 2 is about making training cost concrete: tensors, dtypes, memory, FLOPs, autograd, optimizers, data loading, checkpoints, and mixed precision all have resource prices.
8min read · May 18, 2026
2026 · cs336 · resource-accounting · learning · systems
AMP: automatic mixed precision as a dispatch policy
TLDR: AMP is not "turn the model into half precision." It is a runtime policy that runs safe, high-throughput ops in lower precision while protecting numerically sensitive paths.
4min read · May 18, 2026
2026 · mixed-precision · gpu-systems · reading · systems
Autocurricula and Multi-Agent Innovation: 社会互动如何生成新问题
TLDR: Multi-agent intelligence should study how cooperation, competition, specialization, and shared discoveries create abilities that isolated agents would miss.
2min read · May 16, 2026
2026 · leibo · multi-agent-systems · agents · research
Social Dilemmas: 三个经典社会困境
TLDR: Social dilemmas show why individually rational actions can damage group outcomes, and why cooperation depends on payoffs, repetition, reputation, and norms.
2min read · May 16, 2026
2026 · leibo · social-dilemmas · agents · research
A Social Path to Human-Like AI: 社会互动如何生成新数据
TLDR: Human-like AI may require populations of agents learning through social interaction, where cooperation and competition generate skills beyond single-agent training.
3min read · May 16, 2026
2026 · leibo · social-ai · agents · research