2026
-
Heuristic Learning: maintaining a learning system in code
TLDR: Heuristic Learning treats iterative agent work as maintaining a verifiable software system. Feedback updates code, tests, rules, state representations, and memory rather than neural network weights.
-
自私的基因:第 3 章 不朽的双螺旋
TLDR: The durable unit is not the body but the replicating gene: bodies disappear, while genetic information keeps competing through copying and recombination.
-
CS336: Lecture 1 - Language Modeling as Engineering
TLDR: Modern LM work is easiest to understand by building the stack yourself, because tokenization, data, compute, and evaluation are all leaky engineering choices.
-
CS336: Lecture 2 - PyTorch and resource accounting
Lecture 2 is about making training cost concrete: tensors, dtypes, memory, FLOPs, autograd, optimizers, data loading, checkpoints, and mixed precision all have resource prices.
-
AMP: automatic mixed precision as a dispatch policy
TLDR: AMP is not "turn the model into half precision." It is a runtime policy that runs safe, high-throughput ops in lower precision while protecting numerically sensitive paths.
-
Autocurricula and Multi-Agent Innovation: 社会互动如何生成新问题
TLDR: Multi-agent intelligence should study how cooperation, competition, specialization, and shared discoveries create abilities that isolated agents would miss.
-
Social Dilemmas: 三个经典社会困境
TLDR: Social dilemmas show why individually rational actions can damage group outcomes, and why cooperation depends on payoffs, repetition, reputation, and norms.
-
A Social Path to Human-Like AI: 社会互动如何生成新数据
TLDR: Human-like AI may require populations of agents learning through social interaction, where cooperation and competition generate skills beyond single-agent training.