Posts tagged agent-systems
-
RLM: Recursive Language Model
TLDR: RLM's real insight is not recursion as a slogan. It moves long context out of the Transformer window and into an external environment that the model can inspect, slice, search, and delegate over.
-
Apodex-1.0: deep research as multi-agent verification
TLDR: Apodex-1.0 is most interesting as a verification-centric agent-system design: independent subagents explore, a shared report pool accumulates evidence, and verifier agents audit claims from outside the worker trace.
-
SkillOpt: training the procedure outside the weights
TLDR: SkillOpt treats an agent skill as an optimizable text artifact. The model stays frozen, rollouts provide evidence, an optimizer proposes edits, and a validation gate accepts only real improvements.
-
LeanMarathon: long-horizon formalization as agent engineering
LeanMarathon turns paper-level Lean formalization into a recoverable multi-agent engineering system: blueprint, proof DAG, scoped workers, reviewer issues, and CI gates keep long tasks from drifting.
-
Beyond Individual Intelligence: the LIFE frame for multi-agent systems
The LIFE survey is useful because it reframes LLM multi-agent systems as a lifecycle: build individual capability, integrate collaboration, attribute failures, and evolve the system.
-
Talk with Shunyu Yao: feedback is the center of AI research
TLDR: The conversation is useful because it frames AI research as system-driven experimental work: define verifiable problems, build feedback loops, debug carefully, and choose directions where scaling paths are still being shaped.