The End of Software Engineering: code is becoming a runtime artifact

The title sounds like a prediction: software engineering is ending.

My read is more conservative and more useful. The paper is not saying that engineering disappears. It is saying that the durable center of engineering may move. In traditional software, the codebase is the long-lived body of the system. In agentic software, some code becomes a runtime artifact: generated, executed, inspected, and discarded inside a larger reasoning loop.

That does not end engineering. It changes what has to be engineered.

Notion image

The shallow reading

The shallow reading is:

AI writes code, so programmers matter less.

That frame is too small. It still imagines code as the final object, only produced by a different actor.

The stronger frame is:

Some decision logic moves from static code into a runtime agent loop.

The long-lived assets then become the things that make the loop reliable: task specification, tools, memory, policies, evals, traces, permission boundaries, and human review points.

From software object to delivered outcome

The paper describes a historical movement from local software to SaaS and then to Agent-as-a-Service.

The direction is not just “more cloud.” It is a deeper outsourcing of complexity:

Era	User buys	Vendor carries
Local software	an installed object	limited updates and support
SaaS	a maintained service	infra, deployment, upgrades
Agent-as-a-Service	an outcome loop	planning, tool use, execution, validation

In that last frame, the user may not care whether a specific script survives. The user cares whether the agent can keep producing the desired result under changing conditions.

This is where the paper becomes interesting. If code is sometimes a temporary tool, the question is no longer only “Is the code elegant?” It is also “Can the loop choose the right tool, verify the result, recover from failure, and preserve useful state?”

What becomes durable

If generated code is less permanent, something else must carry continuity.

The obvious candidates are:

Memory: what the system keeps about users, projects, failures, and preferences.
Tools: the stable operations the agent can call safely.
Skills: reusable procedures and constraints outside model weights.
Evals: the standing definition of acceptable outcomes.
Observability: traces, logs, replays, and explanations.
Governance: permissioning, audit, approval, rollback, and policy.

These are not decorations around the model. They are the engineering surface.

Evidence is directional, not final

The paper’s evidence supports the direction, but it does not prove an end state.

Benchmarks where agents repair isolated issues show clear progress. At the same time, continuous software evolution remains much harder: context goes stale, errors accumulate, tests miss intent, and long-horizon work needs memory and governance.

That contrast is the real lesson. Agents are already useful on bounded tasks with strong feedback. They are less mature when the task is long-lived, cross-cutting, and dependent on ambiguous product judgment.

So I would not read the roadmap as “we are already at self-evolving ecosystems.” I would read it as a map of missing infrastructure:

tool-augmented work
  -> single-task autonomy
  -> multi-agent teams
  -> self-evolving systems

The gap between the second and third stages is not just model capability. It is role design, shared memory, evaluation, observability, and human governance.

The new engineering job

The paper’s best phrase is not the title. It is the role shift: engineers become intent architects, agent coordinators, and outcome auditors.

That sounds abstract, so I translate it into concrete work:

write goals that survive contact with messy repositories;
expose safe tools with typed boundaries;
design memory that helps without contaminating context;
build evals that catch regressions;
create traces that make failures inspectable;
decide which actions require human approval;
maintain skills that compress repeated practice.

None of this is less technical than writing code. It is technical at a different layer.

My takeaway

The useful claim is not that software engineering ends. The useful claim is that code may stop being the only durable artifact worth organizing around.

If an agent can generate code on demand, the scarce asset becomes the runtime that tells it what to generate, when to trust it, how to test it, how to remember the result, and when to ask for help.

That makes this paper a good conceptual map for agent builders. The future may contain less hand-maintained boilerplate, but it will need more careful engineering of intent, tools, skills, memory, evals, observability, and governance.

Paper | HTML | PDF