Current Position:fig. beginning " Course materials

12-Factor Agents How we got here: a brief history of software

2025-07-22

Detailed version: how we got to this point

You don't have to listen to me.

Whether you're new to intelligences or a grumpy veteran like me, I'm going to try to convince you to ditch most of your pre-existing views on AI intelligences, take a step back, and rethink them from first principles. (In case you missed the release of OpenAI's API features a couple weeks ago, here's the spoiler: pushing more intelligent body logic behind the API is not the right direction)

Intelligent bodies as software, and a brief history of them

Let's talk about how we got here.

60 years ago

We will talk a lot about directed graphs (DGs) and its acyclic friends, directed acyclic graphs (DAGs). The first thing I want to point out is that ...... well ...... software is a directed graph. There's a reason we used to use flowcharts to represent programs.

12-Factor Agents How We Got Here: A Brief History of Software-1

20 years ago

About 20 years ago, we started to see DAG orchestrators become popular. We're talking about things like Airflow,Prefect Such classic tools, as well as some predecessors and some newer ones like (dagster,inggest,windmill). They follow the same graph pattern with the added benefits of observability, modularity, retrying, and management.

12-Factor Agents How We Got Here: A Brief History of Software-1

10-15 years ago

When machine learning models start to become good enough to use, we start to see them dotted around in DAGs. You might think of steps like "summarize the text in this column into a new column" or "categorize support questions by severity or sentiment".

12-Factor Agents How We Got Here: A Brief History of Software-1

But at the end of the day, it's still essentially the same old deterministic software.

Future Prospects for Smart Bodies

I'm not the first , but the biggest thing I learned when I started learning about intelligences is that you can ditch the DAG. software engineers no longer need to write code for every step and edge case, you can give intelligences a goal and a set of transitions:

12-Factor Agents How We Got Here: A Brief History of Software-1

Then let the big language model make decisions in real time to figure out the path.

12-Factor Agents How We Got Here: A Brief History of Software-1

The prospect here is that you write less software, and that you simply give the graph of the Big Language Model "edges" and let it figure out the "nodes" on its own. You can recover from bugs, you can write less code, and you may find that the big language model finds novel solutions to problems.

Intelligentsia as a cycle

In other words, you have a loop consisting of 3 steps:

The Big Language Model determines the next step in the workflow, outputting structured JSON ("tool calls")
Deterministic code execution tool calls
The result is appended to the context window
Repeat this process until the next step is identified as "Finish".

initial_event = {"message": "..."}
context = [initial_event]
while True:
next_step = await llm.determine_next_step(context)
context.append(next_step)
if (next_step.intent === "done"):
return next_step.final_answer
result = await execute_step(next_step)
context.append(result)

Our initial context is just the startup event (maybe a user message, a cron task trigger, a webhook, etc.), and then we let the big language model choose the next step (the tool) or determine if the task is complete.

This is a multi-step example:

The resulting "materialized" DAG looks like this:

12-Factor Agents How We Got Here: A Brief History of Software-1

Problems with this "round-robin-until-solution" model

The biggest problem with this model is:

When the context window becomes too long, the intelligences get lost - they keep trying the same failed methods over and over again.
It's really just this one issue, but it's enough to make this method inch forward.

Even if you haven't built intelligences manually, you've probably seen this long context problem when using an intelligent body coding tool. They get lost in use and you need to open a new chat.

I'd even like to make a point that I've heard often and that you may have intuitively picked up on as well:

Even if the model supports longer and longer context windows, you'll always get better results with short, focused hints and contexts

Most of the developers I've talked to, when they realized that after more than 10-20 rounds of dialog, things got messy and the big language model couldn't recover from it, werePutting the "tool call loop" idea on the back burner.Even if the Intelligent Body 90% is timed correctly, this is far from being "good enough for delivery". Even if a smart body gets it right 90% of the time, that's still far from "good enough to hand over to the customer". Can you imagine a web application with a 10% page load that crashes?

2025-06-09 Update - I like it very much. @swyx of this statement:

12 - Factor Agents How we got here: a brief history of software - 8

What really works -- micro-intelligence

One approach that I do see often in real-world applications is to adopt and punctuate intelligent body patterns into a broader, more deterministic DAG.

12-Factor Agents How We Got Here: A Brief History of Software-1

You might ask -- "Why use intelligentsia in this case?" -- we'll get to that later, but basically, having a language model manage sets of well-scoped tasks makes it easy to integrate real-time feedback from real people and translate it into workflow steps without getting caught in a contextual error loop. (Element 1,Element 3,Elements 7).

Allowing language models to manage explicitly scoped task sets makes it easy to integrate real-time feedback from real people ...... without getting caught in contextual error loops

A real-life example of a micro-intelligence

This is an example of how deterministic code can run a micro-intelligence that handles the "man in the loop" step of the deployment process.

humankind Merge PRs into GitHub's main branch
Deterministic code Deploying to a staging environment
Deterministic code Run end-to-end (e2e) testing against pre-release environments
Deterministic code Give the task to the Intelligence for production environment deployment with the initial context "Deploy SHA 4af9ec0 to the production environment"
intelligent body invocations deploy_frontend_to_prod(4af9ec0)
Deterministic code Requesting human approval for this operation
humankind Reject the action with the feedback, "Can I deploy the backend first?"
intelligent body invocations deploy_backend_to_prod(4af9ec0)
Deterministic code Requesting human approval for this operation
humankind Approve the operation
Deterministic code Perform back-end deployment
intelligent body invocations deploy_frontend_to_prod(4af9ec0)
Deterministic code Requesting human approval for this operation
humankind Approve the operation
Deterministic code Perform front-end deployment
intelligent body Determine the successful completion of the task, end!
Deterministic code Run end-to-end tests against production environments
Deterministic code Task completion, or passing the task to the Rollback Intelligence to review the failure and possibly rollback

This example is based on one of our Humanlayer releases that manages our deployments. Real Open Source Intelligence -- Here's a real conversation I had with it last week:

12-Factor Agents How We Got Here: A Brief History of Software-1

We have not given this intelligence a great deal of tools or tasks. The main value of the Big Language Model is in parsing plain text feedback from humans and suggesting updated courses of action. We isolate tasks and context as much as possible to keep the Big Language Model focused on a small, 5-10 step workflow.

Here's another one. More classic support/chatbot demoThe

So, what exactly is an intelligent body?

Prompt (prompt) - Tells the big language model how to act and what "tools" are available to it. The output of the hint is a JSON object that describes the next step in the workflow ("tool call" or "function call"). (Element 2)
switch statement - Based on the JSON returned by the Big Language Model, decide what to do with it. ( Part of element 8)
Cumulative context - Stores a list of steps that have occurred and their results. (Element 3)
for loop - Before the Big Language Model issues some kind of "terminate" tool call (or plain text response), add the result of the switch statement to the context window and ask the Big Language Model to choose the next step. (Element 8)

12-Factor Agents How We Got Here: A Brief History of Software-1

In the "deploybot" example, we gained several benefits by mastering control flow and context accumulation:

in our switch statement cap (a poem) for loop in which we can hijack the control flow to pause waiting for human input or waiting for a long-running task to complete.
We can easily serialize (textual) context window for pause and resume.
in our Prompt (prompt) In this case, we can strongly optimize the way instructions and "what's happened so far" are delivered to the big language model.

Part II commander-in-chief (military) Formalizing these patternsThey can be used in any software project to add impressive AI functionality without having to adopt the traditional implementation/definition of an "AI Intelligence Body" in its entirety.

May not be reproduced without permission:Chief AI Sharing Circle " 12-Factor Agents How we got here: a brief history of software

12-Factor Agents How we got here: a brief history of software

Detailed version: how we got to this point

You don't have to listen to me.

Intelligent bodies as software, and a brief history of them

60 years ago

20 years ago

10-15 years ago

Future Prospects for Smart Bodies

Intelligentsia as a cycle

Problems with this "round-robin-until-solution" model

Even if the model supports longer and longer context windows, you'll always get better results with short, focused hints and contexts

What really works -- micro-intelligence

Allowing language models to manage explicitly scoped task sets makes it easy to integrate real-time feedback from real people ...... without getting caught in contextual error loops

A real-life example of a micro-intelligence

So, what exactly is an intelligent body?

Related articles

Recommended

Can't find AI tools? Try here!

Recommended Tools

New Releases

12-Factor Agents How we got here: a brief history of software

Detailed version: how we got to this point

You don't have to listen to me.

Intelligent bodies as software, and a brief history of them

60 years ago

20 years ago

10-15 years ago

Future Prospects for Smart Bodies

Intelligentsia as a cycle

Problems with this "round-robin-until-solution" model

Even if the model supports longer and longer context windows, you'll always get better results with short, focused hints and contexts

What really works -- micro-intelligence

Allowing language models to manage explicitly scoped task sets makes it easy to integrate real-time feedback from real people ...... without getting caught in contextual error loops

A real-life example of a micro-intelligence

So, what exactly is an intelligent body?

Related articles

Recommended

Can't find AI tools? Try here!

Recommended Tools

New Releases

Quick query station AI tool