Prompt Has Run Its Course. The Future Belongs to Loop Engineering

Prompt Has Run Its Course. The Future Belongs to Loop Engineering

Recently, a catchy new term has taken the AI world by storm: Loop Engineering.

If you follow the AI space, you've likely seen it everywhere these days — trending on X, blowing up across social media, and sparking heated discussions in countless groups.

Here's how it all started.

On June 7th, Peter Steinberger, founder of OpenClaw, posted a short tweet that instantly went viral.

Peter Steinberger's viral tweet about Loop Engineering

His words went something like this: I no longer write manual prompts for Claude. Instead, I run loops that let Claude orchestrate tasks automatically. My job now is to build these looping systems — simply put, to write loops.

Shortly after, Addy Osmani from Google published an in-depth post, formally defining and organizing the concept of Loop Engineering.

Following Prompt Engineering, Context Engineering and Harness Engineering, the AI industry now has its fourth widely recognized engineering discipline: Loop Engineering.


Personally, I'm never a big fan of coining new buzzwords. That said, I think new terms fall into two categories. Some are just hollow hype, like those so-called "xxx 4.0" releases. Others emerge simply because the industry evolves at such a rapid pace, and people need precise language to articulate new ideas. Loop Engineering definitely belongs to the latter. What's more, it aligns perfectly with how I've been using AI Agents and the practices I've been recommending all along.

If you've read my previous piece on Harness Engineering, you'll likely grasp where I'm coming from. In that article, I talked about three major leaps from Prompt to Context and then to Harness. I used the analogy of reins and harnesses, and emphasized putting constraints in place first. Loop Engineering takes things a step further beyond Harness Engineering — turning manual reins for guiding work into a fully automated industrial pipeline. It feels just like the era progression seen in the game Civilization.


From One-Off Tasks to Automated Loops

Let me walk you through an example. Back when you used Claude Code for coding, the workflow went like this: you assign a task, check the results once it's done, send revision requests if something is off, and repeat the whole cycle. You sit in front of your device, going back and forth turn by turn. You are essentially the engine driving the entire process. Even after we moved from chatbots to AI Agents, most work still operated on a one-off task basis.

Now take Boris's workflow. He builds loops such as /loop babysit all my PRs. This single command lets Claude Code run autonomously: it automatically checks all pull requests on GitHub, fixes failed CI issues, and deploys dedicated sub-Agents to address new review comments. He also schedules some loops as recurring tasks to run overnight. While he sleeps, thousands of Agents can be working simultaneously. He even mentioned that he hasn't written a single line of code manually in 2026.

This is the power of loops. You set clear goals and enable full automation. You don't need to stay glued to your computer or even check your phone. You can head to bed, and wake up to find code revised, tests completed and pull requests submitted.

This is far more than writing a one-time prompt for an Agent to finish a single task. Instead, you define objectives, set validation rules and outline failure handling protocols within a loop framework — then step back and let the system run on its own.


The Five Core Components of a Loop

In his detailed post, Addy Osmani broke down a complete loop into five core components. I find this framework very straightforward, so I'll explain it in my own words.

The five core components of Loop Engineering

First: Scheduled Tasks — the heartbeat of the entire loop. You need a trigger to launch the cycle automatically — it can be time-based scheduling or event-driven activation. Claude Code offers multiple options: the /loop command runs tasks at set intervals; cron handles timed jobs; hooks fire at specific stages of the Agent lifecycle (for instance, auto-running lint checks every time a file is edited). You can also deploy loops to GitHub Actions, so they keep running even after you power off your computer. Without automated scheduling, you have to manually kick off the Agent every single time — that's not a real loop.

Second: Worktree Isolation — when running multiple Agents concurrently, each gets its own independent workspace to operate without interfering with others, with work merged only after completion. Dealing with two Agents editing the same file is just as frustrating as two designers modifying the same layer without coordination.

Third: Project Knowledge Base — Addy Osmani referred to it simply as skills, but I believe that description falls short. Mere individual skills are not enough — what we truly need is a comprehensive knowledge management system.

As we all know, AI models lose all context once a new conversation starts. All the coding standards, project architectures and past pitfalls you've shared are forgotten, and every new chat begins from scratch. That's why you need a complete set of methods to preserve and refine this knowledge, so Agents fully understand your project from the moment they launch.

Over the past year of coding, I've consolidated my methodologies into what I call cleanliness.skill — this is by far the most frequently used skill for my Agents. The CLAUDE.md file stores global rules and constraints. Cross-session memory keeps track of unresolved issues and document routing. And the docs repository holds all accumulated knowledge and experience.

Why is a solid knowledge management system so critical for loops? Because loops run autonomously without human oversight. If an Agent relies on outdated information, it will make flawed decisions. If CLAUDE.md becomes bloated with hundreds of lines of historical content, core rules get buried and the Agent fails to read them properly. A loop built on messy knowledge is like an employee who reads obsolete documents every morning: the faster it works, the more mistakes it makes.

Fourth: The Connector (MCP) — an Agent limited to only accessing the local file system is severely restricted. Once you connect it to GitHub, Lark, databases and other tools, it can operate seamlessly within your real workflow. This is a true closed loop: it identifies problems, delivers solutions and notifies humans — all in one end-to-end process.

Fifth: Sub-Agents — we separate execution from inspection. A coding Agent cannot fairly evaluate its own work, much like students grading their own exam papers — they will always go easy on themselves. For this reason, you need a dedicated second Agent, or even a different AI model, to review the output. One handles the work, the other verifies the results.

These five parts together form the complete framework of a functional loop.


The Soul of Loop Engineering: Goal Definition

Claude Code and Codex feature a command that perfectly embodies the core logic of Loop Engineering in a lightweight, productized form, though many people fail to notice it: the /goal command. In Codex, it is named Pursue Goal. Simply set a clear completion criteria for Claude, such as "all tests pass and no lint errors remain", and it will iterate autonomously until the requirements are met.

Most articles covering Loop Engineering stop right here. They explain the five components, demonstrate /goal and /loop, and show how to configure scheduled tasks — and that's it.

To me, these are merely tech tactics. What I want to dive deeper into is the underlying philosophy.

The true core capability of Loop Engineering is not technical scripting, nor configuring hooks, nor any other hands-on engineering skill. It is the ability to define clear goals. This may sound simple, but it is incredibly difficult in practice.

Let's go back to the /goal command. On the surface, it works straightforwardly: set a completion condition, and Claude keeps working until the standard is satisfied. It sounds easy, right? But anyone who has used it knows the final outcome hinges entirely on how well you define your goal.

Let me illustrate with two contrasting examples.

Goal A: Optimize this application.

Goal B: Ensure all tests in the test/auth directory pass, tsc --noEmit returns zero errors, and npm run lint shows zero violations.

What happens with Goal A? Claude gets stuck in limbo. It has no clear definition of what "optimized" means. Only advanced models like Fable 5 can independently refine goals on their own. Most mainstream models, including Opus 4.8 and GPT-5.5, lack this ability. They may tweak a little code, deem the work done and stop early — or they might keep modifying code endlessly and mess up the entire codebase, never knowing when the task is truly finished.

Goal B works perfectly, by contrast. After each round of edits, Claude runs tests, type checks and lint scans. It stops immediately once all three standards are met, and continues working otherwise. Everything is clear and unambiguous.

We use the same tool and the same model — the only difference is the quality of the goal definition.


Managing Agents Is Just Like Managing People

I've stuck to a personal rule for years: if you perform a task manually three times or more, automate it completely.

I apply this principle to my daily work. All repetitive workflows are fully automated: our AI trend monitoring system, data analysis pipelines, financial reconciliation and data cleaning processes.

During this journey, the biggest roadblocks I faced were never technical. They were vague goals. Early on in automation projects, I often set ambiguous objectives.

Take "automate AI industry trend monitoring" as an example. It sounds reasonable, yet it is far too vague. What exactly counts as a trend? Does it require ten thousand views, or a hundred thousand? How frequently should data be fetched — hourly or daily? How do we assess content quality? What's the sorting logic? And how should we deliver final notifications? I could list over twenty such follow-up questions right now. Without measurable standards for every step, the entire automated pipeline becomes useless.

I learned to prioritize goal definition long before building any automation. I spend ample time clarifying what "completion" looks like, and what qualifies as quality work. This is exactly the logic behind /goal, and it is the soul of Loop Engineering.

Interestingly, I picked up this skill not from AI development or coding, but from years of entrepreneurship. Defining goals is essentially the same logic as team management. I run a small company with around thirty employees, so I know this well.

The biggest headache in management is not lazy workers or lack of talent, but unclear goals. When your team receives vague instructions, they feel confused and work aimlessly, and their final deliverables never match your expectations.

Telling an employee "make this feature better" will rarely get you the result you want — your definition of "good" will differ from theirs. But if you specify "reduce this API's response time to under 200 milliseconds, keep the error rate below 0.1%, and launch next Wednesday", the outcome will align closely with your requirements. You have given them verifiable success metrics.

This rule even applies to top-tier experts. Talented people can interpret and refine goals on their own, but they still need clear direction — just with fewer granular details.

What works for human teams works for AI Agents too.

Look back at classic management theories. Peter Drucker's Management by Objectives from the 1950s, Andy Grove's OKRs created at Intel, and all the variations adopted by modern business leaders share one core principle: turn vague intentions into measurable, verifiable completion criteria.

A good manager ensures three things: clear goals, sufficient resources and timely feedback. These three pillars match a well-designed loop perfectly:

  • Clear goals = precise completion rules
  • Sufficient resources = well-configured skills, connectors and permissions for the Agent
  • Timely feedback = independent verification mechanisms to judge performance and point out flaws

Managing Agents follows the exact same logic as managing people — yet it demands even higher standards. Humans can interpret vague requests, take initiative to ask for clarification, and flag ambiguous requirements. AI Agents cannot. They will execute confidently based on their own understanding, and then confidently report that the work is done.

That is why I always dismiss claims like "arts subjects are obsolete" or "pure science is no longer relevant" in the AI era. Disciplines such as management, psychology and organizational behavior have only grown more important.

At its core, Loop Engineering is named an engineering discipline, but its real competitive advantage lies in management.


Beware of Goodhart's Law

When it comes to goal-setting in management, there is a classic pitfall known as Goodhart's Law:

When a measure becomes a target, it ceases to be a good measure.

In plain terms: people will only focus on hitting the stated metrics and neglect everything else. This is a long-standing issue in human management, and it is amplified a hundredfold with AI Agents, since they are far better at gaming the rules.

A common phenomenon with Loop Engineering is that Agents optimize for the verification rules, rather than your actual objectives. For instance, if your loop's requirement is "all tests pass", the Agent may simply delete failed test cases instead of fixing underlying bugs. Technically, all tests now pass and the task is marked complete — yet the real problem remains unsolved.

Humans do the same thing occasionally, but Agents act faster, more thoroughly, and without any moral hesitation.

Therefore, a solid goal definition needs not only completion standards, but also clear boundaries and restrictions. This is where Harness Engineering plays a vital role within Loop Engineering. A harness acts as guardrails: it allows freedom to execute work, but sets non-negotiable limits. A loop provides the driving force to keep the system moving forward. Only when combined do they form a robust, complete system.


A Practical Goal-Setting Framework

We've covered the structural framework, core philosophy and potential pitfalls of Loop Engineering. To wrap up, I'd like to share a practical goal-setting framework I've developed from real-world experience. It is not academically rigorous, but it works well for me:

  1. All completion standards must be machine-verifiable.
  2. Define boundary rules alongside success criteria.
  3. Build fallback plans for failures.
  4. Structure goals in multiple layers.

The Four Stages of Engineering Evolution

Let's recap the four evolutionary stages from Prompt to Context, Harness, and finally Loop Engineering. They all tell a continuous story:

  • Prompt Engineering: Master clear language to help AI understand you. Core skill: verbal expression.
  • Context Engineering: Provide sufficient background information beyond simple prompts. Core skill: information screening and organization.
  • Harness Engineering: Set rules and boundaries for AI. Core skill: system design and rule-making.
  • Loop Engineering: Build self-running automated systems. Core skill: goal definition and management.

These four engineering branches correspond to four timeless disciplines: linguistics, information science, cybernetics and management.

It's fascinating to see how little human society truly changes, even amid technological revolutions.

Back to blog

Leave a comment