Assistants, Workflows, and Agents: Designing for the Right Level of Autonomy

Agent has become one of the most overloaded terms in AI. Product teams use it to describe everything from a chat interface with retrieval to a long-running process that can plan, call tools, and take actions on its own. That vocabulary drift creates a practical problem: teams start arguing about labels when the real design question is architectural.

For engineers and technical decision-makers, the useful question is not whether a system "is an agent." The useful question is: who controls the next step? In some systems, the next step is fixed in code. In others, the model can choose among tools, decide what information to gather next, and determine when to stop or escalate. Those are different control patterns, with different reliability, cost, and risk profiles.

This is why the common assistant-versus-agent binary is too shallow. It collapses several distinct design decisions into a single marketing term. A better framing is to distinguish assistants, workflows, and agents as architectural patterns, then place them on an autonomy spectrum defined by permissions, approval boundaries, and stop conditions. The focus in this post is those control boundaries, not the runtime mechanics of loops. Those come next.

Post 4 separated four information layers: conversation memory, task state, working memory, and durable knowledge. That vocabulary matters here because autonomy design depends directly on what information the system can read, write, and treat as evidence. A system that collapses those layers will make poor autonomy decisions regardless of how it is labeled.

DiagramThe right autonomy design starts by separating user-driven assistance, code-directed workflows, and model-directed agents, then placing approval boundaries before action.

What This Post Is Not

This is not a post about runtime loop mechanics, tool contracts, or failure recovery. Those operational details belong to Post 6. It is also not a governance or auditability post; the full production standard for approval gates and replayable traces comes in Post 10. This post owns the conceptual framing: what autonomy means as a design spectrum and where approval boundaries should sit.

Why Agent Terminology Gets Confusing

Part of the confusion comes from mixing product language with system design language. Assistant is often a user-facing metaphor. It suggests a system that helps a person complete a task, usually through conversation. That can describe many architectures: a single-turn chat application, a retrieval-backed question-answering system, a deterministic workflow wrapped in a chat UI, or a bounded agent that plans multiple steps before responding.

Agent, by contrast, is often used as if it were a clean technical category. In practice, it is not. Different teams use it to mean:

any system that can call tools
any system that can complete a multi-step task
any system that behaves proactively
any system where the model helps decide the process rather than only generating a final answer

Those definitions overlap, but they are not identical. A system can call one tool without being meaningfully agentic. A system can execute multiple steps without deciding any of them itself. A proactive system can still be tightly constrained by code. If the terminology is loose, design reviews become loose too.

The corrective is to stop treating assistant and agent as mutually exclusive product categories. Instead, treat them as shorthand around deeper control questions:

Is the execution path predetermined or model-directed?
Can the model choose the next action, or only fill in a step?
What tools can it use?
What state can it read and write?
Where must a human approve, interrupt, or stop execution?

Those questions are much more stable than the labels.

Assistants, Workflows, and Agents as Architectural Patterns

The cleanest way to separate these ideas is to define them in terms of control logic rather than personality.

Assistant

An assistant is best understood as an interaction pattern, not a strict system class. It is a system designed to help a user perform a task, often through natural language interaction. The important point is that an assistant can be implemented with very different levels of autonomy.

An assistant might be:

a single model call that drafts a reply
a retrieval-backed interface that cites internal documents
a deterministic orchestration that always runs the same sequence of steps
a bounded agent that can gather evidence and propose next actions before asking for approval

So assistant tells you something about the product experience, but not enough about the internal architecture.

Workflow

A workflow is a predefined process. The control logic lives primarily in code, with the model used inside bounded steps. The model might classify, summarize, extract, rank, or draft, but the sequence of operations is largely fixed in advance.

Examples include:

retrieve documents, extract relevant tables, normalize units, then synthesize a summary
classify a request, route it to one of three approved flows, then generate a response
ingest a PDF, run layout extraction, validate required fields, then write structured output

Workflows are often the right default because they are easier to test, easier to reason about, and easier to constrain. Their weakness is flexibility. If the task is irregular or the correct sequence depends heavily on what the system discovers along the way, a rigid workflow can become brittle or expensive to maintain.

Agent

An agent is a system in which the model has meaningful control over part of the execution process. That usually means the model can decide which action to take next, which tool to call, what information to gather, whether to revise the plan, and when to escalate or stop within defined boundaries.

This does not require open-ended autonomy. It does not imply human-level judgment. It does not mean the system is free to do anything it wants. In production, useful agents are usually bounded systems:

they operate within a defined task scope
they can access only approved tools
they run under explicit policies
they face approval gates before high-impact actions
they terminate under fixed stop conditions

What makes a system more agentic is not that it feels more intelligent. It is that more of the control logic has been delegated from deterministic code to model-driven decision-making.

A More Useful Distinction: Fixed Process vs Model-Directed Process

This framing helps separate three situations that are often blurred together.

In a fixed workflow, code decides the next step. The model may perform subtasks, but it does not direct the process.

In an assistive system, the user may remain the primary driver. The system helps with retrieval, drafting, and analysis, but the user chooses each next move.

In an agentic system, the model can direct at least part of the process. It may decide to retrieve more evidence, call a parser, compare conflicting results, ask a clarifying question, or stop because confidence is too low or approval is required.

That distinction matters because reliability changes when the model controls process rather than only output. Once the model can choose actions, error can compound across steps. A mistaken retrieval choice can poison later synthesis. An unnecessary tool call can add latency and cost. A missing stop condition can turn a useful loop into a runaway one.

The architectural shift is real, but it should be described precisely. This is not a separate kind of intelligence. It is a different placement of control.

Autonomy Is a Spectrum

The next mistake teams make is treating autonomy as binary. Either the system is "just an assistant" or it is "an autonomous agent." That framing hides the decisions that actually matter.

Autonomy is better understood as a spectrum. One practical version looks like this:

1. Fixed Pipeline

The system follows a hard-coded path. It may use models internally, but the model does not choose what happens next.

Typical use cases:

document ingestion
structured extraction
fixed compliance checks
repeatable back-office processing

2. Routed Workflow

The system still uses predefined flows, but code or a classifier selects which flow to run. This adds some flexibility without giving the model broad control over execution.

Typical use cases:

routing support tickets by intent
choosing between extraction templates
selecting among approved answer modes

3. Assistive System

The system can help with drafting, retrieval, and synthesis, but the user remains in charge of the sequence. The model is useful, but the human still controls the process.

Typical use cases:

research copilots
coding assistants that suggest changes but do not apply them
internal analytics assistants that assemble evidence for a human reviewer

4. Bounded Agent

The system can plan or revise subtasks, call tools, and choose what to do next within a constrained environment. It can complete portions of a task with limited supervision, but important actions remain gated.

Typical use cases:

investigation workflows
research memo preparation
debugging or analysis loops
multi-step data gathering under approval boundaries

5. Long-Running Autonomous System

The system can continue operating across longer horizons with limited direct supervision, potentially monitoring state and taking repeated actions. This is the highest-risk end of the spectrum and the hardest to validate.

Typical use cases:

narrow operational automation in highly controlled domains
internal process automation with extensive monitoring and rollback

Most teams do not need to jump to the far end of this spectrum. In fact, many should not. Increasing autonomy can improve flexibility, but it also raises the burden of evaluation, observability, permissions design, and failure handling.

Approval Boundaries Matter More Than Branding

Once you think in terms of autonomy levels, the next design question is where to place approval boundaries. This is where many "agent" conversations become concrete.

Use approval boundary as the broad design term: it marks where the system may prepare an output but may not advance on its own. An approval gate is the concrete stop point where that boundary is enforced in a workflow.

For any system with model-directed behavior, you should be able to answer four questions clearly:

What can the model decide on its own?
What tools can it call?
What state can it read or modify?
What actions require human approval?

Those boundaries are not documentation details. They are core parts of the architecture.

Consider the difference between these two policies:

the model may draft a booking recommendation
the model may confirm and pay for the booking

The first is a bounded analysis task. The second crosses into operational control and carries a much higher consequence if the system is wrong. The difference is not subtle, and it should not be hidden behind the same agent label.

Approval gates are especially important as implementation points when the system can:

write to durable records
contact customers or external partners
trigger code changes in production
spend money
launch physical processes
change system configuration

In those cases, a human review step is often not a temporary compromise. It is the correct design.

Stop conditions matter too, but we only need the concept here: once the model controls part of the process, the system must know when it should stop, escalate, or wait at an approval boundary. The operational details of retries, stops, and escalation paths belong in the next post.

Running Example: An OptiVerse Travel Copilot With Approval Gates

Use the travel copilot for OptiVerse Travel from the rest of this series. A travel consultant asks:

Prepare the Kyoto hotel segment for trip JPN-2026-0417: find accessible rooms during cherry blossom season within the client's budget, and recommend whether Contract KYO-H12 is the best option.

That request can be implemented in several ways.

Version 1: Assistive

The system retrieves partner hotel contracts, accessibility specification sheets, seasonal rate tables, and past client reviews for Kyoto properties. It surfaces three candidate hotels with pricing and accessibility details. The consultant then decides what to ask next: narrow the search by price, inspect a specific hotel's accessibility record, or request a recommendation.

This is assistive because the system helps substantially, but the human controls each step in the process.

Version 2: Workflow

The system runs a fixed path:

search partner hotels for accessible rooms in Kyoto during the requested dates
filter results by wheelchair accessibility requirements
extract pricing from seasonal rate tables and normalize to USD
rank options by budget fit and accessibility rating
present a comparison summary for the consultant to review

This is a workflow because the sequence is predetermined. It may be reliable and efficient if the task shape is stable. But if availability has changed or a hotel's accessibility claims are inconsistent with its actual facilities, the system may have no good way to adapt except through more hard-coded branches.

Version 3: Bounded Agent

The system receives the same goal, but now it can decide some of the process for itself. It might:

start with retrieval across partner contracts, availability feeds, and accessibility records
notice that one hotel lists a roll-in shower in its contract but the accessibility specification sheet describes only a tub with grab bars
verify JR Pass coverage for planned Shinkansen segments — the Nozomi is not included in the base pass (a supplementary ticket has been available since October 2023, but the system flags this as a cost and availability detail for the consultant to confirm)
call a rate-extraction tool on Contract KYO-H12 to pull the seasonal pricing appendix
compare those rates against two alternative partner properties
identify the discrepancy between the hotel's accessibility claims and its actual room specifications
ask the consultant for approval before widening the search to non-partner hotels
produce a recommendation with the accessibility conflict clearly flagged

This is more agentic because the model can choose what to do next within a constrained task. But the boundaries remain firm:

it may gather evidence across contracts and availability feeds
it may draft the recommendation
it may suggest booking a specific hotel
it may not confirm any booking
it may not modify the client's trip file or itinerary
it may not authorize any payment or deposit

That distinction also sets up a simple governance ladder that will recur later in the series: a system may describe, it may recommend, or it may act. Those are different authority levels and should not be collapsed together.

That is the practical meaning of bounded autonomy. The system is useful precisely because it has some discretion, not because it has unlimited authority.

When Not to Use an Agent

One of the most important design decisions is recognizing when agentic behavior is unnecessary.

Do not use an agent when the task is stable enough to encode directly. If the sequence is known, repeatable, and easy to validate, a workflow is usually better.

Do not use an agent when errors are high-impact and the benefits of flexible planning are small. A deterministic process with explicit checks often wins in domains involving compliance, finance, safety, or production operations.

Do not use an agent when the available tools are weak, underspecified, or difficult to observe. Giving a model control over poor tools does not create useful autonomy. It creates opaque failure.

Do not use an agent when your team cannot yet evaluate multi-step behavior. Once a model directs process, single-turn prompt evaluation is not enough. You need step-level tracing, tool-call inspection, and tests for termination and escalation behavior.

Do not use an agent because the interface is conversational. A chat box does not imply that the underlying system should plan or act autonomously. Many excellent products should remain mostly deterministic behind a natural-language surface.

In practice, teams often reach for agents too early because the label sounds advanced. The better progression is usually the reverse:

start with a workflow
identify where rigidity hurts task performance
add bounded model-directed decision points only where they earn their complexity

That progression tends to produce systems that are easier to ship and easier to trust.

Common Misconceptions

Several misconceptions repeatedly distort design choices.

Any system with a tool is an agent. No. Tool use alone is not enough. A workflow can call tools in a fixed sequence without giving the model meaningful control over process.

Agent means fully autonomous. No. Many useful agents are tightly bounded and require approval for the most important actions.

Assistant means simple chatbot. No. An assistant can sit on top of anything from a single model call to a complex orchestration layer.

More autonomy means more intelligence. No. It means more delegated control. Whether that improves outcomes depends on task structure, tool quality, and control design.

If the model can plan, humans should leave the loop. In high-consequence systems, that is often exactly the wrong conclusion. Planning ability does not remove the need for approval boundaries.

The Real Design Question

When teams say they want an agent, they are usually asking for one of three things:

the ability to handle irregular multi-step tasks
the ability to choose tools or evidence dynamically
the ability to reduce how much process control the human must supply

Those are legitimate goals. But each one should be translated into an architectural decision about control placement, permissions, and stopping rules. The most useful question in design review remains the simplest one:

Who controls the next step?

If the answer is always code, you have a workflow. If the answer is always the user, you have an assistive interaction pattern. If the answer is sometimes the model, within explicit boundaries, you are entering agentic design.

That framing is more useful than arguing over whether a product deserves an agent label.

Bridge to Agent Loops in Practice

Once you decide that a task needs bounded model-directed control, the next problem is operational: how should that loop actually run?

That is the subject of the next post. We will move from terminology to mechanics: reason-act-observe loops, tool calling, planner-executor patterns, tool failures, and the controls needed to keep a bounded agent useful instead of erratic. The goal is not to make systems look more autonomous. It is to make multi-step tool use legible, testable, and safe enough to deploy.

Source Notes

This post draws on the following primary and practitioner sources:

Anthropic. "Building effective agents." Practitioner reference for distinguishing workflows from agents, starting with simple patterns, and using bounded autonomy. anthropic.com/engineering/building-effective-agents

BAIR. "The Shift from Models to Compound AI Systems." Reference for control logic in code versus control logic delegated to the model. bair.berkeley.edu/blog/2024/02/18/compound-ai-systems

Schick, T., Dwivedi-Yu, J., Dessi, R., et al. "Toolformer: Language Models Can Teach Themselves to Use Tools." Reference for model-directed tool selection as a capability. arxiv.org/abs/2302.04761

Yao, S., Zhao, J., Yu, D., et al. "ReAct: Synergizing Reasoning and Acting in Language Models." Reference for reason-act-observe loops, used here as a bridge to the next post. arxiv.org/abs/2210.03629

National Institute of Standards and Technology. "Artificial Intelligence Risk Management Framework: Generative Artificial Intelligence Profile." Reference for oversight, guardrails, and risk framing as autonomy increases. nist.gov/publications/artificial-intelligence-risk-management-framework-generative-artificial-intelligence

Assistants, Workflows, and Agents: Designing for the Right Level of Autonomy

Building AI Systems

Memory, State, and Knowledge: Stop Calling Everything "Memory"

Agent Loops in Practice: ReAct, Tools, and Failure Modes

What This Post Is Not

Why Agent Terminology Gets Confusing