Flows: Declarative Multi-Agent Coordination¶
Proposal — not yet implemented
This is a design document, not a shipped feature. None of the shoal flow commands described here exist yet. For the current multi-agent coordination primitives see Agent Teams and Robo Supervisor.
Date: 2026-03-31 Status: Design (not yet implemented)
Summary¶
Flows are a declarative, TOML-defined way to launch and manage a group of related sessions as a single unit. A flow is to Shoal sessions what docker-compose.yml is to containers: a named topology of lanes (sessions) with optional dependency ordering, shared context, and group-level operations.
Flows compose — they don't replace — templates, modes, tags, worktrees, robo supervision, and lifecycle hooks.
1. Problem Statement¶
Shoal has strong primitives for individual sessions: templates, modes, tags, worktrees, MCP sharing, handoff artifacts, and lifecycle events. Multi-agent coordination works today through implicit conventions:
- Naming (
feat/auth,review/auth) implies relationships - Tags (
implementer,review-ready) imply roles parent_idtracks forks but nothing broader- Robo supervision discovers workers by polling status, not by team membership
- Batch MCP ops require enumerating session names manually
This means:
- No declarative topology — Every multi-agent run is hand-assembled via sequential
shoal newcalls - No group identity — "These 4 sessions are one effort" exists only in the operator's head
- No dependency sequencing — Launching a reviewer before the implementer finishes requires manual timing
- No group operations — Killing a multi-session effort means N individual kills
- No automatic handoff flow — Context from upstream sessions doesn't auto-propagate downstream
2. Design¶
2.1 Flow Spec (TOML)¶
[flow]
name = "auth-rework"
description = "Rework auth middleware for compliance"
[[flow.lanes]]
role = "planner"
template = "claude-dev"
mode = "planner"
branch = "plan/auth-rework"
prompt = "Review the auth middleware and produce a compliance rework plan"
[[flow.lanes]]
role = "impl-tokens"
template = "pi-dev"
mode = "implementer"
depends_on = "planner"
branch = "impl/auth-tokens"
prompt = "Implement token storage changes per the plan"
[[flow.lanes]]
role = "impl-sessions"
template = "pi-dev"
mode = "implementer"
depends_on = "planner"
branch = "impl/auth-sessions"
[[flow.lanes]]
role = "reviewer"
template = "claude-review"
mode = "reviewer"
depends_on = ["impl-tokens", "impl-sessions"]
branch = "review/auth-rework"
Flow-level fields:
| Field | Type | Required | Description |
|---|---|---|---|
name |
str | yes | Unique flow identifier |
description |
str | no | Human-readable purpose |
Lane fields:
| Field | Type | Required | Description |
|---|---|---|---|
role |
str | yes | Unique within flow; becomes session name suffix |
template |
str | yes | Session template to use |
mode |
str | no | Operating mode override |
branch |
str | no | Git branch name |
prompt |
str | no | Initial prompt sent after agent launch |
depends_on |
str or list[str] | no | Role name(s) that must reach idle/done before this lane launches |
tags |
list[str] | no | Additional tags (merged with template + mode tags) |
env |
table | no | Extra env vars for this lane |
Session naming: {flow.name}/{lane.role} — e.g., auth-rework/planner, auth-rework/impl-tokens.
2.2 Discovery & Storage¶
Flow specs live in:
1. ~/.config/shoal/flows/*.toml (user-global)
2. .shoal/flows/*.toml (project-local, git root)
3. Inline via shoal flow start --file path/to/flow.toml
Flow state is stored in SQLite:
CREATE TABLE flows (
id TEXT PRIMARY KEY,
name TEXT UNIQUE NOT NULL,
spec_path TEXT,
status TEXT NOT NULL DEFAULT 'active', -- active, completed, killed
created_at TEXT NOT NULL,
completed_at TEXT
);
Sessions gain a nullable flow_id foreign key:
ALTER TABLE sessions ADD COLUMN flow_id TEXT REFERENCES flows(id);
ALTER TABLE sessions ADD COLUMN flow_role TEXT; -- lane role within the flow
2.3 CLI Surface¶
# Flow lifecycle
shoal flow start <name|file> # Launch a flow (create lanes per spec)
shoal flow status [name] # Show lane states + dependency progress
shoal flow kill <name> # Kill all lanes, mark flow completed
shoal flow ls # List active flows
# Operating on flow lanes
shoal flow attach <name> <role> # Attach to a specific lane
shoal flow log <name> # Aggregate journal entries across lanes
# Discovery
shoal flow list-specs # Show available flow definitions
shoal flow show <name> # Print resolved flow spec
Example shoal flow status auth-rework output:
auth-rework — Rework auth middleware for compliance
planner idle plan/auth-rework done (14m)
impl-tokens running impl/auth-tokens 12m active
impl-sessions waiting impl/auth-sessions needs approval
reviewer pending — blocked on: impl-tokens, impl-sessions
Status meanings:
- Lane statuses map directly to SessionStatus (running, waiting, idle, error, stopped)
- pending = not yet launched (waiting on depends_on)
- blocked on: X = dependency X hasn't reached target status
2.4 Dependency Resolution¶
Dependencies form a DAG (validated at parse time — cycles are rejected).
Trigger condition: A pending lane launches when all its depends_on lanes reach idle or stopped status. This hooks into the existing status_changed lifecycle event.
async def _on_status_changed(event: LifecycleEvent, **kwargs: Any) -> None:
session = kwargs["session"]
if session.flow_id and kwargs["new_status"] in ("idle", "stopped"):
await _check_and_launch_pending_lanes(session.flow_id)
Handoff injection: When a dependency is met, the upstream session's HandoffArtifact is appended to the downstream session's journal before launch. This gives the downstream lane full context without manual copy.
2.5 MCP Integration¶
New orchestration tools:
| Tool | Description |
|---|---|
flow_start |
Launch a flow by name or inline spec |
flow_status |
Get lane states for a flow |
flow_kill |
Kill all lanes in a flow |
Robo supervisors and orchestrator agents can manage flows through the same MCP interface they use for individual sessions.
2.6 Robo Integration¶
Robo supervision works unchanged — it still polls sessions by status. But flow metadata enables smarter decisions:
- Robo can query
session.flow_roleto know a session's purpose - Robo can check flow-level progress ("is the whole flow stuck or just one lane?")
- Escalation context includes flow name and role for better LLM decisions
3. What Flows Are NOT¶
- Not a DAG runner — No retries, conditional branches, loops, or error recovery. A flow launches lanes and tracks their status. If a lane fails, the operator decides what to do.
- Not an org chart — No hierarchy, delegation chains, or chain-of-command.
depends_onis sequencing, not authority. - Not replacing templates/modes — Flows compose them. Every lane is still a template-based session.
- Not mandatory —
shoal newcontinues to work for ad-hoc sessions. Flows are opt-in for multi-agent patterns. - Not Paperclip — No companies, budgets, approvals FSM, or heartbeat protocol. Flows are topology, not governance.
4. Implementation Plan¶
Phase 1 — Flow spec + start/status/kill (MVP)¶
Goal: Declare a multi-lane topology and launch/manage it as a unit.
- [ ]
models/flow.py—FlowSpec,LaneSpecPydantic models with TOML parsing - [ ]
core/flow.py— Flow spec discovery (user-global + project-local), validation (DAG cycle check, role uniqueness, template existence) - [ ] DB migration —
flowstable +flow_id/flow_rolecolumns onsessions - [ ]
services/flow_manager.py—start_flow(),kill_flow(),flow_status() start_flowcreates all lanes vialifecycle.create_session_lifecycle(), setsflow_id+flow_role- MVP: all lanes launch immediately (no
depends_onyet) - [ ]
cli/flow.py—shoal flow start,shoal flow status,shoal flow kill,shoal flow ls - [ ] Tests — spec parsing, validation, start/kill lifecycle, status rendering
No depends_on in Phase 1 — all lanes launch in parallel. This keeps the MVP small while still delivering group identity and group operations.
Phase 2 — Dependency sequencing¶
Goal: Lanes launch automatically when their dependencies are satisfied.
- [ ] DAG resolution in
flow_manager.py— topological sort, launch root lanes immediately, hold others aspending - [ ] Lifecycle hook on
status_changed— check pending lanes, launch when ready - [ ]
pendingpseudo-status for lanes not yet created as sessions - [ ] Flow status view shows dependency state (blocked on X, Y)
- [ ] Handoff injection — auto-append upstream
HandoffArtifactto downstream journal on launch
Phase 3 — MCP + robo integration¶
Goal: Agents and robo can manage flows programmatically.
- [ ] MCP tools:
flow_start,flow_status,flow_kill - [ ]
flow_rolein session snapshot responses - [ ] Robo-aware flow context in escalation prompts
- [ ]
shoal flow attach <name> <role>convenience command - [ ]
shoal flow log <name>aggregated journal view
Phase 4 — Ergonomics¶
Goal: Quality-of-life for daily flow usage.
- [ ]
shoal flow create <name>— interactive flow spec scaffolding - [ ] Flow templates (meta-templates that define common topologies: author-reviewer, planner-impl-review)
- [ ]
shoal popupflow-aware grouping (flow lanes grouped visually) - [ ] Fish completions for flow names and roles
5. Mapping to Existing Primitives¶
| Flow concept | Builds on | New code |
|---|---|---|
| Flow spec | Template TOML parsing patterns | models/flow.py |
| Lane | create_session_lifecycle() |
flow_id/flow_role fields |
| Group operations | Batch MCP ops | flow_manager.kill_flow() |
| Dependency sequencing | status_changed lifecycle event |
Hook + pending state tracking |
| Handoff injection | HandoffArtifact |
Auto-append on dependency met |
| Flow status | urgency.py + session status |
Aggregation view |
| MCP integration | mcp_shoal_server.py |
3 new tools |
| Robo integration | robo_supervisor.py |
Flow-aware context |
6. Open Questions¶
-
Lane restart semantics — If a lane errors out, should
shoal flow restart <name> <role>re-create just that lane? Or does the operator kill + manually re-create? -
Flow-level tags — Should the flow spec support tags that propagate to all lanes? (Probably yes —
flow.tags = ["sprint-42"]applied to every lane session.) -
Partial start — Should
shoal flow startsupport--only <role>to launch a subset of lanes? -
Flow completion — When is a flow "done"? All lanes idle? All lanes stopped? Explicit
shoal flow complete? -
Flow-scoped MCP — Should flows support a shared MCP server list that all lanes inherit? (May overlap with template MCP inheritance.)
-
Remote lanes — Can a lane target a remote host? This would compose with
shoal remotebut adds significant complexity.
7. Relation to Paperclip¶
This design is informed by Paperclip's company/agent/issue model but deliberately simpler:
| Paperclip | Shoal Flows | Why different |
|---|---|---|
| Company (multi-tenant org) | Flow (named session group) | Shoal is single-operator |
| Agent (persistent employee) | Lane (session with a role) | Sessions are ephemeral, not permanent |
| Issue + checkout (task tracker) | depends_on + handoff |
No separate task model needed — the session IS the work unit |
| Heartbeat (bounded execution) | Long-running sessions | Shoal sessions run until done, not in cycles |
| Org hierarchy (CEO → IC) | Flat lanes with sequencing | No authority model, just ordering |
| Budget enforcement | Not in scope | Could layer on later via cost tracking |
The key insight from Paperclip: group identity matters. Being able to say "these sessions are one effort" and operate on them as a unit is the 80/20 feature. The governance, budgets, and hierarchy are Paperclip-specific concerns Shoal doesn't need.