Delegation & subagents

When a task is big enough to flood the main conversation, or when several independent pieces of work could happen at once, Hermes can spawn subagents with the delegate_task tool. Each child gets a fresh, isolated context and its own terminal session. Only the child's final summary comes back to the parent, which keeps the main context lean.

Single task

python

delegate_task(
    goal="Debug why tests fail",
    context="Error: assertion in test_foo.py line 42",
    toolsets=["terminal", "file"],
)

Parallel batch

Up to 3 children run concurrently by default:

python

delegate_task(tasks=[
    {"goal": "Research topic A", "toolsets": ["web"]},
    {"goal": "Research topic B", "toolsets": ["web"]},
    {"goal": "Fix the build", "toolsets": ["terminal", "file"]},
])

The one rule that matters: subagents know nothing

WARNING

A subagent starts with a completely fresh conversation. It has zero knowledge of the parent's history or prior tool calls. Everything it needs must be in the goal and context fields.

python

# BAD - the child has no idea what "the error" is
delegate_task(goal="Fix the error")

# GOOD - all context is passed in
delegate_task(
    goal="Fix the TypeError in api/handlers.py",
    context="""api/handlers.py raises a TypeError on line 47:
    'NoneType' object has no attribute 'get'. parse_body() returns None
    when Content-Type is missing. Project is at /home/user/myproject, Python 3.11.""",
    toolsets=["terminal", "file"],
)

Picking toolsets for children

Toolsets	Use case
`["terminal", "file"]`	Code work, debugging, builds
`["web"]`	Research, fact-checking
`["file"]`	Read-only analysis / code review
`["terminal"]`	System administration

Leaf subagents cannot call delegate_task, clarify, memory, send_message, or execute_code, this keeps them focused and prevents runaway recursion.

Local-model considerations

Delegation multiplies inference work: three parallel children means three concurrent generations. On a single local GPU that is slower, not faster, since they contend for the same hardware. Two patterns help:

Route children to a smaller/cheaper model so they don't fight the main model for resources:

yaml

# ~/.hermes/config.yaml
delegation:
  model: qwen3.5-coder
  base_url: http://localhost:11434/v1
  api_key: local-key
  max_concurrent_children: 2

Or keep delegation sequential by lowering max_concurrent_children to 1 when you are GPU-bound.

delegate_task is synchronous, not durable

WARNING

delegate_task runs inside the parent's current turn and blocks until children finish. If the parent is interrupted (you send a new message, /stop), all children are cancelled and their work is discarded. Children do not keep running after the turn ends.

For durable, long-running work that must survive interrupts, use a cron job (cronjob create) or a backgrounded terminal command (terminal(background=True, notify_on_complete=True)) instead.

Delegation vs execute_code

Factor	`delegate_task`	`execute_code`
Reasoning	Full LLM loop	Just Python execution
Best for	Tasks needing judgment	Mechanical multi-step pipelines
Token cost	Higher	Lower (only stdout returns)

Use delegate_task when the subtask needs reasoning; use execute_code for scripted data processing.

Nested orchestration (advanced)

By default delegation is flat: a parent spawns children that cannot delegate further. For multi-stage workflows, a child can be spawned with role="orchestrator" and delegation.max_spawn_depth raised above 1. Be careful: with depth 3 and 3 children per level, the tree can reach 27 concurrent agents, which is far too much for a local box.

TIP

You don't usually invoke delegation yourself, the agent decides when a task benefits from it. See the official Delegation Patterns guide for hands-on examples.

Delegation & subagents ​

Single task ​

Parallel batch ​

The one rule that matters: subagents know nothing ​

Picking toolsets for children ​

Local-model considerations ​

delegate_task is synchronous, not durable ​

Delegation vs execute_code ​

Nested orchestration (advanced) ​

Delegation & subagents

Single task

Parallel batch

The one rule that matters: subagents know nothing

Picking toolsets for children

Local-model considerations

delegate_task is synchronous, not durable

Delegation vs execute_code

Nested orchestration (advanced)