Skip to main content

Span hierarchy

Every Hermes turn produces a nested tree of spans. This page documents what each one is for and the data it carries.

The tree

agent / cron [AGENT]
├── skill.{name} [SKILL] (load → turn end; overlaps OK)
└── llm.{model} [LLM]
├── api.{model} [LLM]
│ ├── tool.{name} [TOOL]
│ ├── tool.skill_view [TOOL] (the call that *loads* skill.{name})
│ ├── tool.{name} [TOOL] (parallel tool calls — siblings)
│ ├── approval.{pattern_key} [APPROVAL] (human-in-the-loop wait; gen_ai.tool.call.id → the gated tool)
│ ├── tool.delegate_task [TOOL] (the tool call that delegates)
│ └── subagent.{role} [AGENT] (delegation span)
│ └── agent (child session root) [AGENT] (the child rejoins this trace)
│ └── llm.{model} → api.{model} → tool.{name} ... (child's own work)
└── api.{model} [LLM] (second round-trip after tool results)
  • agent / cron — the root for each turn. Present when session hooks are available in the Hermes build; absent on older versions (the llm.* span becomes the root).
  • skill.* — one per skill loaded during the turn. Spans from the load (a skill_view call or a /skills/ read) to the turn boundary, nested under the turn root. Skills overlap freely — two loaded in one turn are two concurrent siblings. See skill.* below.
  • llm.* — one per logical model turn. Wraps one or more HTTP round-trips to the provider.
  • api.* — one per HTTP round-trip. Tools run during a round-trip, so their parent is api.*, not llm.*.
  • tool.* — one per tool invocation. Parallel tool calls are siblings under the same api.*.
  • approval.* — one per dangerous-command approval prompt. Spans the time the agent blocks waiting for a human to approve/deny, correlated to the gated tool via gen_ai.tool.call.id. See approval.* below.
  • subagent.* — one per delegated child agent. The child's own root span nests under it so a multi-agent run is one connected trace. See subagent.* below.

session.* / cron

The root span. Name is derived from the Hermes session kind (session.cli, session.telegram, session.discord, cron, etc.).

Span kind: GENERAL (no OpenInference-specific kind).

Key attributes, set at start:

AttributeSource
openinference.project.nameOTEL_PROJECT_NAME / HERMES_OTEL_PROJECT_NAME
hermes.session.kindFrom Hermes (cli, telegram, cron, etc.)
hermes.session.idHermes session ID
session.idSame as above, standard OTel naming
user.idHermes user ID (when available)

Key attributes, set at end (the turn summary):

AttributeTypeMeaning
hermes.turn.tool_countintDistinct tool names invoked
hermes.turn.toolsstringSorted CSV of distinct tool names (≤500 chars)
hermes.turn.tool_targetsstring|-joined distinct file paths / URLs
hermes.turn.tool_commandsstring|-joined distinct shell commands
hermes.turn.tool_outcomesstringSorted CSV of distinct outcome statuses
hermes.turn.skill_countintDistinct skills inferred
hermes.turn.skillsstringSorted CSV of distinct skill names
hermes.turn.api_call_countintNumber of pre_api_request hooks fired
hermes.turn.final_statusstringcompleted · interrupted · incomplete · timed_out

See Turn summary for why these exist.

llm.*

One per logical model turn. Name is llm.{model} (e.g. llm.claude-sonnet-4-6).

Span kind: LLM (OpenInference).

AttributeConventionMeaning
llm.model_nameOpenInferenceModel name
llm.providerOpenInferenceProvider (anthropic, openai, etc.)
input.valueOpenInferenceUser message or full conversation history (see below)
input.mime_typeOpenInferencetext/plain or application/json
output.valueOpenInferenceFinal assistant response
output.mime_typeOpenInferencetext/plain
gen_ai.request.modelgen_aiModel name (for Langfuse / SigNoz)
gen_ai.content.promptgen_aiUser message (same content as input.value when both are strings)
gen_ai.content.completiongen_aiAssistant response
hermes.conversation.message_counthermes-specificWhen capture_conversation_history: true

By default input.value is the latest user turn only. To see the full message list the model saw, enable conversation capture.

api.*

One per HTTP round-trip to the provider. Name is api.{model}.

Span kind: LLM (OpenInference).

AttributeConventionMeaning
llm.token_count.promptOpenInferencePrompt tokens
llm.token_count.completionOpenInferenceCompletion tokens
llm.token_count.totalOpenInferenceSum of the above
llm.token_count.cache_readOpenInferencePrompt tokens read from cache (if provider reports)
llm.token_count.cache_writeOpenInferencePrompt tokens written to cache (if provider reports)
gen_ai.usage.input_tokensgen_aiPrompt tokens (for Langfuse)
gen_ai.usage.output_tokensgen_aiCompletion tokens (for Langfuse)
gen_ai.usage.cache_read_input_tokensgen_aiCache read (if provider reports)
gen_ai.usage.cache_creation_input_tokensgen_aiCache write (if provider reports)
llm.invocation_parametersOpenInferenceJSON of temperature, max_tokens, etc.
gen_ai.response.finish_reasongen_aistop, tool_use, length, etc.
http.duration_mshermes-specificWall-clock duration of the HTTP call

The api.* span is the right place to look for token counts — not the parent llm.* (which doesn't carry per-call counts, because a turn can have multiple api.* calls).

tool.*

One per tool invocation. Name is tool.{name} (e.g. tool.bash, tool.read_file).

Span kind: TOOL (OpenInference).

AttributeConventionMeaning
tool.nameOpenInferenceTool name
input.valueOpenInferenceTool args (JSON)
output.valueOpenInferenceTool result
hermes.tool.targethermes-specificInferred file path / URL (see Tool identity)
hermes.tool.commandhermes-specificInferred shell command
hermes.tool.outcomehermes-specificcompleted · error · timeout · blocked
hermes.skill.namehermes-specificSkill inferred from args paths (optional)

Errors: hermes.tool.outcome=error also maps the span's StatusCode to ERROR. Timeouts and blocked tools stay OK so dashboards don't count them as failures.

skill.*

One per skill the agent loads during a turn. A skill isn't a tool call — it's loaded once (the agent calls skill_view, or reads a /skills/<name>/ file) and then guides the rest of the turn. So the span opens at the load and closes at the turn boundary (on_session_end), nested under the turn's agent root rather than the in-flight tool/LLM span. That makes "which skills were active, and for how long" a visible part of the timeline.

Skills overlap: load two in one turn and you get two concurrent skill.* siblings — loading a skill again in the same turn keeps the first window (no duplicate span). Controlled by skill_spans (default on); the hermes.skill.name attribute on the tool span and the skill_inferred counter are emitted regardless.

AttributeConventionMeaning
hermes.skill.namehermes-specificSkill name
gen_ai.skill.namegen_ai (ext.)Skill name (GenAI-convention alias)
hermes.skill.sourcehermes-specificskill_view (canonical) or path_match
hermes.skill.pathhermes-specificConventional ~/.hermes/skills/<name> location
hermes.skill.result_statushermes-specificTurn outcome: completed · interrupted · incomplete
hermes.span_kindhermes-specificskill (for UI grouping)

Status is always OK — a skill being active is never itself an error; the turn's outcome rides on hermes.skill.result_status.

approval.*

One per dangerous-command approval prompt. When a tool trips an approval rule, the agent blocks waiting for a human to approve or deny — often the single biggest chunk of a turn's wall-clock, and previously invisible. This span makes that human-decision latency a first-class part of the trace, and records what was decided (an audit signal for allowlist tuning).

The span opens on pre_approval_request and closes on post_approval_response, nested within the turn (under the active api.*/agent span) and correlated to the tool it gates via gen_ai.tool.call.id. It's keyed off the turn_id the hook carries (which embeds the session id), so it lands in the right trace even though the approval system runs on its own executor thread. Works on both the CLI (surface=cli) and gateway surfaces (surface=gateway — Telegram, Discord, Slack, …).

AttributeConventionMeaning
hermes.approval.pattern_keyhermes-specificThe matched approval rule (e.g. rm_rf) — also the span name
hermes.approval.choicehermes-specificonce · session · always · deny · timeout
hermes.approval.grantedhermes-specifictrue for once/session/always
hermes.approval.timed_outhermes-specifictrue when the human never answered
hermes.approval.duration_mshermes-specificHuman-decision wait time (ms)
hermes.approval.surfacehermes-specificcli or gateway
hermes.approval.commandhermes-specificThe command awaiting approval (preview-clipped; suppressed when capture_previews=false)
hermes.approval.descriptionhermes-specificWhy approval is required (preview-clipped)
gen_ai.tool.call.idgen_aiThe gated tool call — correlates this approval to its tool.* span

Status is always OK — a denial or timeout is a legitimate human outcome, not an error. Metrics: hermes.approval.count (by choice / pattern_key) and hermes.approval.duration (the wait histogram). These spans are observer-only — the plugin never vetoes or alters an approval.

subagent.*

One per delegated child agent (when the parent calls the delegate_task tool). Name is subagent.{role} (e.g. subagent.leaf, subagent.researcher).

Span kind: AGENT (OpenInference).

The delegation span opens on subagent_start and nests under the parent turn's in-flight api.*/llm.* span. Crucially, the delegated child's own root span rejoins this trace: when the child runs in the same process (the default), its agent root nests directly under the subagent.* span, so a multi-agent run is a single connected tree instead of many disconnected traces.

AttributeConventionMeaning
gen_ai.operation.namegen_aiinvoke_agent
gen_ai.agent.namegen_aiThe child's role
hermes.subagent.rolehermes-specificChild role (leaf, orchestrator, …)
hermes.subagent.goalhermes-specificWhat the child was asked to do (preview)
hermes.subagent.child_session_idhermes-specificThe child's session ID (join key)
hermes.subagent.parent_session_idhermes-specificThe delegating parent's session ID
hermes.subagent.parent_turn_idhermes-specificThe parent turn that delegated
hermes.subagent.child_idhermes-specificThe child's sub-agent ID
hermes.subagent.statushermes-specificReported child_status (set on stop)
hermes.subagent.duration_mshermes-specificChild wall-clock duration (set on stop)
hermes.subagent.summaryhermes-specificThe child's result summary (set on stop)

The child's agent root carries hermes.session.is_subagent=true plus hermes.subagent.parent_session_id / hermes.subagent.role so you can filter child runs even when looking at a single span.

Status: failure-like child_status values (error, failed, cancelled, timeout) map the span to ERROR; anything else (including a missing status) stays OK.

Metrics: hermes.subagent.count{role, status} and hermes.subagent.duration{role}.

See Hooks reference for the rejoin mechanism and the cross-process span-link fallback.

Why this shape?

The tree mirrors the agent's execution structure:

  • One root per turn so you can filter "one user question worth of work" in the backend UI.
  • llm.* as a logical parent of all api.* because the conversation-with-the-model is one coherent thing even when it takes multiple HTTP calls.
  • tool.* under api.* because tools run between rounds of model inference, within a specific HTTP response's tool_calls. The api.* parent makes that explicit.

See Attribute conventions for the dual-convention mapping side-by-side.