Skip to main content

hermes-otel

OpenTelemetry for Hermes Agent

Fan LLM traces, tool calls, API requests, and token metrics out to any OTLP-compatible observability backend — Phoenix, Langfuse, LangSmith, SigNoz, Jaeger, Grafana Tempo, or your own collector. One plugin, parallel fan-out, zero hot-path blocking.

$ hermes plugins install briancaffey/hermes-otel

Why hermes-otel?

Hermes Agent is a production agent loop — tools, skills, memory, a gateway, messaging platforms. The moment you ship it, you need to see what it's actually doing: which tools fired, how many tokens the model burned, which turns stalled, which users hit errors.

hermes-otel turns every Hermes lifecycle hook into a properly-nested OpenTelemetry span with the right attribute conventions for the backend you're sending to — no adapter code per vendor. Drop it in, point it at an OTLP endpoint, and the traces show up.

Dual-convention attributes

Emits both gen_ai.* (Langfuse / SigNoz) and llm.token_count.* (Phoenix / OpenInference) so the UI in your chosen backend just works.

Multi-backend fan-out

Send the same span to Phoenix + Langfuse + Jaeger in parallel, each on its own non-blocking worker. One slow collector can't stall the others — or the agent.

Per-turn summary

Root session span gets tool count, tool names, skills used, API-call count, and final status. Dashboards don't need to JOIN across spans to answer "what happened in this turn?"

Non-blocking export

BatchSpanProcessor under the hood: span.end() is a queue push. A slow backend adds zero latency to tool calls or API requests on the hot path.

Privacy mode

Flip capture_previews: false to strip every input/output preview at the source. Metadata (tool names, durations, tokens) still flows.

Orphan-span sweep

Long-abandoned sessions don't leak state: a TTL sweeper finalizes stale root spans with final_status=timed_out so your UI stays clean.

Supported backends

The span hierarchy

session.{platform} / cron [root, GENERAL]
└── llm.{model} [LLM — input, output, total tokens]
├── api.{model} [LLM — prompt/completion tokens, duration]
│ └── tool.{name} [TOOL — args, result, outcome]
└── api.{model} [LLM — second round-trip, final response]

Each span carries the attributes both Langfuse (gen_ai.usage.input_tokens, gen_ai.content.prompt) and Phoenix (llm.token_count.prompt, input.value) expect — see Attribute conventions.

Where to go next

🚀 QuickstartInstall + Phoenix in a local Docker container, first trace in under 5 minutes
📦 InstallationInstall into Hermes Agent's venv, optional langsmith extra
🧩 ConceptsHooks, spans, fan-out, how the plugin wires into Hermes
🎯 Pick a backendComparison table, quick picks, decision flowchart
⚙️ Configurationconfig.yaml, env vars, sampling, privacy, batch tuning
🏗️ ArchitectureSpan hierarchy, attribute conventions, turn summaries, orphan sweep
🛠️ ContributingRun the test suite, add a backend, open a PR
📑 ReferenceEvery env var, every config key, every span attribute