hermes-otel
OpenTelemetry for Hermes Agent
Fan LLM traces, tool calls, API requests, and token metrics out to any OTLP-compatible observability backend — Phoenix, Langfuse, LangSmith, SigNoz, Jaeger, Grafana Tempo, or your own collector. One plugin, parallel fan-out, zero hot-path blocking.
Why hermes-otel?
Hermes Agent is a production agent loop — tools, skills, memory, a gateway, messaging platforms. The moment you ship it, you need to see what it's actually doing: which tools fired, how many tokens the model burned, which turns stalled, which users hit errors.
hermes-otel turns every Hermes lifecycle hook into a properly-nested OpenTelemetry span with the right attribute conventions for the backend you're sending to — no adapter code per vendor. Drop it in, point it at an OTLP endpoint, and the traces show up.
Dual-convention attributes
Emits both gen_ai.* (Langfuse / SigNoz) and llm.token_count.* (Phoenix / OpenInference) so the UI in your chosen backend just works.
Multi-backend fan-out
Send the same span to Phoenix + Langfuse + Jaeger in parallel, each on its own non-blocking worker. One slow collector can't stall the others — or the agent.
Per-turn summary
Root session span gets tool count, tool names, skills used, API-call count, and final status. Dashboards don't need to JOIN across spans to answer "what happened in this turn?"
Non-blocking export
BatchSpanProcessor under the hood: span.end() is a queue push. A slow backend adds zero latency to tool calls or API requests on the hot path.
Privacy mode
Flip capture_previews: false to strip every input/output preview at the source. Metadata (tool names, durations, tokens) still flows.
Orphan-span sweep
Long-abandoned sessions don't leak state: a TTL sweeper finalizes stale root spans with final_status=timed_out so your UI stays clean.
Supported backends
config.yaml.The span hierarchy
session.{platform} / cron [root, GENERAL]
└── llm.{model} [LLM — input, output, total tokens]
├── api.{model} [LLM — prompt/completion tokens, duration]
│ └── tool.{name} [TOOL — args, result, outcome]
└── api.{model} [LLM — second round-trip, final response]
Each span carries the attributes both Langfuse (gen_ai.usage.input_tokens, gen_ai.content.prompt) and Phoenix (llm.token_count.prompt, input.value) expect — see Attribute conventions.
Where to go next
| 🚀 Quickstart | Install + Phoenix in a local Docker container, first trace in under 5 minutes |
| 📦 Installation | Install into Hermes Agent's venv, optional langsmith extra |
| 🧩 Concepts | Hooks, spans, fan-out, how the plugin wires into Hermes |
| 🎯 Pick a backend | Comparison table, quick picks, decision flowchart |
| ⚙️ Configuration | config.yaml, env vars, sampling, privacy, batch tuning |
| 🏗️ Architecture | Span hierarchy, attribute conventions, turn summaries, orphan sweep |
| 🛠️ Contributing | Run the test suite, add a backend, open a PR |
| 📑 Reference | Every env var, every config key, every span attribute |