hermes-otel

OpenTelemetry for Hermes Agent

Fan LLM traces, tool calls, API requests, and token metrics out to any OTLP-compatible observability backend — Phoenix, Langfuse, LangSmith, SigNoz, Jaeger, Grafana Tempo, or your own collector. One plugin, parallel fan-out, zero hot-path blocking.

Quickstart →Browse backends

$ hermes plugins install briancaffey/hermes-otel

Why hermes-otel?

Hermes Agent is a production agent loop — tools, skills, memory, a gateway, messaging platforms. The moment you ship it, you need to see what it's actually doing: which tools fired, how many tokens the model burned, which turns stalled, which users hit errors.

hermes-otel turns every Hermes lifecycle hook into a properly-nested OpenTelemetry span with the right attribute conventions for the backend you're sending to — no adapter code per vendor. Drop it in, point it at an OTLP endpoint, and the traces show up.

Dual-convention attributes

Emits both gen_ai.* (Langfuse / SigNoz) and llm.token_count.* (Phoenix / OpenInference) so the UI in your chosen backend just works.

Multi-backend fan-out

Send the same span to Phoenix + Langfuse + Jaeger in parallel, each on its own non-blocking worker. One slow collector can't stall the others — or the agent.

Per-turn summary

Root session span gets tool count, tool names, skills used, API-call count, and final status. Dashboards don't need to JOIN across spans to answer "what happened in this turn?"

Non-blocking export

BatchSpanProcessor under the hood: span.end() is a queue push. A slow backend adds zero latency to tool calls or API requests on the hot path.

Privacy mode

Flip capture_previews: false to strip every input/output preview at the source. Metadata (tool names, durations, tokens) still flows.

Orphan-span sweep

Long-abandoned sessions don't leak state: a TTL sweeper finalizes stale root spans with final_status=timed_out so your UI stays clean.

Supported backends

Phoenix

Arize's OSS LLM observability platform. Local docker or Arize AX cloud. Traces + metrics.

Langfuse

OSS LLM engineering platform. Self-host or cloud. Traces only.

LangSmith

LangChain's tracing platform. Cloud with a free tier. Traces only.

SigNoz

OSS observability platform. Local docker or cloud. Traces + metrics + logs.

Jaeger

The classic distributed-trace UI. Single-container local. Traces only.

Grafana Tempo

Tempo + Grafana stack, OSS or Grafana Cloud. Traces only.

Generic OTLP

Any OTLP/HTTP collector. Drop in an endpoint and go.

Multi-backend

Fan the same spans out to several backends in parallel from one config.yaml.

The span hierarchy

session.{platform} / cron                 [root, GENERAL]
└── llm.{model}                           [LLM — input, output, total tokens]
    ├── api.{model}                       [LLM — prompt/completion tokens, duration]
    │   └── tool.{name}                   [TOOL — args, result, outcome]
    └── api.{model}                       [LLM — second round-trip, final response]

Each span carries the attributes both Langfuse (gen_ai.usage.input_tokens, gen_ai.content.prompt) and Phoenix (llm.token_count.prompt, input.value) expect — see Attribute conventions.

Where to go next


🚀 Quickstart	Install + Phoenix in a local Docker container, first trace in under 5 minutes
📦 Installation	Install into Hermes Agent's venv, optional `langsmith` extra
🧩 Concepts	Hooks, spans, fan-out, how the plugin wires into Hermes
🎯 Pick a backend	Comparison table, quick picks, decision flowchart
⚙️ Configuration	`config.yaml`, env vars, sampling, privacy, batch tuning
🏗️ Architecture	Span hierarchy, attribute conventions, turn summaries, orphan sweep
🛠️ Contributing	Run the test suite, add a backend, open a PR
📑 Reference	Every env var, every config key, every span attribute