MCP trace propagation
When the agent calls a tool on a remote MCP server, that server's spans normally land in a separate, unlinked trace — so debugging a slow or failing MCP call means correlating timestamps by hand.
This plugin can forward the active span's context as a W3C
traceparent header on every outbound
MCP HTTP request, so the MCP server's spans join the same trace as the
agent. One query in your backend then shows the whole lifecycle: user message →
LLM call → tool dispatch → MCP transport → the MCP server's own work.
How it works
Two pieces cooperate:
- hermes-otel exposes the active trace context and registers an
mcp_request_headershook that returns{"traceparent": "..."}. - hermes-agent invokes that hook on the MCP transport's HTTP client and merges the returned headers onto each outbound request.
The span registry is process-global and keyed by session_id (not bound to a
context var), so the lookup works even though MCP requests run on a separate
background event loop from the agent task.
The header injection needs the mcp_request_headers hook in hermes-agent.
hermes-otel registers the hook only when the host advertises it, so on
older builds this feature is a silent no-op — nothing to configure, nothing to
break. Trace context is propagated over HTTP/StreamableHTTP MCP transports
(stdio servers have no request headers).
Enabling it
Nothing to configure. When both sides support it, the hook is registered automatically and you'll see one extra hook in the startup banner:
[hermes-otel] Registered 9 hooks
Then point the agent at an MCP server over HTTP (~/.hermes/config.yaml):
mcp_servers:
my-server:
url: "https://my-mcp-server.example.com/mcp"
For the MCP-server spans to actually appear linked, the MCP server must be
OpenTelemetry-instrumented (extract the incoming traceparent and emit spans
to a backend). Most OTel ASGI/HTTP middlewares do this out of the box.
Public API
If you're building your own integration, call the public helper directly instead of reaching into span internals:
from hermes_plugins.hermes_otel.hooks import get_current_traceparent
tp = get_current_traceparent(session_id) # "00-<trace>-<span>-01" or None
It returns the W3C traceparent for the active span of the given session
(falling back to the current context-var parent, then to any single active
session), or None when tracing is disabled or no span is active. Safe to call
from any thread or event loop.
Verifying
With an OTel-instrumented MCP server, after a tool call your backend should show
a single trace whose spans span both services — the agent's agent / tool.*
spans and the MCP server's request/handler spans — all sharing one
trace_id. If the MCP server logs request headers, you'll also see the
traceparent arrive on the tools/call request.
See also
- Span hierarchy — how the agent's spans nest.
- Backends overview — where these traces land.