Tracing
Agenta's observability sits on top of OpenTelemetry. This page covers the span and trace model, the ag.* attribute namespace, references, the four ingestion endpoints, and querying.
In OTel, a span is a unit of work and a trace is the set of spans sharing one trace_id. In Agenta, spans also carry an ag.* attribute subtree. Agenta uses ag.* to tag every trace with a type (annotation or invocation), to link the trace back to the entity that produced it through ag.references, and to surface server-computed metrics under ag.metrics.
Spans and traces
A span records one unit of work, such as a request, an LLM call, or a tool execution. Every span carries:
| Field | Description |
|---|---|
trace_id | 32-char hex that groups all spans of one operation. |
span_id | 16-char hex identifying this span. |
parent_id | The parent's span_id, or null for a root span. |
start_time, end_time | ISO-8601 timestamps. |
span_kind | OTel kind (SPAN_KIND_SERVER, SPAN_KIND_INTERNAL, and so on). |
status_code | STATUS_CODE_UNSET, STATUS_CODE_OK, or STATUS_CODE_ERROR. |
attributes | Free-form dict. Agenta's data lives under the ag.* subtree. |
When an endpoint returns a single trace, the payload wraps the trace_id with a nested SpansTree: spans keyed by span_name under each parent's spans field (a list when siblings share a name). This is the shape the UI renders as a waterfall.
Trace types
Every trace declares a type through ag.type.trace:
| Value | Meaning |
|---|---|
invocation | The trace records a product event: an application run, a workflow step, an LLM call. Most traces produced by instrumented code are invocations. |
annotation | The trace records a judgment on another trace. Evaluator runs and human feedback both produce annotations. Annotations carry a link back to the invocation they describe and a reference to the entity that produced them. |
unknown | The type was not declared or could not be inferred. |
The distinction matters at query time. Annotation UIs filter for ag.type.trace = annotation; evaluation views walk invocations.
- An invocation has
ag.type.trace = invocation. It is the original product event. - An annotation has
ag.type.trace = annotationand at least one entry underlinkspointing at the invocation it describes. The entity that produced it (an evaluator revision, a user, an external system) sits underag.references.
Each span also declares its role through ag.type.span:
| Value | Role |
|---|---|
workflow | The root of a product run (an application or evaluator invocation). |
agent | An agent loop driving tool calls or sub-tasks. |
chain | A composed sequence of calls treated as one unit. |
task | A generic unit of work. Default when no other type fits. |
tool | A tool call from an agent. |
llm | A raw LLM request. |
chat, completion | Chat-completion or completion calls. |
embedding, query, rerank | Index and retrieval calls. |
References
A reference links a trace (or a span) back to the Agenta entity that produced or triggered it. References live under ag.references.<key> and each value names an entity revision:
{"id": "019d952f-0000-0000-0000-000000000000", "slug": "support-bot", "version": "3"}
Any of the three fields is enough. id always resolves. slug resolves within the project. version applies to revisions and is a string.
The supported keys:
| Key | Points at |
|---|---|
application, application_variant, application_revision | The application artifact, variant, or revision. |
workflow, workflow_variant, workflow_revision | The workflow artifact, variant, or revision. |
evaluator, evaluator_variant, evaluator_revision | The evaluator artifact, variant, or revision. |
testset, testset_variant, testset_revision | The testset artifact, variant, or revision. |
query, query_variant, query_revision | A saved query, variant, or revision. |
environment, environment_variant, environment_revision | A deployment target. |
snippet, snippet_variant, snippet_revision | A reusable snippet. |
testcase | A specific testcase. |
Links vs references
The two terms look similar but do different things.
| Links | References | |
|---|---|---|
| Spec | OTel SpanLink | Agenta extension |
| Points at | Another span in another trace (trace_id + span_id) | An Agenta entity (id / slug / version) |
| Used for | "This span continues from span Y in that other trace." | "This trace was produced by application revision X." |
An annotation trace typically carries both: a link to the invocation span being annotated, and a reference to the evaluator that produced the annotation.
How evaluations weave traces together
Evaluations tie traces and the rest of the API together:
- Each evaluation run invokes its subject (an application revision) over a testset.
- Every invocation produces an
invocationtrace. - Every evaluator execution produces an
annotationtrace that links back to the invocation it scored. - Evaluation scenarios reference both trace IDs via their links.
So a single evaluation run leaves behind one invocation trace per (testcase × subject) plus one annotation trace per (testcase × evaluator). Evaluation scenarios join them back together at query time.
Ingestion
The Agenta SDK and OTel auto-instrumentation handle ingestion for you in the common cases. See Observability for the SDK setup. Use the endpoints below only when you have a custom integration that bypasses the SDK.
There are two supported write endpoints. Pick the one that matches your payload shape.
| Endpoint | When to use |
|---|---|
POST /otlp/v1/traces | Raw OTLP protobuf from an OTel SDK or collector. Point any OTel exporter here. |
POST /simple/traces/ | Single-trace helper for feedback or annotations. The endpoint generates the IDs and wraps the payload. Use this for one standalone trace. |
Traces created through POST /traces/ and updated through PUT /traces/{trace_id} round-trip the canonical Trace shape if you need whole-trace CRUD against records you already have.
The Agenta-native ingest endpoints POST /tracing/spans/ingest and POST /traces/ingest are deprecated. Point any OTel-style ingest at POST /otlp/v1/traces instead. For one-off feedback or annotation traces, use POST /simple/traces/. The Agenta SDK already targets the supported endpoints.
How ingest works
Every ingest endpoint returns 202 Accepted before the spans are persisted. The router parses the payload, hands the spans off to an async worker, and responds immediately. This keeps the ingest buffer fast, so instrumented apps do not block on telemetry.
The response shape:
{
"count": 2,
"links": [
{"trace_id": "f5a2efb40895881e938e2ebc070beca8", "span_id": "15f3df0731995245"},
{"trace_id": "f5a2efb40895881e938e2ebc070beca8", "span_id": "c8ae7f12d1b3e9a4"}
]
}
count is the number of spans the router parsed and accepted into the stream. links lists the IDs that made it through.
count reflects parse-time acceptance, not persistence. If the worker rejects spans later (for example, because a field doesn't match the canonical schema), those failures are not visible in the response. Use the query endpoints to confirm what landed.
Agenta's ingest adapter is lenient. When a field doesn't fit the canonical ag.* schema, the adapter parks it under ag.unsupported.* on the span instead of rejecting the span. See ag.unsupported for what lands there.
The ag.* attribute namespace
Spans carry an OTel attributes dict. Agenta owns the ag subtree.
| Namespace | Purpose |
|---|---|
ag.type.trace, ag.type.span | Trace and span type enums. |
ag.data.inputs, ag.data.outputs, ag.data.parameters, ag.data.internals | What the span saw and produced. |
ag.metrics.duration, ag.metrics.errors, ag.metrics.tokens, ag.metrics.costs | Server-computed metrics. Each carries a cumulative and incremental view. |
ag.references | Entity references (see above). |
ag.flags, ag.tags, ag.meta | Filtering and labelling dicts. |
ag.exception | Structured exception payload. |
ag.session.id, ag.user.id | Optional session and user identifiers. |
Query responses return the namespace as a nested object:
{
"attributes": {
"ag": {
"type": {"trace": "invocation", "span": "workflow"},
"data": {"inputs": {"country": "France"}, "outputs": "Paris"}
}
}
}
ag.unsupported
When the ingest adapter can't fit a field into the canonical ag.* schema, the field lands under ag.unsupported.* on the resulting span. Look in query responses to find anything that came in but didn't match.
Common cases:
- Unknown top-level keys under
ag.*. - Values whose type doesn't match the expected schema.
- JSON strings under
ag.data.{inputs, parameters, internals}that fail to parse.
Querying
Two endpoints query spans, each returning a different shape:
POST /traces/queryreturns spans grouped into a nested tree bytrace_id.POST /spans/queryreturns a flat list of spans with cursor pagination.
Both accept the same filter and windowing parameters. See Windowing for cursor details.
The legacy /tracing/traces/query and /tracing/spans/query paths are deprecated; use the paths above.
Examples
Ingest spans from your runtime
For programmatic ingestion of nested traces, point any OTel exporter at POST /otlp/v1/traces. The Agenta SDK and the OTel auto-instrumentation set this up for you; see Observability for SDK setup, or the OpenTelemetry exporter documentation if you're configuring a raw OTel collector.
Record a single annotation
curl -X POST "$AGENTA_HOST/api/simple/traces/" \
-H "Content-Type: application/json" \
-H "Authorization: ApiKey $AGENTA_API_KEY" \
-d '{
"trace": {
"origin": "human",
"kind": "adhoc",
"channel": "api",
"data": {"outputs": {"score": 5, "comment": "Good response"}},
"references": {"evaluator": {"slug": "user-feedback"}},
"links": {
"invocation": {
"trace_id": "0af7651916cd43dd8448eb211c80319c",
"span_id": "b7ad6b7169203331"
}
}
}
}'
The response returns the created trace with a server-generated trace_id and span_id. Use those IDs with GET, PATCH, or DELETE on /simple/traces/{trace_id}.