Tracing
๐ง Philosophy
Tracing in INTELLITHING is about observability by default. When workflows and chat engines become complex โ with streaming responses, triggers, and multiple orchestration layers โ it becomes hard to answer questions like:
- What inputs did the system receive?
- Which path was taken through the workflow?
- What was the final output sent back to the user?
We wanted a solution that provides clarity without burden. Instead of engineers scattering print
statements or operators wading through opaque logs, tracing gives a structured timeline of every request and response.
The philosophy is simple: each request leaves behind a breadcrumb trail. That trail is human-readable, machine-analyzable, and always tied back to the original session.
๐ Key Concepts
At its core, tracing provides a span-based view of activity in the system:
- Trace: A top-level record of one request (e.g., a
/stream-response
call). - Span: A labeled section of work inside a trace (e.g., โprepare queryโ, โmodel streamโ, โmemory updateโ).
- Inputs: Data captured at the start of a span.
- Outputs: Data (final or intermediate) emitted before the span closes.
- Status: Success, error, or cancelled (e.g., if a client disconnects mid-stream).
Together, traces and spans form a hierarchical timeline of what happened during a request.
๐ Key Definitions
- Root Span: The span that represents the overall request (always one per trace).
- Child Span: Nested spans that represent sub-tasks or components.
- Context: Metadata attached to a span (endpoint name, query text, workflow path).
- Finalize: The moment a span closes, recording outputs and status.
๐งฉ Tracing at a Glance
One-line Form
Minimal Example
A /stream-response
call automatically creates a root span:
- Inputs:
{"query": "What is RAG?"}
- Child Spans: workflow selection, memory prep, model streaming
- Outputs: Final text streamed back to the client
- Status: Success
From the outside, you see a normal streaming response. Behind the scenes, you get a structured timeline of exactly what happened.
โ๏ธ How Tracing Fits into INTELLITHING
Tracing is not an add-on โ it is woven into the runtime:
- Every API endpoint (like
/stream-response
) automatically opens a root span. - Wrappers around streaming responses ensure that spans only close after all output is delivered (or on error/disconnect).
- Input, output, and error details are captured once in the span, not scattered across logs.
- Tracing integrates with your existing observability tools โ but it can also be inspected directly for debugging.
This means you can follow a single user query from input to final streamed text without digging through raw logs.
๐งญ Tracing in /stream-response
The streaming endpoint is a good illustration:
- Start: A root span
stream_response
is opened with attributes{endpoint: "/stream-response"}
. - Inputs Recorded: The userโs query text.
-
Execution:
-
If workflows are loaded, the router engine runs as a child span.
- If no workflows are present, the model stream runs directly.
-
Streaming:
-
Each chunk is logged as it passes through.
- Wrappers accumulate the full output while sending deltas to the client.
-
Finalize:
-
The full response text is recorded.
- The span is closed (success, error, or cancelled).
For the client, the experience is unchanged. For engineers and operators, thereโs a traceable record of everything that happened.
๐ Why It Matters
- Transparency: Every request has a clear, navigable record.
- Debuggability: Engineers can replay a trace to understand failures.
- Reliability: Even in case of client disconnects, spans are finalized cleanly.
- Accountability: Inputs and outputs are tied together, avoiding ambiguity in logs.
๐ฌ How to Activate
To enable observability for your app:
- Open Helm and locate your project card.
- Click on the โ plus icon under "Other settings".
- In the settings menu, toggle Observability to ON.
- Important: After toggling, you must recompile, build, and deploy your app for the change to take effect.
๐ Observability changes are applied only after a fresh deploy.
๐ How to Access Tracing
-
In Helm
-
You can switch between Logs and Trace views.
- By selecting the Trace tab, you see the last trace for the last query.
-
As you send more queries or chat, the trace view automatically updates to the latest request.
-
Historical Traces
-
For past traces, use the Trace tab (usually below the Help section).
- This lets you browse and inspect historical traces.
โ๏ธ Accessing trace requires Sensitive Data permission. Admins can assign this permission to a role or directly to a user.