Tracing

🧠 Philosophy

Tracing in INTELLITHING is about observability by default. When workflows and chat engines become complex — with streaming responses, triggers, and multiple orchestration layers — it becomes hard to answer questions like:

What inputs did the system receive?
Which path was taken through the workflow?
What was the final output sent back to the user?

We wanted a solution that provides clarity without burden. Instead of engineers scattering print statements or operators wading through opaque logs, tracing gives a structured timeline of every request and response.

The philosophy is simple: each request leaves behind a breadcrumb trail. That trail is human-readable, machine-analyzable, and always tied back to the original session.

🔑 Key Concepts

At its core, tracing provides a span-based view of activity in the system:

Trace: A top-level record of one request (e.g., a /stream-response call).
Span: A labeled section of work inside a trace (e.g., “prepare query”, “model stream”, “memory update”).
Inputs: Data captured at the start of a span.
Outputs: Data (final or intermediate) emitted before the span closes.
Status: Success, error, or cancelled (e.g., if a client disconnects mid-stream).

Together, traces and spans form a hierarchical timeline of what happened during a request.

📘 Key Definitions

Root Span: The span that represents the overall request (always one per trace).
Child Span: Nested spans that represent sub-tasks or components.
Context: Metadata attached to a span (endpoint name, query text, workflow path).
Finalize: The moment a span closes, recording outputs and status.

🧩 Tracing at a Glance

One-line Form

Trace = { root_span + child_spans }

Minimal Example

A /stream-response call automatically creates a root span:

Inputs: {"query": "What is RAG?"}
Child Spans: workflow selection, memory prep, model streaming
Outputs: Final text streamed back to the client
Status: Success

From the outside, you see a normal streaming response. Behind the scenes, you get a structured timeline of exactly what happened.

⚙️ How Tracing Fits into INTELLITHING

Tracing is not an add-on — it is woven into the runtime:

Every API endpoint (like /stream-response) automatically opens a root span.
Wrappers around streaming responses ensure that spans only close after all output is delivered (or on error/disconnect).
Input, output, and error details are captured once in the span, not scattered across logs.
Tracing integrates with your existing observability tools — but it can also be inspected directly for debugging.

This means you can follow a single user query from input to final streamed text without digging through raw logs.

🧭 Tracing in `/stream-response`

The streaming endpoint is a good illustration:

Start: A root span stream_response is opened with attributes {endpoint: "/stream-response"}.
Inputs Recorded: The user’s query text.
Execution:
If workflows are loaded, the router engine runs as a child span.
If no workflows are present, the model stream runs directly.
Streaming:
Each chunk is logged as it passes through.
Wrappers accumulate the full output while sending deltas to the client.
Finalize:
The full response text is recorded.
The span is closed (success, error, or cancelled).

For the client, the experience is unchanged. For engineers and operators, there’s a traceable record of everything that happened.

🌐 Why It Matters

Transparency: Every request has a clear, navigable record.
Debuggability: Engineers can replay a trace to understand failures.
Reliability: Even in case of client disconnects, spans are finalized cleanly.
Accountability: Inputs and outputs are tied together, avoiding ambiguity in logs.

🎬 How to Activate

To enable observability for your app:

Open Helm and locate your project card.
Click on the ➕ plus icon under "Other settings".
In the settings menu, toggle Observability to ON.
Important: After toggling, you must recompile, build, and deploy your app for the change to take effect.

🔁 Observability changes are applied only after a fresh deploy.

📊 How to Access Tracing

In Helm
You can switch between Logs and Trace views.
By selecting the Trace tab, you see the last trace for the last query.
As you send more queries or chat, the trace view automatically updates to the latest request.
Historical Traces
For past traces, use the Trace tab (usually below the Help section).
This lets you browse and inspect historical traces.

⛔️ Accessing trace requires Sensitive Data permission. Admins can assign this permission to a role or directly to a user.