fltrace
fltrace is a CLI wrapper that instruments your application code to capture LLM activity. It works similarly to observability tools like ddtrace-run, but is specifically optimized for capturing the full context of AI interactions.
Usage
Language Support
Flightline provides auto-instrumentation for both Python and Node.js applications.Setup
Instrumentation
Running with fltrace
FLIGHTLINE_RECORD=1 environment variable.
How it Works
When instrumentation is enabled, Flightline automatically records:- Request and Response Metadata: The configuration parameters and raw completions from the model.
- Performance and Cost Signals: Detailed timing data and exact token usage counts for every call.
- Payload Capture: Optional full-content recording based on your privacy settings and testing mode.
Key Options
| Option | Description |
|---|---|
--traces, -t | Directory to save captured traces (default: .flightline/traces). |
--prod, -p | Production mode. Enables stricter PII handling. |
--quiet, -q | Suppress fltrace status messages, showing only your command’s output. |
Why Trace?
Static evaluation is limited. By capturing real execution traces, Flightline can:- Analyze Behavior: Verify if the AI actually followed instructions in a real-world context.
- Detect Regressions: Compare traces from a new PR against a “Golden Set” of traces from main.
- Identify Latency Bottlenecks: See which specific AI calls are slowing down your users.
- Audit for Safety: Check for PII leakage or unsafe outputs in production-like environments.
Example Output
Next Steps
Captured traces are the raw evidence used by Flightline’s intelligence layer to grade your system.Run Ship-Readiness Check
Evaluate your captured traces for ship-readiness.
