Skip to main content

Your First Run

This guide walks through the core Flightline workflow: discovering operations, generating scenarios from code, capturing traces, evaluating behavior, and running a ship-readiness check.
Flightline is in early access. The exact CLI output may differ from what is shown here. Contact the team for the latest documentation.

Prerequisites

  • Flightline CLI installed (pip install flightline-ai)
  • OPENROUTER_API_KEY environment variable set
  • A codebase with AI operations

Step 1: Discover AI Operations

Map the AI surface area of your repository:
flightline discover
Flightline scans your code and identifies risk tiers for every LLM call it finds. This generates a flightline.discovery.json file used in the next step.

Step 2: Generate Test Scenarios

Create high-fidelity synthetic data based on your code and prompts:
flightline generate --from-discover flightline.discovery.json --count 10
This ensures you have realistic inputs to test your system, covering both happy paths and edge cases, without needing any existing data samples.

Step 3: Record Execution Traces

Capture real behavior by wrapping your tests or application:
flightline fltrace pytest tests/
Flightline records the inputs, outputs, and performance metrics for every AI call made during the run.

Step 4: Evaluate AI Behavior

Compare actual outputs against your expected outcomes:
flightline eval scenarios
This identifies logic failures by matching captured traces against the ground truth in your generated scenarios.

Step 5: Run a Ship-Readiness Check

Perform a final assessment to see if the system is ready for production:
flightline check --traces .flightline/traces/
Flightline answers the 7 ship-blocking questions and provides a clear pass or fail verdict.

Next Steps