flightline generate
The generate command mass-produces realistic test data and scenarios. It ensures you have a comprehensive set of inputs to test your AI system, covering everything from happy paths to hostile edge cases.
Usage
Generation Modes
1. From Discovery (Cold-Start)
This is the primary way to get started. It analyzes your code and prompts to infer what the inputs should look like.- Goal: Quickly bootstrap a test suite for a new feature.
- Coverage: Uses “Scenario Space” targeting to ensure scenarios cover different complexity and risk dimensions based on your implementation.
2. From Profile (Refinement)
Used when you have existing sample data and want to enhance the realism of your scenarios. It replicates the patterns and business logic found during thelearn step.
- Goal: Create large, statistically accurate datasets or refine scenarios with real-world domain nuances.
- Privacy: Automatically generates fake values that look realistic but are entirely synthetic.
Key Options
| Option | Description |
|---|---|
--from-discover | Path to your flightline.discovery.json. |
--count, -n | Number of records or scenarios to generate. |
--model, -m | The model used for synthesis (default: google/gemini-3-flash-preview). |
--output, -o | Custom path for the generated file. |
Refinement with learn
If your generated scenarios feel too generic, you can use real data samples to ground them. Run flightline learn on a sample file first, then point generate at the resulting profile:
Scenario Categories
When generating from discovery, Flightline targets specific categories:- Happy Path: Standard, well-formatted inputs.
- Edge Case: Boundary conditions and unusual but valid inputs.
- Failure Mode: Inputs designed to trigger known AI failure modes in your domain.
- Adversarial: Hostile or nonsensical inputs to test system robustness.
Next Steps
With your test data ready, the next step is to capture how your system handles these inputs.Capture Traces
Use
fltrace to record real LLM execution data.