flightline run

Run your LLM prompts against generated test data and validate the outputs.

Usage

flightline run

What It Does

The run command:

Loads synthetic test data from your configured directory
Runs your prompt against each test case
Applies the Fact-Checker to validate outputs
Reports pass/fail results

Example

$ flightline run

> Running 'Financial Summary' prompt against 20 records...
> ❌ FAILURE: Scenario #4 (Negative Income).
> Expected: "Applicant rejected."
> Received: "Applicant approved with $0 income."

Validation Checks

The Fact-Checker applies validation checks including:

Numerical Consistency

Verifies that numbers in the LLM output match the source data. This catches hallucinated numbers before they reach production.

Safety Guardrails

Ensures safety-critical responses trigger correctly. For example, a low credit score should result in a rejection, not an approval.

CI/CD Integration

Add Flightline to your CI pipeline to catch regressions before merge. When evaluations fail, the pipeline fails, blocking bad prompts from reaching production.

Next: The Mimic Command

Generate synthetic data from existing sample files.

Getting Started

CLI Reference

Configuration

Concepts

flightline run

flightline run

Usage

What It Does

Example

Validation Checks

Numerical Consistency

Safety Guardrails

CI/CD Integration

Next: The Mimic Command

Getting Started

CLI Reference

Configuration

Concepts

​flightline run

​Usage

​What It Does

​Example

​Validation Checks

​Numerical Consistency

​Safety Guardrails

​CI/CD Integration

Next: The Mimic Command

flightline run

Usage

What It Does

Example

Validation Checks

Numerical Consistency

Safety Guardrails

CI/CD Integration