Skip to main content

flightline check

check is the most powerful command in the Flightline toolkit. It serves as the definitive “ship-readiness gate” for your AI features, answering the critical questions that determine if your code is ready for production.

Usage

flightline check --traces <path/to/traces>

The 7 Ship-Blocking Questions

When you run a check, Flightline’s intelligence layer analyzes your traces to answer these seven questions:
  1. Task Completion: Does it do the right thing?
  2. Grounding: Is it truthful and based on the provided context?
  3. Hallucination: Did it make up facts or ignore constraints?
  4. Rule Compliance: Did it follow your specific business rules?
  5. Safety: Did it avoid producing harmful or biased content?
  6. Consistency: Is it producing stable results across similar inputs?
  7. Quality: Is the output tone and formatting up to your standards?

How it Works: The Intelligence Layer

The check command runs a two-tier analysis:

Tier 1: Deterministic (Local)

Fast, high-confidence checks that run on your machine.
  • Format Validation: Ensures JSON is valid and matches expected schemas.
  • PII Detection: Scans for accidental leakage of sensitive data.
  • Pattern Matching: Checks for required fields and specific keywords.

Tier 2: Reasoning (Cloud/LLM)

Semantic analysis that understands context.
  • Semantic Comparison: Checks if the meaning of the output matches the intent.
  • Rubric Grading: Evaluates the output against qualitative criteria you define.
  • Risk Assessment: Identifies potential failure modes and hallucinations.

Key Options

OptionDescription
--tracesPath to the directory of traces to evaluate.
--config, -cPath to your evaluation spec (default: flightline.eval.yaml).
--offlineRun Tier 1 deterministic checks only.
--verbose, -vShow the reasoning behind every pass/fail decision.

Exit Codes

flightline check is designed for use in CI/CD pipelines. It uses standardized exit codes:
  • 0: PASS (Ship it)
  • 1: WARN (Review recommended)
  • 2: BLOCK (Critical failures detected)

Example

$ flightline check --traces .flightline/traces/

     STATUS: [ACT] RUNNING SHIP-READINESS ANALYSIS

WP01 ─╼ TIER 1: DETERMINISTIC CHECKS
 124 FORMAT CHECKS PASSED
 0 PII LEAKAGE DETECTED

WP02 ─╼ TIER 2: REASONING ASSESSMENT
 [BLOCK] HALLUCINATION DETECTED in 2 traces
 [WARN] TONE INCONSISTENCY in 5 traces

OVERALL VERDICT: BLOCK
  REASON: Critical hallucinations in payment summary logic.
  NEXT STEPS: Review traces in `app/services/billing.py`

 SHIPMENT BLOCKED

Deep System Insights

Beyond the pass/fail verdict, check provides a “Profile” view of your system health, including latency anatomy, token efficiency, and model fit recommendations.

Intelligence Layer

Learn more about the dual-tier architecture behind Flightline checks.