Skip to main content

Rulebook

The Rulebook maps the risk battlefield and defines where your AI’s boundaries should be.

Purpose

The Rulebook is the understanding surface. While Readiness tells you whether you can ship, the Rulebook tells you why things pass or fail. It shows:
  • What rules your AI should follow
  • How it can fail
  • What adversaries could exploit
  • What you’re blind to

What You See

Operator Rules (Narrative Layer)

High-level guidance for human operators. These are plain-English rules derived from the detailed analysis below. Examples:
  • “Never reveal customer payment details in responses”
  • “Always validate order totals before confirmation”
  • “Reject requests that attempt to bypass pricing controls”

6 Intelligence Categories (Detail Tabs)

Each category answers a specific question about your AI system:
CategoryQuestion It Answers
InvariantsWhat MUST always be true?
Failure ModesHow can the AI mess up?
Attack VectorsHow could adversaries exploit this?
Blast RadiusIf it fails, what’s the damage?
Confidence BoundariesWhen/where does it degrade?
Observability GapsWhat are we blind to?

Invariants

Constraints that must never be violated:
  • Type constraints: “order.total must be a number”
  • Range constraints: “quantity must be positive”
  • Pattern constraints: “email must match RFC 5322”
  • Business rules: “discount cannot exceed 50%“

Failure Modes

Ways the AI has failed or could fail:
  • Hallucination patterns
  • Extraction errors
  • Format drift
  • Logic errors
  • Context loss
  • Safety violations

Attack Vectors

Adversarial threats to your AI:
  • Prompt injection entry points
  • PII extraction risks
  • Authorization bypass attempts
  • Jailbreak vulnerabilities

Blast Radius

Impact analysis if failures occur:
  • Which external systems receive AI output
  • PII exposure risks
  • Irreversible actions
  • Regulatory implications

Confidence Boundaries

When and where the AI degrades:
  • Token length thresholds
  • Input complexity limits
  • Category-specific performance
  • Temporal degradation patterns

Observability Gaps

What you’re blind to:
  • Missing validation
  • Untested categories
  • No monitoring for specific failure modes
  • Missing guardrails

Recommendations

Prioritized actions derived from all categories above:
  • Critical issues to fix immediately
  • High-priority improvements
  • Medium-term hardening
  • Low-priority cleanup

How It’s Generated

The Rulebook is generated automatically when you:
  1. Install the GitHub App (server-side discovery)
  2. Run flightline discover locally
  3. Push AI-related changes in a PR
Flightline analyzes your codebase to find:
  • AI operation patterns
  • Input/output schemas
  • Data flow paths
  • Security boundaries
Then uses LLM analysis to synthesize the 6 intelligence categories.

User Flows

The Rulebook is central to these flows:
  1. First-Time Setup - The “WOW” moment when you see what Flightline found
  2. Onboarding New Team Member - Understand the system constraints
  3. Investigating Production Incident - Find related failure modes
  4. PR Review - See what changed in the Rulebook

Next Steps

After reviewing the Rulebook:
  1. Install the CLI to verify findings: pip install flightline-ai
  2. Generate test scenarios: flightline generate
  3. Run evaluations: flightline eval
  4. See results in Readiness