Rulebook
The Rulebook maps the risk battlefield and defines where your AI’s boundaries should be.Purpose
The Rulebook is the understanding surface. While Readiness tells you whether you can ship, the Rulebook tells you why things pass or fail. It shows:- What rules your AI should follow
- How it can fail
- What adversaries could exploit
- What you’re blind to
What You See
Operator Rules (Narrative Layer)
High-level guidance for human operators. These are plain-English rules derived from the detailed analysis below. Examples:- “Never reveal customer payment details in responses”
- “Always validate order totals before confirmation”
- “Reject requests that attempt to bypass pricing controls”
6 Intelligence Categories (Detail Tabs)
Each category answers a specific question about your AI system:| Category | Question It Answers |
|---|---|
| Invariants | What MUST always be true? |
| Failure Modes | How can the AI mess up? |
| Attack Vectors | How could adversaries exploit this? |
| Blast Radius | If it fails, what’s the damage? |
| Confidence Boundaries | When/where does it degrade? |
| Observability Gaps | What are we blind to? |
Invariants
Constraints that must never be violated:- Type constraints: “order.total must be a number”
- Range constraints: “quantity must be positive”
- Pattern constraints: “email must match RFC 5322”
- Business rules: “discount cannot exceed 50%“
Failure Modes
Ways the AI has failed or could fail:- Hallucination patterns
- Extraction errors
- Format drift
- Logic errors
- Context loss
- Safety violations
Attack Vectors
Adversarial threats to your AI:- Prompt injection entry points
- PII extraction risks
- Authorization bypass attempts
- Jailbreak vulnerabilities
Blast Radius
Impact analysis if failures occur:- Which external systems receive AI output
- PII exposure risks
- Irreversible actions
- Regulatory implications
Confidence Boundaries
When and where the AI degrades:- Token length thresholds
- Input complexity limits
- Category-specific performance
- Temporal degradation patterns
Observability Gaps
What you’re blind to:- Missing validation
- Untested categories
- No monitoring for specific failure modes
- Missing guardrails
Recommendations
Prioritized actions derived from all categories above:- Critical issues to fix immediately
- High-priority improvements
- Medium-term hardening
- Low-priority cleanup
How It’s Generated
The Rulebook is generated automatically when you:- Install the GitHub App (server-side discovery)
- Run
flightline discoverlocally - Push AI-related changes in a PR
- AI operation patterns
- Input/output schemas
- Data flow paths
- Security boundaries
User Flows
The Rulebook is central to these flows:- First-Time Setup - The “WOW” moment when you see what Flightline found
- Onboarding New Team Member - Understand the system constraints
- Investigating Production Incident - Find related failure modes
- PR Review - See what changed in the Rulebook
Next Steps
After reviewing the Rulebook:- Install the CLI to verify findings:
pip install flightline-ai - Generate test scenarios:
flightline generate - Run evaluations:
flightline eval - See results in Readiness
Related
- Readiness - The decision surface
- Dashboard - Fleet overview
- CLI: discover - Trigger discovery locally
