Input/Output Scope

Input/Output Scope controls when detection runs: on LLM inputs, outputs, or both. This lets you optimize for your security needs and performance requirements.

Policy Modes

Mode	Scans Input	Scans Output	Use Case
`IO`	✅	✅	Full protection (recommended)
`I`	✅	❌	Input-only, lowest latency
`O`	❌	✅	Output-only, for pre-validated inputs

Configuration

{
  "policy_mode": "IO",
  "input_detectors": [
    { "detector_type": "prompt_attack", "threshold": "L2", "action": "block" }
  ],
  "output_detectors": [
    { "detector_type": "pii/email", "threshold": "L2", "action": "block" }
  ]
}

Input Detection

When: Before the request reaches the LLM

Purpose:

Block malicious prompts (injection, jailbreaks)
Prevent submission of sensitive data
Reduce cost by blocking bad requests early

{
  "policy_mode": "I",
  "input_detectors": [
    { "detector_type": "prompt_attack", "threshold": "L2", "action": "block" },
    { "detector_type": "pii/credit_card", "threshold": "L1", "action": "block" }
  ]
}

Input Detection Flow

flowchart LR
    A[User Request] --> B{Input Detectors}
    B -->|BLOCKED| C[Error Response]
    B -->|PASSED| D[LLM]
    D --> E[Response]

    style A fill:#1a1a2e,stroke:#00d4ff,color:#fff
    style B fill:#0d3d4d,stroke:#00d4ff,color:#fff
    style C fill:#4a1a1a,stroke:#ff4444,color:#fff
    style D fill:#1a1a2e,stroke:#00d4ff,color:#fff
    style E fill:#1a1a2e,stroke:#00d4ff,color:#fff

Best For Input Detection

Detector	Why Input?
`prompt_attack`	Block attacks before they reach the model
`pii/credit_card`	Prevent accidental submission of payment data
`pii/ssn`	Block SSN submission

Output Detection

When: After the LLM generates a response, before it reaches the user

Purpose:

Catch data the model shouldn’t leak
Filter harmful content in responses
Safety net for prompt attacks that bypassed input detection

{
  "policy_mode": "O",
  "output_detectors": [
    { "detector_type": "pii/email", "threshold": "L2", "action": "block" },
    { "detector_type": "moderated_content/hate", "threshold": "L2", "action": "block" }
  ]
}

Output Detection Flow

flowchart LR
    A[User Request] --> B[LLM]
    B --> C{Output Detectors}
    C -->|BLOCKED| D[Error Response]
    C -->|PASSED| E[User Response]

    style A fill:#1a1a2e,stroke:#00d4ff,color:#fff
    style B fill:#1a1a2e,stroke:#00d4ff,color:#fff
    style C fill:#0d3d4d,stroke:#00d4ff,color:#fff
    style D fill:#4a1a1a,stroke:#ff4444,color:#fff
    style E fill:#1a1a2e,stroke:#00d4ff,color:#fff

Best For Output Detection

Detector	Why Output?
`pii/*`	Catch model leaking training data
`moderated_content/*`	Filter harmful generated content
`unknown_links`	Validate URLs in responses

Full IO Detection (Recommended)

For most applications, scan both inputs and outputs:

{
  "policy_mode": "IO",
  "input_detectors": [
    { "detector_type": "prompt_attack", "threshold": "L2", "action": "block" },
    { "detector_type": "pii/credit_card", "threshold": "L1", "action": "block" }
  ],
  "output_detectors": [
    { "detector_type": "pii/email", "threshold": "L2", "action": "block" },
    { "detector_type": "pii/credit_card", "threshold": "L1", "action": "block" },
    { "detector_type": "moderated_content/hate", "threshold": "L2", "action": "block" }
  ]
}

IO Detection Flow

flowchart LR
    A[User Request] --> B{Input Detectors}
    B -->|BLOCKED| C[Error Response]
    B -->|PASSED| D[LLM]
    D --> E{Output Detectors}
    E -->|BLOCKED| F[Error]
    E -->|PASSED| G[User Response]

    style A fill:#1a1a2e,stroke:#00d4ff,color:#fff
    style B fill:#0d3d4d,stroke:#00d4ff,color:#fff
    style C fill:#4a1a1a,stroke:#ff4444,color:#fff
    style D fill:#1a1a2e,stroke:#00d4ff,color:#fff
    style E fill:#0d3d4d,stroke:#00d4ff,color:#fff
    style F fill:#4a1a1a,stroke:#ff4444,color:#fff
    style G fill:#1a1a2e,stroke:#00d4ff,color:#fff

Performance Considerations

Latency: +signature time (both) + LLM detection (if enabled)

Best for:

User-facing applications
Sensitive data handling
Regulatory compliance

Input: ~11µs (signatures) + 0-100ms (LLM)
Output: ~11µs (signatures) + 0-100ms (LLM)

Latency: +signature time + LLM detection

Best for:

Internal tools with trusted outputs
High-throughput, low-latency requirements
When LLM output is post-processed anyway

Input: ~11µs (signatures) + 0-100ms (LLM)
Output: No additional latency

Latency: +signature time + LLM detection

Best for:

Pre-validated/sanitized inputs
Output-focused concerns (data leakage)
When inputs come from trusted sources

Input: No additional latency
Output: ~11µs (signatures) + 0-100ms (LLM)

Detector Placement Strategy

Same Detector, Different Stages

You can use the same detector type for both input and output with different settings:

{
  "input_detectors": [
    { "detector_type": "pii/email", "threshold": "L3", "action": "flag" }
  ],
  "output_detectors": [
    { "detector_type": "pii/email", "threshold": "L2", "action": "block" }
  ]
}

This policy:

Flags emails in input (users might legitimately mention emails)
Blocks emails in output (model shouldn’t leak them)

Different Actions by Stage

{
  "input_detectors": [
    { "detector_type": "prompt_attack", "threshold": "L2", "action": "block" }
  ],
  "output_detectors": [
    { "detector_type": "prompt_attack", "threshold": "L3", "action": "flag" }
  ]
}

This policy:

Blocks injection attempts on input
Flags (but allows) injection-like patterns in output for review

Recommendations by Use Case

Public Chatbot

{
  "policy_mode": "IO",
  "input_detectors": [
    { "detector_type": "prompt_attack", "threshold": "L2", "action": "block" }
  ],
  "output_detectors": [
    { "detector_type": "moderated_content/hate", "threshold": "L2", "action": "block" },
    { "detector_type": "pii/email", "threshold": "L2", "action": "block" }
  ]
}

Internal Developer Tool

{
  "policy_mode": "I",
  "input_detectors": [
    { "detector_type": "prompt_attack", "threshold": "L1", "action": "flag" }
  ]
}

Healthcare Application

{
  "policy_mode": "IO",
  "input_detectors": [
    { "detector_type": "prompt_attack", "threshold": "L3", "action": "block" },
    { "detector_type": "pii/ssn", "threshold": "L1", "action": "block" }
  ],
  "output_detectors": [
    { "detector_type": "pii/ssn", "threshold": "L1", "action": "block" },
    { "detector_type": "pii/email", "threshold": "L1", "action": "block" },
    { "detector_type": "pii/phone", "threshold": "L2", "action": "flag" }
  ]
}

Content Generation Platform

{
  "policy_mode": "IO",
  "input_detectors": [
    { "detector_type": "prompt_attack", "threshold": "L2", "action": "block" }
  ],
  "output_detectors": [
    { "detector_type": "moderated_content/hate", "threshold": "L2", "action": "block" },
    { "detector_type": "moderated_content/sexual", "threshold": "L2", "action": "block" },
    { "detector_type": "moderated_content/violence", "threshold": "L2", "action": "block" }
  ]
}

Next Steps

Policies — Complete policy configuration guide
Threshold Levels — Tune detection sensitivity
Quick Start — Get running with your first policy