Skip to content

Input/Output Scope

Input/Output Scope controls when detection runs: on LLM inputs, outputs, or both. This lets you optimize for your security needs and performance requirements.

ModeScans InputScans OutputUse Case
IOFull protection (recommended)
IInput-only, lowest latency
OOutput-only, for pre-validated inputs
{
"policy_mode": "IO",
"input_detectors": [
{ "detector_type": "prompt_attack", "threshold": "L2", "action": "block" }
],
"output_detectors": [
{ "detector_type": "pii/email", "threshold": "L2", "action": "block" }
]
}

When: Before the request reaches the LLM

Purpose:

  • Block malicious prompts (injection, jailbreaks)
  • Prevent submission of sensitive data
  • Reduce cost by blocking bad requests early
{
"policy_mode": "I",
"input_detectors": [
{ "detector_type": "prompt_attack", "threshold": "L2", "action": "block" },
{ "detector_type": "pii/credit_card", "threshold": "L1", "action": "block" }
]
}
flowchart LR
A[User Request] --> B{Input Detectors}
B -->|BLOCKED| C[Error Response]
B -->|PASSED| D[LLM]
D --> E[Response]
style A fill:#1a1a2e,stroke:#00d4ff,color:#fff
style B fill:#0d3d4d,stroke:#00d4ff,color:#fff
style C fill:#4a1a1a,stroke:#ff4444,color:#fff
style D fill:#1a1a2e,stroke:#00d4ff,color:#fff
style E fill:#1a1a2e,stroke:#00d4ff,color:#fff
DetectorWhy Input?
prompt_attackBlock attacks before they reach the model
pii/credit_cardPrevent accidental submission of payment data
pii/ssnBlock SSN submission

When: After the LLM generates a response, before it reaches the user

Purpose:

  • Catch data the model shouldn’t leak
  • Filter harmful content in responses
  • Safety net for prompt attacks that bypassed input detection
{
"policy_mode": "O",
"output_detectors": [
{ "detector_type": "pii/email", "threshold": "L2", "action": "block" },
{ "detector_type": "moderated_content/hate", "threshold": "L2", "action": "block" }
]
}
flowchart LR
A[User Request] --> B[LLM]
B --> C{Output Detectors}
C -->|BLOCKED| D[Error Response]
C -->|PASSED| E[User Response]
style A fill:#1a1a2e,stroke:#00d4ff,color:#fff
style B fill:#1a1a2e,stroke:#00d4ff,color:#fff
style C fill:#0d3d4d,stroke:#00d4ff,color:#fff
style D fill:#4a1a1a,stroke:#ff4444,color:#fff
style E fill:#1a1a2e,stroke:#00d4ff,color:#fff
DetectorWhy Output?
pii/*Catch model leaking training data
moderated_content/*Filter harmful generated content
unknown_linksValidate URLs in responses

For most applications, scan both inputs and outputs:

{
"policy_mode": "IO",
"input_detectors": [
{ "detector_type": "prompt_attack", "threshold": "L2", "action": "block" },
{ "detector_type": "pii/credit_card", "threshold": "L1", "action": "block" }
],
"output_detectors": [
{ "detector_type": "pii/email", "threshold": "L2", "action": "block" },
{ "detector_type": "pii/credit_card", "threshold": "L1", "action": "block" },
{ "detector_type": "moderated_content/hate", "threshold": "L2", "action": "block" }
]
}
flowchart LR
A[User Request] --> B{Input Detectors}
B -->|BLOCKED| C[Error Response]
B -->|PASSED| D[LLM]
D --> E{Output Detectors}
E -->|BLOCKED| F[Error]
E -->|PASSED| G[User Response]
style A fill:#1a1a2e,stroke:#00d4ff,color:#fff
style B fill:#0d3d4d,stroke:#00d4ff,color:#fff
style C fill:#4a1a1a,stroke:#ff4444,color:#fff
style D fill:#1a1a2e,stroke:#00d4ff,color:#fff
style E fill:#0d3d4d,stroke:#00d4ff,color:#fff
style F fill:#4a1a1a,stroke:#ff4444,color:#fff
style G fill:#1a1a2e,stroke:#00d4ff,color:#fff

Latency: +signature time (both) + LLM detection (if enabled)

Best for:

  • User-facing applications
  • Sensitive data handling
  • Regulatory compliance
Input: ~11µs (signatures) + 0-100ms (LLM)
Output: ~11µs (signatures) + 0-100ms (LLM)

You can use the same detector type for both input and output with different settings:

{
"input_detectors": [
{ "detector_type": "pii/email", "threshold": "L3", "action": "flag" }
],
"output_detectors": [
{ "detector_type": "pii/email", "threshold": "L2", "action": "block" }
]
}

This policy:

  • Flags emails in input (users might legitimately mention emails)
  • Blocks emails in output (model shouldn’t leak them)
{
"input_detectors": [
{ "detector_type": "prompt_attack", "threshold": "L2", "action": "block" }
],
"output_detectors": [
{ "detector_type": "prompt_attack", "threshold": "L3", "action": "flag" }
]
}

This policy:

  • Blocks injection attempts on input
  • Flags (but allows) injection-like patterns in output for review
{
"policy_mode": "IO",
"input_detectors": [
{ "detector_type": "prompt_attack", "threshold": "L2", "action": "block" }
],
"output_detectors": [
{ "detector_type": "moderated_content/hate", "threshold": "L2", "action": "block" },
{ "detector_type": "pii/email", "threshold": "L2", "action": "block" }
]
}
{
"policy_mode": "I",
"input_detectors": [
{ "detector_type": "prompt_attack", "threshold": "L1", "action": "flag" }
]
}
{
"policy_mode": "IO",
"input_detectors": [
{ "detector_type": "prompt_attack", "threshold": "L3", "action": "block" },
{ "detector_type": "pii/ssn", "threshold": "L1", "action": "block" }
],
"output_detectors": [
{ "detector_type": "pii/ssn", "threshold": "L1", "action": "block" },
{ "detector_type": "pii/email", "threshold": "L1", "action": "block" },
{ "detector_type": "pii/phone", "threshold": "L2", "action": "flag" }
]
}
{
"policy_mode": "IO",
"input_detectors": [
{ "detector_type": "prompt_attack", "threshold": "L2", "action": "block" }
],
"output_detectors": [
{ "detector_type": "moderated_content/hate", "threshold": "L2", "action": "block" },
{ "detector_type": "moderated_content/sexual", "threshold": "L2", "action": "block" },
{ "detector_type": "moderated_content/violence", "threshold": "L2", "action": "block" }
]
}