Detectors

Detectors are Glitch’s security mechanisms that analyze content to identify threats, sensitive data, and policy violations. They form the foundation of Glitch’s security model.

What Are Detectors?

Detectors are configurable security checks that:

Analyze content using pattern matching, semantic analysis, or custom rules
Return confidence scores indicating how likely content matches a threat pattern
Trigger actions (block, flag, or allow) based on configured thresholds
Support both input and output scanning to protect LLM interactions end-to-end

Detector Components

Every detector configuration includes:

Component	Description
Type	The specific detector (e.g., `prompt_attack`, `pii/email`)
Threshold	Sensitivity level (L1-L4) that determines when the detector triggers
Action	What happens when triggered: `block`, `flag`, or `allow`
Scope	Whether to run on input, output, or both

How Detectors Work

Content Analysis: Detector analyzes the content using its detection method
Confidence Scoring: Returns a confidence score (0.0-1.0) indicating threat likelihood
Threshold Comparison: Compares score against configured threshold level
Action Execution: If threshold is met, executes the configured action

Detector Categories

Glitch organizes detectors into four main categories:

Prompt Defense

Detect prompt injection, jailbreaks, and instruction manipulation attempts.

Content Moderation

Filter harmful, toxic, or inappropriate content.

Data Leakage Prevention

Identify and protect sensitive data like PII and credentials.

Malicious Links

Detect and validate URLs, blocking known malicious domains.

Learn more about detector categories →

Detection Methods

Glitch uses multiple detection approaches:

Signature Detection: Fast pattern matching using regex (~11µs)
LLM Detection: Deep semantic analysis for nuanced threats (~50-100ms)
Custom Detectors: User-defined patterns and rules

Integration with Policies

Detectors are configured within Policies, which define:

Which detectors to run
Threshold levels for each detector
Actions to take when threats are detected
Whether to scan inputs, outputs, or both

Learn more about Policies →

Next Steps

Detector Categories — Explore available detector types
Threshold Levels — Understand sensitivity tuning
Input/Output Scope — Configure detection scope
Custom Detectors — Create domain-specific rules