Enabling Agent Guard
Via Middleware Options
Via Policy YAML
Built-in Detectors
Agent Guard includes the following detectors out of the box:| Detector | Finding Code | What It Detects |
|---|---|---|
| Prompt Injection | agent_prompt_injection | Attempts to override system instructions or jailbreak |
| PII | agent_pii | Social security numbers, emails, phone numbers, etc. |
| Secrets | agent_secret | API keys, passwords, tokens, connection strings |
| Toxic Content | agent_toxic_content | Hate speech, harassment, explicit content |
| Command Safety | agent_command_unsafe | Shell commands, system calls, dangerous operations |
| Malicious Code | agent_malicious_code | Code injection, eval patterns, exploit payloads |
| Policy Violation | agent_policy_violation | Custom policy-defined violations |
| Data Exfiltration | agent_data_exfil | Attempts to send sensitive data to external endpoints |
Finding Structure
Each finding follows a structured format:Severity Levels
| Level | Description |
|---|---|
unknown | Cannot be classified |
low | Informational finding |
medium | Potential risk worth investigating |
high | Likely policy violation; should be blocked in production |
critical | Active threat; block immediately |
Scan Locations
| Location | When Scanned |
|---|---|
input | Tool arguments before execution |
output | Tool results after execution |
run | Accumulated run context (objective + previous actions) |
Custom Patterns
Override the built-in regex patterns for any detector:External Scanner Adapters
Chain external scanning services for additional coverage:Custom Adapter
Run Context Scanning
Enable accumulated run-context scanning to detect multi-turn attacks:Enforcement
Agent Guard integrates with Sec0’s policy enforcement:- Observe mode (
deny_on: []): Findings are logged in audit envelopes but requests are not blocked - Enforce mode (
deny_on: ["agent_guard_failed"]): Requests are blocked when findings exceed the configured threshold
agent_guard_findings array is always included in the audit envelope, regardless of enforcement mode.
For the full Agent Guard options reference, see Middleware Options Reference.