Skip to main content
Agent Guard is Sec0’s in-device content scanning engine. It runs inline on every tool invocation to detect prompt injection, PII leakage, secrets exposure, toxic content, and malicious code on both inputs and outputs. Agent Guard is configured as part of the middleware and runs in-process for zero-latency enforcement.

Enabling Agent Guard

Via Middleware Options

import { sec0SecurityMiddleware } from "sec0-sdk/middleware";

sec0SecurityMiddleware({
  policy,
  signer,
  otel,
  sec0: { dir: ".sec0" },
  agentGuard: {
    enabled: true,
    block_on_severity: "high",   // Block if any finding >= this severity
    block_on_count: 5,           // Block if total findings >= this count
  },
})(server);

Via Policy YAML

agent_guard:
  enabled: true
  block_on_severity: high

enforcement:
  deny_on:
    - agent_guard_failed

Built-in Detectors

Agent Guard includes the following detectors out of the box:
DetectorFinding CodeWhat It Detects
Prompt Injectionagent_prompt_injectionAttempts to override system instructions or jailbreak
PIIagent_piiSocial security numbers, emails, phone numbers, etc.
Secretsagent_secretAPI keys, passwords, tokens, connection strings
Toxic Contentagent_toxic_contentHate speech, harassment, explicit content
Command Safetyagent_command_unsafeShell commands, system calls, dangerous operations
Malicious Codeagent_malicious_codeCode injection, eval patterns, exploit payloads
Policy Violationagent_policy_violationCustom policy-defined violations
Data Exfiltrationagent_data_exfilAttempts to send sensitive data to external endpoints

Finding Structure

Each finding follows a structured format:
{
  "code": "agent_prompt_injection",
  "severity": "high",
  "location": "input",
  "message": "Detected prompt injection attempt in tool input",
  "evidence": "Ignore all previous instructions and...",
  "path": "args.prompt",
  "tags": ["security", "injection"]
}

Severity Levels

LevelDescription
unknownCannot be classified
lowInformational finding
mediumPotential risk worth investigating
highLikely policy violation; should be blocked in production
criticalActive threat; block immediately

Scan Locations

LocationWhen Scanned
inputTool arguments before execution
outputTool results after execution
runAccumulated run context (objective + previous actions)

Custom Patterns

Override the built-in regex patterns for any detector:
sec0SecurityMiddleware({
  // ...
  agentGuard: {
    enabled: true,
    block_on_severity: "high",
    pii_patterns: [
      "\\b\\d{3}-\\d{2}-\\d{4}\\b",    // SSN
      "\\b[A-Z]{2}\\d{6,8}\\b",         // Passport
    ],
    secret_patterns: [
      "(?i)api[_-]?key\\s*[=:]\\s*\\S+",
      "(?i)bearer\\s+[a-zA-Z0-9._-]+",
    ],
    prompt_injection_patterns: [
      "(?i)ignore\\s+(all\\s+)?previous\\s+instructions",
      "(?i)you\\s+are\\s+now\\s+in\\s+developer\\s+mode",
    ],
    dangerous_commands: [
      "(?i)\\brm\\s+-rf\\b",
      "(?i)\\bsudo\\b",
    ],
  },
})(server);

External Scanner Adapters

Chain external scanning services for additional coverage:

Custom Adapter

sec0SecurityMiddleware({
  // ...
  agentGuard: {
    enabled: true,
    block_on_severity: "high",
    adapters: [
      {
        type: "custom",
        onScanPrompt: async (text) => {
          // Call your scanner service
          const findings = await myScanner.scanInput(text);
          return findings; // Return AgentGuardFinding[]
        },
        onScanOutput: async (text) => {
          const findings = await myScanner.scanOutput(text);
          return findings;
        },
        onScanRun: async (context) => {
          const findings = await myScanner.scanRunContext(context);
          return findings;
        },
      },
    ],
  },
})(server);

Run Context Scanning

Enable accumulated run-context scanning to detect multi-turn attacks:
agentGuard: {
  enabled: true,
  run_context: {
    enabled: true,
    max_chars: 50000,        // Max total characters in context
    max_events: 100,         // Max events to accumulate
    max_event_chars: 5000,   // Max chars per event
    max_runs: 50,            // Max cached runs
    ttl_ms: 3600000,         // Cache TTL (1 hour)
    include_objective: true, // Include the agent's objective
    include_metadata: true,  // Include hop metadata
  },
}

Enforcement

Agent Guard integrates with Sec0’s policy enforcement:
  • Observe mode (deny_on: []): Findings are logged in audit envelopes but requests are not blocked
  • Enforce mode (deny_on: ["agent_guard_failed"]): Requests are blocked when findings exceed the configured threshold
The agent_guard_findings array is always included in the audit envelope, regardless of enforcement mode. For the full Agent Guard options reference, see Middleware Options Reference.