Agent Guard

Agent Guard is Sec0’s in-device content scanning engine. It runs inline on every tool invocation to detect prompt injection, PII leakage, secrets exposure, toxic content, and malicious code on both inputs and outputs. Agent Guard is configured as part of the middleware and runs in-process for zero-latency enforcement.

Enabling Agent Guard

Via Middleware Options

import { sec0SecurityMiddleware } from "sec0-sdk/middleware";

sec0SecurityMiddleware({
  policy,
  signer,
  otel,
  sec0: { dir: ".sec0" },
  agentGuard: {
    enabled: true,
    block_on_severity: "high",   // Block if any finding >= this severity
    block_on_count: 5,           // Block if total findings >= this count
  },
})(server);

Via Policy YAML

agent_guard:
  enabled: true
  block_on_severity: high

enforcement:
  deny_on:
    - agent_guard_failed

Built-in Detectors

Agent Guard includes the following detectors out of the box:

Detector	Finding Code	What It Detects
Prompt Injection	`agent_prompt_injection`	Attempts to override system instructions or jailbreak
PII	`agent_pii`	Social security numbers, emails, phone numbers, etc.
Secrets	`agent_secret`	API keys, passwords, tokens, connection strings
Toxic Content	`agent_toxic_content`	Hate speech, harassment, explicit content
Command Safety	`agent_command_unsafe`	Shell commands, system calls, dangerous operations
Malicious Code	`agent_malicious_code`	Code injection, eval patterns, exploit payloads
Policy Violation	`agent_policy_violation`	Custom policy-defined violations
Data Exfiltration	`agent_data_exfil`	Attempts to send sensitive data to external endpoints

Finding Structure

Each finding follows a structured format:

{
  "code": "agent_prompt_injection",
  "severity": "high",
  "location": "input",
  "message": "Detected prompt injection attempt in tool input",
  "evidence": "Ignore all previous instructions and...",
  "path": "args.prompt",
  "tags": ["security", "injection"]
}

Severity Levels

Level	Description
`unknown`	Cannot be classified
`low`	Informational finding
`medium`	Potential risk worth investigating
`high`	Likely policy violation; should be blocked in production
`critical`	Active threat; block immediately

Scan Locations

Location	When Scanned
`input`	Tool arguments before execution
`output`	Tool results after execution
`run`	Accumulated run context (objective + previous actions)

Custom Patterns

Override the built-in regex patterns for any detector:

sec0SecurityMiddleware({
  // ...
  agentGuard: {
    enabled: true,
    block_on_severity: "high",
    pii_patterns: [
      "\\b\\d{3}-\\d{2}-\\d{4}\\b",    // SSN
      "\\b[A-Z]{2}\\d{6,8}\\b",         // Passport
    ],
    secret_patterns: [
      "(?i)api[_-]?key\\s*[=:]\\s*\\S+",
      "(?i)bearer\\s+[a-zA-Z0-9._-]+",
    ],
    prompt_injection_patterns: [
      "(?i)ignore\\s+(all\\s+)?previous\\s+instructions",
      "(?i)you\\s+are\\s+now\\s+in\\s+developer\\s+mode",
    ],
    dangerous_commands: [
      "(?i)\\brm\\s+-rf\\b",
      "(?i)\\bsudo\\b",
    ],
  },
})(server);

External Scanner Adapters

Chain external scanning services for additional coverage:

Custom Adapter

sec0SecurityMiddleware({
  // ...
  agentGuard: {
    enabled: true,
    block_on_severity: "high",
    adapters: [
      {
        type: "custom",
        onScanPrompt: async (text) => {
          // Call your scanner service
          const findings = await myScanner.scanInput(text);
          return findings; // Return AgentGuardFinding[]
        },
        onScanOutput: async (text) => {
          const findings = await myScanner.scanOutput(text);
          return findings;
        },
        onScanRun: async (context) => {
          const findings = await myScanner.scanRunContext(context);
          return findings;
        },
      },
    ],
  },
})(server);

Run Context Scanning

Enable accumulated run-context scanning to detect multi-turn attacks:

agentGuard: {
  enabled: true,
  run_context: {
    enabled: true,
    max_chars: 50000,        // Max total characters in context
    max_events: 100,         // Max events to accumulate
    max_event_chars: 5000,   // Max chars per event
    max_runs: 50,            // Max cached runs
    ttl_ms: 3600000,         // Cache TTL (1 hour)
    include_objective: true, // Include the agent's objective
    include_metadata: true,  // Include hop metadata
  },
}

Enforcement

Agent Guard integrates with Sec0’s policy enforcement:

Observe mode (deny_on: []): Findings are logged in audit envelopes but requests are not blocked
Enforce mode (deny_on: ["agent_guard_failed"]): Requests are blocked when findings exceed the configured threshold

The agent_guard_findings array is always included in the audit envelope, regardless of enforcement mode. For the full Agent Guard options reference, see Middleware Options Reference.

Getting Started

In-Device Security

Network Boundary Security

Policy & Compliance

Operations

Advanced

Reference

Agent Guard

Enabling Agent Guard

Via Middleware Options

Via Policy YAML

Built-in Detectors

Finding Structure

Severity Levels

Scan Locations

Custom Patterns

External Scanner Adapters

Custom Adapter

Run Context Scanning

Enforcement

Getting Started

In-Device Security

Network Boundary Security

Policy & Compliance

Operations

Advanced

Reference

​Enabling Agent Guard

​Via Middleware Options

​Via Policy YAML

​Built-in Detectors

​Finding Structure

​Severity Levels

​Scan Locations

​Custom Patterns

​External Scanner Adapters

​Custom Adapter

​Run Context Scanning

​Enforcement

Enabling Agent Guard

Via Middleware Options

Via Policy YAML

Built-in Detectors

Finding Structure

Severity Levels

Scan Locations

Custom Patterns

External Scanner Adapters

Custom Adapter

Run Context Scanning

Enforcement