Policies are reusable guardrails that detect and control risky behavior in your AI applications. Define a policy once, assign it to one or many applications, and manage it centrally.
Policies overview

How policies work

A policy describes what to detect, where to look, and how to respond:
  • Detector model: The engine used to detect a risk
  • Target: Which part(s) of the conversation to inspect, e.g. Assistant or User
  • Sensibility: The minimum confidence required to flag
  • Expected behavior (coming soon): Defines the action taken when a detector is triggered (e.g., block or alert)
  • Override response: Optional default message to return on violation
Policy details view with configuration

Detector types

Guardion supports several types of detectors, each designed to identify specific risks or policy violations in AI interactions. You can mix and match detector types within your policies to cover a wide range of threats.

Supported types and models:

Prompt Defense
  • modern-guard-v1: Fast, general-purpose prompt security and jailbreak detection
  • modern-guard-v1.5: Advanced agentic prompt security and jailbreak detection
Content Moderation
  • moderation-v0: Safety moderation for harmful content
Each detector type can be configured with its own threshold and target, allowing you to tailor policies to your application’s needs. See the Detectors documentation for a full list and configuration options.

Targets

Choose where the policy evaluates:
  • user
  • assistant
  • system
  • developer
  • context

Assign policies to applications

Policies are assigned per application. The relationship is:
  • One policy can be used by many applications
  • One application can enable one policies
To assign:
  1. Go to the application
  2. Click Assign policy
  3. Select the policy and confirm
To learn more about assigning policies to applications, visit the Applications page.

Using policies via API

When calling the Guard API, Guardion will evaluate the policies assigned to the specified application. Provide your application ID in the request body.
curl -X POST "https://api.guardion.ai/v1/guard" \
  -H "Authorization: Bearer $GUARDION_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "application": "app_12345",
    "messages": [
      { "role": "user", "content": "Summarize this text" }
    ]
  }'
Optionally, override which policies run regardless the application:
curl -X POST "https://api.guardion.ai/v1/guard" \
  -H "Authorization: Bearer $GUARDION_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "override_enabled_policies": ["policy_pii"],
    "messages": [
      { "role": "user", "content": "Share a list of test credit cards" }
    ]
  }'
  • See the full schema in Guard API
  • Return shape includes flagged, breakdown (per-policy results), and optional correction

Reviews and datasets (optional)

Use policy-focused reviews to iterate on performance, and build a dataset from your feedback to continuously improve detection.
Policy dataset built from feedback