Policies

Policies are reusable guardrails that detect and control risky behavior in your AI applications. Define a policy once, assign it to one or many applications, and manage it centrally.

How policies work

A policy describes what to detect, where to look, and how to respond:

Detector model: The engine used to detect a risk
Target: Which part(s) of the conversation to inspect, e.g. Assistant or User
Sensibility: The minimum confidence required to flag
Expected behavior (coming soon): Defines the action taken when a detector is triggered (e.g., block or alert)
Override response: Optional default message to return on violation

Detector types

Guardion supports several types of detectors, each designed to identify specific risks or policy violations in AI interactions. You can mix and match detector types within your policies to cover a wide range of threats.

Supported types and models:

Prompt Defense

modern-guard-v1: Fast, general-purpose prompt security and jailbreak detection
modern-guard-v1.5: Advanced agentic prompt security and jailbreak detection

Content Moderation

moderation-v0: Safety moderation for harmful content

Each detector type can be configured with its own threshold and target, allowing you to tailor policies to your application’s needs. See the Detectors documentation for a full list and configuration options.

Targets

Choose where the policy evaluates:

user
assistant
system
developer
context

Assign policies to applications

Policies are assigned per application. The relationship is:

One policy can be used by many applications
One application can enable one policies

To assign:

Go to the application
Click Assign policy
Select the policy and confirm

To learn more about assigning policies to applications, visit the Applications page.

Using policies via API

When calling the Guard API, Guardion will evaluate the policies assigned to the specified application. Provide your application ID in the request body.

curl -X POST "https://api.guardion.ai/v1/guard" \
  -H "Authorization: Bearer $GUARDION_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "application": "app_12345",
    "messages": [
      { "role": "user", "content": "Summarize this text" }
    ]
  }'

Optionally, override which policies run regardless the application:

curl -X POST "https://api.guardion.ai/v1/guard" \
  -H "Authorization: Bearer $GUARDION_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "override_enabled_policies": ["policy_pii"],
    "messages": [
      { "role": "user", "content": "Share a list of test credit cards" }
    ]
  }'

See the full schema in Guard API
Return shape includes flagged, breakdown (per-policy results), and optional correction

Reviews and datasets (optional)

Use policy-focused reviews to iterate on performance, and build a dataset from your feedback to continuously improve detection.

Getting Started

Platform

Detectors

Integrations

Guardion API endpoints

How policies work

Detector types

Supported types and models:

Targets

Assign policies to applications

Using policies via API

Reviews and datasets (optional)

Getting Started

Platform

Detectors

Integrations

Guardion API endpoints

​How policies work

​Detector types

​Supported types and models:

​Targets

​Assign policies to applications

​Using policies via API

​Reviews and datasets (optional)

How policies work

Detector types

Supported types and models:

Targets

Assign policies to applications

Using policies via API

Reviews and datasets (optional)