๐Ÿ›ก๏ธ Using GuardionAI with OpenAI Agents SDK

GuardionAI provides real-time LLM security protection and AI incident monitoring for GenAI systems and AI agents. With the OpenAI Agents SDK, you can integrate GuardionAI as both input and output guardrails in your agent pipelines.

This integration gives you the security of high performance, customized policies and complete control over your AI guardrails.

  • Multilingual support: Enhanced capabilities across multiple languages, including spanish, portuguese, and english
  • Ultra-Fast API Response: Sub-50ms API latency for real-time protection
  • AI Incident Monitoring: Real-time detection, alerting and auditing for potential threats or security incidents in AI interactions

๐Ÿงช Try Integrating GuardionAI + Agents SDK in the Google Colab

Use our interactive Colab notebook experiment:

๐Ÿ‘‰ Open in Google Colab

Example output in the Google Colab Example:

โŒ {
    "object": "eval",
    "time": 2.0, // response in 2 ms thanks to caching
    "created": 1745525108,
    "flagged": true, // threat flagged
    "breakdown": [
        {
            "policy_id": "prompt-defense-default", // or your custom policy
            "detector": "modern-guard", // our advanced and continuous updated threat detection model
            "detected": true,
            "threshold": 0.8, // the policy flagging level
            "score": 0.9129506945610046, // model output for a threat
            "result": [
                {
                    "label": "SAFE",
                    "score": 0.08704936504364014
                },
                {
                    "label": "INJECTION",
                    "score": 0.9129506945610046
                }
            ]
        }
    ]
}
โŒ Guardion flagged this input as unsafe.


What Youโ€™ll Need

pip install openai-agents-python requests

Input Guardrail Example (Using GuardionAI)

Stops unsafe/malicious/prompt attacks before they reach your AI agent.

from agents import Agent, GuardrailFunctionOutput, InputGuardrailTripwireTriggered, RunContextWrapper, input_guardrail, Runner
import requests
import asyncio

@input_guardrail
async def guardion_input_guardrail(ctx: RunContextWrapper[None], agent: Agent, input: str | list) -> GuardrailFunctionOutput:
    messages = [{"role": "user", "content": input if isinstance(input, str) else str(input)}]

    response = requests.post(
        "https://api.guardion.ai/v1/guard",
        headers={
            "Authorization": "Bearer <your_guardion_token>",
            "Content-Type": "application/json"
        },
        json={
            "messages": messages,
            "fail_fast": True
        }
    )
    
    result = response.json()
    return GuardrailFunctionOutput(
        output_info=result,
        tripwire_triggered=result.get("flagged", False)
    )

agent = Agent(
    name="Secure AI Assistant",
    instructions="You are a helpful and safe assistant.",
    input_guardrails=[guardion_input_guardrail]
)

async def main():
    try:
        user_prompt = "How can I hack a website?"
        result = await Runner.run(agent, user_prompt)
        print(result.final_output)
    except InputGuardrailTripwireTriggered:
        print("โŒ Input flagged by GuardionAI!")

asyncio.run(main())

Output Guardrail Example (using GuardionAI)

Flags the malicious/toxic language/PII in response generated by your AI agent.

from agents import Agent, GuardrailFunctionOutput, OutputGuardrailTripwireTriggered, RunContextWrapper, output_guardrail
from pydantic import BaseModel
import requests

class MessageOutput(BaseModel):
    response: str

@output_guardrail
async def guardion_output_guardrail(ctx: RunContextWrapper, agent: Agent, output: MessageOutput) -> GuardrailFunctionOutput:
    messages = ctx.context.get("messages", [])
    messages.append({"role": "assistant", "content": output.response})

    response = requests.post(
        "https://api.guardion.ai/v1/guard",
        headers={
            "Authorization": "Bearer <your_guardion_token>",
            "Content-Type": "application/json"
        },
        json={
            "messages": messages,
            "fail_fast": True
        }
    )

    result = response.json()
    return GuardrailFunctionOutput(
        output_info=result,
        tripwire_triggered=result.get("flagged", False)
    )

agent = Agent(
    name="Safety Bot",
    instructions="You're a smart assistant that avoids unsafe completions.",
    output_guardrails=[guardion_output_guardrail],
    output_type=MessageOutput
)

๐Ÿ’ก Tips & Best Practices

  • โœ… Use input guardrails to save cost by blocking bad prompts early.
  • ๐Ÿ“Š Log breakdown details for security audits.
  • ๐Ÿ’ฌ For multi-turn chat, format messages as [{role: ..., content: ...}, ...]

Guardion API Reference

  • Endpoint: POST https://api.guardion.ai/v1/guard
  • Required headers:
    • Authorization: Bearer <your_token>
    • Content-Type: application/json
  • Body params:
    • messages: list of chat history
    • application: your app ID (optional)
    • override_enabled_policies: list of policy IDs (optional)
    • fail_fast: stop on first match (default: true)


Need help? Contact us at founders@guardion.ai.