The Advanced Prompt Attack Detection

Developed by industry experts with experience building enterprise-grade AI guardrails at Siri Apple, Nubank and other leading companies,

ModernGuard is a specialized and modern transformer-encoder model designed to detect and prevent prompt attacks in real-time. This enterprise-grade solution offers multilingual support and ultra-fast inference capabilities to protect GenAI systems across various domains.

Model Card

Modern Transformer-Encoder Architecture

Built on ModernBERT, a high-efficiency encoder
Features Rotary Positional Embeddings, Flash Attention, and memory optimizations
Supports 8K token context with low latency

⚡ Ultra-Fast Inference

Optimized for real-time streaming and in-line LLM applications
Achieves sub ~50ms latency in production environments

Multilingual and Domain-Aware

Trained on data in 8+ languages
Covers banking, fintech, ecommerce, healthcare, and other verticals

🔐 Threat Intelligence Training + Continuous Updates

Pretrained on 1 trillion tokens
Fine-tuned on millions of simulated and real-world prompt attacks
- Proprietary red teaming data generated by AI attackers + red team partners
- AI threat databases & state-of-the-art prompt attack vectors
- Diverse synthetic data generation for safe examples
- Continuous updates with emerging threat patterns

Benchmark Results

This is the result for the benchmark, collecting public and private threats from red teaming partners and with set of updated threats database used from NVIDIA Garak and PromptFoo libraries. Our comprehensive evaluation demonstrates ModernGuard’s superior performance across diverse attack vectors.

The benchmark methodology includes:

Evaluation against 40+ attack classes
Cross-validation across multiple domains and languages

Overall F1-Scores

Model	Overall F1-Score
modern-guard-500M-modernBERT-v1	0.9718
modern-guard-120M-modernBERT-v1	0.9301
Lakera Guard	0.8600
protectai/deberta-v3-base-prompt-injection-v2	0.6008
deepset/deberta-v3-base-injection	0.5725
meta-llama/Prompt-Guard-86M	0.4555
jackhhao/jailbreak-classifier	0.5000

We missed any other prompt injection detector model or solution? Please, let us know, and we can add the evaluation as well.

Threat Category Coverage

Threat Category	guardion/Modern-Guard-1	meta-llama/Prompt-Guard-86M	protectai/deberta-v3-base-prompt-injection-v2	deepset/deberta-v3-base-injection	jackhhao/jailbreak-classifier	lakera-guard
Encoding	0.972667	0.567333	0.530222	0.889556	0.000000	0.677778
Prompt Injection	0.968602	0.308043	0.755299	0.899980	0.142857	0.878889
Jailbreaking	0.981274	0.621297	0.360996	0.764824	0.000000	0.738333
Exfiltration & Leakage	0.999667	0.284000	0.587730	0.981667	0.000000	0.850000
Evasion & Obfuscation	0.994659	0.583764	0.453216	0.794332	0.000000	0.728889
Code and Command Injection	0.990200	0.474000	0.455200	0.796400	0.000000	0.808000
Hard Negatives	0.958000	0.754000	0.756000	0.014000	1.000000	0.840000
Regular Content	0.968000	0.379000	0.786000	0.222000	1.000000	0.940000

Benchmarks span 40+ attack classes including obfuscation (e.g. ANSI, ASCII), jailbreaks (e.g. DAN, Goodside), injections (e.g. SQL, shell), and real-world attacks observed in LLM deployments.

A comprehensive research paper detailing ModernGuard’s architecture, training methodology, and benchmark results will be published soon.

How to Use ModernGuard

You need to combine the ModernGuard detector with a guardrail policy, so you can have control and fine-tune it for the specific risk level you want to manage (threshold levels).

💡 Example integration using a default policy

const messages = [
  { role: "user", content: "Your user input here" }
];

// Evaluate user message
const response = await fetch("https://api.guardion.ai/v1/guard", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
  },
  body: JSON.stringify({ message: messages })
});

const result = await response.json();

if (result.flagged) {
  console.log("Threat detected:", result.reason);
} else {
  console.log("Prompt is safe to use");
}

On this page

The Advanced Prompt Attack Detection
Model Card
Modern Transformer-Encoder Architecture
⚡ Ultra-Fast Inference
Multilingual and Domain-Aware
🔐 Threat Intelligence Training + Continuous Updates
Benchmark Results
Overall F1-Scores
Threat Category Coverage
How to Use ModernGuard
💡 Example integration using a default policy

The Advanced Prompt Attack Detection

Developed by industry experts with experience building enterprise-grade AI guardrails at Siri Apple, Nubank and other leading companies,

Model Card

Modern Transformer-Encoder Architecture

Built on ModernBERT, a high-efficiency encoder
Features Rotary Positional Embeddings, Flash Attention, and memory optimizations
Supports 8K token context with low latency

⚡ Ultra-Fast Inference

Optimized for real-time streaming and in-line LLM applications
Achieves sub ~50ms latency in production environments

Multilingual and Domain-Aware

Trained on data in 8+ languages
Covers banking, fintech, ecommerce, healthcare, and other verticals

🔐 Threat Intelligence Training + Continuous Updates

Pretrained on 1 trillion tokens
Fine-tuned on millions of simulated and real-world prompt attacks
- Proprietary red teaming data generated by AI attackers + red team partners
- AI threat databases & state-of-the-art prompt attack vectors
- Diverse synthetic data generation for safe examples
- Continuous updates with emerging threat patterns

Benchmark Results

The benchmark methodology includes:

Evaluation against 40+ attack classes
Cross-validation across multiple domains and languages

Overall F1-Scores

Model	Overall F1-Score
modern-guard-500M-modernBERT-v1	0.9718
modern-guard-120M-modernBERT-v1	0.9301
Lakera Guard	0.8600
protectai/deberta-v3-base-prompt-injection-v2	0.6008
deepset/deberta-v3-base-injection	0.5725
meta-llama/Prompt-Guard-86M	0.4555
jackhhao/jailbreak-classifier	0.5000

We missed any other prompt injection detector model or solution? Please, let us know, and we can add the evaluation as well.

Threat Category Coverage

Threat Category	guardion/Modern-Guard-1	meta-llama/Prompt-Guard-86M	protectai/deberta-v3-base-prompt-injection-v2	deepset/deberta-v3-base-injection	jackhhao/jailbreak-classifier	lakera-guard
Encoding	0.972667	0.567333	0.530222	0.889556	0.000000	0.677778
Prompt Injection	0.968602	0.308043	0.755299	0.899980	0.142857	0.878889
Jailbreaking	0.981274	0.621297	0.360996	0.764824	0.000000	0.738333
Exfiltration & Leakage	0.999667	0.284000	0.587730	0.981667	0.000000	0.850000
Evasion & Obfuscation	0.994659	0.583764	0.453216	0.794332	0.000000	0.728889
Code and Command Injection	0.990200	0.474000	0.455200	0.796400	0.000000	0.808000
Hard Negatives	0.958000	0.754000	0.756000	0.014000	1.000000	0.840000
Regular Content	0.968000	0.379000	0.786000	0.222000	1.000000	0.940000

Benchmarks span 40+ attack classes including obfuscation (e.g. ANSI, ASCII), jailbreaks (e.g. DAN, Goodside), injections (e.g. SQL, shell), and real-world attacks observed in LLM deployments.

A comprehensive research paper detailing ModernGuard’s architecture, training methodology, and benchmark results will be published soon.

How to Use ModernGuard

You need to combine the ModernGuard detector with a guardrail policy, so you can have control and fine-tune it for the specific risk level you want to manage (threshold levels).

💡 Example integration using a default policy

const messages = [
  { role: "user", content: "Your user input here" }
];

// Evaluate user message
const response = await fetch("https://api.guardion.ai/v1/guard", {
  method: "POST",
  headers: {
    "Content-Type": "application/json",
    "Authorization": "Bearer YOUR_API_KEY"
  },
  body: JSON.stringify({ message: messages })
});

const result = await response.json();

if (result.flagged) {
  console.log("Threat detected:", result.reason);
} else {
  console.log("Prompt is safe to use");
}

On this page

The Advanced Prompt Attack Detection
Model Card
Modern Transformer-Encoder Architecture
⚡ Ultra-Fast Inference
Multilingual and Domain-Aware
🔐 Threat Intelligence Training + Continuous Updates
Benchmark Results
Overall F1-Scores
Threat Category Coverage
How to Use ModernGuard
💡 Example integration using a default policy

ModernGuard v1

The Advanced Prompt Attack Detection

Model Card

Modern Transformer-Encoder Architecture

⚡ Ultra-Fast Inference

Multilingual and Domain-Aware

🔐 Threat Intelligence Training + Continuous Updates

Benchmark Results

Overall F1-Scores

Threat Category Coverage

How to Use ModernGuard

💡 Example integration using a default policy

Getting Started

Platform

Detectors

Integrations

Guardion API endpoints

ModernGuard v1

The Advanced Prompt Attack Detection

Model Card

Modern Transformer-Encoder Architecture

⚡ Ultra-Fast Inference

Multilingual and Domain-Aware

🔐 Threat Intelligence Training + Continuous Updates

Benchmark Results

Overall F1-Scores

Threat Category Coverage

How to Use ModernGuard

💡 Example integration using a default policy

​The Advanced Prompt Attack Detection

​Model Card

​Modern Transformer-Encoder Architecture

​⚡ Ultra-Fast Inference

​Multilingual and Domain-Aware

​🔐 Threat Intelligence Training + Continuous Updates

​Benchmark Results

​Overall F1-Scores

​Threat Category Coverage

​How to Use ModernGuard

​💡 Example integration using a default policy

Getting Started

Platform

Detectors

Integrations

Guardion API endpoints

​The Advanced Prompt Attack Detection

​Model Card

​Modern Transformer-Encoder Architecture

​⚡ Ultra-Fast Inference

​Multilingual and Domain-Aware

​🔐 Threat Intelligence Training + Continuous Updates

​Benchmark Results

​Overall F1-Scores

​Threat Category Coverage

​How to Use ModernGuard

​💡 Example integration using a default policy

The Advanced Prompt Attack Detection

Model Card

Modern Transformer-Encoder Architecture

⚡ Ultra-Fast Inference

Multilingual and Domain-Aware

🔐 Threat Intelligence Training + Continuous Updates

Benchmark Results

Overall F1-Scores

Threat Category Coverage

How to Use ModernGuard

💡 Example integration using a default policy

The Advanced Prompt Attack Detection

Model Card

Modern Transformer-Encoder Architecture

⚡ Ultra-Fast Inference

Multilingual and Domain-Aware

🔐 Threat Intelligence Training + Continuous Updates

Benchmark Results

Overall F1-Scores

Threat Category Coverage

How to Use ModernGuard

💡 Example integration using a default policy