Protect your AI agents from PII leaks, prompt injections, and NSFW content using Agno's guardrails.
Guardrails are security checkpoints that validate inputs before they reach your language model, protecting against PII leaks, prompt injection, jailbreaks, and inappropriate content.
Overview
Protect your agents from:
- PII Leaks: SSNs, credit cards, emails, phone numbers
- Prompt Injection: "Ignore previous instructions..."
- Jailbreaks: "Developer mode" attempts
- NSFW Content: Hate speech, violence, harmful content
Use guardrails when your agent is exposed to real users or handles sensitive data.
Prerequisites
- Python 3.9+
pip install agno- OpenAI API key
- Basic Agno knowledge: Getting Started
from agno.agent import Agent
from agno.models.openai import OpenAIChat
from agno.guardrails import PIIDetectionGuardrail
agent = Agent(
model=OpenAIChat(id="gpt-5-mini"),
pre_hooks=[PIIDetectionGuardrail()],
)
# Safe input works
agent.print_response("What's your return policy?")
# PII is blocked
agent.print_response("My SSN is 123-45-6789") # Raises InputCheckError
Built-in Guardrails
1. PII Detection
The PII Detection Guardrail automatically scans inputs for sensitive information like Social Security Numbers, credit card numbers, email addresses, and phone numbers. By default, it blocks any input containing PII.
Block PII:
from agno.guardrails import PIIDetectionGuardrail
agent = Agent(
model=OpenAIChat(id="gpt-5-mini"),
pre_hooks=[PIIDetectionGuardrail()],
)
Sometimes you want to process requests while still protecting sensitive data. The masking feature replaces PII with asterisks before sending to the LLM, allowing your agent to understand context without exposing actual sensitive information.
Mask PII instead of blocking:
agent = Agent(
model=OpenAIChat(id="gpt-5-mini"),
pre_hooks=[PIIDetectionGuardrail(mask_pii=True)],
)
# Input: "My SSN is 123-45-6789"
# LLM receives: "My SSN is ***********"
You can disable specific PII checks (e.g., allow emails in support tickets) or add custom patterns to detect business-specific sensitive data like employee IDs or internal account numbers.
Custom patterns:
guardrail = PIIDetectionGuardrail(
enable_email_check=False, # Disable email check
custom_patterns={
"bank_account_number": r"\b\d{10}\b", # Add custom pattern
}
)
Learn more:
2. Prompt Injection Defense
Prompt injection is one of the most common attacks on AI systems. Attackers try to manipulate your agent by injecting instructions like "Ignore previous instructions and..." to bypass your system prompts. This guardrail detects common injection patterns and blocks them.
from agno.guardrails import PromptInjectionGuardrail
agent = Agent(
model=OpenAIChat(id="gpt-5-mini"),
pre_hooks=[PromptInjectionGuardrail()],
)
# Blocks: "Ignore previous instructions..."
# Blocks: "Developer mode activated..."
# Blocks: "You are now a different AI..."
The default patterns cover most common attacks, but you can customize them to match your specific security needs or reduce false positives for your use case.
Custom patterns:
guardrail = PromptInjectionGuardrail(
injection_patterns=["ignore previous instructions", "bypass security"]
)
Learn more:
3. Content Moderation
Use OpenAI's Moderation API to automatically filter inappropriate content including hate speech, violence, self-harm, sexual content, and other harmful material. This happens before your main LLM call, saving costs on inappropriate requests.
from agno.guardrails import OpenAIModerationGuardrail
agent = Agent(
model=OpenAIChat(id="gpt-5-mini"),
pre_hooks=[OpenAIModerationGuardrail()],
)
By default, all moderation categories are checked. You can customize which categories trigger blocks based on your application's requirements. For example, a medical application might allow self-harm discussions while blocking hate speech.
Custom categories:
guardrail = OpenAIModerationGuardrail(
raise_for_categories=["violence", "hate", "sexual"]
)
Learn more:
Production Setup: Multiple Guardrails
For production systems, use multiple guardrails together for defense-in-depth security. Each layer catches different types of threats, and ordering them by speed ensures optimal performance.
from agno.guardrails import (
PIIDetectionGuardrail,
PromptInjectionGuardrail,
OpenAIModerationGuardrail,
)
secure_agent = Agent(
model=OpenAIChat(id="gpt-5-mini"),
pre_hooks=[
PIIDetectionGuardrail(mask_pii=True), # Layer 1: Protect PII
PromptInjectionGuardrail(), # Layer 2: Stop attacks
OpenAIModerationGuardrail(), # Layer 3: Filter content
],
)
Performance tip: Order by speed (fastest first):
- PromptInjectionGuardrail (fast regex checks)
- PIIDetectionGuardrail (regex + validation)
- OpenAIModerationGuardrail (external API call)
Custom Guardrails
Create custom guardrails for business-specific rules by extending BaseGuardrail. This example blocks any input containing URLs, useful for applications where you don't want users sharing external links.
import re
from agno.exceptions import CheckTrigger, InputCheckError
from agno.guardrails import BaseGuardrail
from agno.run.agent import RunInput
class URLGuardrail(BaseGuardrail):
"""Block inputs containing URLs."""
def check(self, run_input: RunInput) -> None:
if isinstance(run_input.input_content, str):
url_pattern = r'https?://[^\s]+|www\.[^\s]+'
if re.search(url_pattern, run_input.input_content):
raise InputCheckError(
"URLs are not allowed.",
check_trigger=CheckTrigger.INPUT_NOT_ALLOWED,
)
async def async_check(self, run_input: RunInput) -> None:
self.check(run_input) # Reuse sync logic for async
You must implement both check() (sync) and async_check() (async) methods. Agno automatically uses the right one based on whether you call .run() or .arun().
Learn more: BaseGuardrail Reference
Result
Your agent now has:
- PII Protection: Detect and mask sensitive information
- Injection Defense: Block malicious prompts automatically
- Content Filtering: Stop NSFW and harmful content
- Custom Rules: Enforce business-specific validation
Next Steps
- Guardrails for Teams - Multi-agent security
- Guardrails Overview - Detailed concepts
- Cookbook - More patterns
- Deploy with AgentOS - Production deployment

