Using LangChain ZenGuard Detector for AI Safety and Security
Posted: Feb 4, 2025.
When building AI applications, security and content monitoring are crucial aspects that need to be addressed. The ZenGuard Detector class in LangChain provides a powerful set of tools to implement various safety features in your LLM applications.
What is the ZenGuard Detector?
The ZenGuard Detector is a class that helps implement security and content monitoring features in your LangChain applications. It provides several detection capabilities including prompt injection detection, PII (Personally Identifiable Information) detection, toxicity monitoring, and content filtering based on allowed or banned topics.
Reference
The Detector class includes several predefined detection types that can be used:
Detector Type | Description |
---|---|
ALLOWED_TOPICS | Checks if content matches predefined allowed subjects |
BANNED_TOPICS | Identifies content containing banned or prohibited subjects |
PROMPT_INJECTION | Detects attempts at prompt injection attacks |
KEYWORDS | Monitors for specific keywords |
PII | Identifies personally identifiable information |
SECRETS | Detects sensitive information like API keys or passwords |
TOXICITY | Monitors for toxic or inappropriate content |
How to Use the ZenGuard Detector
Setup and Authentication
Before using the Detector, you'll need to set up ZenGuard and obtain an API key:
Detecting Prompt Injection Attacks
Here's how to check for potential prompt injection attacks:
Multiple Detection Types
You can combine multiple detectors in a single check:
Content Topic Filtering
To ensure content stays within allowed topics or check for banned subjects:
Working with Response Data
The detector returns useful information in the response:
The ZenGuard Detector provides a robust solution for implementing content safety and security features in your LangChain applications. By utilizing its various detection types, you can create safer and more secure AI applications that protect against common threats and inappropriate content.
Remember to handle errors appropriately, as the API might return various status codes:
- 401 for authentication issues
- 400 for malformed requests
- 500 for internal server errors
An alternative to LangSmith
Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.
LangChain DocsJoin 10,000+ subscribers
Every 2 weeks, latest model releases and industry news.
An alternative to LangSmith
Open-source LangChain monitoring, prompt management, and magic. Get started in 2 minutes.