- Text Moderation: Analyzes text for harmful content categories with configurable severity thresholds
- Prompt Shields: Detects prompt injection attacks in user inputs and malicious content embedded in documents
Prerequisites
Before configuring Azure Content Safety guardrails in AI Studio, you need:- An Azure account with an active subscription
- An Azure AI Content Safety resource deployed in a supported region
- The API key and endpoint for your Content Safety resource
- Enterprise plan access to AI Studio
Create a content safety resource and get credentials
If you haven’t created an Azure AI Content Safety resource yet, follow these steps:- Sign in to the Azure portal
- Select Create a resource and search for Content Safety
- Select Azure AI Content Safety and select Create
- Configure your resource:
- Select your Subscription and Resource group
- Choose a supported region
- Enter a unique Name for your resource
- Select a Pricing tier
- Select Review + create, then Create
- After deployment completes, navigate to your resource and go to Resource Management > Keys and Endpoint
- Copy one of the Keys (either KEY 1 or KEY 2) and the Endpoint URL (for example,
https://your-resource-name.cognitiveservices.azure.com/) for use in AI Studio
Configure in AI Studio
When adding an Azure Content Safety guardrail in AI Studio, provide these parameters:| Parameter | Description |
|---|---|
| API key | Your Azure Content Safety API key (from the Azure portal) |
| API base | Your Content Safety endpoint URL (for example, https://your-resource-name.cognitiveservices.azure.com/) |
| Parameter | Description | Default |
|---|---|---|
| Categories | The content categories to evaluate. Select one or more: Hate, SelfHarm, Sexual, Violence. | All categories selected |
| Severity threshold | The minimum severity level (0–6) that triggers a block. Lower values are more strict. For example, a threshold of 2 blocks content with a severity score of 2 or higher. | 2 |
By default, a guardrail applies to all teams in your organization. You can restrict it to specific teams during configuration. See Team scoping for details.
Text Moderation
Azure Content Safety Text Moderation analyzes text for harmful content across four categories:| Category | Description |
|---|---|
| Hate | Discriminatory or prejudiced content targeting identity groups |
| Sexual | Sexually explicit or suggestive content |
| Self-Harm | Content related to self-harm or suicide |
| Violence | Violent, threatening, or graphic content |
How Text Moderation works with AI Studio
When configured as a guardrail, Text Moderation checks text content against each category and blocks the request if any category’s severity score meets or exceeds the configured threshold. Text Moderation supports the following guardrail modes:- Pre-call: Check user input before the LLM processes it. Blocks harmful prompts and saves LLM costs.
- Post-call: Check LLM output before returning it to the user. Catches harmful content the LLM might generate.
Prompt Shields
Azure Content Safety Prompt Shields detects prompt injection attacks—attempts by users or embedded documents to manipulate an AI model into bypassing its safety rules or instructions. Prompt Shields detects two types of attacks:| Attack type | Description |
|---|---|
| User Prompt attacks | Malicious instructions in user input designed to override the model’s system prompt or safety behaviors |
| Document attacks | Harmful commands embedded in documents or external content that the model processes |
How Prompt Shields works with AI Studio
When configured as a guardrail, Prompt Shields analyzes user prompts and any associated documents for injection attacks. If an attack is detected, AI Studio blocks the request. Prompt Shields supports the following guardrail modes:- Pre-call: Check user input and documents before the LLM call. Blocks prompt injection attempts and prevents the LLM from processing malicious content.
- During-call: Check input in parallel with the LLM call for lower latency.
Verify your setup
Before configuring Azure Content Safety in AI Studio, test your resource directly to confirm your API key and endpoint work.Test Text Moderation
Run this cURL command, replacing<endpoint> and <your_api_key> with your values:
Test Prompt Shields
Run this cURL command, replacing<endpoint> and <your_api_key> with your values:
Next steps
- Configure guardrails: Learn about guardrail modes, team scoping, and error handling
- Track usage and spend: Monitor guardrail activity and usage
- Azure AI Content Safety documentation: Detailed Azure documentation
- Azure Content Safety pricing: Understand Azure Content Safety costs