Using the Batch API with Azure OpenAI
Posted: Oct 22, 2024.
Dealing with massive datasets or generating content at scale can be resource intensive and costly. Batch APIs are a cost-efficient way of processing large-scale LLM tasks efficiently.
In this guide, we'll walk through how to integrate with the Batch API under Azure OpenAI. We will go in-depth on setup, execution, practical examples and troubleshooting tips along the way.
What is Azure OpenAI's Batch API?
The Azure OpenAI Batch API allows handling large, asynchronous groups of requests while saving a lot on costs (50% cost reduction compared to the standard global pricing).
Instead of sending individual requests one-by-one, the Batch API allows handling very large workloads bundled into a JSON lines (JSONL) file.
Batch requests have their own enqueued token quotas, which means they do not interfere with real-time workloads.
Prerequisites
Before diving into batch processing, you will need the following:
-
Azure Subscription: Create one for free if you don't have one.
-
Azure OpenAI Resource: Ensure you have a deployed Azure OpenAI model of the Global-Batch type (Check out set-up steps below).
-
Python: Python 3.8 or later version
Setting up the Azure OpenAI Resource
- Visit https://ai.azure.com/ and log in using your Azure credentials.
- Search for Azure OpenAI in the Azure services menu.
- Click Create to set up a new Azure OpenAI resource. Provide the Subscription, Resource Group, and Region.
- Click Create new deployment. Configure the Model, Deployment Name, Scale Type, and Model Version.
Now the OpenAI resource is ready for use in your LLM applications.
Creating Your Batch File (.jsonl)
Batch processing in Azure formatting our data in JSON lines (.jsonl). Each line represents an individual request.
Here is a basic example of a JSONL file used for batch requests:
Each request contains details such as the method, url, and body, which includes the specific model used and the information to be processed.
JSONL Batch File Parameters
Field | Description |
---|---|
custom_id | A unique identifier for each request. |
method | The HTTP method, usually "POST". |
url | The API endpoint for the request. |
body | The body of the request, including model and messages with role and content. |
Submitting Your Batch File
Once your batch file is ready, you can proceed to upload it to Azure AI Studio or upload it programmatically via APIs.
Python APIs
We can use the normal OpenAI library:
The code first imports necessary tools and uses DefaultAzureCredential
to handle signing in to Azure.
Then it creates a client (AzureOpenAI) which lets us communicate with Azure services from the code.
The code then opens a file called "test.jsonl" and sends it to Azure.
This file contains tasks you want Azure to perform, in a batch.
After the file has been uploaded successfully, we can submit it for batch processing.
Azure AI Studio
-
Log in to AI Studio: Begin by signing into AI Studio.
-
Access Batch Jobs: Navigate to your Azure OpenAI resource, locate the Batch jobs (TOOLS) section, and click on Create Batch Job.
- Upload the JSONL File: Under the Batch data section, click Upload file and choose your prepared JSONL file.
After completing the upload, you will be able to monitor the progress of your batch jobs in AI Studio.
After uploading your batch file, click Create to start deploying the batch job.
Azure will then validate and enqueue the requests, assigning them tokens and processing them asynchronously.
Monitor Batch Job progress
Once your batch job is underway, you can monitor its status in Azure AI Studio.
Azure provides detailed timestamps and status messages to help track each phase of the job.
If your job fails, error messages will appear to guide you through troubleshooting. You can also view the number of requests processed, those pending, and any failures that occurred.
When monitoring via code, It is recommended to wait atleast 60 seconds for each status call
Possible status values are validating
, failed
, in_progress
, finalizing
, completed
, expired
, cancelling
and cancelled
.
For cancelling any Batch call, we can run:
Retrieving Batch Job Results
When a job is completed, Azure generates two types of files:
-
Output File: Contains successfully executed requests and results.
-
Error File: Details any issues or failures encountered during processing.
Successful jobs can still generate an error_file_id
, but it will be associated with an empty file with zero bytes.
You can download these files for further review by clicking the appropriate download icon in the Batch Jobs interface.
Or programmatically, retrieve the file from the file.id that we stored earlier.
Troubleshoot: Common Issues
-
Invalid JSON Lines: If you get an error regarding invalid JSON, ensure there are no missing brackets or incorrect characters in your JSONL file. You can use a JSON validator tool to check your files before submitting.
-
Too Many Requests: The Batch API limits each file to 100,000 items. Make sure your batch file is within these limits to avoid submission errors.
If your dataset is higher than that, splitting into smaller files will help.
-
Authentication Errors: Ensure your Azure credentials are correct and that you have the necessary permissions to create and manage batch jobs.
-
URL Mismatch: The URL provided in the request does not match the expected endpoint URL.
-
Quota Exceeded: This means that your current deployment does not have enough tokens left to process the batch. You may need to adjust your quota or split the requests into smaller batches.
For detailed information on configuring, monitoring, and optimizing your Azure OpenAI deployments, refer to Azure's official documentation or explore Azure AI Studio's interface.
As the need for high-volume data processing continues to grow, Azure's Batch API stands for modern enterprises.
Summary
Feature | Details |
---|---|
Maximum Requests per File | 100,000 |
Supported Regions | East US, West US, Sweden Central |
Dynamic Quota | Enabled for optimal resource utilization |
Input File Format | JSON Lines (.jsonl) |
Cost Reduction | 50% lower than standard global pricing |
Supported Models for Images | GPT-4o |
Support for Embeddings Models | Not Supported |
Support for fine-tuned Models | Not Supported |
Content filtering | Supported |
Error File Generation | Available for troubleshooting |
Join 10,000+ subscribers
Every 2 weeks, latest model releases and industry news.
Building an AI chatbot?
Open-source GenAI monitoring, prompt management, and magic.