General usage
Rate limits
For custom rate limits please contact our sales team.
To ensure optimal service performance and fairness in resource allocation, our endpoints enforce the following rate limits.
- RPM (requests per minute): 400
- TPM (token per min): 25,000
Best practices
Monitor request rates
Monitor request rates
Implement mechanisms in your applications to track and regulate the frequency of your requests to stay within the prescribed limits.
Adaptive retry strategies
Adaptive retry strategies
In cases where you exceed these limits, employ adaptive retry strategies with exponential backoff to handle retries efficiently and reduce the likelihood of consecutive limit breaches.
Response to HTTP 429 status codes
Response to HTTP 429 status codes
Prepare to handle HTTP 429 (too many requests) responses by pausing or slowing down request rates.