Rate limits

What are rate limits?

A rate limit is a restriction that an API imposes on the number of times a user or client can access the server within a specified period of time.

Note: A request is defined as a single API call made by an API user. Requests can be made to any endpoint within the API.

Why do we have rate limits?

  • Rate limiting is in place to prevent overuse of the Writer API and to ensure a fair distribution of resources.
  • Rate limiting helps Writer manage the aggregate load on its infrastructure since it could cause performance images when requests to the API dramatically increase suddenly.

What happens if I encounter a rate limit error?

If you hit a rate limit, it means you've made too many requests in a short period of time, and the API is refusing to fulfill further requests until a specified amount of time has passed.

Each API key is allowed a certain number of requests per hour. If you exceed this limit, you will receive a 429 error response.

What are the rate limits for the Writer API?

  1. Writer API calls of various categories are subject to differing rate constraints. Furthermore, the Enterprise API has its own set of restrictions.
  2. All client-to-server queries on Writer are associated with the authenticated developer account. This is the user who created the developer account. All requests from a Client to a server count toward the API token rate limit.
  3. Requests from a client to a server are limited to 4,000 per hour and per token.
  4. Client-to-server requests may be subject to a higher limit of 10,000 requests per hour and per API token.
  5. The request must come from a developer account owned by a Writer Enterprise account or from an Auth App authorized by a Writer business partner.

What are the rate limits for the Recap API?

Recap concurrency limit is the maximum number of transcripts or real-time audio sessions that a user can process simultaneously. This is also referred to as "throttling".

There is also a usage limit, which determines the number of hours of audio that a user can transcribe in a given month. This is specific to each account, and varies based on the type of transcription requested. Additionally, there is a rate limit at the API level, which is the number of API calls that a user can make within a particular time frame. This is to ensure that a single user or bad actor doesn't affect the performance of the API for other users.

Account typeConcurrency limit
Enterprise 132
Enterprise +Custom


Note that real-time is a paid-only model. In addition to the concurrency limit, there is a rate limit for the API, which restricts users to a maximum of 20,000 requests per five minutes.

How can I check my rate limit status?

You can check your current rate limit status at any moment using the developer dashboard.

To check your current rate limit status, you can include the following headers in your request:

  • X-RateLimit-Limit: the maximum number of requests you're allowed to make per hour
  • X-RateLimit-Remaining: the number of requests you have left to make this hour
  • X-RateLimit-Reset: the time at which your request limit will be reset (in UTC epoch seconds)

When should I consider requesting a rate limit increase?

Our default rate limits help us maintain stability and prevent abuse of our API. We increase limits to enable high-traffic applications, so the best time to apply for a rate limit increase is when you feel that you have the necessary traffic data to support a strong case for increasing the rate limit.

This rate limiting policy may change from time to time. API users will be notified of any changes through the API documentation or by email.


Writer API calls of various categories are subject to differing rate constraints. Furthermore, the Enterprise API has its own set of restrictions.