Rate limits
API rate limits and how to handle throttled requests.
Limits
Authenticated API requests are rate-limited to 100 requests per minute per API key.
The limit applies to all authenticated endpoints:
POST /v1/responsesGET /v1/responses/{id}DELETE /v1/responses/{id}GET /v1/usage
GET /v1/engines is a public endpoint and is not rate-limited per key.
Exceeded limit response
When you exceed the limit, the API returns:
HTTP/1.1 429 Too Many Requests{
"message": "Rate limit exceeded. Please try again later.",
"data": {
"retry_after": 42
}
}The retry_after field indicates seconds until the limit resets.
Best practices
- Backoff — wait
retry_afterseconds before retrying. - Reduce polling frequency — poll generation status every 2–5 seconds, not every second.
- Use webhooks — replace polling with webhook delivery for completion notifications.
- Separate keys — use different API keys for high-throughput services to isolate rate limit buckets.