Rate Limit Tiers
Rate limits vary based on authentication status:| Tier | Requests per Minute | Requests per Hour |
|---|---|---|
| Unauthenticated | 100 | 1,000 |
| Authenticated | 1,000 | 10,000 |
Enterprise customers can request higher rate limits. Contact [email protected] for custom limits.
Rate Limit Headers
Every API response includes headers indicating your current rate limit status:| Header | Description |
|---|---|
X-RateLimit-Limit | Maximum requests allowed in the current window |
X-RateLimit-Remaining | Number of requests remaining in the current window |
X-RateLimit-Reset | Unix timestamp when the rate limit window resets |
429 Too Many Requests
When you exceed the rate limit, the API returns a429 Too Many Requests response:
Retry-Afterheader: Seconds until you can retryretry_afterfield: Same value in the JSON response body
Handling Rate Limits
Automatic Retry with SDK
The Python SDK automatically handles rate limits with exponential backoff:- Wait for the duration specified in
Retry-Afterheader - Retry with exponential backoff (1s, 2s, 4s, …)
- Throw
BaytRateLimitErrorif all retries are exhausted
Manual Retry Logic
If calling the API directly, implement retry logic:Python Example
JavaScript Example
Best Practices
Monitor rate limit headers
Monitor rate limit headers
Always check
X-RateLimit-Remaining to know when you’re approaching the limit:Implement exponential backoff
Implement exponential backoff
Don’t retry immediately. Use exponential backoff to avoid hammering the API:
Batch requests when possible
Batch requests when possible
Instead of fetching prompts one at a time, use the list endpoint:
Cache responses locally
Cache responses locally
Cache prompt data to reduce API calls:
Spread requests over time
Spread requests over time
If processing many prompts, add delays between requests:
Use server-side caching
Use server-side caching
For production applications, implement server-side caching:
Rate Limit Response Example
Checking Headers Before Hitting Limit
When Rate Limited
Monitoring Rate Limit Usage
Track your rate limit usage in application logs:Enterprise Rate Limits
Enterprise customers receive:- Higher default rate limits
- Dedicated rate limit pools
- Priority support for limit adjustments
- Custom rate limit configurations per API key