Working With Rate Limits
A rate limit is like having a bucket with a limited number of tokens. Each API request uses one token from your bucket. The bucket refills with new tokens over time, but if you use all your tokens, you’ll need to wait for more to be added before making additional requests.
How Rate Limits Work
Think of rate limiting like a water faucet with a flow regulator:
- Token Bucket: You start with a full bucket of 100 request tokens
- Request Consumption: Each API call uses one token from your bucket
- Bucket Refill: Your bucket refills completely every 60 seconds
- Overflow Protection: If your bucket is empty, you must wait for it to refill before making more requests
The system tracks your usage in real-time and provides information about your current status through response headers.
Per-API-Key Tracking
Rate limits are tracked individually for each API key, so one application’s usage doesn’t affect another’s limits.
Tracking Your Usage
Every API response includes headers that tell you about your current rate limit status:
Response Headers
X-RateLimit-Limit
The total number of requests allowed in the current time window.
X-RateLimit-Remaining
The number of requests you have left in the current time window.
X-RateLimit-Reset
A Unix timestamp indicating when your rate limit will reset (when your bucket refills).
Retry-After
(when rate limited)
When you exceed your rate limit, this header tells you how many seconds to wait before trying again.
Example Response Headers
Here’s what you might see in a successful response:
When You Exceed Rate Limits
If you make too many requests and exceed your rate limit, you’ll receive a 429 Too Many Requests
response:
This response tells you:
- You’ve used all 100 of your allowed requests
- You have 0 requests remaining
- Your limit will reset at timestamp 1672531200
- You should wait 45 seconds before making another request
Working with Rate Limits In Your App
Monitor Your Usage
Always check the rate limit headers in your responses to monitor your usage:
Handle Rate Limit Errors
Implement proper error handling for rate limit responses:
Best Practices
Implement Exponential Backoff
When you hit rate limits, wait progressively longer between retry attempts.
Batch Your Requests
If possible, combine multiple operations into fewer API calls.
Cache Responses
Store API responses locally to reduce the number of requests you need to make.
Monitor Continuously
Keep track of your rate limit headers to avoid hitting limits unexpectedly.
Plan for Peak Usage
Consider your application’s peak usage patterns and design accordingly.