Quotas
Quotas are similar to rate limits but typically apply over longer periods or across multiple resources.
- Example: A free API tier may allow 10,000 requests per month.
- Exceeding the quota usually blocks further requests until reset.
Differences from Rate Limiting:
| Feature | Rate Limiting | Quotas |
|---|---|---|
| Time Window | Short-term (seconds/minutes) | Long-term (daily/monthly/yearly) |
| Purpose | Prevent server overload | Enforce usage tiers and subscriptions |
| Enforcement | Per request | Per account, API key, or subscription |
Quota in REST API
Scenario: Free users can make 10,000 requests/month.
- Client makes a request:
GET /api/products
- Server checks monthly usage:
- Usage < 10,000 → process request, increment usage
- Usage ≥ 10,000 → return 403 Forbidden with quota message
Response:
HTTP/1.1 403 Forbidden
Content-Type: application/json
{
"error": "Quota exceeded. Free tier allows 10,000 requests per month."
}
- Paid users might have higher quotas.
- Quotas are often combined with rate limits to control both burst traffic and overall usage.
Combining Rate Limiting & Quotas
Imagine an API with:
- Rate Limit: 100 requests per hour
- Monthly Quota: 10,000 requests
Client requests:
GET /api/orders
Server checks:
- Hourly rate: If requests > 100 →
429 Too Many Requests - Monthly quota: If total > 10,000 →
403 Quota Exceeded
Response headers may include:
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1673269200
X-Quota-Limit: 10000
X-Quota-Remaining: 8765
Why Rate Limiting & Quotas Improve Performance & Scalability
- Prevents server overload: Stops clients from sending too many requests at once.
- Fair usage: Ensures no client consumes disproportionate resources.
- Encourages efficient client behavior: Clients implement caching or backoff strategies.
- Supports monetization: Quotas enable tiered subscription models.
- Protects downstream services: Rate limiting reduces load on databases and third-party integrations.
Best Practices
- Return 429 for rate limit exceeded and 403 for quota exceeded.
- Use Retry-After header to guide clients.
- Combine with caching for optimal performance.
- Log request counts for monitoring and analytics.
- Consider IP-based, user-based, or API key-based limits depending on API type.