Skip to main content

Quotas

Quotas are similar to rate limits but typically apply over longer periods or across multiple resources.

  • Example: A free API tier may allow 10,000 requests per month.
  • Exceeding the quota usually blocks further requests until reset.

Differences from Rate Limiting:

FeatureRate LimitingQuotas
Time WindowShort-term (seconds/minutes)Long-term (daily/monthly/yearly)
PurposePrevent server overloadEnforce usage tiers and subscriptions
EnforcementPer requestPer account, API key, or subscription

Quota in REST API

Scenario: Free users can make 10,000 requests/month.

  1. Client makes a request:
GET /api/products
  1. Server checks monthly usage:
  • Usage < 10,000 → process request, increment usage
  • Usage ≥ 10,000 → return 403 Forbidden with quota message

Response:

HTTP/1.1 403 Forbidden
Content-Type: application/json

{
"error": "Quota exceeded. Free tier allows 10,000 requests per month."
}
  • Paid users might have higher quotas.
  • Quotas are often combined with rate limits to control both burst traffic and overall usage.

Combining Rate Limiting & Quotas

Imagine an API with:

  • Rate Limit: 100 requests per hour
  • Monthly Quota: 10,000 requests

Client requests:

GET /api/orders

Server checks:

  1. Hourly rate: If requests > 100 → 429 Too Many Requests
  2. Monthly quota: If total > 10,000 → 403 Quota Exceeded

Response headers may include:

X-RateLimit-Limit: 100
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1673269200
X-Quota-Limit: 10000
X-Quota-Remaining: 8765

Why Rate Limiting & Quotas Improve Performance & Scalability

  1. Prevents server overload: Stops clients from sending too many requests at once.
  2. Fair usage: Ensures no client consumes disproportionate resources.
  3. Encourages efficient client behavior: Clients implement caching or backoff strategies.
  4. Supports monetization: Quotas enable tiered subscription models.
  5. Protects downstream services: Rate limiting reduces load on databases and third-party integrations.

Best Practices

  • Return 429 for rate limit exceeded and 403 for quota exceeded.
  • Use Retry-After header to guide clients.
  • Combine with caching for optimal performance.
  • Log request counts for monitoring and analytics.
  • Consider IP-based, user-based, or API key-based limits depending on API type.