python 59 lines · 10 steps

Token-bucket rate limiting in FastAPI

A per-user token bucket throttles requests through a configurable FastAPI dependency.

Explained by highlit
1import time
2from dataclasses import dataclass, field
3 
4from fastapi import Depends, FastAPI, HTTPException, Request, status
5from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
6 
7app = FastAPI()
8bearer = HTTPBearer()
9 
10 
11@dataclass
12class TokenBucket:
13 capacity: int
14 refill_rate: float
15 tokens: float = field(init=False)
16 updated_at: float = field(default_factory=time.monotonic)
17 
18 def __post_init__(self) -> None:
19 self.tokens = float(self.capacity)
20 
21 def consume(self, amount: int = 1) -> bool:
22 now = time.monotonic()
23 elapsed = now - self.updated_at
24 self.tokens = min(self.capacity, self.tokens + elapsed * self.refill_rate)
25 self.updated_at = now
26 if self.tokens >= amount:
27 self.tokens -= amount
28 return True
29 return False
30 
31 
32_buckets: dict[str, TokenBucket] = {}
33 
34 
35def rate_limit(capacity: int = 60, per_seconds: float = 60.0):
36 async def dependency(
37 request: Request,
38 creds: HTTPAuthorizationCredentials = Depends(bearer),
39 ) -> str:
40 user_id = await resolve_user_id(creds.credentials)
41 bucket = _buckets.setdefault(
42 user_id, TokenBucket(capacity, capacity / per_seconds)
43 )
44 if not bucket.consume():
45 retry_after = max(1, int((1 - bucket.tokens) / bucket.refill_rate))
46 raise HTTPException(
47 status_code=status.HTTP_429_TOO_MANY_REQUESTS,
48 detail="Rate limit exceeded",
49 headers={"Retry-After": str(retry_after)},
50 )
51 request.state.user_id = user_id
52 return user_id
53 
54 return dependency
55 
56 
57@app.get("/reports")
58async def list_reports(user_id: str = Depends(rate_limit(capacity=30))):
59 return await fetch_reports_for(user_id)
01 / 01
STEP 01

Walkthrough

Space play step click any line
Three takeaways
  1. 1A token bucket allows bursts up to capacity while enforcing a steady long-run rate via continuous refill.
  2. 2Wrapping a dependency in a factory lets you parameterize per-route limits while reusing one implementation.
  3. 3Returning a 429 with a Retry-After header tells clients exactly when to try again.

Related explainers

Share this explainer

Here's the card — post it anywhere.

Token-bucket rate limiting in FastAPI — share card
Made with highlit — turn any snippet into a walkthrough like this in about a minute.
Explain your code