python
59 lines · 10 steps
Token-bucket rate limiting in FastAPI
A per-user token bucket throttles requests through a configurable FastAPI dependency.
Explained by
highlit
1import time
2from dataclasses import dataclass, field
3
4from fastapi import Depends, FastAPI, HTTPException, Request, status
5from fastapi.security import HTTPAuthorizationCredentials, HTTPBearer
6
7app = FastAPI()
8bearer = HTTPBearer()
9
10
11@dataclass
12class TokenBucket:
13 capacity: int
14 refill_rate: float
15 tokens: float = field(init=False)
16 updated_at: float = field(default_factory=time.monotonic)
17
18 def __post_init__(self) -> None:
19 self.tokens = float(self.capacity)
20
21 def consume(self, amount: int = 1) -> bool:
22 now = time.monotonic()
23 elapsed = now - self.updated_at
24 self.tokens = min(self.capacity, self.tokens + elapsed * self.refill_rate)
25 self.updated_at = now
26 if self.tokens >= amount:
27 self.tokens -= amount
28 return True
29 return False
30
31
32_buckets: dict[str, TokenBucket] = {}
33
34
35def rate_limit(capacity: int = 60, per_seconds: float = 60.0):
36 async def dependency(
37 request: Request,
38 creds: HTTPAuthorizationCredentials = Depends(bearer),
39 ) -> str:
40 user_id = await resolve_user_id(creds.credentials)
41 bucket = _buckets.setdefault(
42 user_id, TokenBucket(capacity, capacity / per_seconds)
43 )
44 if not bucket.consume():
45 retry_after = max(1, int((1 - bucket.tokens) / bucket.refill_rate))
46 raise HTTPException(
47 status_code=status.HTTP_429_TOO_MANY_REQUESTS,
48 detail="Rate limit exceeded",
49 headers={"Retry-After": str(retry_after)},
50 )
51 request.state.user_id = user_id
52 return user_id
53
54 return dependency
55
56
57@app.get("/reports")
58async def list_reports(user_id: str = Depends(rate_limit(capacity=30))):
59 return await fetch_reports_for(user_id)
01 / 01
STEP 01
‹ swipe to step through ›
Walkthrough
Space play
←→ step
click any line
Three takeaways
- 1A token bucket allows bursts up to capacity while enforcing a steady long-run rate via continuous refill.
- 2Wrapping a dependency in a factory lets you parameterize per-route limits while reusing one implementation.
- 3Returning a 429 with a Retry-After header tells clients exactly when to try again.
Related explainers
python
import stripe from fastapi import APIRouter, Request, Header, HTTPException from app.config import settings
Handling Stripe webhooks in FastAPI
webhooks
signature-verification
event-routing
Intermediate
7 steps
php
<?php namespace App\Http\Controllers;
Building a filtered product index in Laravel
query-builder
validation
conditional-queries
Intermediate
9 steps
typescript
import { Directive, Input, TemplateRef,
Building a structural *appUnless directive in Angular
structural-directive
template-rendering
dependency-injection
Intermediate
8 steps
python
from operator import itemgetter def sort_employees(employees):
Multi-key sorting patterns in Python
sorting
tuple-keys
itemgetter
Intermediate
5 steps
php
<?php namespace App\Support;
Retry with exponential backoff in PHP
retry
exponential-backoff
error-handling
Intermediate
7 steps
typescript
import { inject } from '@angular/core'; import { ResolveFn, Router, ActivatedRouteSnapshot } from '@angular/router'; import { catchError, of, EMPTY } from 'rxjs'; import { Article } from './models/article';
Prefetching route data with an Angular resolver
route-resolver
dependency-injection
rxjs
Intermediate
6 steps
Share this explainer
Here's the card — post it anywhere.
Made with highlit — turn any snippet into a walkthrough like this in about a minute.
Explain your code
Embed this explainer
Drop the interactive walkthrough into a blog or docs. Views never cost a credit.
<iframe src="https://highlit.co/explainers/token-bucket-rate-limiting-in-fastapi-explained-python-d57f/embed?autoplay=1" width="100%" height="520" loading="lazy" style="border:0"></iframe>
Autoplay is on by default — add ?autoplay=0 to start paused.