python
30 lines · 7 steps
Batching an iterable for bulk indexing
A lazy chunking generator feeds fixed-size batches into a bulk index call while tracking failures.
Explained by
highlit
1from itertools import islice
2from typing import Iterable, Iterator, TypeVar
3
4T = TypeVar("T")
5
6
7def chunked(items: Iterable[T], size: int) -> Iterator[list[T]]:
8 if size < 1:
9 raise ValueError("size must be at least 1")
10 iterator = iter(items)
11 while batch := list(islice(iterator, size)):
12 yield batch
13
14
15def bulk_index_documents(documents: Iterable[dict], client, batch_size: int = 500) -> int:
16 indexed = 0
17 for batch in chunked(documents, batch_size):
18 actions = [
19 {"index": {"_id": doc["id"]}, "_source": doc}
20 for doc in batch
21 ]
22 response = client.bulk(operations=actions)
23 if response.get("errors"):
24 failures = [
25 item for item in response["items"]
26 if item["index"]["status"] >= 400
27 ]
28 raise RuntimeError(f"{len(failures)} documents failed to index")
29 indexed += len(batch)
30 return indexed
01 / 01
STEP 01
‹ swipe to step through ›
Walkthrough
Space play
←→ step
click any line
Three takeaways
- 1Wrapping any iterable in a generator lets you batch streams without loading everything into memory.
- 2The walrus operator turns an islice-into-list loop into a clean batch-until-empty pattern.
- 3Bulk APIs need explicit response inspection, since a 200 transport can still contain per-item failures.
Related explainers
python
import stripe from fastapi import APIRouter, Request, Header, HTTPException from app.config import settings
Handling Stripe webhooks in FastAPI
webhooks
signature-verification
event-routing
Intermediate
7 steps
rust
use axum::{ body::Bytes, extract::State, http::StatusCode,
Handling raw byte uploads in Axum
extractors
shared-state
request-limits
Intermediate
7 steps
javascript
const RETRIABLE_STATUS = new Set([408, 429, 500, 502, 503, 504]); function sleep(ms, signal) { return new Promise((resolve, reject) => {
Retrying fetch with exponential backoff
retry
exponential-backoff
abort-signal
Advanced
8 steps
rust
use axum::{ extract::{Query, State}, http::StatusCode, Json,
Paginated, filtered product listing in Axum
pagination
query-parameters
sql-filtering
Intermediate
8 steps
python
from operator import itemgetter def sort_employees(employees):
Multi-key sorting patterns in Python
sorting
tuple-keys
itemgetter
Intermediate
5 steps
php
<?php namespace App\Support;
Retry with exponential backoff in PHP
retry
exponential-backoff
error-handling
Intermediate
7 steps
Share this explainer
Here's the card — post it anywhere.
Made with highlit — turn any snippet into a walkthrough like this in about a minute.
Explain your code
Embed this explainer
Drop the interactive walkthrough into a blog or docs. Views never cost a credit.
<iframe src="https://highlit.co/explainers/batching-an-iterable-for-bulk-indexing-explained-python-7dab/embed?autoplay=1" width="100%" height="520" loading="lazy" style="border:0"></iframe>
Autoplay is on by default — add ?autoplay=0 to start paused.