python
33 lines · 7 steps
Two ways to dedupe while keeping order
Stream items through a set to keep first occurrences, or through an OrderedDict to keep the last.
Explained by
highlit
1from collections import OrderedDict
2from typing import Callable, Hashable, Iterable, Iterator, TypeVar
3
4T = TypeVar("T")
5
6
7def dedupe(
8 items: Iterable[T],
9 key: Callable[[T], Hashable] | None = None,
10) -> Iterator[T]:
11 seen: set[Hashable] = set()
12 extract = key or (lambda item: item)
13
14 for item in items:
15 marker = extract(item)
16 if marker in seen:
17 continue
18 seen.add(marker)
19 yield item
20
21
22def dedupe_last_wins(
23 items: Iterable[T],
24 key: Callable[[T], Hashable],
25) -> list[T]:
26 collapsed: OrderedDict[Hashable, T] = OrderedDict()
27
28 for item in items:
29 marker = key(item)
30 collapsed.pop(marker, None)
31 collapsed[marker] = item
32
33 return list(collapsed.values())
01 / 01
STEP 01
‹ swipe to step through ›
Walkthrough
Space play
←→ step
click any line
Three takeaways
- 1A set of computed keys lets you dedupe by identity while still yielding the original objects.
- 2Generators make first-wins deduplication lazy, so you never materialize the full input.
- 3Popping then re-inserting into an OrderedDict moves a key to the end, giving clean last-wins ordering.
Related explainers
python
import argparse import sys from pathlib import Path
Building a subcommand CLI with argparse
cli
argparse
subcommands
Intermediate
6 steps
python
from collections.abc import Mapping from typing import Any, Iterator
Flattening nested config into dotted keys
recursion
generators
tree-traversal
Intermediate
7 steps
python
import csv import io from datetime import datetime
Streaming a CSV export in Flask
streaming
generators
csv
Intermediate
9 steps
python
import time from collections import defaultdict from threading import Lock
Sliding-window login rate limiting in Flask
rate-limiting
sliding-window
thread-safety
Intermediate
7 steps
python
from django.conf import settings from django.contrib.auth import get_user_model from django.core.mail import EmailMultiAlternatives from django.db.models.signals import post_save
Sending a welcome email with Django signals
signals
email
user-activation
Intermediate
8 steps
python
import csv import io from datetime import date
Streaming a CSV export in FastAPI
streaming
async-generators
csv
Advanced
8 steps
Share this explainer
Here's the card — post it anywhere.
Made with highlit — turn any snippet into a walkthrough like this in about a minute.
Explain your code
Embed this explainer
Drop the interactive walkthrough into a blog or docs. Views never cost a credit.
<iframe src="https://highlit.co/explainers/two-ways-to-dedupe-while-keeping-order-explained-python-65d3/embed?autoplay=1" width="100%" height="520" loading="lazy" style="border:0"></iframe>
Autoplay is on by default — add ?autoplay=0 to start paused.