java
41 lines · 6 steps
Three ways to deduplicate a list by key in Java
A small utility class showing distinct-by-key, latest-per-key, and a custom collector that all dedupe streams without overriding equals.
Explained by
highlit
1import java.util.ArrayList;
2import java.util.Comparator;
3import java.util.HashSet;
4import java.util.List;
5import java.util.Set;
6import java.util.function.Function;
7import java.util.stream.Collector;
8import java.util.stream.Collectors;
9
10public final class Deduplication {
11
12 public static <T, K> List<T> distinctByKey(List<T> items, Function<? super T, ? extends K> keyExtractor) {
13 Set<K> seen = new HashSet<>();
14 return items.stream()
15 .filter(item -> seen.add(keyExtractor.apply(item)))
16 .collect(Collectors.toList());
17 }
18
19 public static <T, K> List<T> latestPerKey(
20 List<T> items,
21 Function<? super T, ? extends K> keyExtractor,
22 Comparator<? super T> recency) {
23 return new ArrayList<>(items.stream()
24 .collect(Collectors.toMap(
25 keyExtractor,
26 Function.identity(),
27 (existing, candidate) -> recency.compare(candidate, existing) >= 0 ? candidate : existing))
28 .values());
29 }
30
31 public static <T> Collector<T, ?, List<T>> distinctByKeyCollector(Function<? super T, ?> keyExtractor) {
32 return Collector.of(
33 () -> new java.util.LinkedHashMap<Object, T>(),
34 (map, item) -> map.putIfAbsent(keyExtractor.apply(item), item),
35 (left, right) -> {
36 right.forEach(left::putIfAbsent);
37 return left;
38 },
39 map -> new ArrayList<>(map.values()));
40 }
41}
01 / 01
STEP 01
‹ swipe to step through ›
Walkthrough
Space play
←→ step
click any line
Three takeaways
- 1A HashSet's add return value doubles as a cheap first-seen test inside a stream filter.
- 2Collectors.toMap with a merge function lets you resolve key collisions by any rule you choose, like recency.
- 3A custom Collector can preserve insertion order while deduplicating, something toSet and toMap can't both guarantee.
Related explainers
java
public class ThumbnailProcessor { private static final int MAX_CONCURRENCY = 4;
Bounded parallel thumbnail rendering in Java
concurrency
thread-pool
futures
Intermediate
7 steps
java
public class SortedListMerger { public static int[] merge(int[] a, int[] b) { int[] result = new int[a.length + b.length];
Merging two sorted arrays in Java
two-pointers
merging
arrays
Beginner
6 steps
go
package cache import ( "container/list"
Building a generic LRU cache in Go
lru-cache
generics
linked-list
Intermediate
8 steps
java
import java.util.ArrayDeque; import java.util.Deque; public final class RollingAverage {
A rolling average over a sliding window
sliding-window
running-sum
deque
Intermediate
7 steps
rust
use std::collections::HashMap; pub struct Memoizer<K, V, F> { cache: HashMap<K, V>,
A generic memoizer in Rust
memoization
generics
caching
Intermediate
6 steps
java
@Target({ElementType.FIELD, ElementType.PARAMETER}) @Retention(RetentionPolicy.RUNTIME) @Constraint(validatedBy = StrongPasswordValidator.class) @Documented
Building a custom @StrongPassword validator in Spring
bean-validation
annotations
regex
Intermediate
7 steps
Share this explainer
Here's the card — post it anywhere.
Made with highlit — turn any snippet into a walkthrough like this in about a minute.
Explain your code
Embed this explainer
Drop the interactive walkthrough into a blog or docs. Views never cost a credit.
<iframe src="https://highlit.co/explainers/three-ways-to-deduplicate-a-list-by-key-in-java-explained-java-b404/embed?autoplay=1" width="100%" height="520" loading="lazy" style="border:0"></iframe>
Autoplay is on by default — add ?autoplay=0 to start paused.