python 33 lines · 7 steps

Summarizing log files by date in Python

A regex parses timestamped log lines, groups them by day, and rolls each day into a small error report.

Explained by highlit
1import re
2from collections import defaultdict
3from pathlib import Path
4 
5LINE_RE = re.compile(
6 r"^(?P<date>\d{4}-\d{2}-\d{2})T(?P<time>\d{2}:\d{2}:\d{2})\s+"
7 r"(?P<level>[A-Z]+)\s+(?P<message>.*)$"
8)
9 
10 
11def group_by_date(log_path):
12 grouped = defaultdict(list)
13 for raw in Path(log_path).read_text(encoding="utf-8").splitlines():
14 match = LINE_RE.match(raw)
15 if not match:
16 continue
17 entry = match.groupdict()
18 grouped[entry["date"]].append(entry)
19 return grouped
20 
21 
22def daily_error_summary(log_path):
23 grouped = group_by_date(log_path)
24 summary = {}
25 for date in sorted(grouped):
26 entries = grouped[date]
27 errors = [e for e in entries if e["level"] in ("ERROR", "CRITICAL")]
28 summary[date] = {
29 "total": len(entries),
30 "errors": len(errors),
31 "first_error": errors[0]["message"] if errors else None,
32 }
33 return summary
01 / 01
STEP 01

Walkthrough

Space play step click any line
Three takeaways
  1. 1Named regex groups turn a raw line into a labeled dict with match.groupdict() for free.
  2. 2defaultdict(list) lets you append into buckets without checking whether the key exists yet.
  3. 3Splitting parsing from summarizing keeps each function focused and independently testable.

Related explainers

Share this explainer

Here's the card — post it anywhere.

Summarizing log files by date in Python — share card
Made with highlit — turn any snippet into a walkthrough like this in about a minute.
Explain your code