refactor: rewrite used car article, remove LLM confidence chat draft

This commit is contained in:
2026-02-18 04:20:53 +01:00
parent 561a04fded
commit 3a25549811
3 changed files with 191 additions and 188 deletions

View File

@@ -0,0 +1,191 @@
---
title: "2 Million Used Cars and What They Tell Us"
date: 2026-02-18
tags: ["Scraping", "Data Engineering", "Grafana", "PostgreSQL"]
summary: "Scraping ~2M used car listings, throwing them into a database, and seeing what shakes out."
code: ""
demo: ""
---
## The Question
Everyone's heard the legend: a VW Passat that just keeps going at 400,000 km.
But is it actually the only car that pulls that off? What other models quietly
rack up absurd mileage — and what do they cost? In short: what makes a car
*last*, and can you get one without overpaying?
Time to find out with data instead of hearsay.
## The Approach
A major used car platform turned out to be surprisingly cooperative when it came
to structured data. Their recommendation engine helpfully links to similar
listings — so starting from a search, you can just keep crawling through related
results.
The haul: roughly **2 million listings** from early 2026, downloaded as JSON and
loaded into a PostgreSQL database. Each listing carries make, model, fuel type,
price, mileage, registration date, location, and a handful of other fields —
enough to get interesting.
---
## Findings
### Price vs. Mileage
The obvious place to start: how does price relate to mileage?
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/0852019305114cd189aedb67dea27721" height="450" >}}
Two clusters jump out: low-mileage/high-price (garage queens) and
high-mileage/low-price (daily drivers priced to move). The vast majority sits in
modest territory on both axes. The plot caps at €1M and 1M km — beyond that, the
data is mostly typos and placeholder values.
Across the cleaned dataset, averages land at roughly **€28,400** and **75,600
km**. That gives an overall ratio of about **€375 per 1,000 km** — not a
depreciation rate in the strict sense, but a useful back-of-the-napkin metric
for comparing brands.
### Price per Kilometer by Brand
Of the **346 distinct makes**, 72 have more than 500 listings — enough for
meaningful statistics.
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/1777bb018e9b47639b93ef31d97f9c89" height="450" >}}
The ranking mirrors common perception — luxury brands dominate the top. But it
has a blind spot: **age**. BYD ranks just below Ferrari and Rolls-Royce, not
because it's a luxury vehicle, but because the average BYD listing is only **0.7
years** old (overall average: **6.7 years**). Give these newcomers time to
accumulate used inventory and their ratios will settle.
### Depreciation
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/e999ce3c237b4cae95b3331331a26261" height="400" >}}
Average prices over mileage for Mercedes-Benz, Toyota, Volkswagen, and Volvo all
follow the same shape: steep early depreciation that gradually flattens out.
Brand-level differences show in the starting points and decay rates, but the
exponential form is universal.
The first 50k km wipe out **3239%** of value across all brands — the new-car
premium evaporating. From 50k to 200k km, depreciation settles into a steadier
2030% per bracket.
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/a801062ca49f49d0a1ef4f97ccb550c8" height="400" >}}
Grouping listings into 50k-km brackets also gives a rough "survival curve" — not
a true survival rate (it conflates production volume, usage patterns, and actual
longevity), but the *shape* tells a story. Volvo shows the gentlest decline: 39%
of listings remain at 100k km, still 6.6% at 250k. Toyota drops steeply early
but then shows a stubbornly flat tail — 51 listings at 400k, 33 at 450k. Once a
Toyota survives the initial culling, it apparently just keeps going.
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/38c5b2b61b8542b4aedc91b947253991" height="400" >}}
Past 250k km, the numbers get chaotic — small sample sizes let a handful of
collector cars or overpriced outliers distort the averages. Toyota is the most
striking case: past 250k, depreciation goes consistently *negative* — prices
*increase* from €7,775 to €13,481 at 450k. The Land Cruiser effect: the Toyotas
that survive to extreme mileage are the models with cult followings, commanding
premiums *because* they've proven their durability. Past 250k, survivor bias
takes over — the remaining cars aren't representative anymore.
### The Survivors: Cars Beyond 250k km
Out of the ~2 million listings, roughly **55,000** have a mileage of 250,000 km
or more. That's about 3% of the dataset — a small slice, but more than enough to
work with.
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/f9fcbcca31e64065a8e9970b88a8a999" height="400" >}}
#### Who Makes It This Far?
In absolute numbers, the usual suspects lead: Mercedes (8,900 listings),
Volkswagen (8,200), BMW (6,700), Audi (5,300). Together they account for over
half of all high-mileage entries. Renault's survivors average 362k km — the
highest on the list — and BMW and Porsche carry remarkably high average prices
past 250k (€24,600 and €27,300), hinting at maintained enthusiast cars rather
than beaters.
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/6c4d0d116ba746f18c908ff5d4037da8" height="400" >}}
But raw counts are misleading. More telling is what *share* of a brand's
listings are high-mileage. That flips the picture: **Saab** leads at **32.6%**
almost one in three. No new cars since 2012 means the surviving inventory is old
by definition, but it also speaks to owners who keep these cars running well
past their expected shelf life. Iveco follows at 21% (commercial vehicles, no
surprise). Then a cluster at 56%: **Volvo, Honda, Subaru, and Mercedes-Benz**
the brands enthusiasts *claim* are built to last, and the data is at least
consistent.
Notably absent: **Toyota**. Only about 2% of Toyota listings cross 250k — below
the dataset average. That doesn't mean Toyotas die young; they may get exported
or simply leave this platform. The data can't distinguish "the car died" from
"the car left the market."
#### Diesel vs. Petrol
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/b721ad75bf764693ba6243714899e0f0" height="400" >}}
Diesel dominates the high-mileage segment: **6.75%** of diesel listings cross
250k km, compared to just **1.58%** for petrol — a 4× overrepresentation. That's
consistent with the "diesel lasts longer" narrative, though not proof: diesel
cars also tend to be driven more (company cars, long-distance commuters).
The real outlier is **CNG at 10.52%** — higher than diesel, though from a small
sample of 5,500 listings (mostly fleet cars and taxis). Electric cars and
hybrids are virtually absent at 0.07% and 0.16% — simply too young to have
accumulated that kind of mileage.
#### What Do They Cost?
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/543b7c49e5f742468f1d90b112cec799" height="400" >}}
**Porsche** stands alone: a median of **€21,000** even past 250k km — these are
maintained sports cars that appreciate regardless of mileage. **Land Rover**
(€11,000 median) and **Iveco** (€12,000) also hold surprising value — cult
status and commercial utility, respectively.
Most brands cluster in the €3,000€8,000 range. At the bottom: **Opel** at
€2,650, **Peugeot** and **SEAT** at €3,000 — the genuine bargain-bin survivors.
---
## So What Should You Buy?
Two million listings, a few dozen queries, and a lot of scatter plots later —
what's the actual answer?
It depends on what you're optimizing for.
**If you want the car that refuses to die:** look at diesel Volvo, Mercedes, or
Subaru in the 150200k km range. These brands show up disproportionately at
extreme mileage, and their survival curves decline more gently than most. A
diesel Mercedes with 180k km on the clock isn't halfway through its life — it
might be a third of the way.
**If you want the best deal per remaining kilometer:** the sweet spot sits around
100150k km for most brands. The steepest depreciation is already behind you
(that 3239% first-bracket hit), but the car still has plenty of mechanical life
left. Volkswagen and Opel are the value picks here — median prices in the low
single-digit thousands, well-understood mechanicals, and parts that are cheap
and everywhere.
**If you want something that holds its value no matter what:** Toyota and
Porsche. Toyota's prices actually *increase* past 250k km — the Land Cruiser
effect — and Porsche commands a €21k median even at quarter-million mileage.
You're not buying depreciation; you're buying into a secondary market where
demand outstrips supply.
**If you just want cheap and functional:** Opel, SEAT, or Peugeot past 200k km.
Median prices between €2,650 and €3,000. Nobody will envy your car. Nobody will
steal it either. And it'll probably keep running.
One thing the data can't tell you: the condition of any *specific* car. Averages
are averages. A well-maintained Fiat at 300k km will outlast a neglected BMW at
100k every time. The numbers here describe the market, not the machine in front
of you. For that, you still need a mechanic — preferably one who doesn't own a
dealership.