191 lines
9.0 KiB
Markdown
191 lines
9.0 KiB
Markdown
|
|
---
|
|||
|
|
title: "2 Million Used Cars and What They Tell Us"
|
|||
|
|
date: 2026-02-18
|
|||
|
|
tags: ["Scraping", "Data Engineering", "Grafana", "PostgreSQL"]
|
|||
|
|
summary: "Scraping ~2M used car listings, throwing them into a database, and seeing what shakes out."
|
|||
|
|
code: ""
|
|||
|
|
demo: ""
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## The Question
|
|||
|
|
|
|||
|
|
Everyone's heard the legend: a VW Passat that just keeps going at 400,000 km.
|
|||
|
|
But is it actually the only car that pulls that off? What other models quietly
|
|||
|
|
rack up absurd mileage — and what do they cost? In short: what makes a car
|
|||
|
|
*last*, and can you get one without overpaying?
|
|||
|
|
|
|||
|
|
Time to find out with data instead of hearsay.
|
|||
|
|
|
|||
|
|
## The Approach
|
|||
|
|
|
|||
|
|
A major used car platform turned out to be surprisingly cooperative when it came
|
|||
|
|
to structured data. Their recommendation engine helpfully links to similar
|
|||
|
|
listings — so starting from a search, you can just keep crawling through related
|
|||
|
|
results.
|
|||
|
|
|
|||
|
|
The haul: roughly **2 million listings** from early 2026, downloaded as JSON and
|
|||
|
|
loaded into a PostgreSQL database. Each listing carries make, model, fuel type,
|
|||
|
|
price, mileage, registration date, location, and a handful of other fields —
|
|||
|
|
enough to get interesting.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## Findings
|
|||
|
|
|
|||
|
|
### Price vs. Mileage
|
|||
|
|
|
|||
|
|
The obvious place to start: how does price relate to mileage?
|
|||
|
|
|
|||
|
|
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/0852019305114cd189aedb67dea27721" height="450" >}}
|
|||
|
|
|
|||
|
|
Two clusters jump out: low-mileage/high-price (garage queens) and
|
|||
|
|
high-mileage/low-price (daily drivers priced to move). The vast majority sits in
|
|||
|
|
modest territory on both axes. The plot caps at €1M and 1M km — beyond that, the
|
|||
|
|
data is mostly typos and placeholder values.
|
|||
|
|
|
|||
|
|
Across the cleaned dataset, averages land at roughly **€28,400** and **75,600
|
|||
|
|
km**. That gives an overall ratio of about **€375 per 1,000 km** — not a
|
|||
|
|
depreciation rate in the strict sense, but a useful back-of-the-napkin metric
|
|||
|
|
for comparing brands.
|
|||
|
|
|
|||
|
|
### Price per Kilometer by Brand
|
|||
|
|
|
|||
|
|
Of the **346 distinct makes**, 72 have more than 500 listings — enough for
|
|||
|
|
meaningful statistics.
|
|||
|
|
|
|||
|
|
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/1777bb018e9b47639b93ef31d97f9c89" height="450" >}}
|
|||
|
|
|
|||
|
|
The ranking mirrors common perception — luxury brands dominate the top. But it
|
|||
|
|
has a blind spot: **age**. BYD ranks just below Ferrari and Rolls-Royce, not
|
|||
|
|
because it's a luxury vehicle, but because the average BYD listing is only **0.7
|
|||
|
|
years** old (overall average: **6.7 years**). Give these newcomers time to
|
|||
|
|
accumulate used inventory and their ratios will settle.
|
|||
|
|
|
|||
|
|
### Depreciation
|
|||
|
|
|
|||
|
|
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/e999ce3c237b4cae95b3331331a26261" height="400" >}}
|
|||
|
|
|
|||
|
|
Average prices over mileage for Mercedes-Benz, Toyota, Volkswagen, and Volvo all
|
|||
|
|
follow the same shape: steep early depreciation that gradually flattens out.
|
|||
|
|
Brand-level differences show in the starting points and decay rates, but the
|
|||
|
|
exponential form is universal.
|
|||
|
|
|
|||
|
|
The first 50k km wipe out **32–39%** of value across all brands — the new-car
|
|||
|
|
premium evaporating. From 50k to 200k km, depreciation settles into a steadier
|
|||
|
|
20–30% per bracket.
|
|||
|
|
|
|||
|
|
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/a801062ca49f49d0a1ef4f97ccb550c8" height="400" >}}
|
|||
|
|
|
|||
|
|
Grouping listings into 50k-km brackets also gives a rough "survival curve" — not
|
|||
|
|
a true survival rate (it conflates production volume, usage patterns, and actual
|
|||
|
|
longevity), but the *shape* tells a story. Volvo shows the gentlest decline: 39%
|
|||
|
|
of listings remain at 100k km, still 6.6% at 250k. Toyota drops steeply early
|
|||
|
|
but then shows a stubbornly flat tail — 51 listings at 400k, 33 at 450k. Once a
|
|||
|
|
Toyota survives the initial culling, it apparently just keeps going.
|
|||
|
|
|
|||
|
|
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/38c5b2b61b8542b4aedc91b947253991" height="400" >}}
|
|||
|
|
|
|||
|
|
Past 250k km, the numbers get chaotic — small sample sizes let a handful of
|
|||
|
|
collector cars or overpriced outliers distort the averages. Toyota is the most
|
|||
|
|
striking case: past 250k, depreciation goes consistently *negative* — prices
|
|||
|
|
*increase* from €7,775 to €13,481 at 450k. The Land Cruiser effect: the Toyotas
|
|||
|
|
that survive to extreme mileage are the models with cult followings, commanding
|
|||
|
|
premiums *because* they've proven their durability. Past 250k, survivor bias
|
|||
|
|
takes over — the remaining cars aren't representative anymore.
|
|||
|
|
|
|||
|
|
### The Survivors: Cars Beyond 250k km
|
|||
|
|
|
|||
|
|
Out of the ~2 million listings, roughly **55,000** have a mileage of 250,000 km
|
|||
|
|
or more. That's about 3% of the dataset — a small slice, but more than enough to
|
|||
|
|
work with.
|
|||
|
|
|
|||
|
|
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/f9fcbcca31e64065a8e9970b88a8a999" height="400" >}}
|
|||
|
|
|
|||
|
|
#### Who Makes It This Far?
|
|||
|
|
|
|||
|
|
In absolute numbers, the usual suspects lead: Mercedes (8,900 listings),
|
|||
|
|
Volkswagen (8,200), BMW (6,700), Audi (5,300). Together they account for over
|
|||
|
|
half of all high-mileage entries. Renault's survivors average 362k km — the
|
|||
|
|
highest on the list — and BMW and Porsche carry remarkably high average prices
|
|||
|
|
past 250k (€24,600 and €27,300), hinting at maintained enthusiast cars rather
|
|||
|
|
than beaters.
|
|||
|
|
|
|||
|
|
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/6c4d0d116ba746f18c908ff5d4037da8" height="400" >}}
|
|||
|
|
|
|||
|
|
But raw counts are misleading. More telling is what *share* of a brand's
|
|||
|
|
listings are high-mileage. That flips the picture: **Saab** leads at **32.6%** —
|
|||
|
|
almost one in three. No new cars since 2012 means the surviving inventory is old
|
|||
|
|
by definition, but it also speaks to owners who keep these cars running well
|
|||
|
|
past their expected shelf life. Iveco follows at 21% (commercial vehicles, no
|
|||
|
|
surprise). Then a cluster at 5–6%: **Volvo, Honda, Subaru, and Mercedes-Benz** —
|
|||
|
|
the brands enthusiasts *claim* are built to last, and the data is at least
|
|||
|
|
consistent.
|
|||
|
|
|
|||
|
|
Notably absent: **Toyota**. Only about 2% of Toyota listings cross 250k — below
|
|||
|
|
the dataset average. That doesn't mean Toyotas die young; they may get exported
|
|||
|
|
or simply leave this platform. The data can't distinguish "the car died" from
|
|||
|
|
"the car left the market."
|
|||
|
|
|
|||
|
|
#### Diesel vs. Petrol
|
|||
|
|
|
|||
|
|
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/b721ad75bf764693ba6243714899e0f0" height="400" >}}
|
|||
|
|
|
|||
|
|
Diesel dominates the high-mileage segment: **6.75%** of diesel listings cross
|
|||
|
|
250k km, compared to just **1.58%** for petrol — a 4× overrepresentation. That's
|
|||
|
|
consistent with the "diesel lasts longer" narrative, though not proof: diesel
|
|||
|
|
cars also tend to be driven more (company cars, long-distance commuters).
|
|||
|
|
|
|||
|
|
The real outlier is **CNG at 10.52%** — higher than diesel, though from a small
|
|||
|
|
sample of 5,500 listings (mostly fleet cars and taxis). Electric cars and
|
|||
|
|
hybrids are virtually absent at 0.07% and 0.16% — simply too young to have
|
|||
|
|
accumulated that kind of mileage.
|
|||
|
|
|
|||
|
|
#### What Do They Cost?
|
|||
|
|
|
|||
|
|
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/543b7c49e5f742468f1d90b112cec799" height="400" >}}
|
|||
|
|
|
|||
|
|
**Porsche** stands alone: a median of **€21,000** even past 250k km — these are
|
|||
|
|
maintained sports cars that appreciate regardless of mileage. **Land Rover**
|
|||
|
|
(€11,000 median) and **Iveco** (€12,000) also hold surprising value — cult
|
|||
|
|
status and commercial utility, respectively.
|
|||
|
|
|
|||
|
|
Most brands cluster in the €3,000–€8,000 range. At the bottom: **Opel** at
|
|||
|
|
€2,650, **Peugeot** and **SEAT** at €3,000 — the genuine bargain-bin survivors.
|
|||
|
|
|
|||
|
|
---
|
|||
|
|
|
|||
|
|
## So What Should You Buy?
|
|||
|
|
|
|||
|
|
Two million listings, a few dozen queries, and a lot of scatter plots later —
|
|||
|
|
what's the actual answer?
|
|||
|
|
|
|||
|
|
It depends on what you're optimizing for.
|
|||
|
|
|
|||
|
|
**If you want the car that refuses to die:** look at diesel Volvo, Mercedes, or
|
|||
|
|
Subaru in the 150–200k km range. These brands show up disproportionately at
|
|||
|
|
extreme mileage, and their survival curves decline more gently than most. A
|
|||
|
|
diesel Mercedes with 180k km on the clock isn't halfway through its life — it
|
|||
|
|
might be a third of the way.
|
|||
|
|
|
|||
|
|
**If you want the best deal per remaining kilometer:** the sweet spot sits around
|
|||
|
|
100–150k km for most brands. The steepest depreciation is already behind you
|
|||
|
|
(that 32–39% first-bracket hit), but the car still has plenty of mechanical life
|
|||
|
|
left. Volkswagen and Opel are the value picks here — median prices in the low
|
|||
|
|
single-digit thousands, well-understood mechanicals, and parts that are cheap
|
|||
|
|
and everywhere.
|
|||
|
|
|
|||
|
|
**If you want something that holds its value no matter what:** Toyota and
|
|||
|
|
Porsche. Toyota's prices actually *increase* past 250k km — the Land Cruiser
|
|||
|
|
effect — and Porsche commands a €21k median even at quarter-million mileage.
|
|||
|
|
You're not buying depreciation; you're buying into a secondary market where
|
|||
|
|
demand outstrips supply.
|
|||
|
|
|
|||
|
|
**If you just want cheap and functional:** Opel, SEAT, or Peugeot past 200k km.
|
|||
|
|
Median prices between €2,650 and €3,000. Nobody will envy your car. Nobody will
|
|||
|
|
steal it either. And it'll probably keep running.
|
|||
|
|
|
|||
|
|
One thing the data can't tell you: the condition of any *specific* car. Averages
|
|||
|
|
are averages. A well-maintained Fiat at 300k km will outlast a neglected BMW at
|
|||
|
|
100k every time. The numbers here describe the market, not the machine in front
|
|||
|
|
of you. For that, you still need a mechanic — preferably one who doesn't own a
|
|||
|
|
dealership.
|