- Add animated coming-soon card when no projects exist - Add Grafana shortcode and gebrauchtwagen-datenbank project - Add hugo/eliaskohout.de/public/ to .gitignore and remove from tracking
7.1 KiB
title, draft, date, tags, summary, code, demo
| title | draft | date | tags | summary | code | demo | ||||
|---|---|---|---|---|---|---|---|---|---|---|
| 2 Million Used Cars and What They Tell Us | true | 2026-02-12 |
|
Scraping ~2M used car listings, throwing them into a database, and seeing what shakes out. |
The Question
Everyone's heard the legend: a VW Passat that just keeps going at 400,000 km. But is it actually the only car that pulls that off? What other models quietly rack up absurd mileage — and what do they cost? In short: what makes a car last, and can you get one without overpaying?
Time to find out with data instead of hearsay.
The Approach
A major used car platform turned out to be surprisingly cooperative when it came to structured data. Their recommendation engine helpfully links to similar listings — so starting from a search, you can just keep crawling through related results.
The haul: roughly 2 million listings from early 2026, downloaded as JSON and loaded into a PostgreSQL database. At that point, the recommendation graph stopped surfacing new entries — a second pass would likely uncover more, since new listings appear daily. But 2M felt like a solid starting point.
The Data
2,046,879 listings, most of them containing the following fields (among others):
make · model · model_variant · fuel · price · mileage · power_kw ·
transmission_type · number_of_cylinders · body_type · body_color ·
first_registration_date · number_of_previous_owners · is_roadworthy ·
is_currently_damaged · usage_state · type · zip_code · country_code ·
city
That's enough to get interesting.
About 119,500 listings were missing either price or mileage — not entirely clear why, but with nearly 2 million records left, it's barely a dent.
Findings
Price vs. Mileage
The obvious place to start: how does price relate to mileage?
{{< grafana url="https://gr.eliaskohout.de/public-dashboards/0852019305114cd189aedb67dea27721" height="450" >}}
Two hot spots jump out immediately. One in the low-mileage/high-price corner, one in the high-mileage/low-price corner — exactly what you'd expect. Expensive special cars that barely leave the garage, and daily drivers with six-figure odometers priced to move.
The vast majority of listings, though, cluster in relatively low-price and low-mileage territory compared to the extremes.
A note on clipping. The plot caps at €1,000,000 and 1,000,000 km because the *tails get absurd. The highest listed price was €999,999,999 — obviously not *real. Six listings exceeded €10 million, none of them serious. A handful *between €1M and €10M could be genuine exotics. On the mileage side, the maximum *was 100,000,000 km. The highest plausible reading I found was an Iveco truck at *897,000 km on the odometer. The roughly 570 listings beyond that appeared to be *typos or placeholder values for "mileage unknown."
Across the cleaned dataset, averages land at roughly €28,400 for price and 75,600 km for mileage. The standard deviations are enormous — €731k and 109k km respectively — which tells you just how wide the spread really is.
That gives an overall average ratio of about €375 per 1,000 km. In other words: for each 1,000 km on the odometer, the average listing costs about €375. This isn't a depreciation rate in the strict sense — we're looking at a cross-sectional snapshot of listed prices, not tracking individual cars losing value over time. But it turns out to be a useful back-of-the-napkin metric for comparing brands.
By Brand
The dataset contains 346 distinct makes. Of those, 72 (~21%) have more than 500 listings — enough for halfway meaningful statistics. The rest are too sparse to generalize from, so brand-level analysis focuses on these 72.
{{/* dashboard on price per mileage; table with make and euro/km; ordered by price per mileage */}} {{< grafana url="https://gr.eliaskohout.de/public-dashboards/1777bb018e9b47639b93ef31d97f9c89" height="450" >}}
The ranking roughly mirrors the common perception of luxury brands — makes with a reputation for being expensive also tend to show up with high price-per-km values. No surprises there.
But this metric has a blind spot: age. Brands with almost no older cars on the market look disproportionately expensive per km, simply because their listings haven't had time to depreciate. BYD, for example, ranks just below Ferrari and Rolls-Royce — not because a BYD is a luxury vehicle, but because the average BYD listing is only 0.7 years old, compared to the overall average of 6.7 years. Leapmotor is even more extreme at 0.5 years. Give these brands a few years to accumulate used inventory at higher mileages, and their ratios will settle down considerably.
Depreciation Curves
You'd expect the price-per-km ratio to fall as mileage increases — older, high-mileage cars are cheaper, and the new-car premium fades fast. You'd also expect the decline to be roughly exponential: a car loses a percentage of its current value per additional kilometer, not a fixed euro amount.
Both hold up in the data:
{{/* dashboard on price vs mileage and price/km vs mileage */}} {{< grafana url="https://gr.eliaskohout.de/public-dashboards/e999ce3c237b4cae95b3331331a26261" height="400" >}}
The curves above show average prices over mileage for BMW, Volkswagen, and Fiat. The brand-level differences are immediately visible — different starting points, different values throughout the decay — but the overall shape is the same: steep early depreciation that gradually flattens out.
-
nur zum ende hin verschwimmen die Grenzen, 200k km kann hier als grobe grenze gesehen werden, ab der die daten etwas chaotischer werden
-
das könnte auch daran liegen, dass es hier einfach weniger datenpunkte gibt und damit die durchschnittbrechnung schlechter wird
-
also lass uns genauer auf die verteilungen an diesem Ende der skala schauen
The Survivors: Cars Beyond 250k km
{{/*
PLAN: Analysis chapters to write
-
Price vs. Mileage Relationship
- Scatter/heatmap of price vs. mileage across all listings
- Depreciation curves: how fast do different makes lose value?
- The "sweet spot": best mileage-to-price ratio by model
-
The Survivors: Cars Beyond 300k km
- Which makes/models appear most often at extreme mileage?
- Fuel type breakdown (diesel vs. petrol at high mileage)
- Average price of high-mileage cars — are they dirt cheap or still holding value?
- ausfallrate abschätzen
-
Fuel & Drivetrain Trends
- Fuel type distribution (diesel/petrol/electric/hybrid/LPG)
- Price and mileage by fuel type
- Are EVs showing up in used markets yet? At what price?
-
Geography
- Listings by country and region (zip code clusters)
- Regional price differences for the same model
- Where are the cheap cars?
============================================== */}}