← Research

// article

Premiums and Road Risk

Higher insurance premiums do not mean more dangerous roads

February 26, 2026 Article

I went in expecting the boring result. States with deadlier roads should pay more for car insurance. Actuaries price risk, more crashes means more risk, premiums climb. Then I ran the correlation and the sign was wrong.

Two scatter plots side by side. Left: insurance premium against insurance losses per driver, a clear upward green fit line, r = +0.62, p < 0.001. Right: insurance premium against fatal collisions per billion miles, a flat-to-negative dashed fit, r = -0.20, not significant, with New Jersey flagged as priciest-on-safe-roads and North Dakota as deadliest-yet-cheap.

Same y-axis, two stories. Premium climbs almost lockstep with what insurers actually pay out, and losses explain about 39% of the variance. It is essentially blind to how many people die on the road. The right-hand panel looks like pricing is broken. The left panel shows it working exactly as designed, just not on the variable you assumed.

Across the 51 rows in FiveThirtyEight’s crash dataset, 50 states plus DC, roughly 2009 vintage via seaborn-data, the Pearson correlation between annual insurance premium and the fatal-collision rate is -0.20, and it is not significant (p = 0.16). Spearman agrees and is weaker still at -0.118. The states that crash the most are not, on average, the states that pay the most. The line tilts the other way.

Premium vs crash rate

Look at the corners of that scatter. DC and New Jersey sit at the top on premium and near the floor on fatal crashes per billion miles. North Dakota and South Carolina are the opposite: they tie for the worst crash rate in the country at 23.9 per billion miles, and their premiums are middling-to-cheap. So what does premium track? Payouts, the thing it is supposed to. Premium against insurance losses per driver comes in at r = 0.623 (p < 0.001). Insurers price what they expect to pay, and what they pay is dominated by dense, fender-bender, property-damage urban driving, not the rural single-car fatalities that drive the total column. There are two kinds of dangerous here, and the premium only knows about one of them.

That is the warning shot. The rest of this dataset is a compositional minefield, and the trap is built into the column definitions.

A share is not a rate

Only total and the two dollar columns are rates. The four behavioral columns, speeding, alcohol, not_distracted, no_previous, are shares of total. They are percentages of that state’s fatal crashes, not crashes per mile, so they roughly sum toward 100 per state. A high percentage tells you about the mix of a state’s crashes, not how many crashes it has.

Picture a single state’s crashes as one pie. A share tells you how the pie is sliced. It says nothing about how big the pie is. The cleanest way to see the trap is to correlate a share against the total rate and watch it lie convincingly. Take no_previous, the fraction of fatal crashes involving drivers with no prior accidents. It correlates with the total crash rate at r = 0.956. That looks like a smoking gun: clean-record drivers cause crashes. It is an artifact. In a high-crash state most crashes involve no-previous-record drivers because most drivers have no previous record, so the share rides up with the total almost mechanically. not_distracted does the same at r = 0.828. These are not findings. They are the denominator leaking into the numerator.

Mississippi makes the gap obvious. It reports a not_distracted share of just 1.76%, the lowest in the set, while sitting at a total rate of 17.6, above the national mean of 15.79. By share you would file it as an outlier. By rate it is an ordinary high-crash state with a reporting quirk on one category. The percentage and the rate point in opposite directions, and the rate is the one that counts.

The ranking that flips when you fix the units

So convert. Multiply each share by the total to get an absolute rate: alcohol-involved fatal crashes per billion miles equals total × alcohol% / 100. Now you are comparing like with like, and the worst-state lists reshuffle.

Speeding shows it hardest. Rank the states by speeding share and Hawaii leads the country: 9.45% of its fatal crashes involve speeding, the single highest share in the data. That sounds like a speeding problem. But Hawaii’s overall crash rate is moderate, so once you convert to an absolute rate Hawaii drops to 5th. The new number one is South Carolina, which sat at rank 3 by share but has the raw crash volume to back it up: 2.17 speeding-involved fatal crashes per billion miles against Hawaii’s 1.65.

Speeding share vs absolute rate by state

The understated direction is worse, because that is the one you would miss. Arkansas ranks 35th by speeding share. Only 4.03% of its fatal crashes involve speeding, which reads as a careful-driving state. Convert to a rate and Arkansas climbs to 18th. The share buried it; the high total rate of 22.4, fourth-worst nationally, dragged it back up. A policymaker ranking states by that share column would deprioritize Arkansas and chase Hawaii. The rate says do the reverse.

Alcohol share vs absolute rate

Alcohol behaves a little better. The Spearman between share-rank and rate-rank is 0.976, because the very worst states, North Dakota, South Carolina, Montana, are bad on both share and total, so they stay pinned at the top either way. The middle still churns. Kentucky sits at rank 22 by alcohol share and jumps to rank 13 by absolute rate. Nebraska does the opposite, sliding from 17 down to 24. Whenever a state’s share and its total disagree, the conversion moves it, and the move is invisible if you only looked at the percentages.

The honest caveat

This is 51 data points, one cross-sectional snapshot from around 2009, aggregated to the state level. Everything here is ecological. A correlation across states is not a statement about any driver, and the premium result especially is confounded six ways from Sunday: urban density, no-fault laws, uninsured-motorist rates, theft, weather. I am not claiming expensive-insurance states are safer. I am claiming the obvious read, that premium tracks fatal-crash risk, is flatly contradicted by the sign of the correlation, and that half the behavioral columns in this beloved little dataset are shares wearing the costume of rates.

Before you rank anything, check whether your column is a numerator or a fraction. The denominator is where the lies hide.