AI methodology

Why we don't predict crime with AI.

One of the questions we are asked most often is whether UK Crime Insights predicts future crime in a postcode. It does not, and it is not going to. This article explains why — both the technical reasons and the ethical ones, because both matter and they reinforce each other. We describe the past clearly, with the limitations stated, and let the reader form their own view about what it implies. That choice is deliberate, and the reasoning behind it is worth setting out plainly so that anyone using the product, or considering it, knows what they are getting and what they are not. The short version: a forecast in this setting would be more confident than the data deserves, and the cost of being wrong is borne by the places and people the forecast names.

What predictive crime models do

The published work on predictive crime modelling falls into two broad families. Place-based models — the best-known being PredPol and its descendants — forecast hot-spots from past incident data, usually by fitting a self-exciting point process to historical reports and projecting forward over short windows. Person-based systems take a different route: they score individuals against a list of features and produce a risk number, used to inform allocation of resources, supervision, or intervention. Both families have been deployed publicly in the United Kingdom and the United States over the past decade, and both have generated a substantial academic literature on what they get right and what they get wrong.

Our argument here is narrow. We are not claiming these systems have no legitimate use anywhere. We are explaining why neither family fits a consumer property-context product, where the user is a homebuyer or property professional and the question is what a postcode looks like, not where the next patrol should go.

The technical objection

Sample sizes for small geographies are too small for stable forecasts. A half-mile radius around a residential postcode generates monthly counts in the low single digits for most of the fourteen Police UK categories. Several categories will register zero in a given month and a handful in the next, with the difference reflecting nothing more than the timing of when an incident was reported and which street it was reported on.

A model trained on counts that thin has very little signal to separate from noise. Random month-to-month variation will dominate any "trend" the model thinks it has found, and when the model is asked to predict the next twelve months from the past twelve months, the prediction is mostly a regression toward the local average with a confidence interval the user never sees. We can do that arithmetic in our heads — last year's mean, plus or minus a lot — and label it honestly as what it is. A model output styled as "we predict 18 incidents next year" sounds more authoritative than the underlying arithmetic supports, and the styling is doing work the data is not.

The feedback loop objection

Predictions cause attention, attention causes recording, recording reinforces the prediction. This is the most well-documented failure mode of place-based predictive systems in the academic literature. A model flags a location, additional patrols are sent, those patrols catch incidents that would otherwise go unrecorded, the location continues to score highly in subsequent runs, and the cycle reinforces itself. The signal the model is responding to is, in part, the signal the model itself created.

A consumer product is not allocating patrols, so the loop is less direct, but it is not absent. A "predicted hot postcode" published next to a property listing changes how that postcode is perceived by buyers, by sellers, by agents writing descriptions, and by neighbours deciding whether an incident is worth reporting. The downstream effects on attention, transaction volume, and reported counts are not zero, and a tool that contributed to them while also presenting itself as a neutral observer would be doing something quietly dishonest. We would rather not be a small contribution to that loop.

The fairness objection

The published evidence on disparate impact is clear enough to take seriously. Place-based models trained on past incident data preferentially flag areas with historically higher recording densities — and recording density correlates with how much policing attention an area has received in the past, which in turn correlates with neighbourhood characteristics that have nothing to do with current crime risk. The work by Lum and Isaac on PredPol drug-related forecasts is the most cited demonstration of this in the place-based literature; the Loomis case in the United States and the discussion around Durham's HART tool in the United Kingdom raised parallel concerns about person-based scoring and the auditability of the features driving it. The details vary, but the conclusion converges: a model that "objectively" reproduces the past will reproduce its biases along with it, and the appearance of objectivity is part of what makes the bias hard to challenge.

UK Crime Insights serves homebuyers, sellers, conveyancers, and agents. A tool that quietly steered them away from areas with historically over-policed populations — under the cover of a forecast that looked technical and neutral — would be doing real harm to those areas while presenting the harm as a feature. That is not a tool we want to ship.

What we do instead

We describe the past, plainly. The headline is what was recorded in the last twelve months in a half-mile radius around the postcode. The category breakdown shows what kind of incidents the area produced, in the order their counts warrant. The trend chart shows whether the level has been broadly steady, rising, or falling, with the monthly counts visible underneath so the reader can see the shape rather than take the slope on trust. The AI summary translates the numbers into a paragraph of prose, with every claim grounded in a figure the model can see and forecasting explicitly ruled out.

None of these elements claim to predict the future, and where the underlying counts are too thin to support a confident reading even of the past, the report says so rather than smoothing the gap over. The user gets the past clearly, with the limits of what the past supports stated, and is trusted to form their own view about what it suggests for the year ahead. That last step — the inference from past to future — sits with the reader, who knows their own circumstances and risk tolerance, not with us.

What "describing well" still adds

A description-only tool is sometimes dismissed as "just statistics", as though the absence of a forecast were the absence of a feature. The case for the conservative product is the opposite. Statistics described well — with the right denominators acknowledged, the right caveats placed where they belong, the AI summary calibrated to plain language, and the categories explained where the names are misleading — already do most of the work a careful reader needs. The reader of a clear description can carry their own uncertainty into the decision, weight it against the rest of what they know about the area, and arrive somewhere honest.

Forecasting adds a pseudo-precision that a careful reader has to mentally subtract before using the figure. A predicted incident count is a number with two error bars the reader cannot see: one from the small-sample variance and one from the model's structural assumptions. Removing the forecasting step removes a step the reader would otherwise have to undo. That is the case for the deliberate non-feature: the past, well-described, is more useful than the future, badly guessed.