The Quiet Data of the Missing: What Open-Source Research Can and Cannot Tell Us About Missing People in Britain

Around 170,000 people are reported missing in the UK each year, yet most of what the public sees is a thin, skewed sliver of that total. This piece assesses what open-source research can actually offer—and where it misleads.

Nathan Tracey 14 min read

Every ninety seconds or so, someone in the United Kingdom is reported missing. The figure most often quoted by the National Crime Agency and the charity Missing People is around 170,000 people each year, generating something closer to 330,000 separate incidents. The gap between those two numbers is itself the first useful fact about missing people: a great many of them go missing more than once.

This raises an appealing question for anyone interested in open-source research. If the phenomenon is this large, this patterned and this socially significant, can publicly available material—official statistics, police appeals, charity reports, local news, social media—be assembled into genuine insight? Can open-source intelligence (OSINT) reveal trends a casual observer would miss? And can proxy sources, such as the “missing: can you help us find” appeals that fill force Facebook pages, stand in for data the public never sees?

The answer is a qualified yes, but the qualifications matter more than the yes. There is real analytical value here, almost all of it at the level of patterns and systems rather than individuals. There is also a well-documented capacity for this kind of research to mislead, to invade privacy, and in a few notorious cases to cause direct harm. This piece sets out both.

What the open record actually contains

The single most important open source is the National Crime Agency’s UK Missing Persons Unit (UKMPU), which publishes an annual Missing Persons Data Report with accompanying statistical tables. It is the closest thing Britain has to a national picture, and it is genuinely open: anyone can download it.

It is also, by its own admission, incomplete. The data is requested each year from the 43 forces in England and Wales, plus Police Scotland and the Police Service of Northern Ireland “where possible.” Submission is voluntary and the forces use different recording systems. The reports warn explicitly that not all forces can supply the requested breakdowns and that year-on-year comparisons are therefore limited. In the most recent edition, for instance, PSNI returned no data at all on the duration of missing incidents. Every “national” total is best understood as a lower bound assembled from partial returns, not a census.

A few things follow that anyone citing these numbers should know. The Office for National Statistics does not hold or publish a missing-persons series; it routes such enquiries back to the NCA. Missing persons are not a Home Office recorded-crime category, and they do not appear in the police open data published at data.police.uk, which covers street-level crime, outcomes and stop-and-search only. So beyond the UKMPU report, the public route to numbers is the patient, unglamorous one: Freedom of Information requests to individual forces and local authorities. That is a real open-source method, but it is laborious and produces data that is hard to compare across the bodies that answer.

With those caveats in place, the headline shape is reasonably firm because it is corroborated across the UKMPU reports, the charity sector and academic work. Children make up roughly two-thirds of all incidents but only around 45 per cent of the individuals involved—the difference, again, being repeat episodes. Teenagers dominate: the single most likely age to be reported missing is seventeen, followed by sixteen and fifteen. Most cases resolve quickly. For children in England and Wales, recent UKMPU figures show around 83 per cent located within 24 hours and over 90 per cent within 48. Fatal outcomes are rare in proportional terms—about three incidents in every thousand—though that still amounted to roughly a thousand deaths in the latest year, the overwhelming majority of them adults.

The patterns the data already reveals

This is where open-source analysis earns its keep, because the official and charity data, read carefully, surfaces patterns that are invisible from any individual case.

The most striking is concentration. Missing incidents are not spread evenly across missing people. Analyses of London data by researchers including Karen Shalev (Greene) at the University of Portsmouth and colleagues have found that a small group of repeatedly missing children—on the order of four to six per cent of the children involved—accounts for roughly a third of all incidents. Around two-thirds of missing-child reports are repeats, and more than half of those repeats occur within four weeks of the previous episode. If you want to know where harm clusters, the data points you very precisely.

It points, in particular, at the care system. Roughly one looked-after child in ten is reported missing each year, against something closer to one in two hundred children generally. Care-experienced children who go missing average several episodes a year, and the great majority of their reports are repeats. The risk concentrates further around out-of-area and unregulated placements: an All-Party Parliamentary Group inquiry documented missing-from-placement reports in Greater Manchester roughly doubling over a few years where children had been placed far from home.

Closely bound up with this is exploitation. Going missing is one of the clearest behavioural indicators of county lines drug networks and of child criminal and sexual exploitation. The Home Office’s own county lines work has estimated some 11,600 children going missing and at risk of exploitation, and charity research has found that exploited children in care go missing far more often than their peers. The disappearance of unaccompanied asylum-seeking children from Home Office contractor hotels in 2022–23—several hundred children nationally, many never found, some trafficked onward—was a safeguarding scandal surfaced largely through journalism and FOI rather than any official dashboard.

For adults, the dominant theme is vulnerability rather than crime. Mental-health crisis is the most commonly cited factor in adults going missing, a meaningful share of incidents originate from hospitals and care settings, and the people who come to fatal harm skew older, male and longer-missing. The University of Portsmouth’s work has even identified a seasonal signature: men who disappear after a night out are at greatest risk of a fatal outcome in winter, when alcohol and cold water combine. Dementia is its own distinct pattern—of the UK’s roughly 850,000 people with dementia, a majority will become lost at some point, and they are markedly more likely than others to feature in a missing incident, though most such episodes are never reported to police at all.

Finally, there is disproportionality. Charity analysis and peer-reviewed work both find that people from minority ethnic groups are over-represented among those reported missing—Black people account for around 14 per cent of missing reports against roughly 4 per cent of the population—and that they tend to stay missing longer and are less likely to be graded “at risk.” That last finding matters enormously, because risk grading determines how hard the police look.

None of these patterns requires covert technique. They are visible in published reports, FOI returns and academic papers. That is the genuine, defensible core of open-source missing-persons research: aggregate, system-level analysis of who goes missing, how often, and how well the institutions respond.

The temptation of proxy sources

The patterns above come from data about the whole population of cases, however imperfectly recorded. The proxy sources—public appeals, charity posts, news stories, Facebook groups, subreddits—are something quite different, and the difference is the single most important methodological point in this whole subject.

The vast majority of missing episodes are resolved within a day or two and never become public. A force issues a public appeal, or a charity amplifies one, only for a small and deliberately selected fraction of cases. National Police Chiefs’ Council and College of Policing guidance is explicit that publicity is an investigative tool, used when it serves the inquiry and weighed against the risk of exposing a vulnerable person—not a neutral record. Researchers who have studied appeal practice have found no consistent system for deciding which cases are publicised at all.

So the appeals you can see are not a sample of missing people. They are a sample of the cases that are unresolved, high-risk, long-running, or judged media-worthy—and that last filter is heavily skewed. The “missing white woman” effect is one of the better-evidenced biases in this field: work by Missing People and others has found that white people receive a substantial majority of publicity appeals while Black people, though over-represented among the missing, receive far fewer. To build a picture of missing people from public appeals is to over-count young, white, photogenic subjects and to under-count almost everyone else—adult men, older people, minority ethnic groups, the routinely-and-repeatedly missing.

The charity infrastructure is more reliable than raw social media because it is governed and research-minded. Missing People runs the free 116 000 helpline, the Appeal Search directory and the official UK Missing Kids site, and publishes serious research. But its appeal and helpline data still reflect who is referred for publicity or who contacts the service—self-selecting, publicity-gated populations, not the full reported total. (One tidy-up worth noting: the “Missing Persons Advocacy Network” people sometimes cite as a UK body is in fact Australian. The relevant UK specialist charity for people missing abroad is LBT Global, the former Lucie Blackman Trust.)

Online communities sit at the unreliable end. Websleuths, the missing-persons subreddits, and Facebook groups amplify cases, but they verify nothing, skew heavily towards sensational and cold cases, are dominated by US material, and almost never propagate the “found safe” updates that would correct the record. Britain also lacks a true public equivalent of the United States’ NamUs database or the volunteer-curated Charley Project. The closest official thing is the UKMPU’s own public case-search site for unidentified bodies and long-term cases, which deliberately publishes only a fraction of what the unit holds.

The lesson is not that proxy sources are useless. It is that they answer a narrow question—which cases became visible, and why—and cannot answer the broad one—who goes missing in Britain. Treating the first as if it were the second is the characteristic error of amateur missing-persons analysis.

What open-source technique realistically adds

Used with discipline, open-source methods do more than describe. The most productive UK examples are investigative rather than technological. ECPAT UK and Missing People have used FOI requests to children’s services across the country to show, in hard numbers, how many trafficked and unaccompanied children disappear from care—findings that drove parliamentary scrutiny. The asylum-hotels story was broken the same way, through whistleblowers, sources and document work. This is OSINT in its most legitimate form: assembling the open record to hold a system to account.

There is also a place for structured, law-enforcement-coordinated volunteering. Locate International, a UK charity, runs a cold-case review project using trained volunteers in genealogy, analysis and OSINT to support police and coroners on long-term missing and unidentified cases. Trace Labs runs gamified events in which teams use strictly passive OSINT to generate leads on live cases submitted by police. Internationally, Interpol’s “Operation Identify Me” showed the power of simply publishing forensic facial reconstructions: one of the women identified, “the woman with the flower tattoo,” turned out to be Rita Roberts from Cardiff, recognised by a relative in Britain from the appeal.

And for the live search itself, the mature data tool is not social media but behavioural science. Search-and-rescue teams in the UK plan around “lost person behaviour” models built on databases of well over a hundred thousand past searches, which predict where particular kinds of missing person—a person with dementia, a despondent adult, a lost walker—are statistically likely to be found. That is open data put to direct operational use.

Two things are conspicuously harder in Britain than the headlines suggest. Forensic genetic genealogy, which has solved many cold cases in the United States, is not yet in routine UK use for legal and ethical reasons. And consumer facial-recognition tools such as PimEyes, sometimes pitched as OSINT aids, sit under active regulatory disapproval—the Information Commissioner’s Office has repeatedly sanctioned the firm Clearview AI—and carry a serious false-positive and stalking risk. Power and propriety are not the same thing.

The limits that are not negotiable

Which brings us to the constraints, because the gap between what is technically possible and what is lawful and ethical is wide.

The legal position is straightforward and frequently misunderstood: information about an identifiable living person remains personal data under UK GDPR and the Data Protection Act 2018 even when it is already public. Compiling public posts about a named missing person is processing personal data, and the compiler becomes a data controller with the full set of obligations. Missing-persons material routinely involves “special category” data—health, mental health, ethnicity—which attracts stricter conditions, and the convenient-sounding exemption for data “manifestly made public by the individual” is narrow: a news report or police appeal disclosing someone’s mental-health crisis does not qualify. Children, who make up most cases, attract heightened protection again under the ICO’s Children’s Code.

The ethical position is, if anything, more demanding than the legal one. A competent adult has a qualified right to go missing; the police will generally respect an adult’s wish not to be located unless a safeguarding duty applies. Many missing people are vulnerable in ways that make exposure actively dangerous—domestic-abuse survivors who have fled, trafficking victims, people in acute crisis. Publishing a sighting or a location-revealing detail can lead an abuser straight to someone. The risk is asymmetric: being careless, or being right but indiscreet, can be fatal.

The cautionary cases are not hypothetical. After the 2013 Boston Marathon bombing, Reddit users wrongly identified a missing student, Sunil Tripathi, as a suspect; he had already died by suicide, and his grieving family was engulfed in the false accusation. In Britain, the 2023 disappearance of Nicola Bulley drew amateur “TikTok detectives” to the village, generating hundreds of millions of views, intruders trying door handles, and a police force that said publicly it had been distracted and inundated with false information. Crowdsourced identification of the missing and at-risk has a documented record of harm and of damaging live investigations.

There is, finally, the matter of inference. Public appeals are a biased, survivorship-skewed sample; social-media data is manipulable, decaying and unrepresentative; force definitions and recording differ; and the same person is counted many times across repeat episodes. Above all there is the no-denominator problem. You can count appeals, sightings and posts, but you almost never know the true population they are drawn from, so claims such as “social media found X per cent of missing people” are simply not supportable. Bellingcat’s own checklist of bad open-source research warns against exactly these errors, and the Berkeley Protocol on Digital Open Source Investigations sets out the do-no-harm standard the field now expects.

Where the potential actually lies

So: can open-source research into missing people in the UK offer real insight? Yes—provided one is clear about which question is being asked.

If the question is who goes missing in Britain, how the harm concentrates, and how well the system responds, the potential is substantial and largely untapped. The official data, however imperfect, already reveals the concentration of incidents among repeatedly-missing children, the over-representation of the care system, the link between going missing and exploitation, the ethnic disproportionality in how seriously cases are treated, and the distinct vulnerabilities of missing adults. FOI-driven investigation has repeatedly turned that open record into accountability. Proxy sources—appeals, charity output, news—are useful here too, not as a measure of the missing population but as a measure of which cases society chooses to see, which is itself a finding worth having.

If the question is can we use open sources to find a particular missing person, the answer is mostly no, and the attempt is where the harm lives. That work belongs to the police and to the small number of properly governed, law-enforcement-coordinated bodies, working passively and within the law.

The most valuable open-source contribution an outsider can make to missing people in Britain is therefore not detection but illumination: studying the phenomenon and the institutions rather than the individuals, treating the numbers with the scepticism their gaps deserve, and remembering throughout that behind every data point is a person who may have very good reasons not to be found.

missing-persons osint open-source data policing safeguarding county-lines ethics policy

Comments

No login needed. Comments are moderated before publishing.