In courts and crime labs around the world, forensic evidence is often treated as a kind of modern magic — promising certainty, closure, and justice. But beneath the confident conclusions and expert testimonies lies a troubling question: How do we actually know these methods work? Without rigorous testing and statistical grounding, many forensic practices risk becoming guesswork dressed in scientific language.
This is the uncomfortable truth the global scientific community has grappled with for over a decade. In 2009, the National Academy of Sciences released a landmark report warning that many forensic methods lacked strong scientific foundations. Seven years later, the President’s Council of Advisors on Science and Technology reinforced this, emphasizing that high validity and reliability are not optional in forensic science — they are essential for credibility and justice.
That same year, I began my master’s degree in forensic anthropology in the United Kingdom. As part of the first cohort trained after the NAS report, we were immersed in statistical thinking from the start. Every method, every conclusion, had to be tested, validated, and measured. We were taught that in forensic science, it is not enough to say something is “a match.” We needed to ask: How often do similar matches occur by chance? How confident are we in this conclusion?
The problem with ‘uniqueness’
Many forensic disciplines still rely on a powerful but unproven idea: uniqueness. Bite mark analysis, for example, assumes that every person’s dental pattern is distinct and that skin can reliably capture and preserve that pattern. But this idea has repeatedly failed under scrutiny. The 1985 conviction of Robert Lee Stinson, based almost entirely on bite mark evidence, was overturned more than two decades later after DNA testing proved his innocence.
Even fingerprint analysis — often regarded as the gold standard — has its challenges. In 2004, American attorney Brandon Mayfield was wrongfully linked to the Madrid train bombings when the FBI declared a “100 percent match” based on a partial print. Spanish authorities later identified the true source. Mayfield’s case was a sobering reminder that even longstanding techniques are fallible.
Firearm identification faces similar scrutiny. It assumes that every gun leaves unique markings on bullets and casings. But recent large-scale studies show that even experienced examiners can disagree, and that false positives—though uncommon — do occur. These comparisons often rest on subjective judgment, not standardized or statistically validated criteria.
In all these examples, the issue is not whether the techniques are useful—they clearly are. The issue is whether we can quantify their accuracy. The notion of uniqueness, while persuasive in court, remains largely untested at a statistical level.
The role of statistical validation
Science is fundamentally about testing assumptions. That means measuring outcomes, calculating error rates, and expressing findings in probabilistic terms. Are forensic methods valid — do they measure what they claim to measure? Are they reliable—do they produce consistent results when repeated?
The PCAST review highlighted two urgent gaps: the need for clear scientific standards and the need for independent, peer-reviewed validation studies. It is not enough to say, we have always done it this way. Tradition is not a substitute for evidence.
Statistics provide the language we need to express uncertainty with honesty. They help us determine how much weight to give a piece of evidence. Phrases like “match,” “consistent with” or “cannot be excluded” may sound definitive, but without an associated error rate or likelihood ratio, they tell us little about the strength of the evidence.
Statistical thinking is critical thinking
In the Philippines, where forensic science is still developing as a formal discipline, we have an opportunity to do things right from the beginning. As more universities offer forensic programs and as institutions like the proposed National Forensics Institute begin to take shape, statistical reasoning must be built into the system — not added as an afterthought.
This means training students not just in lab techniques but in data analysis and critical thinking. It means encouraging local research that evaluates the accuracy of forensic methods in Philippine conditions. While some standards apply globally, others must be adapted to local contexts — our environments, our populations, and our specific challenges. We also need a stronger culture of peer review and transparency.
We must be willing to acknowledge the limitations of our tools. That is not weakness — it is scientific honesty. It is better to admit what we do not know than to assert conclusions without proof. Pretending otherwise helps no one — not victims, not the accused, and not the justice system.
A science worth trusting
When I began my formal training in 2009, forensic science was already being challenged to become more rigorous. More than a decade later, I find that in many parts of the world — including the Philippines — statistical thinking still has not fully taken root. We still present forensic conclusions with confidence, but too often without context.
Forensic evidence is powerful. But to be worthy of trust, it must be built on science — not just tradition or expert opinion. That means measuring how well our methods work, how often they fail, and how they can be improved. Every conclusion should be treated not as a final answer but as a testable hypothesis.
Forensics without statistics is not science — it is guesswork dressed in lab coats. And when justice is on the line, we cannot afford to guess.