Benford’s Law is this counter-intuitive result from statistics that says in nature, digits 1-9 don’t occur with the same frequency. This law is usually applied to the leading digit of naturally occurring numbers, and states that numbers start with “1” more often than any other digit.
This is an established result, and has supposedly been applied to the leading digit of a ton of different data sets including “the surface areas of 335 rivers, the sizes of 3259 US populations, 104 physical constants, 1800 molecular weights, 5000 entries from a mathematical handbook, 308 numbers contained in an issue of Reader’s Digest and the street addresses of the first 342 persons listed in American Men of Science” (what the hell is this list?)
|Benford’s Law: The digits 1-9 don’t occur with the same probability (Wikipedia)|
|Value of the First Digit of a Number||Probability|
If this is your first run-in with Benford’s Law, the result is going to surprise you. You probably don’t think you read it right. That’s how I felt too. So let’s make sure we are all on the same page.
Below is a plot showing the percentage of 237 countries that have a population starting with the digit 1, 2, 3, etc. as of 2010. Red bars are the results, while black dots indicate the distribution predicted by Benford’s law. In other words, what percentage of countries have a population starting with 1 followed by any string of other digits? That could be 198237987198247, or 159823 or 1298032509436097203503298403475 … That’s shown (as a percentage) by the first red bar.
By Melikamp – Own work, CC BY-SA 4.0, https://commons.wikimedia.org/w/index.php?curid=92013276
Shocking. More shocking, is that this seems to work pretty well for a whole load of other data too.
I had heard of Benford’s Law before this, though I didn’t know much about it. I chalked it up to being one of those mathematical oddities that makes the universe seem more ordered than it really is, but nothing I’d stake the fate of the Free World on. It wasn’t until I spotted a quirky little study assessing the validity of COVID-19 infection rates using Benford’s Law that I really gave it any thought.
“WOW! This will make for some great reporting! I wonder if anyone has picked up on this one yet,” I thought.
A day of Googling and I’d found a few podcasts and TV shows pre-dating this study, so I set those into the “watch later” pile. I narrowed my search to Benford + COVID news. I saw a few results that hinted at some application to election results, but brushed those off as quackery. ”Fantastic, let’s scrap together a quick pitch.”
The idea was to probe how valid something like Benford’s Law was on pandemic data. I had found a few other studies that looked specifically at COVID numbers, and there was naturally some debate. I remember hearing about this law being a tool in checking if tax numbers and scientific data had been tampered with, and it appeared that there were some important caveats that needed to be observed for this mystical law to hold. In particular, it could help people flag suspicious data, but there were plenty of extenuating circumstances that stopped it from being a foolproof truth checking tool. The story wasn’t going to be “who’s lying about COVID data?” but instead, “where does Benford’s Law work, and where doesn’t it work?” Seemed like a homerun.
Only later did I realize how attractive Benford’s Law could be to conspiracy theorists and purveyors of pseudoscience.
It turns out Benford’s Law is just one of many pieces of math or science that people have glommed on to and used to push weird agendas. For example, one thing that doesn’t conform to Benford’s Law is the 2020 US Presidential Election results. This has been a fact used by a lot of people to claim the election results were tampered with, and has since been refuted quite succinctly and in quite a few places (Physics World and Reuters are two of the first that come up when you google it and both are good reads, though my favorite so far has been from the blog of Jen Golbeck).
Unfortunately, I sent the pitch before I realized this. Now I feel like I must have been put on some sort of list. Do I have to change my name now?
So, what is it about Benford’s Law that makes it such an attractive angle to push a narrative? It’s a seemingly simple result that leads people to believe that they can use it as-stated on Wikipedia without understanding the nuances and necessary conditions for the “law” to be applicable. In fact, the most common way you’ll see Benford’s Law stated (and in fact how I started this article) is to marvel at how large a shadow it casts. The huge collection of examples that are used to show how ubiquitous Benford’s Law is, along with the use of the term “law” suggests that it is an immutable fact that all things must follow. In fact, this might be one of pop-math’s favorite narratives. (I am officially coining the term “pop-math” for mathy facts that people with serious dad-energy would bring up over dinner)
As a trained physicist, there are plenty of science tropes that get my goat. Parallel universes is one that springs to mind (and was discussed in a great piece by Elani Petrakou over on Massive). On one hand, using ideas that we know will catch the public’s imagination is a powerful strategy to make a story POP or to help get a complicated idea across without getting bogged down in details. It’s attractive because it puts real life science into the context of what most people are far more ready to ingest – fiction.
This brings us back to Benford. It is the perfect confluence of rigor, perceived simplicity, shock value, and to be honest, boringness, that lends itself to a mystical interpretation. And that makes it a prime tool to fool people.
Above I mentioned that Benford’s law seems almost mystical. Well, what’s the difference between science and mysticism? Both try to give us insight into the workings of our world, with the hope of being able to make predictions, explain how things are, or explain how things came to be. The major differences have to do with falsifiability and transparency.
When a claim is “backed by science”, the idea is that the claim can be inspected by another person who can use the same set of facts and techniques to come to the same conclusion. This is how peer review works. Sometimes, people will disagree on how to use a certain technique, or what technique to use to interpret some observations. But that’s ok! Since science includes transparency, claims can be analyzed and the disagreeing parties can identify the details that are at the core of the disagreement.
When I think of mysticism, I think of seers, soothsayers, and prophetic advisors. These characters use their own special blend of science, logic, and nonsense to generate information about the world, like which month’s Ides you should beware of, the details relating to special lamps, and what animal entrails can tell you about the weather. Hindsight let’s us make light of a lot of these things, but in many cases these theories were based on extensive observations and represented the state of the scientific arts at the time. The scientific community tends to drop these theories when enough counter-evidence builds up, or we realize there is a fundamental mistake in the logic underpinning the theory.
At this point of our investigation it seems like mysticism is just outdated science, but I think you’ll agree that this doesn’t quite sound right. I think an important point we need to make is that outdated science falls into disuse in favor of new theories while mysticism doesn’t have to play by the same rules. Mystic explanations don’t need to make reference to largely accepted scientific theories, and they don’t need to be transparent. Mystics are in the possession of techniques and “truths” that the average person does not have access to, and which cannot be confirmed or denied by an outside party with the same facts. This is the main tool of scientific quacks, numerologists, and skeevy Benfordians.
At war with the mystics
In my post from a few weeks ago, I talked about how I view the role of science communicator and science journalist differ. I think this is a great case study to test my theory and decide how to best tidy up the discussion around Benford’s Law.
In short, I made the claim that science journalists are beholden to the public and exist to hold science accountable to the public. That comes in the form of breaking stories about new studies and fresh results, giving readers an unbiased account of what the study does and does not do. Bonus points are awarded for an entertaining and inspiring journey, but the main goal is to keep people current with what is going on in the world of science.
Science communicators, on the other hand, have a more amorphous role. Of course keeping current with science news is a big part of our job, but so is engaging people who wouldn’t normally look to science news and providing background and explainers to get people up to speed to better understand what’s being told to them by journalists. I made the point of noting that a big part of a lot of scicommers’ work focuses on entertainment as well. (Are you not entertained?!?)
After pacing around my apartment for a few days asking what went wrong such that a weird little math fact would get misused so dramatically, this is what I’ve come up with: I think journalists did their job well. As Benford-based claims popped up, journalists reported that these claims were being made (because journalism is unbiased), then asked experts if the claims held up. The experts said “no, these claims are not valid”. That’s how things should be handled. I think the reason things got out of hand is because the brainspace that Benford’s Law takes up for a lot of people is truth that borders on fantasy. We have been happy to present Benford’s Law as clickbaity “one weird trick to find fake data” articles without providing guidance on how the tool is used or how it works. We’ve handed people a toy gun that shoots real bullets.
It’s hard to explain complicated math in a way that gives people a good sense of how it works and what it can and can’t do. Hey, no one said the job of science communicator would be an easy one. At this point there are plenty of articles that go into details about Benford’s Law, especially applied to elections, so we hopefully won’t have the same problem again. But how do we prepare for the next conspiracy-in-science’s-clothing? I don’t know. But I think a good first step is for science communicators to retire tropes that border on fantasy.