Bayes’ Theorem: What’s All The Fuss About?

Bayes’ theorem has been called a powerful method for creating new knowledge, but it can also be used to advertise superstition and pseudoscience.

Bayes’ Theorem: What’s All The Fuss About?
Bayes’ Theorem: What’s All The Fuss About?

Why does Bayes’ theorem so popular?

Bayes’ Theorem became so popular that it was even featured on the TV show The Big Bang Theory. But, like any tool, it can be used for good or bad.

I don’t know exactly when I first heard about it. But it wasn’t until the last ten years that I really started to take an interest in it after some of my biggest nerds of my students started touting her as a magical guide to life.

The rantings of the students confused me, as did the explanations of the theorem on Wikipedia and other sites — they were either completely dumb or too complicated. I figured Bayes was a passing fad and there was no point in doing deep research. But now Bayesian fever has become too intrusive to ignore.

According to The New York Times, Bayesian statistics “permeates everything from physics to cancer research, from ecology to psychology.” Physicists have proposed Bayesian interpretations of quantum mechanics and Bayesian defenses of string theory and multiverse theory. Philosophers argue that science as a whole can be viewed as a Bayesian process, and that Bayes helps distinguish science from pseudoscience better than the method of falsifiability popularized by Karl Popper .

Artificial intelligence researchers, including robo-car developers at Google, are using Bayesian software to help machines recognize patterns and make decisions. Bayesian programs, according to Sharon Bertsch McGrayne, author of the popular history of Bayesian theorem, “sort email and spam, assess medical risks and government security, decode DNA, and more.” On, physicist John Mather worries that Bayesian machines may become so smart that they supplant humans.

Cognitive scientists suggest that Bayesian algorithms are at work in our brains when it senses, thinks, and makes decisions. In November, scientists and philosophers explored this possibility at a conference at New York University called “Does the Brain Work the Bayesian?”

Fanatics insist that if more people adopted Bayesian thinking (instead of the unconscious Bayesian work that supposedly goes on in the brain), the world would be much better.

What is it really about?

With all the hype, I tried to deal with Bayes once and for all. Among countless explanations of the theorem, I found the best explanations of the theorem on the Internet in Yudkowsky, on Wikipedia, and in the works of the philosopher Curtis Brown and computer scientists Oscar Bonill and Kalid Azad. Now I will try, mainly for myself, too, to explain what the essence of the theorem is.

Bayes’ Theorem is a method for calculating the validity of beliefs (hypotheses, statements, suggestions) based on the available evidence (observations, data, information).

The simplest version is:

original belief + new evidence = new, improved belief

More specifically, the probability that a belief is true given new evidence is equal to the probability that the belief was true without that evidence multiplied by the probability that the evidence is true if the belief is true, and divided by the probability that the evidence is true regardless of the truth of beliefs. Clear?

A simple mathematical formula looks like this:

Bayes’ theorem formula
Bayes’ theorem formula

Where P is a probability, B is a belief, E is evidence. P(B) is the probability that B is true, P(E) is the probability that E is true. P (B | E) is the probability of B if E is true, and P (E | B) is the probability of E if B is true.

Real-world example

To demonstrate how the formula works, the example of medical analyzes is often used. Let’s say you’re being tested for cancer, which affects 1% of people your age. If the test is 100% reliable, then you don’t need Bayes’ Theorem to figure out what a positive result means — but let’s just look at that situation for an example.

To calculate the P (B | E) value, you place the data on the right side of the equation. P (B), the probability that you have cancer before testing is 1%, or 0.01. The same is P (E), the probability that the test result will be positive. Since they are in the numerator and denominator, they cancel out, and it remains P (B | E) = P (E | B) = 1. If the test results are positive, you have cancer and vice versa.

In the real world, the reliability of analyzes rarely reaches 100%. Let’s say your test is 99% reliable. That is, 99 out of 100 people with cancer will get a positive result, and 99 out of 100 healthy people will get a negative result. And it will still be a surprisingly reliable test. Here is the question: if your test is positive, what is the likelihood that you have cancer?

Now Bayes’ theorem shows its full power. Most people will think the answer is 99% or something like that. The test is so reliable, right? But the correct answer is only 50%.

To see why to paste your data on the right side of the equation. P (B) is still 0.01. P (E | B), the probability of getting a positive test in the case of cancer is 0.99.

P (B) * P (E | B) = 0.01 * 0.99 = 0.0099. This is the likelihood that you will get a positive test showing that you are sick.

What about the denominator, P (E)? There is a little trick here. P (E) — The probability of getting a positive test, regardless of whether you are sick. In other words, it includes false positives and true positives.

To calculate the probability of a false positive, multiply the number of false positives, 1% or 0.01, by the percentage of people who do not have cancer, 0.99. It turns out 0.0099. Yes, your excellent test with 99% accuracy produces as many false positives as true.

Let’s finish the calculations. To get P (E), add the true and false positives to get 0.0198, divide by 0.0099, and get 0.5 (!).

So P (B | E), the probability that you have cancer if you test positive is 50%.

If you take the test again, you can drastically reduce the uncertainty, since the probability of you having cancer P (B) will already be 50% instead of 1. If the second test is also positive, according to Bayes’ theorem, the probability of you having cancer will be 99%. or 0.99. As this example shows, repeating the theorem can give a very accurate answer.

But if the test is 90% reliable, which is not bad at all, the chances of having cancer, even if you get twice positive results, are still less than 50%.

Most people, including doctors, have a hard time understanding this distribution of chances, which explains the overwhelming number of diagnoses and treatments for cancer and other diseases. This example suggests that Bayesians are right: the world would be a better place if more people — at least more patients and doctors — accepted Bayesian logic.

On the other hand, Bayes’ theorem is just a reduction in a code of common sense. As Yudkowski writes towards the end of his teaching material:

“By this point, Bayes’s theorem may seem quite obvious and resemble a tautology, instead of being surprising and new. In this case, this introduction has achieved its purpose. “

Going back to the cancer example: Bayes’ Theorem says that the probability of having cancer if your test is positive is the probability of getting a true positive divided by the probability of all positives, true and false. In general, beware of false positives.

Here is my generalization of this principle: the validity of your belief depends on how strongly your belief explains the existing facts. The more options you have to explain the facts, the less credible your personal belief. From my point of view, this is the essence of the theorem.

Alternative explanations

“Alternative explanations” can include many things. Your facts can be false, gotten with the wrong tool, wrong analysis, tendency to get the right result, or even falsified. Your facts may be accurate, but many other beliefs or hypotheses may explain them.

In other words, there is no magic in Bayes’ theorem. It all comes down to the fact that your beliefs are true as much as the evidence in their favor is true. If you have good proof, the theorem yields good results. If the proof is so-so, the theorem won’t help you. Garbage at the entrance, garbage at the exit.

Theorem problems can start with P (B), an initial assumption about the probability of your beliefs, often called the prior probability. In the example above, we had a nice and accurate prior probability of 0.01. In the real world, experts argue over how to diagnose and treat cancer. Your prior probability is likely to be a range, not a single number.

In many cases, the estimate of prior probability is based only on guesswork and allows subjective factors to creep into the calculation. One can guess that the probability of the existence of something — in contrast to the same cancer — is simply zero, for example, strings, the multiverse, inflation, or god. You can refer to dubious evidence of dubious belief. In such cases, Bayes’ theorem can advertise pseudoscience and superstition, along with common sense.

The theorem is edifying: if you are not scrupulously looking for alternative explanations for the available evidence, then the evidence will only confirm what you already believe. This is often overlooked by scientists, which explains why so many scientific claims are wrong. Bayesians argue that their methods can help scientists overcome the propensity for finding evidence to support their beliefs and produce more reliable results — but I doubt it.

As I mentioned, some string theory and multiverse enthusiasts use Bayesian analysis. Why? Because the enthusiasts are tired of hearing that string theory and multiverse theory are not falsifiable and therefore unscientific. Bayes’ Theorem allows them to present these theories in the best possible light. In these cases, the theorem does not eliminate the bias but indulges it.

And yet I admire Bayes’ theorem. It reminds me of the theory of evolution, another idea that seems tautologically simple or depressingly deep, depending on the point of view, and in the same way inspired people for all sorts of nonsense and for amazing discoveries.

Perhaps because my brain works according to Bayes, I’m starting to see allusions to this theorem everywhere. Leafing through the collected works of Edgar Allan Poe on his Kindle, I came across the following sentence from “The Narrative of Arthur Gordon Pym of Nantucket”: “By virtue of our biases or prejudices we are not able to extract lessons from even the most obvious things”.

Consider this before signing up to be a Bayesian.

Read More

If you found this article helpful, click the💚 or 👏 button below or share the article on Facebook so your friends can benefit from it too.

Written by

Bioinformatician at Oncobox Inc. (@oncobox). Research Associate at Moscow Institute of Physics and Technology (@mipt_eng).

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store