The Bayesian Trap


Picture this: You wake up one morning and you feel a little bit sick. No particular symptoms, just not 100%. So you go to the doctor and she also doesn’t know what’s going on with you, so she suggests they run a battery of tests and after a week goes by, the results come back, turns out you tested positive for a very rare disease that affects about 0.1% of the population and it’s a nasty disease, horrible consequences, you don’t want it. So you ask the doctor “You know, how certain is it that I have this disease?” and she says “Well, the test will correctly identify 99% of people that have the disease and only incorrectly identify 1% of people who don’t have the disease”. So that sounds pretty bad. I mean, what are the chances that you actually have this disease? I think most people would say 99%, because that’s the accuracy of the test. But that is not actually correct! You need Bayes’ Theorem to get some perspective. Bayes’ Theorem can give you the probability that some hypothesis, say that you actually have the disease, is true given an event; that you tested positive for the disease. To calculate this, you need to take the prior probability of the hypothesis was true – that is, how likely you thought it was that you have this disease before you got the test results – and multiply it by the probability of the event given the hypothesis is true – that is, the probability that you would test positive if you had the disease – and then divide that by the total probability of the event occurring – that is testing positive. This term is a combination of your probability of having the disease and correctly testing positive plus your probability of not having the disease and being falsely identified. The prior probability that a hypothesis is true is often the hardest part of this equation to figure out and, sometimes, it’s no better than a guess. But in this case, a reasonable starting point is the frequency of the disease in the population, so 0.1%. And if you plug in the rest of the numbers, you find that you have a 9% chance of actually having the disease after testing positive. Which is incredibly low if you think about it. Now, this isn’t some sort of crazy magic. It’s actually common sense applied to mathematics. Just think about a sample size of 1000 people. Now, one person out of that thousand, is likely to actually have the disease. And the test would likely identify them correctly as having the disease. But out of the 999 other people, 1% or 10 people would falsely be identified as having the disease. So, if you’re one of those people who has a positive test result and everyone’s just selected at random – well, you’re actually part of a group of 11 where only one person has the disease. So your chances of actually having it are 1 in 11. 9%. It just makes sense. When Bayes first came up with this theorem he didn’t actually think it was revolutionary. He didn’t even think it was worthy of publication, he didn’t submit it to the Royal Society of which he was a member, and in fact it was discovered in his papers after he died and he had abandoned it for more than a decade. His relatives asked his friend, Richard Price, to dig through his papers and see if there is anything worth publishing in there. And that’s where Price discovered what we now know as the origins of Bayes’ Theorem. Bayes originally considered a thought experiment where he was sitting with his back to a perfectly flat, perfectly square table and then he would ask an assistant to throw a ball onto the table. Now this ball could obviously land and end up anywhere on the table and he wanted to figure out where it was. So what he’d asked his assistant to do was to throw on another ball and then tell him if it landed to the left, or to the right, or in front, behind of the first ball, and he would note that down and then ask for more and more balls to be thrown on the table. What he realized, was that through this method he could keep updating his idea of where the first ball was. Now of course, he would never be completely certain, but with each new piece of evidence, he would get more and more accurate, and that’s how Bayes saw the world. It wasn’t that he thought the world was not determined, that reality didn’t quite exist, but it was that we couldn’t know it perfectly, and all we could hope to do was update our understanding as more and more evidence became available. When Richard Price introduced Bayes’ Theorem, he made an analogy to a man coming out of a cave, maybe he’d lived his whole life in there and he saw the Sun rise for the first time, and kind of thought to himself: “Is, Is this a one-off, is this a quirk, or does the Sun always do this?” And then, every day after that, as the Sun rose again, he could get a little bit more confident, that, well, that was the way the world works. So Bayes’ Theorem wasn’t really a formula intended to be used just once, it was intended to be used multiple times, each time gaining new evidence and updating your probability that something is true. So if we go back to the first example when you tested positive for a disease, what would happen if you went to another doctor, get a second opinion and get that test run again, but let’s say by a different lab, just to be sure that those tests are independent, and let’s say that test also comes back as positive. Now what is the probability that you actually have the disease? Well, you can use Bayes formula again, except this time for your prior probability that you have the disease, you have to put in the posterior probability, the probability that we worked out before which is 9%, because you’ve already had one positive test. If you crunch those numbers, the new probability based on two positive tests is 91%. There’s a 91% chance that you actually have the disease, which kind of makes sense. 2 positive results by different labs are unlikely to just be chance, but you’ll notice that probability is still not as high as the accuracy, the reported accuracy of the test. Bayes’ Theorem has found a number of practical applications, including notably filtering your spam. You know, traditional spam filters actually do a kind of bad job, there’s too many false positives, too much of your email ends up in spam, but using a Bayesian filter, you can look at the various words that appear in e-mails, and use Bayes’ Theorem to give a probability that the email is spam, given that those words appear. Now Bayes’ Theorem tells us how to update our beliefs in light of new evidence, but it can’t tell us how to set our prior beliefs, and so it’s possible for some people to hold that certain things are true with a 100% certainty, and other people to hold those same things are true with 0% certainty. What Bayes’ Theorem shows us is that in those cases, there is absolutely no evidence, nothing anyone could do to change their minds, and so as Nate Silver points out in his book, The Signal and The Noise, we should probably not have debates between people with a 100% prior certainty, and 0% prior certainty, because, well really, they’ll never convince each other of anything. Most of the time when people talk about Bayes’ Theorem, they discussed how counterintuitive it is and how we don’t really have an inbuilt sense of it, but recently my concern has been the opposite: that maybe we’re too good at internalizing the thinking behind Bayes’ Theorem, and the reason I’m worried about that is because, I think in life we can get used to particular circumstances, we can get used to results, maybe getting rejected or failing at something or getting paid a low wage and we can internalize that as though we are that man emerging from the cave and we see the Sun rise every day and every day, and we keep updating our beliefs to a point of near certainty that we think that that is basically the way that nature is, it’s the way the world is and there’s nothing that we can do to change it. You know, there’s Nelson Mandela’s quote that: ‘Everything is impossible until it’s done’, and I think that is kind of a very Bayesian viewpoint on the world, if you have no instances of something happening, then what is your prior for that event? It will seem completely impossible your prior may be 0 until it actually happens. You know, the thing we forget in Bayes’ Theorem is that: our actions play a role in determining outcomes, and determining how true things actually are. But if we internalize that something is true and maybe we’re a 100% sure that it’s true, and there’s nothing we can do to change it, well, then we’re going to keep on doing the same thing, and we’re going to keep on getting the same result, it’s a self-fulfilling prophecy, so I think a really good understanding of Bayes’ Theorem implies that experimentation is essential. If you’ve been doing the same thing for a long time and getting the same result that you’re not necessarily happy with, maybe it’s time to change. So is there something like that that you’ve been thinking about? If so, let me know in the comments. Hey, this episode of Veritasium was supported in part by viewers like you on Patreon and by Audible. Audible is a leading provider of spoken audio information including an unmatched selection of audiobooks: original, programming, news, comedy and more. So if you’re thinking about trying something new and you haven’t tried Audible yet, you should give them a shot, and for viewers of this channel, they offer a free 30-day trial just by going to: audible.com/Veritasium You know, the book I’ve been listening to on Audible recently is called: ‘The Theory That Would Not Die’ by Sharon Bertsch McGrayne, and it is an incredible in-depth look at Bayes’ Theorem, and I’ve learned a lot just listening to this book, including the crazy fact that Bayes never came up with the mathematical formulation of his rule that was done independently by the mathematician Pierre-Simon Laplace so, really I think he deserves a lot of a credit for this theory, but Bayes gets naming rights because he was first, and if you want, you can download this book and listen to it, as I have, when I’ve just been driving in the car or going to the gym, which I’m doing again, and so if there’s a part of your day that you feel is kind of boring then I can highly recommend trying out audiobooks from Audible. Just go to: audible.com/Veritasium So as always I want to thank: Audible for supporting me, and I want to thank you for watching.

100 Replies to “The Bayesian Trap”

  1. Near the end of the video, it was suggested that if no instances of a certain event have yet occurred, you should assign 0 as the a priori probability for that event. That's not correct; see Good, Turing et al. on the likelihood for rare species.

  2. You seem to be assuming that the error of the blood test is random. If not, the numbers change drastically, e.g. you might have a particular rare gene sequence which will always lead to a false result.

  3. I have to reject the facile Mandela quote. If something hasn't happened before, that does NOT mean it's impossible. The rest of the talk was pretty good though.

  4. 2:18 But doesn't that presume that all 1000 people will be tested for the disease, which is not the case? Only the people who feel the possible symptoms of it will take the test, which drastically reduces the amount of false positives and thus drastically increases the probability of your positive not being false, right?

  5. I never did understand Bayes' theorem until now…and I still don't, but I'm thinking the odds are I will if I watch this about 3 more times.

  6. Well, Yes!

    And that's what moved me towards where I'm at now, Norway 😳 there's much of my upbringing I'd love to record but what's relevant to the closing remarks here is that I had to break from the ubiquitous practice of the sprawl and unnecessary consumption.
    But credit must be given to my parents for a perhaps unintentional springboard of customarily dressing me in hand me downs and my Dad's refusal to our soda requests. We rarely ate out that it was never a question of being able to suggest eating out. They relaxed as they became more comfortable w their household income/savings… By then my oldest sister had taught me to record my own meager savings from chore-completing-allowance and prior to that my Mormor established a tight loving bond with me irrespective of her tiny apartment and having to travel through town by foot or bus.

    Here in Oslo, as strange and unfair some of the folk can be here, I adore not having to rely on using a car to get across town. There's variety in things to do and sights to appreciate. I feel content, although quite short of flair compared to living in LA.
    I'm content bc I'm not stuck witnessing 10s of thousands of people at a time:
    commuting long distances,
    in long lines buying crap they shouldn't buy that's overly packaged or a long term health expense on taxes payers to come,
    constantly selling eachother poor lifestyle habits (like excess airtravel without concerning themselves with countering their carbon impact by means of reforestation or not eating meat as much as possible).
    I can access into nature without harming it here.

    If only all cities were as well designed.
    There are easily incredibly better designed than here, too! But people need to demand it as consumers and municipalities.
    It feels too big of a problem but I pray the powers that be are heading towards such goals. Other than do what I can for my own life what else could I do to influence others?

    Every time I make a suggestion to the urban ag community I'd like to contribute to they take the idea without asking me how to do it or without giving credit. Which might seem nice but it's slightly frustrating when they don't bother to thoroughly understand my vision and seek having me Work towards developing my concepts with others. But I'm also a little new here and whatever skills I need to develop I'd like to…….. whatever those may be 😐
    I'm sleepy 😴

  7. Did Veritasium just call for the proletariate to rise up against the bourgeoisie to change the system?

  8. I've wanted to stream and record for YouTube for years now. This video inspired me to stop just talking about trying, and to just do it.

  9. I wonder how related Baeysian statistics and machine learning are. A machine learning algorithm would also only start rationalising the sun rising on the opening of a cave after a certain amount of reoccurrences. Further, a machine learning algorithm can make estimates for non-observed scenarios, but heavily relies on the amount and quality of data (or experiences being made).
    I would like to hear your thoughts on that in a future episode.

  10. Tomorrow, atoms from a local supernova could hit earth and completely destroy our atmosphere – killing every one of us. Observations are nice, but there are always unknowns

  11. i = probability of someone having something interesting to say
    c = the probability of people having a camera
    f = the probability of someone being in a field
    t = the probability of someone talking
    f*c*i*t = the probability of someone talking in a field to a camera being interesting

  12. i think everyone understands bayes theorem and applies it with every belief they have, but most do not realize it can be expressed mathematically and therefore do not understand that they should dive headfirst into artificial intelligence and make a billion dollars in the next decade

  13. Okay, maybe someone can explain the example better because he basically disregards the 99% accuracy in favor of the .1% of the population and 1% false positive. 1% of the population having the disease really has no impact on the accuracy of the test so how does this calculation work?

  14. I actually am really glad I found this video. I read the comments and found that a little of the maths may possibly be wrong but what the hell do i know? I’m a student and I’ve been studying really hard and for some reason I can’t seem to stay focused and my results have reflected that and I thought it was because well I dunno why but I guess after watching this video I can understand that perhaps I should change my ways and not do more.

    Fantastic videos! Very interesting and educational. I think you do an amazing job and you keep up the good work.

  15. But the issue here is that its highly likely you're doctor has ordered this test due to a certain complaint, that would render your calculations with bays theorem completely useless. It may not be a 99% chance but in real life scenarios it would probably be a lot closer to that than 9%.

  16. when ı did the math result is not exactly 1/11 : 0,0909090909 it is : 0,09016393 ı double checked it. but why?

  17. I recently had an experience that exemplifies how you wrapped up the video. I went to school for Computer Science, but my real passion is Aerospace Engineering. I searched for about two years to find a job, and poured through hundreds of job listings and submitted dozens of applications to find anything that would interest me more than the job I got right out of college. However, none of that statistical improbability mattered. It only took one company giving me just one chance to prove myself. Statistics mean very little when it comes to opportunities like that. Models a great and can help inform how we view the world around us, but none of our models are perfect. And if you assume that the next application, the next date, the next time you try might very well be the one that changes you life forever, you may find yourself living a one-in-a-billion life. And if it doesn't work out and you get one more rejection, you very well might still be one more attempt away from your big break.

  18. You are right that the prior probability is the most challenging to assign. That is where your assumption is wrong, too.

    The 9% is true only if you assumed the prior probability equal to the frequency of the disease in the general population.
    However, this probability is not what you assume before ordering the test.
    If a physician assumed the patient has the disease with the same chance as anyone else, there would (apart from general screening) be no reason to test for the disease.

    As soon as symptoms trigger a presentation of the patient to a doctor, we are talking about a subpopulation.
    When the doctor starts suspecting something on basis of history and clinical examination, the subpopulation is even more specific.

    In your case, the correct prior P(H) would not be a background frequency of disease but:
    "The frequency of the disease in self-presenting patient population that has been ordered the test." (Had it been possible, the granularity to the degree of symptom set would have been useful, too.)

    Otherwise one great talk, thanks!

  19. This clearly mixes personal believes with an unshakeable mathematic theorem. The discussion here is not in any way about the Bayes theorem as such, which is a natural consequence of the probability theorem, but rather a critique of the impossibility to build up a correct state space to model the reality. All models are wrong, but some are useful, as said some economist (and hopefully he did not die after that sentence, for it would not resonate well with its message). Best wishes.

  20. So you've taken the test that is 99% accurate and its turned out positive and that means there is a 9% chance you have the disease? After doing it twice 91% sure… guess I'm dumb because that makes no sense to me.

  21. Your comments toward the end were truly inspiring. I had never thought about in such a mathematical way.

  22. It's very important that the 91% of a true positive after two positive tests is true only if the tests are independent, which is usually not the case. Otherwise, the posterior probability for having the disease after two positive tests could be as low as 9%.

  23. Can u please explain bayes theorem as a reply to this comment to prove that extraordinary claims require extra ordinary evidence.

  24. The assumption at 2:04 is basically assuming that the clinical judgement of the doctor is worthless, that is to say there is no special reason why you were given the test in the first place. If that were true, you need to see a different doctor. This is the real trap of Bayes Theorem: flawed assumptions. It's also the trap of ideology. 😉

  25. I think the probability is 11/111 ~ 0.099%
    Edit: I forgo to multiply by 100 so it is actually 9%, in fact I came up with the same method

  26. I took a Coursera course on Bayesian Statistics last year and I have a better understanding of Bayes theorem after this video than after taking that course.

  27. Besides the great breakdown of the Bayes Theorem, you made a great philosophical point here. Reminds me of Gödel's incompleteness theorem, but in philosophy of mind.

  28. At around 3:22, the aerial shots were disturbing. I lost concentration on the verbal narrative and had to go back to watch it!

  29. Thanks for sharing this great thought. According to my limited amount of knowledge in psychology, there is a term called "learned helplessness" that could be one of the Bayesian Trap this video describes. It is a self-enhancing process of believing that one is incapable of doing something and failed again and again and again, and that failure fulfills one's image of incapability. The reason of this trap is probably because for every event happening, they are closely correlated with the previous events instead of "independent". So one will become even less willing to try hard or try a different way. The point to avoid or at least become aware of this trap is to reflect on whether I am just inheriting the old belief of incapability from the past failures or I have learned lessons from the past failures so as not to make the same mistakes again. "Independent" means trying things differently and not making the same mistakes again as best as one could.

  30. Doesn't the false positive rate for the test also apply to people who are tested yet feel perfectly fine and have no reason to go to the doctor? If so, does that mean the first prior is off since this person specifically went to the doctor because they felt unwell?

  31. Did anyone even notice that the calculation of the probability of the person having the disease at the start of the video was wrong …as he put .001% instead of .01% and that's y he got a 9% probability and not a 90%

  32. must admit I was confused af at 2:00 (when substitution happened)
    so given only
    P(H) = 0.001
    P(E|H) = 0.99
    we end up substituting 0.01 for P(E|-H) leaving to false impression that P(E|H) + P(E|-H) = 1 (like in P(E, H) + P(E, -H) case)
    so at the end of day, I must conclude that P(E|-H) was given (though its really hard to catch)
    and that's bad example of oversimplification (trying to have nice numbers that hide complexity)

  33. I can't believe how good the quality of his content is. Connectivity something what I learnt in school like Bayes theorem to reality and how it can change our perspective on real life #MindBlown

  34. But Baye's theorem comes to mind like an intuition without any thinking involved if you have even a fair enough of Mathematical IQ. I think it getting published was more like nobody else cared to publish that water is wet so Mr. James got it published. And since it's a theorem … it's not required to have gigantic proofs that a Law requires. It's just common sense proven with example. I don't know… I don't know why we have it formalised as a theorem… I find Pythagoras theorem more non-intuitive even to a 15yr school student old than Baye's theorem.

  35. The comment from ‘The signal and the noise’ is interesting but problematic. People might say they’re 100% certain about x but how certain are they really about their certainty? How certain can we be about the certainty they claim? We can’t, hence we argue.

  36. With or without Baysean reasoning, it's 100% certain that an "unexpected" catastrophe will affect the planet.
    By selecting some specific causal factor like Nuclear First Strike, or Global Warming / Climate Change etc, the preparation for defense against consequences, like moving to Mars or Moon, is ridiculously expensive and is an inappropriate distraction.
    Muddling about with Mathematical Abstractions is good for entertainment and not much else in the face of certainty, as is the topic of the video?

  37. This was a dope video for me because I just started back to school to pursue an engineering career after almost 10 years in the auto field. I was learning about electricity and came to Youtube because i was curious about how exactly we determine the number of particles in an atom. Watched a couple videos, got what I was looking for and decided to watch this from the suggested row. It correlates with my recent life changes perfectly and I said all that just to say thank you. I've been hesitant and nervous about "starting over" but this really helped me. In so many ways. If i keep typing this comment will be a book. Thank you again

  38. Oh my god, I still am not clear about Bayes' Theorem but I'm much more positive about getting to understand it. Thanks.

  39. Thanks so much for this video, your explanation really helped me understand Bayes' Theorem. Much more confident for my test/exam now!

Leave a Reply

Your email address will not be published. Required fields are marked *