SDS PODCAST EPISODE 96 FIVE MINUTE FRIDAY: THE BAYES THEOREM

This is Five Minute Friday episode number 96: The Bayes Theorem Welcome everybody back to the SuperDataScience podcast. Super excited to have you on board, and today I would like to discuss something very interesting, the Bayes Theorem and an example of its application. So the Bayes Theorem was developed in the 18th century by Thomas Bayes as a way to include additional evidence in our probability calculation as that evidence comes to be. So it sometimes can be hard to get your head around the Bayes Theorem. There is a somewhat complex-looking formula even though the intuition behind it is very, very simple. And that's why today I wanted to just quickly give an example, which I really enjoyed learning about, of the Bayes Theorem applied in practice to just illustrate what it's all about. So let's have a look at an example of breathalyser tests. So, for instance, in Australia here, the police is working very efficiently, and we often have these roadblocks, especially on a Friday or a Saturday in the evening, when you're driving, and the police will have blocked off the whole road, and every single car has to pull over and you have to do the breathalyser test so they know if you're drunk or not. So they're looking for drink driving just to make sure that the roads are safe. And so let's have a look at an example where a group of police officers have set up a roadblock and they're testing every single person regardless of their driving, regardless of what they observe of the driving. They're testing every single person, they have these devices where you just breathe into it, and it says whether or not the person is drunk. This might not be exactly how it is in reality in Australia, but this is the set up of this specific problem that we're going to look into. The device works as follows: they never fail to detect a drunk person. So if somebody is drunk, they will detect him 100% of the time. But at the same time, they have a false positive rate of 5%.

What does that mean? Well that means that if a person is not drunk, they might still say that that person is drunk with a 5% chance. So that is the outline of the problem, and the question is, if somebody was pulled over, and they breathe into this breathalyser, then what is the probability that they are actually drunk? What is the probability that this person is actually drunk? So if indeed they are actually drunk, then the breathalyser will detect them with 100% certainty. But at the same time, if they are not drunk, there is still a 5% chance that the breathalyser will detect them as drunk. So the question is, we have this situation, a person was pulled over, they breathe into the breathalyser, it says that they're drunk. What is the likelihood that they are indeed really drunk, and that they're drink driving? The essential first answer that comes to mind is obviously 95%, right? Because there's only a 5% chance of a false positive, so basically the remainder is the 95% chance should be true that they are drink driving. But that is actually not the true case of things. So this is where the Bayes Theorem comes in. What we need to include is other evidence that we already have about our problem, about our situation. And that will help us refine the final probability. And the other evidence that we're going to include is the statistics around drink driving in the population in general. So let's say that, on average, we know that about 1 out of 1000 people drink and drive. So if you take 1000 people who are driving at any given point in time, approximately one of them, so 0.1%, is actually going to be drink driving. So now if we take into account this base rate, it's called the base rate, something we knew in advance, something that we know overall about this whole situation, we know that 1 in 1000 people are drink driving at any given point in time. And now we're going to apply that in addition to the knowledge in our problem. So if we apply that, then what we're going to

know is that let's just take 1000 people. Just visualise this. So we take 1000 people, out of them, 1 is a driver who's driving drunk, and if we breathalyse all those 1000 people, then the breathalyser will pick that person up as drunk because it is 100% accurate for people who are drink driving. So that person will be picked up as a true positive. Now, in the remaining 999 people, when they're breathalysed, 5% of them will still be picked up as if they are drunk, even though they are not, because there is a false positive rate, 5% of them will be picked up as drunk, and 5% of 999 is about 49.95. So 49.95 people will be picked up as false positive results. And so if we add those two together, in total, the test will pick up the following number of drunk people: 1, which is our true positive, plus 49.95 false positives. And if we add them together, we get 50.95 times out of 1000, the test will say that people are indeed drunk. So now, it's very easy to calculate the probability that somebody is indeed truly drunk, it's basically that 1 person out of the 50.95 is truly drunk, so 1/50.95 gives us approximately 2%. Gives us 0.019 something. So that means that in those people that the test will identify as drunk and driving, only 2% of the people will actually be drunk. How crazy is that. If you think about that for a second, the police blocked off the road, they have these super devices that have 100% accuracy of detecting a drunk person, a 95% accuracy of detecting a non-drunk person, so only a 5% false positive rate, and they're breathalysing people. But even when the person reads that that person is indeed drunk, the actual likelihood of them being drunk is only 2%. How crazy is that! For me, that was a very vivid example of the Bayes Theorem in action, and that it's very counterintuitive, but this happens very often in different situations in life. Nate Silver, for example, talks a bit more about well, actually quite a lot about this in his book, "The Signal and The Noise," and

if you just look up "Bayes Theorem examples" online, you'll find other examples, but this is a very, very vivid example of that happening in action. And by the way, that is also the reason why police officers will sometimes ask the potentially drink driving person to breathe into another device, to do a second breathalyser test. Because that's additional evidence. It gives us additional evidence. When they breathe into the second device, our prior information is richer, and now we're adding on because we already know that 1 in 1000 people drink drive, but we already know about this person, that they have a likelihood of 2% being a drink driver. So the likelihood went from 0.1% (1 in 1000) to 2% now. But now if we ask him to breathe into another breathalyser, then the probability (and you can do the mathematical calculations on your own, it's pretty straightforward, it's exactly the same as what we did now), the probability goes up to 29% if they breathe twice and both times they're detected as drunk. And then if they breathe in again, if they do another breathalyser test, then the third time probability goes up to 89%. So as you can see, as we're gathering more evidence, the probability goes up higher and higher and higher, and therefore it shouldn't come as a surprise that police officers sometimes ask people to breathe into the breathalyser more than once. So there we go, that's how the Bayes Theorem works. I hope you enjoyed the short intro. Once again, there are lots of resources online where you can find more additional examples, for instance one of my favourite ones is Julia Galef's YouTube video titled "A Visual Guide to Bayesian Thinking". Very interesting, maybe check that out. And overall, don't forget about this when you're doing work in data science, or even in your day to day life. Remember that sometimes we need to take into account the bigger picture, the prior evidence that we may have about the problem rather than just simply reacting to the evidence that we're presented at a single point in time. It is often the case that there is much more evidence that we can include in our probability calculations.

So there you go. Hope you enjoyed today's quick introduction to the Bayes Theorem, and I look forward to seeing you here next time. Until then, happy analyzing.