Conditional Probability and Bayes Chapter 2 Lecture 7 Yiren Ding Shanghai Qibao Dwight High School March 15, 2016 Yiren Ding Conditional Probability and Bayes 1 / 20
Outline 1 Bayes Theorem 2 Application in Judicial Decisions Probability, misleading Search and Rescue Tea House Scam Gracia Murder Mystery 3 The O.J. Simpson Murder Case Bayesian Analysis 4 The Lost Diamond Intuitive Defense Yiren Ding Conditional Probability and Bayes 2 / 20
Bayes Theorem Introduction Suppose in a court case, the defendant is either guilty (H, hypothesis is true ) or not guilty (H). Before any evidence is presented, we can assign the prior probabilities P(H) and P(H) = 1 P(H). The essential question we must ask is: How do the prior probabilities change once evidence (event E) is revealed? For instance, the event E could be the evidence that the accused s fingerprint is found on the crime scene, or the accused has the same blood type as the perpetrator s. The updated value of the probability is called the posterior probability, denoted by P(H E), and can be expressed via the standard form of the Bayes Theorem: P(H E) = P(HE) P(E) = P(E H)P(H) P(E H)P(H) + P(E H)P(H) Yiren Ding Conditional Probability and Bayes 3 / 20
Theorem 1 (). The posterior probability P(H E) satisfies P(H E) P(H E) = P(H) P(E H) P(H) P(E H). In words, Bayes theorem in odds form states that posterior odds = prior odds likelihood ratio. The theorem easily follows by applying the definition of conditional probability to P(H E) and P(H E), but its implication is profound. The factor P(H) P(H) gives the prior odds in favor of the hypothesis H before the evidence has been presented. The factor P(E H) P(E H) gives the likelihood ratio or the Bayes factor, which represents the impact the evidence will have on the hypothesis. Yiren Ding Conditional Probability and Bayes 4 / 20
Do not just memorize this theorem, but understand it first: P(H E) P(H E) = P(H) P(E H) P(H) P(E H). The Bayes factor is large when P(E H) is large compared to P(E H). This means that it is very likely that the evidence will be observed given that the hypothesis is true. Therefore, with any two pieces of evidence E 1 and E 2, this theorem can be used iteratively, P(H E 1 E 2 ) P(H E 1 E 2 ) = P(H) P(E 1 H) P(E 2 HE 1 ) P(H) P(E 1 H) P(E 2 HE 1 ). Here lies the real power of Bayes theorem. You should be crying right now at the beauty of probability theory! Yiren Ding Conditional Probability and Bayes 5 / 20
Application in Judicial Decisions Applying Bayes Theorem in Court In judicial decision making, prior probability P(H) represents the personal opinion of the court before the evidence is presented. The Bayes factor is often determined by an expert, who sometimes mistaken P(H E) with P(E H)! A classic example is the famous court case of People vs. Collins in Los Angeles in 1964. In this case, a couple matching the description of a couple that had committed an armed robbery was arrested. Based on expert testimony, the district attorney claimed that the frequency of couples matching the description was roughly 1 in 12 million. Although this was the estimate for P(E H), the district attorney treated this as if it was P(H E) and incorrectly concluded that the couple was guilty beyond reasonable doubt! A low P(E H) does not necessarily imply a high Bayes factor. The accused is most likely guilty only if P(E H) is significantly larger than P(E H). Yiren Ding Conditional Probability and Bayes 6 / 20
Probability, misleading Probability can be misleading Statement 1 Only 10% of traffic accidents are caused by drunk drivers. This means that 90% of traffic accidents are caused by sober drivers. Therefore, we should only allow drunk drivers on the street. Yiren Ding Conditional Probability and Bayes 7 / 20
Probability, misleading Probability can be misleading! Statement 2 For the past 10 years, only.1% of drunk people in Shanghai ended up killing themselves while driving. Therefore, it is okay to get drunk while driving and still have a 99.9% probability of living. Yiren Ding Conditional Probability and Bayes 8 / 20
Probability, misleading English can trick you! Be very careful when people say the following sentences: 1 Only...% of X lead to Y. 2 Only...% of X end up as Y. 3 Only...% of X become Y. 4 Only...% of X are caused by Y. 5 Only...% of X are affected by Y.. These all refer to the conditional probability P(Y X )! Often people will try to trick you by subtly distorting the sample space X in order to skew the probability, so you must be very careful when interpreting these remarks. Remember that a small % of X leading to Y does not necessarily mean that also a small % of Y leading to X! Yiren Ding Conditional Probability and Bayes 9 / 20
Search and Rescue Example 1 (Search and Rescue). It is believed that a sought-after wreck will be in a certain sea area with probability p = 0.4. A search in that area will detect the wreck with probability d = 0.9 if it is there. What is the revised probability of the wreck being in the area when the area is searched and no wreck is found? Here our hypothesis H is that the wreck is in the area and the evidence E is that the wreck has not been detected in that area. The prior odds is P(H) : P(H) = 0.4 : 0.6 = 2 : 3. The likelihood ratio is P(E H) : P(E H) = 0.1 : 1 = 1 : 10. The posterior odds is P(H E) : P(H E) = 2 3 1 10 = 1 : 15. Therefore, the updated probability is P(H) = 1 1 + 15 = 1 16. Yiren Ding Conditional Probability and Bayes 10 / 20
Tea House Scam Example 2 (Tea House Scam). The Seven Virtues Tea House invites students to take part in the following betting game. Three cards are placed into a hat. One card is red on both sides, one is black on both sides, and one is red on one side and black on the other side. A student is asked to pick a card out of the hat at random, and reveal only one side of the card. The owner of the tea house bets the student equal odds that the other side of the card will be the same color as the one shown. (If the other side is the same color, I win $1, otherwise you win 1$.) Do you want to play? Without loss of generality, suppose the chosen card reveals red on the visible side. Let H be the hypothesis that both sides of the chosen card are red and let E be the evidence that the visible side of the chosen card is red. The prior odds is 1 : 2 since there is only one card with red on both sides. The likelihood ratio is 1 : 1 4 = 4 : 1. (Seeing red is very likely!) The posterior odds is therefore 1 2 4 1 = 2 : 1, which means P(H E) = 2 2 + 1 = 2 3. Yiren Ding Conditional Probability and Bayes 11 / 20
Gracia Murder Mystery Example 3 (Gracia Murder Mystery). Gracia was murdered last night. Xena and Yang are the prime suspects. Both persons are on the run, and after an initial investigation, both fugitives appear equally likely to be the murderer. Further investigation reveals that the actual perpetrator has blood type A. Ten percent of the population belongs to the group having this blood type. Additional inquiry reveals that Xena has blood type A, but offers no information concerning Yang s blood type. In light of this new information, what is the probability that Xena is the one who murdered Gracia? Yiren Ding Conditional Probability and Bayes 12 / 20
Gracia Murder Mystery Example 3 solution Let H denote the event that Xena is the murderer. Let E represent the new evidence that Xena has blood type A. 1 The prior odds is 1 : 1, and the likelihood ratio is 1 : 10 = 10 : 1. Hence the posterior odds is 1 1 10 1 = 10 : 1, and the posterior probability that Xena is the murderer is P(H E) = 10 1 + 10 = 10 11. The probability that Yang is the perpetrator is 1 10 11 and not, 1 as many would think, 10 1 2 = 1 20. This is because that Yang is not a randomly chosen person because he has a 50% probability of being the perpetrator. Bayesian analysis can sharpen our intuition! 11 = 1 Yiren Ding Conditional Probability and Bayes 13 / 20
The O.J. Simpson Murder Case Example 4 (The O.J. Simpson Murder Case). Nicole Brown was murdered at her home in Los Angeles on the night of June 12, 1994. The prime suspect was her husband O. J. Simpson, at the time a well-known celebrity famous both as a TV actor as well as a retired professional football star. This murder led to one of the most heavily publicized murder trials in the United States during the last century. The fact that the murder suspect had previously physically abused his wife played an important role in the trial. The famous defense lawyer Alan Dershowitz, a member of the team of lawyers defending the accused, tried to belittle the relevance of this fact by stating that only 0.1% of the men who physically abuse their wives actually end up murdering them. Was the fact that O. J. Simpson had previously physically abused his wife irrelevant to the case? Yiren Ding Conditional Probability and Bayes 14 / 20
The O.J. Simpson Murder Case Bayesian Analysis The O.J. Simpson Murder Case The answer is no. To explain that, we define E = the event that husband has physically abused his wife M = the event that the wife has been murdered G = the event that the husband is guilty of the murder of his wife The crucial fact in this case is that Nicole Brown was already murdered! The question, therefore, is not how likely does abuse lead to murder, i.e., P(M E), but the probability that, given the wife is murdered, the husband is guilty if he had previously abused his wife: P(G EM). Since the size of the event E is so much larger than the size of the event EM, it is no surprise that the conditional probability P(M E) will be significantly smaller than P(G EM). However, Alan Dershowitz s argument that only 0.1% of physical abuse lead to murder was indeed falsely convincing! Yiren Ding Conditional Probability and Bayes 15 / 20
The O.J. Simpson Murder Case Bayesian Analysis The O.J. Simpson Murder Case To find the probability that Simpson is guilty we use the Bayes formula P(G EM) P(G M) P(E GM) = P(G EM) P(G M) P(E GM), According to crime statistics, in 1992, 4, 936 women were murdered in the United States, of which roughly 1, 430 were murdered by their (ex)husbands or boyfriends. This results gives an estimate of 1,430 4,936 = 0.29 for the prior probability P(G M), and an estimated probability 0.71 for P(G M). Furthermore, it is also known that roughly 5% of married women in the United States have at some point been physically abused by their husbands. If we assume that a woman who has been murdered by someone other than her husband is a randomly selected woman, then P(E GM) = 0.05. Yiren Ding Conditional Probability and Bayes 16 / 20
The O.J. Simpson Murder Case Bayesian Analysis The O.J. Simpson Murder Case The value of P(E GM) will be estimated based on the reported remarks made by Simpson s famous defense attorney, Alan Dershowitz Dershowitz admitted in a newspaper article, that a substantial percentage of the husbands who murder their wives have, previous to the murder, also physically abused their wives. Given this statement, it is reasonable to assume that P(E GM) = 0.5. By substituting the various values we find the posterior odds to be P(G EM) P(G EM) = 0.29 0.71 0.5 0.05 = 4.08. This means that P(G EM) = 0.81. In other words, there is an estimated probability of 81% that the husband is the murderer of his wife in light of the knowledge that he had previously physically abused her. Therefore this evidence is certainly very relevant to the case! Yiren Ding Conditional Probability and Bayes 17 / 20
The O.J. Simpson Murder Case Bayesian Analysis Visualizing the O.J. Simpson Murder Yiren Ding Conditional Probability and Bayes 18 / 20
The Lost Diamond Example 5 (The Lost Diamond). A diamond merchant has lost a case containing a very expensive diamond somewhere in a large city in an isolated area. The case has been found again but the diamond has vanished. However, the empty case contains DNA of the person who took the diamond. The city has 150,000 inhabitants, and each is considered a suspect in the diamond theft. An expert declares that the probability of a randomly chosen person matching the DNA profile is 10 6. The police search a database with 5,120 DNA profiles and find one person matching the DNA from the case. Apart from the DNA evidence, there is no additional background evidence related to the suspect. On the basis of the extreme infrequency of the DNA profile and the fact that the population of potential perpetrators is only 150,000 people, the prosecutor jumps to the conclusion that the odds of the suspect not being the thief are practically nil and calls for a tough sentence. What do you think of this conclusion? Yiren Ding Conditional Probability and Bayes 19 / 20
The Lost Diamond Intuitive Defense Intuitive Defense This is a textbook example of the faulty use of probabilities. The real issue here is that the probability that the suspect is innocent of the crime given a DNA match is quite different from the probability of a random DNA match alone! What we should really look for is the probability that among all persons matching the DNA, the arrested person is the perpetrator. Therefore, the counsel of defense could reason as follows: Among the other 150, 000 5, 120 = 144, 880 individuals, the expected number of people matching the DNA profile is 144, 880 10 6 = 0.14488. So the probability that the suspect is guilty is 1/(1 + 0.14488) = 0.8735. It is not beyond reasonable doubt that the suspect is guilty and thus the suspect must be released! Yiren Ding Conditional Probability and Bayes 20 / 20