字幕記錄 00:00 So we're talking about conditional probability, right? 00:04 We're continuing to do conditional probability. 00:06 And so today we're gonna talk about a couple of the most 00:10 interesting famous problems related to conditional probability. 00:14 First one, a lot of you have seen in some form or other. 00:19 And that's called the Monty Hall problem, the three door problem. 00:24 Very, very famous problem, it's been in movies, on TV, and 00:28 New York Times articles about this. 00:31 There are entire books written just about this problem, 00:35 popular books on just this problem, entire problem. 00:40 So a lot of you have seen it before, but I'm not assuming you've seen it. 00:42 Even if you have seen it though then I'm hoping I can give you some additional 00:46 ways to think about it. 00:48 And we'll approach it from more than one perspective 00:51 because it's a very subtle problem. 00:53 This is the sorta problem that almost everyone gets wrong the first time they 00:57 see it. 00:58 And then if they think about it for a while they think they understand it. 01:02 And then you ask the same question just in a slight disguise. 01:08 And then everyone gets it wrong again. 01:09 So, I'm gonna try to talk a lot about how do you really think about this so 01:14 that you won't get fooled if I ask the same thing. 01:17 And then just change a few things around. 01:19 What's really going on here. 01:20 But anyway I'm not assuming you've seen the problem before, 01:22 so let's start from the beginning. 01:26 So Monty Hall is a game show host. 01:32 He's retired now, but for 01:34 many years he hosted a game show on TV called Let's Make a Deal. 01:39 And so the problem is named after him because it's not exactly what he did on 01:44 his game show but he did kinda similar kinds of games on his show. 01:49 So the game is like this, there are three doors. 01:58 Let's just call them door 1, 2 and 3. 02:01 [COUGH]. 02:03 And suppose that you're the contestant on the game show. 02:07 Monty Hall asks you to choose. 02:10 And right now you don't know what's behind each door. 02:12 Monty Hall asks you to choose a door. 02:17 So say you pick that one. 02:21 So, there are all assumptions in this problem. 02:24 We are supposed to assume that one door has a car behind it. 02:29 The other two doors have goats. 02:38 So you don't know which one. 02:41 One of these has a car behind it. 02:42 The other two have goats. 02:44 And we're assuming that you, 02:46 the contestant, have no idea which one has the car. 02:50 They're all equally likely, but we assume that Monty Hall knows which is which. 02:56 So Monty knows which one has the car. 03:01 Okay, that's a key assumption of this problem. 03:04 Sometimes it's kinda left implicit, and 03:06 I think it's important to state that explicitly. 03:11 It's a different problem if Monty Hall doesn't know. 03:14 So one door has a car, two doors have goats. 03:18 You pick one. 03:19 Another assumption is that you want the car, you do not want the goats. 03:24 So you pick a door. 03:26 Suppose you pick this one. 03:30 So let's say you pick this one. 03:32 Pick this one. 03:34 Just by symmetry, we can simplify things, if we assume that we, 03:39 being the contestants, we choose door one. 03:41 If you want door 3 instead, that's fine with me, we could just renumber the doors. 03:46 So let's assume that we pick this one initially, door 1, 03:49 just to simplify the notation. 03:51 Then what happens is that Monty opens up either door 2 or door 3 revealing a goat. 04:00 So for example, he could possibly opens door 2 and there's a goat there. 04:06 So then you know the car's either the one you initially chose or door 3. 04:11 Question is should you switch? 04:13 Monty Hall gives you the option of do you wanna switch to the other door that's 04:17 unopened or keep your original choice? 04:20 The question is, is it beneficial to switch? 04:23 So that's the problem and there's one more assumption. 04:28 Which is that Monty always opens, 04:32 he's always gonna offer you that choice, so he's always gonna open a goat door. 04:37 I'm just saying goat door is a door with a goat. 04:41 That's an assumption. 04:44 But if Monty opened the door that had the car then it's a stupid game, right? 04:49 Like, it's okay. 04:50 >> [LAUGH] >> So 04:50 he's always gonna open the door with the goat. 04:52 That's why he needs to know otherwise by chance he might spoil the game. 04:57 There's one more assumption that usually doesn't usually get stated but 05:00 it's actually pretty important. 05:03 If he has a choice of which door to open. 05:08 He picks with equal probabilities. 05:12 And the case where that happens is if initially you guessed right, so 05:16 the car is here. 05:17 These ones both have goats. 05:19 So Monty could open either door 2 or door 3, assume those are equally likely. 05:24 On the strategic problems, I posted the new homework and 05:27 the new strategic practice. 05:29 On the strategic practice you'll find a problem there which is kind of 05:32 an extension to the case where, what you might call Lazy Monty Hall. 05:36 Where he's standing here and he prefers opening door 2 to door 3, 05:41 cuz he doesn't wanna walk all the way from here to here. 05:45 So suppose he opens this one, probability P this one probability one minus P. 05:49 Where maybe P is greater than one half, or less than one half, or whatever. 05:53 If he has a choice, sometimes he has no choice right? 05:57 Because he's not going to spoil the game, but sometimes he has a choice. 05:59 But for the basic problem we assumed he picks with equal probabilities. 06:05 Okay, and the question is, should you switch? 06:10 And most people when they first hear this problem they say, 06:14 okay, well suppose you open door 2. 06:16 And you're picking between door 1 and door 3. 06:18 There's two doors. 06:20 You don't seemingly have any information about whether door 1 or 06:24 door 3 is more likely. 06:25 It looks kind of symmetrical, so it's 50/50. 06:29 And controversy has raged about this problem for years and years and years. 06:36 Part of the controversy is because people don't understand probability and 06:39 they haven't taken Stat 110. 06:40 Part of the reason for 06:41 the controversy is that sometimes some of these key assumptions are left implicit. 06:47 And so then you know different people might interpret it differently. 06:51 So it's important to have these assumptions clear. 06:55 Under these assumptions, the answer is that you should switch. 06:58 And if you switch, your probability of success is two-thirds, and 07:02 if you stick with your original choice, your probability of success is one-third. 07:06 So, it's better to switch. 07:08 It's not 50/50. 07:10 We can't, it's sorta like an abuse of the naive definition of probably. 07:14 If you just immediately say well it's 50/50 cuz there's two doors and 07:18 we don't know which is which. 07:19 Well, it's the naive definition, right? 07:21 Assuming they're equal to likely but why they're equal likely? 07:24 We don't know that they're equal likely given the evidence. 07:28 We are assuming that initially I didn't write this but I said it. 07:33 Initially it's one-third, one-third, one-third as the probabilities. 07:37 But that doesn't mean that conditionally after we observe what happens. 07:41 It doesn't mean that it's still equally likely. 07:45 Cuz we have the information. 07:49 One key intuitive point. 07:55 Suppose that Monty opens door 2. 07:59 Just for concreteness. 08:03 Well, obviously, we know that Door 2 has a goat. 08:10 So the kind of naive approach would be, 08:13 well, let's just condition on the fact that Door 2 has a goat. 08:17 And originally, it was one-third, one-third, one-third. 08:19 And we're getting rid of this, so then it's still one-half, one-half for these. 08:23 There's a subtle flaw in that reasoning. 08:27 So we know Door 2 has a goat, but a key thing with conditioning 08:33 is we wanna condition on all the evidence, okay? 08:38 The evidence is not just that Door 2 has a goat. 08:42 We need to condition on all the evidence. 08:44 There's one more key piece of information there. 08:47 That's the fact that Monty Hall opened Door 2. 08:54 So now we're gonna start thinking about, 08:56 why does that matter that he opened that door? 08:58 So we're not just conditioning on there being a goat there. 09:02 We're conditioning on the fact that Monty Hall opened Door 2. 09:05 So we have to kind of see, why is that relevant, 09:09 why does it become two-thirds? 09:13 Okay, Door 2, all right, so 09:17 I wanna approach this problem in a couple different ways. 09:22 Maybe three different ways, one intuitive, one with a tree, and 09:26 one with a probability calculation, okay? 09:29 Actually, two intuitive ways. 09:30 There are a lot of ways to think about this problem, but some of them are wrong. 09:33 So I'm gonna show you some correct ways to think about this. 09:36 So first of all, a tree diagram, I think, 09:39 is a very good way to picture this problem. 09:42 So let's draw a tree. 09:48 So you choose the door, and just to simplify not having to draw as many 09:53 branches, I'm gonna assume that the contestant chooses Door 1. 09:59 So choose Door 1, Door 1, okay? 10:04 Now it branches depending on which door Monty Hall opens, right? 10:09 Monty Hall, well, there's two things. 10:13 One is, what's the door that has the car? 10:16 And the other branching is, what door does Monty Hall open, okay? 10:22 So let's actually do the door that has the car first. 10:25 So it branches three ways, and these branches each have probability of 10:30 one-third, one-third, one-third, Door 1, Door 2, Door 3. 10:34 So this is just the car door. 10:39 By car door, I don't mean that the door of the car, but 10:42 the door that has a car behind it. 10:44 That's either Door 1, 2, or 3, right? 10:46 And they're equally likely by assumption, that's the first branch. 10:50 Secondly, we have Monty door, that's whatever door that Monty Hall opens. 10:58 And let's just consider the cases. 11:02 We choose Door 1, the car is behind Door 1. 11:06 Then Monty Hall can open either Door 2 or Door 3. 11:09 So that's where he has two choices, so it branches two ways. 11:13 Then he opens Door 2 or Door 3. 11:17 And we're assuming those are equally likely, so I'm putting one-half, 11:20 one-half on those branches. 11:22 Now if we picked Door 1, and the car is behind 2, 11:27 then Monty Hall has no choice but to open Door 3, right? 11:33 And then he'll offer you, do you wanna switch to Door 2 or not? 11:37 So in this case, he has no choice, so this has a probability of 1. 11:41 And then lastly, we picked Door 1, car is behind Door 3. 11:45 Monty Hall has no choice but to open Door 2 to reveal a goat, right? 11:51 So that is probability 1, Door 2. 11:54 Okay, so now how do we actually use this tree diagram to get the answer? 12:03 Well, we just have to consider cases. 12:07 So suppose that Monty Hall opens, 12:11 just for example, suppose he opens Door 2, okay? 12:15 That means I'm just, basically, I'm gonna show you how do you do 12:18 conditional probability in terms of a tree, okay? 12:21 So what does it mean? 12:22 Suppose we condition on the fact that Monty opened Door 2. 12:25 That means that it must be that we took this path here, I'll circled that one. 12:30 Or we took this path here, right? 12:36 These two are now irrelevant, just like when I was drawing the pebble diagram. 12:39 And when you condition on something, 12:42 you delete from your space everything that you can now rule out, right? 12:46 You rule out everything that's inconsistent with what you observe. 12:49 And whatever's left would be these two cases. 12:53 And now let's just see what happens. 12:55 The probability of this path from here to here, 12:58 I'm just multiplying cuz I'm assuming independent. 13:02 Well, I'm assuming that one-half is the probability of going from here to here if 13:05 you're at here. 13:06 So one-third times one-half is one-sixth. 13:08 And for this branch, one-third times 1 is one-third. 13:13 And remember what I said when we were doing conditional probabilities using 13:16 the pebble world perspective. 13:18 We deleted all the pebbles that are irrelevant, and then we renormalized, 13:22 right, to make the total mass equal to 1. 13:24 So if we wanna renormalize one-sixth and 13:29 one-third, what we should do is just multiply by 2. 13:35 Because then I'm making it two-thirds, one-third. 13:37 Now I've renormalized so that they add up to 1. 13:41 So what that says is that, conditional on Monty Hall opening Door 2, 13:46 there is a two-thirds chance now that the car is behind Door 3. 13:51 And there's a one-third chance that it's behind Door 1. 13:55 So therefore, if we switch, we have a two-thirds chance of success. 13:59 So what this thing, the circling, 14:02 just showed was that the probability of success, if switching, 14:08 Given that Monty opens Door 2, cuz I was assuming he opened Door 2. 14:16 And we just showed just by this tree diagram that this is = two-thirds. 14:23 Of course, you could do the same thing if he opens Door 3. 14:26 Just circle the other two, same thing again, and it's still two-thirds, okay? 14:32 So that's one way to think of it, 14:36 just looking at what are the different possibilities. 14:39 And let me just say, and what does this say intuitively? 14:43 What this says intuitively is one-third of the time, your initial guess is correct. 14:49 And then you would fail if you switched, right? 14:53 No matter what happened, you would fail cuz you got it right. 14:55 But that's only one-third of the time. 14:58 The other two-thirds of the time your initial guess is wrong. 15:02 Monty opens up a goat door, and then he should switch. 15:05 So two-thirds of the time you should switch, I mean, you should always switch. 15:08 Cuz two-thirds of the time you'll get it right. 15:10 And the one-third where you're getting right is where you were initially correct, 15:14 but that's only one-third of the time, okay? 15:17 So that's a tree diagram, which I think is useful. 15:20 But it's also useful to just see how can we do this 15:23 as a conditional probability argument. 15:26 You can do this problem using Bayes' rule. 15:28 But actually, I would prefer to just use the law of total probability here. 15:34 So let's do this by using the law of total probability, LOTP. 15:43 When we're using the law of total probability, 15:46 the key step is deciding what to condition on, okay? 15:49 And one of the really nice things about statistics and 15:53 probability that's basically unique to this subject is, 15:57 with most mathematical subjects, if you have a problem And you're stuck, right? 16:04 You can't just say well I wish that I knew this and I wish that I knew that. 16:08 Well, you can say that, but it doesn't help you, right? 16:12 In probability you think I wish I knew this, I wish I knew that. 16:16 That's giving you a hint as to what you should condition on, and 16:18 then you just condition on it. 16:19 And you act as if you did know that. 16:22 Okay? So 16:23 I didn't name the law of total probability, but 16:26 if I had I would have just called it wishful thinking, all right? 16:29 It's like what do we wish that we knew? 16:32 Well, I don't know about you, but for me the first thing I think of if someone 16:37 asks me, what do I wish that I knew I wish I knew where the car was, right? 16:42 So it's a pretty obvious thing to think about. 16:45 That's what we wish we knew, so we wish we know car door. 16:54 So that's all we wish. 16:55 We wish we knew where the car is, so we're gonna condition on that. 17:01 That's all, okay? 17:03 So So we need to define some events. 17:07 I'll just say, let S be 17:12 the event that we succeed. 17:17 Succeed, assuming that we're using the switching strategy. 17:20 So I'm gonna assume that we're gonna switch. 17:22 And I'm gonna see what the probability that we succeed by that strategy. 17:26 So assuming our strategy is always switch. 17:29 You know Monty Hall's gonna give us the chance to switch, and we'll say yes. 17:32 That's our strategy, and 17:33 we wanna see what's the probability of success of that strategy. 17:36 Okay, and then let's just let, Dj be 17:43 the event that Door j has the car. 17:49 So I'm using D for Door, Door j has car. 17:52 Where j is just 1, 2, or 3. 17:56 Okay so that's just some simple notation. 17:59 Now let's do a calculation of the probability of success. 18:06 Law of total probabilities just says well condition on which door has 18:11 the car that's the only wish we knew. 18:14 So to do that that's just P(S)=P(S|D1) 18:21 which is 1/3 + P(S|D2) which 18:27 is 1/3 + P(S|D3) times 1/3. 18:33 So it's P(S|D1), right? 18:36 But P of D1 is the prior probability. 18:39 That door one has the car and we're assuming that there are equally likely to 18:43 start with so it's just 1/3, 1/3, 1/3 for these waits. 18:47 Okay now what's this? 18:51 D1, I'm just assuming, remember we're assuming well we pick door one so 18:55 that means we got it right initially. 18:58 But we switch, so that's bad. 19:00 This is a bad case. 19:00 So this is gonna be 0. 19:01 It's never gonna work in that case. 19:03 Now in this case, we picked door 1. 19:06 The car is behind door 2. 19:08 Monty Hall will open door 3, and we should switch, right? 19:12 So, we have probability of success 1 here, 1 times 1/3. 19:15 Similarly in this case, it's gonna work. 19:18 So, that's also 1 = 2/3. 19:21 So, it's actually a very easy calculation, right? 19:25 Because these problems are very, very easy to compute once we know where the car is. 19:30 So, it's 2/3. 19:33 There's one slightly subtle point here which is that 19:36 this is the unconditional probability that our strategy will be successful. 19:41 And you could say well what if we wanted the conditional probability, 19:45 given that Monty Hall opened door 2, is that gonna be different? 19:50 But in this case, I defined it symmetrically so by symmetry we also have. 19:58 We can make up a notation for which which door Monty Hall opens. 20:04 P(S| Monte opens 20:09 2)= 2/3. 20:16 You could write out another law of total probability calculation if you want, 20:19 but all it's a condition on whether Monty Hall opens door 2 or door 3. 20:24 But by symmetry it couldn't be because from door 2 and 20:29 door 3 are completely symmetrical until Monty opens one of them. 20:36 So that means both the conditional and 20:38 the unconditional probabilities of success are 2/3. 20:42 Now in the extension on the strategic practice that I mentioned, 20:45 the unconditional probability is still 2/3. 20:49 But the conditional probability changes because in that problem I said that 20:53 Monty Hall is too lazy to walk over and open door 3 unless he has to or 20:57 that kind of thing. 20:58 Then there's an asymmetry but in this version It's symmetrical, 21:02 so conditional and unconditional works the same way. 21:05 So okay so we got 2/3. 21:07 And they're nice, you can easily find applets online, 21:10 I put a link on the webpage for a nice one that the New York Times had, 21:13 if you wanted to try it out. 21:15 There's one thing that, so 21:17 just to tell you a little bit about the controversy over this problem. 21:20 That controversy started raging when Marilyn vos Savant, 21:25 who writes a column in Parade Magazine. 21:28 Someone wrote in and asked this question. 21:33 It was not 100% explicitly specified, all of the assumptions. 21:36 But it was fairly clear that this was what was intended. 21:39 And she basically gave the correct answer. 21:41 And then thousands of people started writing in telling her that she was wrong. 21:47 Telling her that she was stupid, including people with PhD's in math and 21:51 just writing all these scathing, nasty letters to her and 21:54 that controversy was raging for a long time. 21:58 And one thing that I find ridiculous about that is 22:03 well okay so most people's intuition about this problem are wrong. 22:08 But if you think about it carefully you'll get the right answer if you have some 22:11 familiarity with conditional probability. 22:14 But even for 22:15 people who had no understanding of any of the stuff I was doing here. 22:19 This is a problem that and 22:20 this is true in statistics in general, you could just simulate it, right? 22:24 It's very, very easy to try it out with a friend, 3 doors. 22:29 You can just do it with cups and props and whatever. 22:35 You can also do it on a computer, very very easily, write a little simulation. 22:40 Try it out 1,000 times, just via computer simulation, and 22:44 you'll see that if you switch, you'll win 2/3 of the. 22:47 So I don't understand how people still continue to argue with that, when you just 22:51 try it out, simulate it, you'll see 2/3 of the time you succeed by switching. 22:57 I mean it's just It's just kind of mind boggling but 23:01 maybe some of the math PhD's didn't want to actually try 23:04 it out cuz they thought they proved that it 1/2 or something. 23:09 Anyway, so that's the Monty Hall problem. 23:12 I wanted to mention one other intuition for this, that's kind of unusual. 23:17 Usually when we have a complicated problem, the suggestion would be, 23:21 look at a simpler case. 23:24 But I did emphasize the fact that it's useful to consider simple and 23:28 extreme cases. 23:30 So an extreme case here would be what if instead of 3 doors, 23:36 what if we considered the Monty Hall problem with a million doors? 23:40 So you pick one of those one million doors and then Monty Hall proceeds to open 23:46 999,998 doors leaving just one door should you switch. 23:53 In that case, I've never met anyone who would not switch, right. 24:00 I've never met anyone like that. 24:02 That because with a million doors you're extremely confident 24:06 that your initial guess is wrong and 24:08 you're extremely confident that that one remaining door has the car. 24:11 Conceptually though, there's no difference between that and this. 24:15 It's like three doors or a million doors, it's the argument for 24:21 one-half, one-half here would apply in the same way with a million doors. 24:25 It's just in that case everyone sees intuitively that it's ridiculous and 24:28 here people don't have that intuition. 24:30 So anyways, that's another intuition from the problem. 24:35 All right, so that's the Monty Hall problem. 24:39 You should look at the strategic practice problem, and on the homework three, 24:45 there's an extension with more than three doors that you can think about. 24:51 I'm talking on that problem about the case where Monty Hall Possibly leaves more than 24:56 one door still unopened. 24:58 But anyway, you can work on that later. 25:00 But this is just an introduction to the Monty Hall. 25:02 So, that's a very famous fun problem, 25:04 but I think it also illustrate a lot about how to think conditionally, right? 25:09 Like the tree is useful this perspective, 25:12 it's just the problem is really easy when you set it up this way, but if you just 25:15 try to just apply naive intuition then most people get this completely wrong. 25:21 Okay. 25:22 So let's do another example. 25:26 So this example. 25:31 Is called Simpson's paradox. 25:33 It's another kind of notorious, Problem or paradox that comes up. 25:43 It comes up a lot in everyday life. 25:46 And it's another problem where most people see it the first time, 25:50 it seems impossible and then they think they understand it, 25:54 then change a few things around and then they fall for it again. 25:57 So, I love paradoxes in general so 26:00 we'll be seeing a lot of paradoxes in this course. 26:04 So the best way, 26:06 I mean I could start making up some abstract notation and stuff, but 26:11 the best way to start understanding Simpson's paradox is to see an examples. 26:16 So I'll write down an example, and then we can discuss that and 26:21 then we can try to write out with the more general thing that's going on. 26:26 By the way, there is no such thing as a true paradox, a real paradox. 26:30 If there were a real paradox, we'd have a contradiction, 26:33 and the universe would explode, and we would not be here. 26:36 So there are no paradoxes in the literal sense. 26:40 What it is though, is something that's deeply counter-intuitive to most people. 26:44 Which means it forces you to think harder. 26:47 And if you think hard enough, 26:48 then eventually, you understand that actually, it does make sense. 26:51 But you have to think hard, okay? 26:54 So here's the problem, one example. 26:58 Is it possible to have two doctors where the first 27:04 doctor has a higher success rate at every single 27:09 possible type of surgery imaginable than the second one? 27:16 Name any surgery, first doctor is more successful than the second doctor, 27:20 measured in terms of percentage of successful outcome. 27:24 Yet the second doctor, overall, has a higher success rate. 27:30 That's the basic question. 27:32 Okay, well Simpson's paradox says that that's possible. 27:35 And at first that sounds wrong to most people. 27:38 Because it sounds like if 27:41 person a is better than person b in every single category then when you 27:45 aggregate those categories together it's not gonna somehow flip, right? 27:50 Except that's wrong since this paradox says it can flip. 27:53 The signs of inequalities can flip when you aggregate data together. 27:58 So one thing seems better than another in every individual case. 28:02 You add up all those cases to get the total, and then it flips which way it is? 28:06 So that sounds really weird at first, but 28:10 I'm just gonna illustrate through a very simple example. 28:12 And if you think carefully about this simple example, 28:14 then you can see what's really going on there, why is that possible. 28:19 Okay, so I like to illustrate Simpson's paradox 28:22 through examples that I make up based on the Simpsons. 28:26 As I've been watching the Simpsons since I was a kid and I like the show. 28:32 And it helps me remember the example that the Simpsons paradox. 28:36 So I don't know how many of you watch the Simpsons, but it doesn't matter if you do. 28:40 On the Simpsons there are two doctors Dr. Hibbert and Dr. Nick, okay? 28:46 The data that I'm gonna give you are made up. 28:50 They don't actually tell you the percentage of success on the show, okay? 28:55 But anyway, Dr. Hibbert is kind of the reasonably well respected 29:01 town doctor that everyone who can afford to goes to, but he's gonna charge a lot. 29:06 Doctor Nick is like the guy with the infomercials on TV 29:10 who advertises that he'll perform any surgery for 129.99, okay? 29:13 So probably most 29:19 people would consider Dr. Hibbert to be the better doctor and Dr. 29:23 Nick is kind of like a a cheap, quack doctor, okay? 29:28 Now, I made up some numbers. 29:32 So, suppose for simplicity, to understand the paradox, 29:35 we only need to consider two types of surgeries. 29:38 Once you understand what happens with two surgeries you can easily see what would 29:42 happen if you had like 50 different types of surgery. 29:45 But to understand that the crucial phenomenon, remains what is assumed 29:49 that there's two doctors, two different types of surgeries, okay? 29:53 So I made up some numbers, and 29:55 we can summarize everything in terms of two, two by two tables. 30:02 Okay? 30:03 One table for Dr. Hibbert. 30:08 And one table for Dr. Nick. 30:13 And I'm gonna fill in some numbers. 30:16 So these tables represent two types of surgeries and success or failure. 30:21 So let's just write success here. 30:25 First row is for success. 30:26 Second row is for failure. 30:28 So each doctor performs a certain number of surgeries. 30:31 And to simplify things, 30:33 I think I assumed that each doctor performs 100 surgeries total. 30:37 So there's nothing going on where you'd say one doctor is doing a lot more 30:41 surgeries total. 30:42 This is success failure for Dr. Nick. 30:46 And let's assume that there are only two types of surgeries, 30:51 one is heart surgery and the other one is, 30:55 I just made up and example, a bandage removal. 31:02 I just tried to make up something that seems hard. 31:04 I'm not a doctor obviously, well doctor of philosophy but that doesn't count. 31:11 One's an easy surgery and one's a difficult surgery, okay? 31:15 Like bandage removal I can do. 31:16 Now I can't do heart surgery. 31:19 So, all right, I made up some numbers. 31:21 So suppose that Dr. Hibbert performed 90 heart surgeries and 31:27 succeeded 70 times, failed 20 times. 31:29 And suppose that he performed 10 bandage removals and succeeded all 10 times. 31:35 Now, Dr. Nick performed 10 heart surgeries. 31:43 Succeeded 2, failed 8 times, 31:49 that's kind of sad. 31:52 He performed 90 bandage removals, 31:56 he succeeded 81 times, he failed 9 times. 32:00 Some how, he failed 9 times at bandage removal. 32:04 Now, how many of you would rather go to Dr. Hibbert than Dr. Nick? 32:11 Would anyone here prefer to go to Dr. Nick, based on this data? 32:16 No? 32:17 [COUGH] Okay, well, I would definitely prefer to go to Dr. Hibbert. 32:23 But on the other hand, you could argue from this that Dr. 32:27 Hibbert is successful 80% of the time. 32:35 And Dr. Nick had 83 successes. 32:39 So on his infomercial, he can say he succeeded 83% 32:44 of the time and that 3% better than Dr. Hibbert. 32:49 And yet you're paying so much less for a 3% higher success rate. 32:53 So what did we do to get 80 and 83? 32:55 We were just aggregating, right? 32:57 Here I listed out separately by surgery. 33:01 Here we aggregated. 33:03 Notice that the direction flipped, okay? 33:07 So for each individual surgery, Dr. Hibbert is better. 33:10 So that's conditional, right? 33:11 So it's the difference between conditional and unconditional. 33:14 Conditional on which type of surgery. 33:17 If you condition on performing heart surgery, you'd rather have Dr. Hibbert. 33:22 If you condition on having band-aid removal, you'd rather have Dr. 33:26 Hibbert, right? 33:28 But unconditionally, Dr. Nick has a higher percentage rate. 33:33 And the reason, you can see it here. 33:34 I just made this up as kind of an extreme example so that you get some intuition. 33:38 You see what's going on is that 90% of Dr. Nick's surgeries are band-aid removals. 33:44 Well, band-aid removal is so much easier than heart surgery that it would be very 33:48 easy for him to get a higher success rate because he's doing easier surgeries. 33:52 And you can see, of course, this is just made up. 33:54 But in real examples you can imagine that maybe the most famous and 33:59 most renowned best neurosurgeons in the world, could be the that their 34:04 success rate isn't as great because they're getting the hardest cases right? 34:10 They're getting the cases that no one else knows how to deal with. 34:13 So you refer to the world's leading expert, and 34:16 that's gonna hurt their success rate, right? 34:19 So that's what's going on. 34:20 That's an example of Simpson's paradox. 34:23 I think when you see this example it's clear how that can happen. 34:27 And yet, in problems that are completely the same, just in a different language 34:31 different setting, then it sounds like it's impossible. 34:34 How can it flip like that? 34:36 Okay? 34:37 So let me explain it a couple other ways and mention a couple other examples. 34:45 Another way to think of Simpson's paradox, well here's another example. 34:51 [COUGH] In baseball, it's possible to have two players 34:56 where the first player has a higher batting average. 35:02 A batting average is just what percentage of times they were at 35:05 bat that they got a hit. 35:07 One player has a higher batting average for the first half of the season and for 35:11 the second half of the season. 35:13 Yet, if you look at the whole season, 35:15 the second player has a higher batting average. 35:17 And again, that sounds impossible at first. 35:19 If you are better for the first half of the season and better for 35:22 the second half of the season how could you be worst for the whole season? 35:25 But you can see that it's basically the same problem is that you can easily make 35:28 up some numbers to illustrate that. 35:30 So it can flip. 35:31 What's really going on here, 35:33 another way to look at it, is just in terms of adding fractions, okay? 35:38 So if you had one-third plus two-fifths. 35:47 Well, let's see, I haven't added any fractions in a while. 35:50 But I think you're not supposed to just add the numerators and 35:53 add the denominators, right? 35:55 This is not equal to three-eighths, right? 36:00 That's our inequality for the day. 36:04 If you haven't studied adding fractions before, when you're learning for 36:08 the first time, that's kinda the obvious thing to do. 36:10 Add the numerators. 36:10 Add the denominator. 36:11 Well, that doesn't actually work, right? 36:14 However, if you add fractions this way, 36:19 that is closer to how you aggregate, right? 36:22 Because we're just adding up successes and adding up trials, right? 36:26 So that is sort of how we do the aggregation here, right? 36:30 Add up the successes. 36:31 Add up the total number of trials, that kind of thing. 36:34 So if fractions were added this way, then Simpson's paradox would not occur. 36:40 But since that's not how we add fractions, it doesn't happen. 36:45 Okay, they're a lot of other examples of Simpson's paradox if you just 36:48 look at the Wikipedia entry, and you can find as many as you want. 36:52 Examples that happen in real life and 36:55 actually have policy implications and legal. 36:59 There are some interesting legal cases involving Simpson's paradox. 37:04 Let me talk about how do we express Simpson's Paradox in terms of 37:09 conditional probability? 37:11 So, just to kind of map this example into some events. 37:18 Let's let A be the event that surgery, suppose someone's going to have surgery, 37:24 and let A be the event that the surgery is successful. 37:33 And then we'll try to write out in probability notation what is this example 37:37 saying? 37:38 Okay? 37:39 So let A be the event that the surgery is successful. 37:42 Let B be the event that we're treated by Dr. Nick. 37:53 So, B complement would correspond to Dr. Hibbert. 37:56 So we don't need separate notation for that. 37:58 And then we need one more event, which is what type of surgery? 38:03 So let's call that C. 38:04 C be the event that we have heart surgery. 38:09 So C complement is band-aid removal, okay? 38:13 So those are the events that we need. 38:15 And let's just write out, 38:16 in probability notation, what are these tables telling us? 38:21 Okay? 38:21 Well, what's it saying? 38:22 It's says the probability of success given that we're treated by Dr. 38:30 Nick, and we're having heart surgery, is less than the probability 38:37 that the surgery is successful given that we have Dr. Herbert for heart surgery. 38:44 Right, that's just comparing the two heart surgeries. 38:47 Now, what about for band-aid removal? 38:50 The probability of success for band-aid removal, again, 38:55 it's less likely to have success with Dr. Nick, than Dr. Hibbert. 39:01 So this is just gonna be A given B complement, C complement. 39:06 Okay? 39:07 So those are the two statements we have. 39:08 But overall, when you aggregate, and aggregating means 39:15 don't condition on the type of surgery, then we're just looking at the overall. 39:19 So overall, the probability of A given B, that is success, 39:23 given that you're being treated by Dr. Nick, is greater than 39:27 the probability of success, given that you're treated by Dr. Hibbert. 39:32 So notice, it looks like these expressions look similar. 39:37 And it looks like if you combine these two cases, 39:41 all these are A given B on the left, and we have the two cases. 39:46 Given C, given C complement. 39:47 Looks like you combine those cases you get this. 39:51 But if you add up or combine these inequalities somehow, 39:55 then you would guess that it would remain less than but then the inequality flipped. 40:00 So, Simpson's Paradox says that this is possible. 40:03 This is an explicit example showing that that's possible. 40:08 C is called a confounder. 40:17 I chose the letter C cuz it can stand for confounder or control. 40:21 It's something you wanna control for. 40:23 In this problem, 40:24 it seems a lot more relevant to condition on what type of surgery you have, right? 40:30 So the more relevant comparison would be the more conditional one in this case. 40:35 And everyone here agrees that Dr. 40:37 Hibbert is better, that means we should condition on more things. 40:42 If we fail to condition on C, though, 40:44 we can get a very misleading answer here, right? 40:48 Because of the fact that if we don't condition on the type of heart surgery, 40:54 what happens is knowing that we got treated by Dr. 40:57 Nick gives us information about what type of surgery we had. 41:02 Which then in turn affects the probability of success. 41:06 So that's Simpson's Paradox, 41:07 and this is just kinda like the general setting of Simpson's Paradox. 41:12 So basically any example of Simpson's Paradox can be written in this form. 41:17 So I would take the definition of Simpson's Paradox is whenever you have 41:21 these inequalities but it flips when you aggregate in this way. 41:25 That's a concrete example. 41:26 That's just the generic setting. 41:29 Okay, so you should try to think intuitively about why is this possible, 41:33 by thinking about examples. 41:35 Think about why this is possible. 41:39 And think about you know should you be conditioning on C. 41:42 And then there's one other thing that I wanted to mention about this, 41:48 just to go into a little bit more of the math of those equations, 41:52 why is that actually possible? 41:54 In the sense that the first time that I saw this, 41:58 I thought that from here and here, I thought I could probably prove this. 42:03 Because it looks like it's gonna be the law of total probabilities. 42:06 So I'm gonna show you what goes wrong. 42:09 Obviously it's not gonna work to prove less than here, 42:13 because we just saw an example that it could go this way. 42:16 But what if you were trying to prove that this and 42:19 this implies this with a less than, will it go wrong? 42:21 I just wanted to show you quickly what would be wrong with that argument. 42:24 So the obvious argument to make is I wanna A\B to A\B,C 42:29 that means we need to use the law of total probability. 42:36 So by the law of total probability, I would be conditioning on C. 42:41 This is just the conditional form of the law of total probability. 42:45 Conditional probabilities are probabilities, so I can do Bayes' rule, 42:48 law of total probability, anything I want, it's gonna work the same way, 42:51 just everything's given B. 42:53 So I'm conditioning on C, whether C happened or not. 42:57 But everything is still given B is the conditional form, 43:03 + A given B, C complement, P(C) complement given B, okay? 43:11 So if you deleted the given Bs everywhere, 43:14 this would just be exactly the law of total probability, right? 43:19 And so therefore this is true, cuz conditional probabilities 43:22 are probabilities, so it's still true with given Bs. 43:25 Now if we compare this with what we had, 43:30 we know that A given BC, this thing is less 43:35 than P(A given B complement in C). 43:40 And this thing is less than P(A given B complement and C complement. 43:48 So you would then you would wanna plug those in. 43:51 But at that point in the calculation, 43:53 you're not gonna be able to reduce it down to this. 43:56 It's because of this term and this term, we might call those weights. 44:04 Those are conditional on B, right? 44:07 And if we tried to write the same thing for P(A given B complement), 44:10 then we would have B complement here, and B complement there. 44:12 But we don't have any way to relate those weights for one case or 44:17 the other, so it's not gonna follow. 44:21 [COUGH] So it's important to contrast this with 44:26 the law of total probability, basically. 44:31 Intuitively, what this term is in that Dr. Nick case, 44:34 this is the probability of performing heart surgery for Dr. Nick, which is 44:38 very different from the probability of heart surgery for Dr Hibbert. 44:42 So the weights change, and that's what enables Simpson's Paradox to happen. 44:48 Okay, so let me mention one or 44:52 two more examples of Simpson's Paradox. 45:01 So here's another one that's a pretty famous one that involved it. 45:07 A court case, so what happened was that [COUGH] there 45:12 was a lawsuit against UC Berkeley claiming a sex 45:16 discrimination in admissions to their graduate programs. 45:22 That is claiming that UC Berkeley was discriminating against women for 45:28 admissions to graduate programs. 45:31 And something interesting happened, in the sense that when 45:36 you kind of looked at kind of the overall admissions rates, 45:41 it looked like UC Berkeley was making it easier for 45:45 men to get in than women for grad school. 45:49 So then it seems to be a clear-cut case of discrimination. 45:52 When you look at the data more closely than that, then something different 45:57 turned up, which was the fact that if you look at each individual department. 46:02 Cuz when you were apply to grad school, you apply to a specific department, right? 46:06 For each individual department, there was not clear evidence, right? 46:11 So each individual departmency, I'm not saying they were all exactly fair. 46:15 But each individual department generally did not seem to be discriminating. 46:20 But when you aggregated all of the departments together, 46:24 then it seemed like it was very unfair to women. 46:27 And the reason is that certain departments are more popular for 46:32 women to apply to relative to men. 46:35 And certain departments are harder to get into than others and 46:38 somehow those things are what led to Simpson's Paradox occurring. 46:43 All right, one more example of Simpson's Paradox. 46:47 This is actually the first one. 46:48 I love paradoxes for a long time. 46:51 And this was the first one I ever saw, and I still find it helpful to think about it. 46:59 Let's see how does it work? 47:01 We have two jars like that with jelly beans. 47:06 So though, if I were phrasing it now, I would use gummy bears. 47:09 But I first learned this problem with jelly beans. 47:11 There's two types of jelly beans. 47:14 Let's say open circles and closed circles, 47:18 and two flavors of jelly beans, okay? 47:23 In these two jars, and there's two more jars here, okay? 47:29 And there's more jelly beans. 47:30 I'm not gonna make up numbers. 47:31 All I'm saying is that there are two types of jelly beans in each of these four jars. 47:37 Okay, now suppose that this, suppose that you like one flavor better than another. 47:43 So you get to choose a random jelly bean from this jar or this jar, right? 47:47 But you just have to reach in, and 47:49 you can't actually pick which color you're getting. 47:51 So you just pick a random jelly bean from this jar or this jar. 47:54 And suppose this jar is better than this jar. 47:57 By greater than, 47:58 I mean that this jar has a higher percentage of the one you like, right? 48:02 And suppose this jar you can make up your own numbers. 48:05 It's actually good practice to make up your own specific numbers for this. 48:09 Suppose that this jar is better than this jar, okay? 48:13 That is, a higher percentage of the jelly mean you like. 48:16 Now suppose that someone then created a bigger jar and then created a bigger. 48:23 I should've used a board where I could write this below. 48:25 These two jars, just dumped everything into one big jar. 48:29 So this one and this one all got dumped into here. 48:31 And this one and this one all got dumped into here, okay? 48:35 So I combined the two better jars and I combined the two worse jars. 48:39 Well, naively you'd think, well, I combine the better ones, 48:41 that's gonna be better, okay? 48:43 But then it could flip, and after you aggregate the jars, 48:46 suddenly this one has a higher percentage of your favorite jelly beans. 48:51 You should make up your own example, your own numbers showing all that I can have. 48:55 And I'll give you a lot of intuition for Simpson's Paradox. 48:57 Okay, so that's all for today, have a good weekend