(no subject)
Oct. 2nd, 2005 09:07 amMore probability and statistics from the brother in graduate school...
Suppose from a 40 person class you are going to select groups of 10 students at random (such that all samples are equally likely). You are interested in probability that a student, Buffy, is chosen.
A) First suppose that you are choosing students to award prizes, and you are willing to allow one student more then one prize. Consequently, you will sample 10 students from the class with replacement. This means you select one student and then put them (them is not an appropriate pronoun for "one student" no matter what the rest of the world does. I do not approve.) back in the selection pool. What is the probability Buffy with (I cut and paste these. Errors are in the originals.) be one of the students chosen?
B) Using this sampling approach, what is the probability of Buffy being chosen more then once?
C) Now suppose that instead of choosing students to award prizes, you are choosing students to staff a committee. Consequently, no student may be chosen more then once. What is the probablity of Buffy being chosen?
D) Explain why you expect the probability of Buffy being chosen to be higher when sampling is done without replacement.
A. For sampling with replacement, the sample size does not get smaller in successive events. (If you're comparing this to dice or cards, this is more like rolling dice. Drawing cards to make a poker hand is a good model for sampling-without-replacement, which we'll get to here directly.) For our example, Buffy has a 1 in 40 chance of being chosen on any given iteration. If there were only one trial, the results would look like this
Buffy
39 other students
Total number of outcomes: 40
Number of outcomes including at least one Buffy result: 1
Odds of a Buffy result in one trial: 1/40 (.025)
If there were two trials, the possible results would look like this:
Buffy, Buffy (1)
Buffy, 39 other students (39)
39 other students, Buffy (39)
39 other students, 39 other students.
Total number of outcomes: 1 + 39 + 39 + 1521 = 1600 (40*40, which is the total number of choices we were expecting)
Total number of outcomes including at least one Buffy result: 79
Odds of a Buffy result in two trials: 79/1600 (~.0493)
If there were three trials, you'd get
Buffy, Buffy, Buffy (1)
Buffy, Buffy, 39 other students (39)
Buffy, 39 other students, Buffy (39)
Buffy, 39 other students, 39 other students (1521)
39 other students, Buffy, Buffy (39)
39 other students, Buffy, 39 other students (1521)
39 other students, 39 other students, Buffy (1521)
39 other students, 39 other students, 39 other students (59319)
Total outcomes: 64000
Total number of outcomes with at least one Buffy: 4681
Odds of a Buffy result in three trials: 4681/64000 (~.0731)
The first thing we should be seeing here is that it's a pain in the fucking ass to count the Buffy is selected results.
The second thing we should be seeing here is that it's duck soup to count the Buffy isn't selected results because they are given by (N-1)^trials where N is 40 and trials is 10.
So, then, the odds of Buffy not being selected are 8149496985191601
The total number of outcomes is 40^10, which is 10485760000000000
Since if we add Buffy isn't selected and Buffy is selected, we should get 1, we can also subtract Buffy isn't selected from the total and get the number of times that Buffy is selected. That'd be .2236703 (etc.) in a tidy decimal instead of in a huge, cock-knockingly large number. Huzzah for calculators.
The probability that Buffy is one of the students chosen is .2236703 (etc.)
B. Argh. The probability that Buffy is chosen more than once is going to be lower than the probability that Buffy gets chosen at all. A lot lower. The odds of no Buffys being chosen is 39^10, as we learned from the previous problem. The odds of Buffy being chosen JUST ONCE (in three trials) is given by 1521 + 1521 + 1521 which is (N-1)^(trials-1)*trials, a cutesy formula that looks good for the trials=2 case as well. So, let's run some numbers, there. (39^9)*10 should give us the number of times Buffy is chosen JUST ONCE. That'd be 2087283611587590 according to my computer. Whatever. We add that to the outcomes where Buffy isn't selected at all (which we have from part A) and divide by the total number of outcomes to get .975388497 or thereabouts. Subtract from 1 to get the odds for Buffy being chosen MORE THAN ONCE: .0246115
C. Back in part A (You in the back, please wake up. This material WILL be on the exam.) I mentioned rolling of dice vs. drawing a poker hand. This committee thing, here, this is the poker hand example. We're picking ten students out of forty for a committee in this part of the problem.
First trial, good old Buffy has a 1 in 40 chance of being picked. After one trial, our results look like this:
Buffy (1)
39 other kids (39)
Total: 40
Second trial:
Buffy, 39 other kids (39)
39 other kids, Buffy (39)
39 other kids, 38 other kids (because we used one up in the first go round) (1482)
Total: 1560 (40 * 39 gives total number of permutations, here)
Third trial:
Buffy, 39 other kids, 38 other kids (1482)
39 other kids, Buffy, 38 other kids (1482)
39 other kids, 38 other kids, Buffy (1482)
39 other kids, 38 other kids, 37 other kids (54834)
Total: 59280 (40 * 39 * 38 gives total number of permutations)
Generalizing for probability Buffy is chosen... we can see that the odds Buffy is NOT chosen are given by (N-1)(N-2)(N-3)(N-trials) for however many trials there are. Since we are doing 10 trials, then we will need to do 39*38*37*36*35*34*33*32*31*30. That is how many times Buffy is not chosen. The total number of outcomes is 40*39...32*31 so we're going to (again) do the divide by and subtract from one thing. Pay no attention to the man behind the curtain. That'd be .75, which is way, way too fucking pat to be wrong. The odds, therefore, that darling Buffy is chosen for a committee are 1/4 or .25.
I'm re-quoting the question for part D because I had forgotten it by this time and I expect you had, too, those of you who are still with us after this dredge through statistics. Why, you ask yourselves, can't I just write about pr0n and horses and bad tenants and cooking and stuff? Does there HAVE to be math, too? Yes. There has to be math. If you don't like it, there are probably forty bazillion teenage angst LJs out there whinging about the terrifying lightness of puberty. Go read one of them, if you can tolerate the lime text on black background, and leave me to my numbers. Oooh, pretty!
D) Explain why you expect the probability of Buffy being chosen to be higher when sampling is done without replacement.
Well, d'oh. When you're sampling WITHOUT replacement, your sample pool gets smaller each time. (Iffn you were "selecting" a committee of 40 students from a class of 40 kids, Buffy would be chosen with a probability of 1.) When the pool gets smaller, any given fish is more likely to be caught. If you sample WITH REPLACEMENT, the pool stays the same size and the odds for being chosen never get any better. This is pretty much why being chosen last for dodgeball in grade school was so psychically scarring... as the pool of children to be picked dwindled, your not-being-chosen status grew ever more important, highlighting your social undesireability more and more. Er. Not that I'm speaking from personal experience or anything.
*sigh* Ah, hell, who am I fooling? I just did statistics for fun. Yes, yes, I was the sort of person chosen last for dodgeball.
Mrs. Souders? If you're still alive, I hate you.
(grade school gym teacher)
Suppose from a 40 person class you are going to select groups of 10 students at random (such that all samples are equally likely). You are interested in probability that a student, Buffy, is chosen.
A) First suppose that you are choosing students to award prizes, and you are willing to allow one student more then one prize. Consequently, you will sample 10 students from the class with replacement. This means you select one student and then put them (them is not an appropriate pronoun for "one student" no matter what the rest of the world does. I do not approve.) back in the selection pool. What is the probability Buffy with (I cut and paste these. Errors are in the originals.) be one of the students chosen?
B) Using this sampling approach, what is the probability of Buffy being chosen more then once?
C) Now suppose that instead of choosing students to award prizes, you are choosing students to staff a committee. Consequently, no student may be chosen more then once. What is the probablity of Buffy being chosen?
D) Explain why you expect the probability of Buffy being chosen to be higher when sampling is done without replacement.
A. For sampling with replacement, the sample size does not get smaller in successive events. (If you're comparing this to dice or cards, this is more like rolling dice. Drawing cards to make a poker hand is a good model for sampling-without-replacement, which we'll get to here directly.) For our example, Buffy has a 1 in 40 chance of being chosen on any given iteration. If there were only one trial, the results would look like this
Buffy
39 other students
Total number of outcomes: 40
Number of outcomes including at least one Buffy result: 1
Odds of a Buffy result in one trial: 1/40 (.025)
If there were two trials, the possible results would look like this:
Buffy, Buffy (1)
Buffy, 39 other students (39)
39 other students, Buffy (39)
39 other students, 39 other students.
Total number of outcomes: 1 + 39 + 39 + 1521 = 1600 (40*40, which is the total number of choices we were expecting)
Total number of outcomes including at least one Buffy result: 79
Odds of a Buffy result in two trials: 79/1600 (~.0493)
If there were three trials, you'd get
Buffy, Buffy, Buffy (1)
Buffy, Buffy, 39 other students (39)
Buffy, 39 other students, Buffy (39)
Buffy, 39 other students, 39 other students (1521)
39 other students, Buffy, Buffy (39)
39 other students, Buffy, 39 other students (1521)
39 other students, 39 other students, Buffy (1521)
39 other students, 39 other students, 39 other students (59319)
Total outcomes: 64000
Total number of outcomes with at least one Buffy: 4681
Odds of a Buffy result in three trials: 4681/64000 (~.0731)
The first thing we should be seeing here is that it's a pain in the fucking ass to count the Buffy is selected results.
The second thing we should be seeing here is that it's duck soup to count the Buffy isn't selected results because they are given by (N-1)^trials where N is 40 and trials is 10.
So, then, the odds of Buffy not being selected are 8149496985191601
The total number of outcomes is 40^10, which is 10485760000000000
Since if we add Buffy isn't selected and Buffy is selected, we should get 1, we can also subtract Buffy isn't selected from the total and get the number of times that Buffy is selected. That'd be .2236703 (etc.) in a tidy decimal instead of in a huge, cock-knockingly large number. Huzzah for calculators.
The probability that Buffy is one of the students chosen is .2236703 (etc.)
B. Argh. The probability that Buffy is chosen more than once is going to be lower than the probability that Buffy gets chosen at all. A lot lower. The odds of no Buffys being chosen is 39^10, as we learned from the previous problem. The odds of Buffy being chosen JUST ONCE (in three trials) is given by 1521 + 1521 + 1521 which is (N-1)^(trials-1)*trials, a cutesy formula that looks good for the trials=2 case as well. So, let's run some numbers, there. (39^9)*10 should give us the number of times Buffy is chosen JUST ONCE. That'd be 2087283611587590 according to my computer. Whatever. We add that to the outcomes where Buffy isn't selected at all (which we have from part A) and divide by the total number of outcomes to get .975388497 or thereabouts. Subtract from 1 to get the odds for Buffy being chosen MORE THAN ONCE: .0246115
C. Back in part A (You in the back, please wake up. This material WILL be on the exam.) I mentioned rolling of dice vs. drawing a poker hand. This committee thing, here, this is the poker hand example. We're picking ten students out of forty for a committee in this part of the problem.
First trial, good old Buffy has a 1 in 40 chance of being picked. After one trial, our results look like this:
Buffy (1)
39 other kids (39)
Total: 40
Second trial:
Buffy, 39 other kids (39)
39 other kids, Buffy (39)
39 other kids, 38 other kids (because we used one up in the first go round) (1482)
Total: 1560 (40 * 39 gives total number of permutations, here)
Third trial:
Buffy, 39 other kids, 38 other kids (1482)
39 other kids, Buffy, 38 other kids (1482)
39 other kids, 38 other kids, Buffy (1482)
39 other kids, 38 other kids, 37 other kids (54834)
Total: 59280 (40 * 39 * 38 gives total number of permutations)
Generalizing for probability Buffy is chosen... we can see that the odds Buffy is NOT chosen are given by (N-1)(N-2)(N-3)(N-trials) for however many trials there are. Since we are doing 10 trials, then we will need to do 39*38*37*36*35*34*33*32*31*30. That is how many times Buffy is not chosen. The total number of outcomes is 40*39...32*31 so we're going to (again) do the divide by and subtract from one thing. Pay no attention to the man behind the curtain. That'd be .75, which is way, way too fucking pat to be wrong. The odds, therefore, that darling Buffy is chosen for a committee are 1/4 or .25.
I'm re-quoting the question for part D because I had forgotten it by this time and I expect you had, too, those of you who are still with us after this dredge through statistics. Why, you ask yourselves, can't I just write about pr0n and horses and bad tenants and cooking and stuff? Does there HAVE to be math, too? Yes. There has to be math. If you don't like it, there are probably forty bazillion teenage angst LJs out there whinging about the terrifying lightness of puberty. Go read one of them, if you can tolerate the lime text on black background, and leave me to my numbers. Oooh, pretty!
D) Explain why you expect the probability of Buffy being chosen to be higher when sampling is done without replacement.
Well, d'oh. When you're sampling WITHOUT replacement, your sample pool gets smaller each time. (Iffn you were "selecting" a committee of 40 students from a class of 40 kids, Buffy would be chosen with a probability of 1.) When the pool gets smaller, any given fish is more likely to be caught. If you sample WITH REPLACEMENT, the pool stays the same size and the odds for being chosen never get any better. This is pretty much why being chosen last for dodgeball in grade school was so psychically scarring... as the pool of children to be picked dwindled, your not-being-chosen status grew ever more important, highlighting your social undesireability more and more. Er. Not that I'm speaking from personal experience or anything.
*sigh* Ah, hell, who am I fooling? I just did statistics for fun. Yes, yes, I was the sort of person chosen last for dodgeball.
Mrs. Souders? If you're still alive, I hate you.
(grade school gym teacher)