#### This activity uses a bag of differently-sized candies to get at the concept of bias in sampling.

*This activity comes to us from Andrew Gelman. The original activity text is below, and a link to the activity is here.*

---------------------------------------------------------------------------

"My favorite statistics demonstration is the one with the bag of candies. I’ve elaborated upon it since including it in the Teaching Statistics book and I thought these tips might be useful to some of you.

**Preparation**

Buy 100 candies of different sizes and shapes and put them in a bag (the plastic bag from the store is fine). Get something like 20 large full-sized candy bars, 20 or 30 little things like mini Snickers bars and mini Peppermint Patties. And then 50 or 60 really little things like tiny Tootsie Rolls, lollipops, and individually-wrapped Life Savers. Count and make sure it’s exactly 100.

You also need a digital kitchen scale that reads out in grams.

Also bring a sealed envelope inside of which is a note (details below). When you get into the room, unobtrusively put the note somewhere, for example between two books on a shelf or behind a window shade.

**Setup**

Hold up the back of candy and the scale and write the following on the board:

Each pair of students should:

1. Pull 5 candies out of the bag

2. Weigh the candies

3. Write down the weight

4. Put the candies back in the bag!!

5. Pass the scale and bag to your neighbors

6. Silently multiply the weight of the 5 candies by 20.

(And, as Frank Morgan told me once, remember to read aloud everything you write on the board. Don’t write silently.)

The students should work in pairs. Explain that their goal is to estimate the total weight of all the candies in the bag. They can choose their 5 candies using any method–systematic sampling, random sampling, whatever. Whichever pair guesses closest to the true weight - they get the whole bag!

Demonstrate how to zero the scale, give the scale and the bag of candies to a pair of students in the front row, and let them go.

**Action**

The demo will proceed silently while the rest of the class proceeds. So do whatever you were going to do in class. Take a look to make sure the scale and bag are moving slowly through the room. After about 30 or 40 minutes, it will reach the back and the students will be done.

At this point, ask the pairs, one at a time, to call out their estimates. Write them on the board. They will be numbers like 3080, 2400, 4340, and so forth. Once all the numbers are written, make a crude histogram (for example, bins from 2000-3000 grams, 3000-4000, 4000-5000, etc.). This represents the sampling distribution of the estimates.

Now call up two students from the class (but not from the same pair) to look at all the estimates. Ask them what their best guess is, having seen this information. Ask the class if they agree with these two students. Now give the bag to the two students in the front of the room and have them weigh it.

**Punch line**

The weight of all 100 candies will be something like 1658. It’s always, always, always lower than *all* of the individual guesses on the board. Write this true weight as a vertical bar on the histogram that you’ve drawn. This is a great way to illustrate the concepts of bias and standard error of an estimator.

Now call out to the students who are sitting near where you hid the envelope: “Um, uh, what’s that over there . . . is it an envelope??? Really? What’s inside? Could you open it up?” A student opens it and reads out what’s written on the sheet inside: “Your guesses are all too high!”

**Aftermath**

Now’s the time to talk about sampling. Large candies are easy to see and to grab, while small candies fall through the gaps between the large ones and end up at the bottom of the bag. You can draw analogies to doing a random sample by going to the mall or by sending out an email survey and seeing who responds. Ask, How could you do a random sample. It won’t be obvious to the students that the way to do a random sample is to number each of the candies from 1 to 100 and pick numbers at random. Also, as noted above, this is an example you can use later in the semester to illustrate bias and standard error.

P.S. My feeling about describing these demos is the same as what Penn and Teller say about why they show audiences how they do their tricks: it’s even cooler when you know how it works.

P.P.S. Remember—it’s crucial that the candies in the bag be of varying sizes, with a few big ones and lots of little ones!"