Behind the Results: How the Survey of Candidate Preferences was Conducted

Image: WHYY

The Mirror recently partnered with Northwood’s AP Statistics class to create a study on the school’s demographics and their preference for a US presidential candidate. A copy of the study results can be downloaded here. The class, which split their responsibilities into different student pairs, managed all aspects of the study, from gathering data to calculating and organizing it into graphics and numbers for the Mirror to use.

AP Statistics teacher Mr. Bob Emery (Photo: Michael Aldridge)

Mr. Bob Emery, who teaches the AP Statistics class, spoke about the class’s process in conducting the survey. “The timing was kind of rushed with the election coming, but I think the class did a pretty good job, especially with the non-response bias. The biggest challenge in conducting it was finding a sample size that would work.” Mr. Emery explained the class’s decision to go with a 25% sample size, saying, “Running tests on samples depends on the independence of one data point from the next. The more you sample, the less independent the data becomes. So, the rule of thumb for running tests is generally not to sample more than 10% of the population, but in our case, 10% would have been around 19 students and a few staff. That’s way too small to have any conclusions, so we used 25%, knowing there’d be some non-response. It’s settled out around 20%, which is big enough to make some conclusions, but small enough to have the conclusions be legitimate.”

Once the data was collected, the class was divided into pairs. Each pair was tasked to come up with “some sort of comparative visual” for their respective sections: comparing political preference to sports cohorts, gender, or grade/faculty. Mr. Emery then took the data that his students compiled and tested it to see how likely it was that the correlations found were due to chance or an actual correlation. He said, “All tests result in a p-value, and interpreting the p-value correctly is crucial.” A p-value represents the likelihood that a correlation is by chance alone; thus, the lower the p-value, the more certain a correlation is. A p-value less than 0.05 is considered statistically significant because there is a 5% possibility of the event being only by chance; in other words, a 95% certainty that the event was not by chance.

Mr. Emery also spoke about the class’s choice to conduct a study, not a poll: “We randomly selected the people who would be a part of it instead of conducting a poll where we ask people to respond. If we did a poll, we would have a larger data set but also significantly more biased and less reliable numbers. So, we chose as a class to stick with the study process, which is to randomly select people and then hound them a little bit, so we don’t have too much non-response bias. That’s why we’re allowed to run those tests and make those conclusions.”