Perhaps the topic most inquired about in the field of surveys, especially customer satisfaction surveys, is response rates. What’s a good response rate? What’s the statistical accuracy with a certain response rate? But what is sometimes missed is that the response rate should not be the only concern. Bias in the response group should be an equally important concern. Lots of responses will give a good statistical accuracy, but it’s a false sense of security if there’s a sample bias. In particular, we’ll discuss survey fatigue and nonresponse bias.
A client of mine posed this question to me on the topic of survey response rates and statistical accuracy: “We launched our survey and have about a 22% response rate so far. Using your response rate calculator, I find that we currently have an accuracy of about 95% +/- 6%, and to get to a +/- 5% accuracy, we’d need about 70 more responses. I also remember that we discussed that a survey with >30% response rate is an indicator of survey health. So how important is it to get to 30% when the accuracy factors are already met?”
I get this type of question frequently in my survey workshops, so I thought it made good fodder for an article on the difference between survey accuracy and bias. This will likely be disconcerting to many readers, but it is possible to have a high degree of statistical accuracy and yet have misleading, bad data. Why? Because you could have a biased sample of data.
What Is Bias?
Let’s define bias first—with its antonym. An unbiased survey would mean that those who completed the survey for us were representative of the population, which is our group of interest. Thus, we would feel comfortable, within some level of confidence, projecting the findings from the sample data to the population as a whole.
However, since we are not getting data from everyone, there will be some random error in our results. That is, the findings from our sample will not match exactly what we would get if we successfully got everyone in the population to complete the survey. That difference is known as sampling error, and the statistical accuracy tells us how much sampling error we may have in our data.
But bias is different, and high accuracy does not mean that there’s no bias. I work a lot with technical support groups. They frequently survey their customers after a service transaction to gauge the quality of service delivery, but usually some questions are included to measure views on the underlying product being supported. Are the findings about the product learned from a tech support survey a good unbiased indicator of how the entire customer base feels about the product? Probably not. The data are coming from people who have had issues with the product and thus needed support. It’s biased data.
Further, many tech support groups send out a follow-up survey for every incident. While you may complete them at first, after awhile most of us would get tired at the constant requests for feedback, especially if we doubt they do anything with it. When might we then take the survey? When we’re motivated by a particularly good or bad experience. The selfselection bias, also known as a nonresponse bias, is now distorting the data. Those who chose not to respond are likely different from those who did.
Survey fatigue is another factor. “The more people say no to surveys, the less trustworthy the data becomes, Judith Tanur, a retired Stony Brook University sociology professor specializing in survey methodology, told the Associated Press.” Ironically, the article’s author then avers, “Social media is also a great channel for customer satisfaction feedback.” But this is simply trading one form of sample bias for another. Is the feedback garnered through social media representative of a company’s entire customer base? It is clearly a biased sample. A huge proportion of people, including this author, have little interest in the social media world—except for business purposes.
This point was made in a recent article by Carl Bialik, “The Numbers Guy” of the Wall Street Journal. In his article, “Tweets as Poll Data? Be Careful,” he notes: “Still, there are significant arguments against extrapolating from people who tweet to those who don’t: Twitter users tend to be younger and better educated than nonusers, and most researchers get only a sample of all tweets. In addition, it isn’t always obvious to human readers—let alone computer algorithms—what opinion a tweet is expressing, if there is an opinion there at all.”
I do agree that we should use multiple research methods, including social media, to draw on each method’s strengths. But know your biases, and know that biases are just as important as statistical accuracy.
How Is Bias Introduced in the Data?
Many different ways exist to introduce bias into our data inadvertently—or intentionally if we’re trying to lie with our statistics! Yes, my Pollyannaish friends, people do intentionally skew survey data.
We might use a survey method that appeals to one class of our population more than another. Think about who in your circle of friends or business associates is a “phone person” versus an “email person.” We might survey during some religious holidays, driving down responses from a group. If we’re sending invitations to just a sample and not to the entire population, we might not have done random sampling—or done it correctly—likely introducing a bias. Maybe account managers get to choose who in a customer site gets the survey invitation; guess who they’re going to cherry pick? Or maybe the car salesman gives you a copy of the survey you’ll be receiving explaining that he’ll get the $200 for his daughter’s braces if you give him all 10s.
All of these can introduce bias into the data set, but the nonresponse bias caused by survey fatigue is particularly perplexing because it’s near impossible to measure. How can we tell how the non-respondents differ from the respondents if we can’t get the nonrespondents to respond?
What we can say for sure is that as the response rate drops, the likelihood of significant nonresponse bias grows. There’s no magic threshold for a “healthy survey,” but a statistically accurate survey could have significant nonresponse bias—or other administrative biases. Higher response rates lower the impact of this self-selection bias, but biases are truly the hidden danger of getting meaningful results.
To learn how to combat nonresponsive bias, we suggest you attend one of Fred’s courses. The next one “Survey Design & Data Analysis”, runs in Dubai from Nov 19-21, 2013 .
Fred Van Bennekom is Founder and President of Great Brook, a consulting and training firm specializing in survey design and customer feedback.