When randomized lists make a difference

Usually when I’m asked about random or shuffled questions and values, it’s because someone came across the function as The Right Way to research, or spotted it in a software feature list. Rarely does it pop up because of a serious problem with order biases. My general advice is to first look at all the other tweaks you can make to your survey, from sampling to usability to scale phrasing to analysis—then decide if random makes your survey’s must have list.

For when it is an issue—or anyone thinking “Why not use it if I have it?”—here we go...

Why random?

Random addresses order biases—the hypothesis/fact that a respondent will answer a question more positively, negatively, or frequently simply based on where it appears. Random doesn’t actually fix anything, it just averages out any effects by changing which item shows up first/fifth/twentieth. Order biases can take several forms:

  • Primacy and recency
    Sometimes items appearing first (primacy) or last (recency) are more likely to be picked, rated high, etc. This is the place random helps most.
  • Fatigue
    Yes, random can help here, but if respondents are getting fatigued in your survey, you have a larger issue than order biases. Look for chaff, or ways to increase involvement, to address the underlying problem first. Also, remember that adding helpful structure/order may be better than a random band-aid.
  • Routine
    Respondents skim, so with repeated or similar elements in a survey (such as answer scales), they can get into a clicking routine. Random can be used, but more often this is addressed by “mixing things up,” i.e. flipping scale orders, changing labels, etc. Be cautious here. When done badly, as in the image below, you can end up with unreliable data (I certainly wouldn’t count on Q19). Mixing things up generally trades a small potential for routine bias with a significant peril respondents will misread inconsistent questions—or abandon more often as they get fed up with the greater effort from “arbitrary” changes. Naturally, some situations, such as Semantic Differentials, still call for a bit of mixing-up.

Grids with flipped scales

What to shuffle?

You can randomize any unordered set of elements, including:

  • Answer scale values
  • Rows of a question grid/table
  • Questions within a page
  • Pages within a survey

Or what to sub-set?

Sometimes you simply have too many questions or combinations to present everything to each respondent—such as with Conjoint analysis. In this case, you may present a random sub-set to each individual. Configuring these surveys gets tricky, as respondents will often answer relative to the question set they see.

But always remember, bad random is worse than no random

There are two facets here: avoiding usability problems, and not tipping your hand to respondents about a survey’s tricks.

  • Is it truly unordered?
    While statistically anything nominal/category is “unordered,” watch for elements such as titles and locations which have a hierarchy or common sort. Similarly, when shuffling questions/grid rows, you may be able to achieve a graceful progression by manually ordering, whose benefit outweighs any potential order bias.
  • Does it look static?
    If a respondent goes forward/back, or pauses and comes back later, do they see the same sort order or have to get acquainted with a “new” list? Similarly, if the same list of products appears in two questions/grids, does it shuffle the same or generate extra work?
  • Does it work with sub-heads?
    A grid or very long answer set may have sections (such as product categories), within which you want items shuffled. If this is an issue, check that the software can handle it.
  • Are all the question numbering, validation, skip, progress bar, etc. bits resolved properly?
    Make sure any ripples are handled and tested—why even “free” bells and whistles have a cost.

Reality check

The only way to really determine the degree of bias your project will face is a split test—half random, half fixed order—with a statistically valid sample. However, you can probably make a guess as to whether enough respondents will answer with the same bias to a strong enough degree that you’d come to a different business decision based on results from random/not.