Sorting out scales in performance measurement. Rating scales pros and cons
This article is trying to answer the most common questions that pop up in a researcher’s mind, when trying to establish what scale to use in a marketing, or business, research: “Are there any differences in calculations when different scales are used?”, “Which advantages best suit a particular need?”, “Which is the best option for an unipolar or bipolar variable?”.
There is a continuous contradiction between researchers, regarding which scale is best used for maximizing the ratio between reliability and user experience.
First of all, using a 5 or a 7 point scale is more likely to produce higher mean scores, relative to the highest possible attainable score, as compared with the 10 point scale, which is more likely to produce lower mean scores, relative to the highest possible attainable score. This means that using a 5 point rating scale may give the impression that satisfaction levels, for example, are overall higher than they are in reality.
Another common challenge comes with using bipolar or unipolar variables. These are pretty common, found in many current performance measures, such as Customer Satisfaction, Customer Experience, etc. A bipolar variable is a variable which describe attitudes that can fall on either side of a midpoint, which is truly ambivalent or neutral. Having a bipolar or unipolar variable can influence the choice over what type of scale is best used.
If the purpose of the research prompts using a bipolar variable (for example, rating your satisfaction for a certain product or service, from complete dissatisfaction to high appreciation), it is recommended to use a higher point scale, rather than the 5 point version, in order to obtain perceivably skewed answers. Using a 5 point scale commonly leads people to rate four or five. Many of them will never rate anything as a “5”, resulting in having “4” include both the impressions of those which are really very satisfied and those which are only somewhat satisfied. This is otherwise known as the topping effect. A recommended practice for avoiding this unwanted outcome is using at least a 7 point scale. Using a 7 point or 10 point scale will also allow for easier analysis of what bottom-line effects satisfaction has, given that statistics tools, such as regressions, work better with a more granular score (a general score, or response, which is broken into several, smaller specific aspects).
It is argued that the 5 point Likert Scale is too blunt to detect differences between items and to precisely measure specific opinions, as the respondent’s true opinion can lie in between the answer categories.
Although using a 5 point Likert Scale can determine users to provide responses between two points, statistics ensure a way for these errors to cancel themselves out. The assumption is that the number of times one over-rates an experience or performance will be met with an equal number of under-rates. It is likely that the responses which are forced into higher numbers will be canceled out by those forced into lower numbers. From this point of view, using a 7 point scale will likely offer a small benefit over the 5 point version. This benefit will likely only be significant if you have fewer response items (less than 10) and very large sample sizes.
In turn, the 10 point scale offers a large number of choices and options for respondents, but it can also be very frustrating to have a battery of questions with 10 point scales.
Another noteworthy attribute of the very popular 5 point scales is that they tend to distribute results evenly, from the positive to the negative end of a continuum, while 7 point scales tend to distribute results skewed either positively or negatively. This means that you will have a better chance of accurately registering a generally positive or negative response to your performance, if you opt for a moderate response scale, which has 7 levels of agreement.
More than 7 levels of “agreement” cannot be considered in one instance, which makes scales with more than 7 points difficult to respond to. They are fatiguing, depriving the mind of the chance to embrace the scale as a whole and accurately make a selection from an array of balanced alternatives. What does the mind do with a 10 points scale? It breaks the scale down it into a positive and negative half in an attempt to establish where to place a response – an especially frustrating decision if the respondent’s opinion is somewhere in the middle.
Another factor that influences the respondent is the scale’s labels, which are essentially the expressions of the levels used on your scale. Studies have shown that scales marked “1 to X, with X being the highest” result in less accurate results than scales with labels such as “good” or “poor”. When using numbered scales, signposts are recommended.
In conclusion, when designing a tool which is aimed at measuring performance, such as a survey or questionnaire, the following considerations need to be taken:
– numbered scales are highly difficult to evaluate and highly ambiguous for the person that rates them;
– more than seven points on a scale are too much, as studies show that people are not able to estimate their response on a scale greater than seven.
Essentially, the 5 and 7 points scales have the most advantages, covering most researchers’ needs. The 10 point scale is recommended to be used in special situation where an in depth analysis is needed.
- Lavrakas, P.J. (2008) Encyclopedia of Survey Research Methods
- Garland, P. (2013) Yes, There is a Right and Wrong Way to Number Rating Scales