Measurement is a roadblock in psychological research. To measure a person’s abilities, attitudes, and traits, you cannot simply place them on a literal scale and record the “weight.” But wouldn’t it be nice if you could.
Features that are not directly observable—called latent variables—require a different kind of scale; one fit for psychological research. That’s why response scales were created, so that the mind can be effectively measured.
If you want to learn how to write a scale for psychometric investigation, the following tips might help you create an accurate and reliable tool.
In psychology, scales are a series of statements or questions concise to a construct. They are used to measure psychological variables. Some common latent variables you can measure with a scale are:
Whether your scope is experimental or market research, the best practices of writing a scale are generally the same.
The first thing to do when writing a scale is figure out what you want to measure.
While it may be tempting to jump right into writing items and questions, background research can save you time in the long run.
Take the latent variable of happiness for example. Sure, humans have an intuitive idea of what happiness is, but researchers define it in slightly different ways. How you define happiness will affect the way you measure it in your scale.
A typical definition of happiness is the experience of positive emotions and satisfaction. Using this definition, you might start to construct a scale that detects the presence of positive feelings and contentment.
However, there are other scientific definitions of happiness. Some common definitions include both affective and cognitive (i.e., feeling and thinking) appraisals of satisfaction, along with the presence of positive experiences and the absence of negative ones (Garaigordobil, 2015). Now your definition of happiness could include new items relating to thoughts and negative experiences.
It is a best practice to form a definition based on existing scientific literature (Kyriazos and Stalikas, 2018).
Not only will preliminary research help you define the topic, it will also put a spotlight on the population you wish to study, effectively casting a shadow on the representative sample. However, if the sample size is not large enough, it could reduce the statistical power.
That’s why some researchers suggest that you consult a statistician during the early stages of scale writing (Jones et. al., 2013). Plus, as you’ll find out later on, fostering a relationship with a statistician (or psychometrician) will be extremely beneficial when you need to validate your scale.
Once you’ve identified your topic/population and defined them, you can start generating the items that will make up your scale. As a rule of thumb, you will need more than a single item, especially if you want to use statistical measures to prove reliability (Boateng et. al., 2018).
There are a few ways to generate items ranging from expert advice to conducting depth-interviews. First, consult with experts in your field of study. It is a best practice to ask experts to generate a list of topics or items they feel is relevant to the construct of interest.
You can also conduct another literature review. You may find that there is an existing scale (or several) already created for your topic of study. This will give you an idea of how many items/questions are in validated scales, which can be extremely helpful in developing an item pool for your scale.
If an existing scale matches your operational definition of the topic, use it! Validated scales are a great way to ensure accuracy.
For instance, two common happiness scales are the Subjective Happiness Scale—a four item scale which gives a general measure of happiness—and the Oxford Happiness Questionnaire—a 29 item scale which gives a measure of psychological well-being (Lyubomirsky and Lepper, 1999, Hills and Argyle, 2002).
If you decide to adapt or modify an existing scale, it’s a best practice to contact the researchers that created the scale and ask for permission; alterations could in effect alter the psychometric properties of the scale.
But depending on your research needs, you may need to create your own. One way to create your own items is by conducting in-depth interviews with a focus group, or individuals, from the population that your scale targets. These interviews can reveal themes common to your topic.
The purpose of conducting a literature review and depth interviews is to generate a large pool of items. Researchers recommend that your initial pool of items should be at least twice the size of the final scale, so it’s okay to include items that are not a perfect fit (Boateng et. al., 2018). You will filter those out later in the validation process.
This step, in practice, is conducted in parallel with step three. While you collect and generate items for your pool, think about how you will format the items in terms of measurement scales, response types, and item wording.
Measurement: There are four main types of measurement scales.
Response Options: Once you’ve decided how to measure your data, it’s time to select a response type. Here are four popular options:
Response Labeling: Researchers found that labeling all of the response options makes items more clear than when labeling just the endpoints (Kyriazos and Stalikas, 2018). Below are two examples of fully labeled response options on a five-point Likert scale and a seven-point semantic differential scale.
__Strongly Disagree __Disagree __Neutral __Agree __Strongly Agree
__Very Unhappy __Unhappy __Somewhat Unhappy __Neutral __Somewhat Happy __Happy __Very Happy
Item Wording: When you write and phrase the items in your scale, it is important to make sure that they are easily understood. Here are three things to consider when writing your items.
Once your item pool is ready, it is time to get an outside opinion.
This step involves evaluating your item set based on expert opinion and cognitive interviewing.
First, you can consult experts in your field of study to evaluate the item list. This is a critical step in aligning your items with the topic you wish to measure. It can help you:
Next, you can administer drafts of your items. During these cognitive interviews, you can ask the participants to re-phrase the items in their own words and talk you through their answering process. This invaluable approach allows you to:
It is a best practice to identify clear trends from several cognitive interviews before making item changes (Gehlbach and Brinkworth, 2011).
The final step is the most important, and the most complicated. For simplicity’s sake, this final step can be distilled down to pilot-testing, analysis, and validation.
Pilot-testing involves administering your scale to a representative sample from your target population. Then, once the results are obtained, you can statistically analyze them.
While complicated, statistical analysis methods, including those aimed at reliability and validity, will ensure that your target population is responding to your scale congruent with the underlying theory. In short, you will be able to empirically explore how people respond to your scale, how accurately your scale measures what you want it to measure, if people are responding to surveys in the same way consistently, and much more.
In order to perform these statistical operations, background knowledge on the concepts of reliability and validity becomes essential. If you want to familiarize yourself with reliability and validity in psychological testing, check out our posts on the same subjects.
References:
Allen, M. (2017). The sage encyclopedia of communication research methods (Vols. 1-4). Thousand Oaks, CA: SAGE Publications.
Garaigordobil M. (2015). Predictor variables of happiness and its connection with risk and protective factors for health. Frontiers in psychology, 6, 1176.
Gehlbach, H., & Brinkworth, M. E. (2011). Measure Twice, Cut down Error: A Process for Enhancing the Validity of Survey Scales. Review of General Psychology, 15(4). pp.380–387.
Hills, P., & Argyle, M. (2002). The Oxford Happiness Questionnaire: a compact scale for the measurement of psychological well-being. Personality and Individual Differences, 33. pp. 1073–1082.
Jones, T. L., Baxter, M. A., & Khanduja, V. (2013). A quick guide to survey research. Annals of the Royal College of Surgeons of England, 95(1), pp. 5–7.
Kyriazos, TA, Stalikas, A (2018). Applied psychometrics: the steps of scale development and standardization process. Psychology 9. pp. 2531–2560.
Lyubomirsky, S. & Lepper, H. S. (1999). A measure of subjective happiness: Preliminary reliability and construct validation. Social Indicators Research, 46. pp.137-155.
Wu, H., Leung, S. (2017) Can Likert Scales be Treated as Interval Scales?—A Simulation Study, Journal of Social Service Research, 43:4. pp. 527-532.