Internal ConsistencyĪnother kind of reliability is internal consistency, which is the consistency of people’s responses across the items on a multiple-item measure. So a measure of mood that produced a low test-retest correlation over a period of a month would not be a cause for concern. The very nature of mood, for example, is that it changes. But other constructs are not assumed to be stable over time.
![validity and reliability of questionnaire validity and reliability of questionnaire](https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fs41598-020-77769-6/MediaObjects/41598_2020_77769_Fig1_HTML.png)
In general, a test-retest correlation of +.80 or greater is considered to indicate good reliability.įigure 4.2 Test-Retest Correlation Between Two Sets of Scores of Several College Students on the Rosenberg Self-Esteem Scale, Given Two Times a Week ApartĪgain, high test-retest correlations make sense when the construct being measured is assumed to be consistent over time, which is the case for intelligence, self-esteem, and the Big Five personality dimensions. The correlation coefficient for these data is +.95. Figure 4.2 shows the correlation between two sets of scores of several university students on the Rosenberg Self-Esteem Scale, administered two times, a week apart. This is typically done by graphing the data in a scatterplot and computing the correlation coefficient. Clearly, a measure that produces highly inconsistent scores over time cannot be a very good measure of a construct that is supposed to be consistent.Īssessing test-retest reliability requires using the measure on a group of people at one time, using it again on the same group of people at a later time, and then looking at test-retest correlation between the two sets of scores.
![validity and reliability of questionnaire validity and reliability of questionnaire](https://0.academia-photos.com/attachment_thumbnails/47088313/mini_magick20190207-25679-lyfqjb.png)
This means that any good measure of intelligence should produce roughly the same scores for this individual next week as it does today. A person who is highly intelligent today will be highly intelligent next week. For example, intelligence is generally thought to be consistent across time. Test-retest reliability is the extent to which this is actually the case. When researchers measure a construct that they assume to be consistent across time, then the scores they obtain should also be consistent across time. Psychologists consider three types of consistency: over time (test-retest reliability), across items (internal consistency), and across different researchers (inter-rater reliability). Reliability refers to the consistency of a measure. In evaluating a measurement method, psychologists consider two general dimensions: reliability and validity. But if it indicated that you had gained 10 pounds, you would rightly conclude that it was broken and either fix it or get rid of it. If at this point your bathroom scale indicated that you had lost 10 pounds, this would make sense and you would continue to use the scale. Your clothes seem to be fitting more loosely, and several friends have asked if you have lost weight. If their research does not demonstrate that a measure works, they stop using it.Īs an informal example, imagine that you have been dieting for a month. Instead, they collect data to demonstrate that they work. Psychologists do not simply assume that their measures work.
![validity and reliability of questionnaire validity and reliability of questionnaire](https://d33v4339jhl8k0.cloudfront.net/docs/assets/5bedba8804286304a71c50f5/images/5c4b7cb02c7d3a66e32daa4c/img-19274-1548450913-2068804204.png)
But how do researchers know that the scores actually represent the characteristic, especially when it is a construct like intelligence, self-esteem, depression, or working memory capacity? The answer is that they conduct research using the measure to confirm that the scores make sense based on their understanding of the construct being measured. Describe the kinds of evidence that would be relevant to assessing the reliability and validity of a particular measure.Īgain, measurement involves assigning scores to individuals so that they represent some characteristic of the individuals.Define validity, including the different types and how they are assessed.Define reliability, including the different types and how they are assessed.