18.104.22.168 Criterion validity Criterion validity asks if the measure is consistent with what is already known? It is a measure of validity that is established by use of a criterion measure. The criterion used to assess validity is already known to be valid. In essence, the operationalisation is checked against an established indicator. For example, where people live and their annual income could be used to indicate their political affiliations. One could test this against the actual voting pattern of individuals to see if geography and income are valid indicators. Of course, the already-known criterion you use as a measure may not be as valid as you think.
Is voting pattern an indicator of political affiliation? A Labour supporter in a safe Conservative seat may vote Liberal for tactical reasons.
Lamers-Winkelman and Heemstede (1998), for example, showed that an instrument measuring child-reported sexual abuse did not correlate well with actual cases of sexual abuse and thus was not valid for judging the truthfulness of allegations of child sexual abuse.
One method of exploring criterion validity is, for example, to ask the same question in different ways or repeat it at a later stage in the questionnaire to test for consistency in the response. Criterion validity is sometimes broken down into concurrent validity and predictive validity.
22.214.171.124.1 Concurrent validity Concurrent validity has two slightly different meanings in the literature.
One view says that concurrent validity occurs when the criterion measures are obtained at the same time as the test scores. This indicates the extent to which the test scores accurately estimate an individual's current state with regards to the criterion. For example, on a test that measures anxiety, the test would be said to have concurrent validity if it measured the current level of anxiety of the person taking the test.
An alternative view says that concurrent validity is demonstrated where a test correlates well with a measure that has previously been validated. The two measures may be for the same construct, or for different but related constructs. For example, if a test of anxiety gives similar results to those gathered using a test of anxiety that has been validated in past investigations, then the new measurement has concurrent validity.
Although these views differ slightly, the key is that they differ from predictive validity (see below) as they rely on current comparisons rather than future predictions.
126.96.36.199.2 Predictive validity Predictive validity is another way of looking at criterion validity.
Predictive validity occurs when the criterion measures are obtained at a time after the test. A test or scale has predictive validity if it is demonstrated to be effective in predicting outcomes. Likewise, if an operationalisation of a concept is able to predict something it should theoretically be able to predict then it has predictive validity.
For example, if a health screening programme is able to predict the health of people not only when administered but also at some later time, then it would be said to have predictive validity. Similarly, an aptitude tests used by a careers counsellor has predictive value if it shows who is likely to succeed or fail in certain subjects or occupations.
Another view on this is that predictive validity is about the ability of the study instrument to predict something that is already known. If for example, there is an established test for a blood disorder, does the new test being devised also accurately predict instances of the disorder?
188.8.131.52 Construct validity Construct validity is about the ability of the constructed concept (test, operationalisation) to represent the theoretical concept. Construct validity has two aspects.
First, a fundamental one: is the construct being used a valid conceptualization. For example a researcher might want to measure health status and decides to use how long it takes someone to run a kilometer as an indicator. Ability to run a kilometer may be linked to health (and fitness) but is a questionable indicator of health status.
Second, a rather weaker version of construct validity is to examine whether a measure relates meaningfully to other similar measures used before or to other variables as required by theory. This is somewhat similar to criterion validity.
Low validity comes when a construct lacks theoretical agreement or the operationalisation is such that its indicators may mean one thing to one researcher and something different to another researcher. A construct is reckoned to be more valid if it is used in a range of settings in the same way with consistent outcomes that mesh with theory.
184.108.40.206.1 Convergent validity Convergent validity examines the extent to which a measure or operationalisation is similar to (converges on) other measures or conceptualizations to which it is theoretically similar.For example, a test of anxiety might be compared with analyses of sleeplessness.
The literature also refers to 'internal consistency' as an aspect of convergent validity. However, this just measures the correlation between the items that are used as indicators of a concept. A scale that shows student satisfaction with teaching, for example, may have several questions related to teaching that are combined into a single measure (index). Internal consistency would expect there to be reasonable correlation between scores on each of these separate items.
Cronbach's alpha is a statistical technique commonly used to establish internal consistency construct validity (a score of 0.60 considered acceptable for exploratory purposes, and scores of .80 considered good for confirmatory purposes, (Cronbach and Meehl, 1955)) (seeSection 8).
220.127.116.11.2 Discriminant validity Discriminant validity is the ability of a measure or operationalisation to make distinctions. It should be able to distinguish in practice between groups that theoretically it should be able to distinguish between.
For example, an operationalisation of anxiety should be able to distinguish between people who are anxious and those who are depressed or suicidal.
Some writers also refer to this ability to discriminate as a feature of criterion validity, which is a little confusing.
18.104.22.168.3 Nomological validity Nomological validity assesses whether a construct relates to a set of other related concepts in the way that is expected. For example the online Design and Marketing Dictionary states:
A type of validity in which a measure correlates positively in the theoretically predicted way with measures of different but related constructs. For example, the tendency to purchase prestige brands should show a high correlation with a person’s need for status and materialism and a negative correlation with price sensitivity.(accessed 18 July 2011)
22.214.171.124.4 Representation validity Representation validity (sometime called translation validity) assesses whether the construct translates into observable measures. In essence, this is asking whether the operationalisation results in appropriate dimensions and indicators (seeSection 8).
126.96.36.199 Content validity (face validity) Content validity is a demonstration that the research is covering the full scope of the conceptual or practical area being explored.
If one is exploring satisfaction at work one needs to ensure that all aspects are covered, wages, conditions, management, staff development and so on. When operationalising a concept, content validity would be established if all the appropriate dimensions of the concept are included (seeSection 8).
However, content validity is about scope, it does not guarantee that, for example, a test or scale actually measures phenomena in that domain, only that the domain is covered. (However, it should be noted that some writers state that content validity is about items measuring what they claim to measure, which is similar to construct validity as defined above).
One way of ensuring that the domain is covered is to ask key people, expert witnesses, focus groups or pilot the instrument or approach on an appropriate sub-sample. Some people refer to this also as face validity, although others restrict the notion of face validity to a simpler approach that asks whether, on the face of it, the measure or test looks like it is going to measure what it is supposed to measure?
Another angle on this is to see construct validity as the ability for your analysis of a concept to be generalised: that is, does the concept you have developed have a wider applicability than the confines of the research? (See Section 1.10.1on the issue of generalisation).
For some commentators, again rather confusingly, construct validity overlaps with predictive validity. For example, it is claimed that a test has construct validity if it demonstrates an association between the test scores and the prediction of a theoretical trait.
188.8.131.52 Internal validity Internal validity is about whether the research was conducted with care and rigour.
Is the design appropriate? Has care been taken in avoiding bias in the research design. Was care taken in collecting data and measuring phenomena? This is to do with the conduct of the research.
Internal validity is also construed as being about the conceptualisation of the research. Can it be demonstrated that the identified cause precedes the effect in time? Have the researchers taken account of alternative explanations of any inferred causal relationship, have they considered other preceding, intervening or specificatory variables (seeSection 8).
In studies that do not explore causal relationships, only the first of these definitions should be considered when assessing internal validity.
Internal validity can be affected by changes in circumstances during a research study, for example, interviewers change and new ones are less experienced than the old ones. If the sample is re-interviewed or tested in some way then the respondents may adopt a cynical attitude to the second event. If the study is a longitudinal study, those who drop out may not be representative of the sample overall, resulting in a biased sub-sample. This concern also applies, for example, to non-respondents to a survey questionnaire.
184.108.40.206 External validity External validity refers to the extent to which the results of a study are generalisable (see section1.10.1).
Thus, for example, whether or not a sample is representative of the population is an external validity issue, especially when this leads to demonstrable bias (see section220.127.116.11).
Bias may also occur if the sample is somehow tainted by having been the subject of observation, survey or experiement (this is known as reactive effect). For example, a sample of workers in a factory may be aware that their work patterns are being observed and thus act in a way that is different from how they would normally act. While this may be a valid finding in its own right (that is, people responds to being observed by acting different to normal) the results of the observation of work patterns cannot be validly generalised to the factory as a whole.
[Hawthorne effect (experimenter expectation). Do the expectations or actions of the investigator contaminate the outcomes? (Named after famous studies at Western Electric's Hawthorn plant, where work productivity improvements were found to reflect researcher attention, not interventions like better lighting).]
Note that some writers regard the 'Hawthorne Effect' as a problem of internal validity.
It does not really matter whether it is labelled internal or external, the important thing is to be aware of the effect of doing a study on the behaviour of respondents. Most field research has relatively poor external validity since the researcher can rarely be sure that there were no extraneous factors at play that influenced the study's outcomes.
Only in experimental settings could variables possibly be isolated sufficiently to test their impact on a single dependent variable.
18.104.22.168 Ecological validity Ecological validity is about the authenticity of a research setting. For an experiment or evaluation setting to possess ecological validity, for example, the methods and the setting of the experiment must approximate the real-life situation that is being studied.
An example would be a mock-jury deliberation designed to evaluate how people would act in a trial setting. If the results from such mock-jury studies generalize to real trials, then the research is externally valid. However, the setting is artificial, being in a classroom rather than a court room and lacking a real defendant, and so on. Such hypothetical jury situations are regarded as lacking ecological validity.
Ecological validity is sometimes confused with external validity but external validity assesses generalisability not authenticity. The closer the setting is to reality the higher the ecological validity and it is supposed that ecological validity would be reflected in external validity.
This is a position that qualitative naturalistic researchers take up, (as will be discussed in section 1.8.3, phenomenological approach to validity).