Social Research Glossary
Citation reference: Harvey, L., 2012-17, Social Research Glossary, Quality Research International, http://www.qualityresearchinternational.com/socialresearch/
This is a dynamic glossary and the author would welcome any e-mail suggestions for additions or amendments. Page updated 20 October, 2017 , © Lee Harvey 2012–2017.
|A fast-paced novel of conjecture and surprises|
Association is a general term applied when there appears to be some kind of connection between two concepts when expressed as operationalised variables and measured.
Such an observed connection between two variables may take several forms. The two variables may vary in direct relation to one another, or they may vary inversely. Further, certain categories in one variable may be related to categories of another variable.
For example, in a group of people height may directly vary with weight. Taller people tend to weigh more. Exercise may vary inversely with weight, heavier people tend to exercise less. Looking at related categories, political party preference may be associated with social class, such that in the United Kingdom middle-class people may tend to vote Conservative and working class people tend towards Labour.
Association is an attempt to show the extent of such tendencies. It is unlikely, in social science, that perfect relationships between variables will emerge, i.e. perfect association. A less than perfect relationship is likely and there are several ways of measuring association in an attempt to reveal the extent to which two variables are related.
It is important to remember that association does not necessarily imply causation.
In cognitive psychology association also refers to the link between two different thoughts or ideas, such as 'shoot' being linked to 'goal' if you are an association football fan or to 'gun' if you are a rifle owner! In behaviourist psychology association might be construed as a stimulus-response mechanism.
The following discussion focuses on association as a correlative concept.
Measures of Association
There are a large number of statistical measures of association. In selecting a measure the scale of the data and the number of categories, the number of variables being related at one time, the type of partial analysis required, and the sample size will determine the available choice of measures.
Conventionally, measures of association are designed with a scale of 0 to 1. A score of zero means that there is no observed relationship at all between the variables. A score of 1 means that there is a perfect relationship between them.
Measures for crosstabulated data
There are many different measures that will provide information on the degree of association apparent between crosstabulated data. Different situations call for different tests and some alternatives (generated by SPSS) are indicated below.
In the special situation of two-by-two table the scale of the data is irrelevent as both variables are simply dichotomised. The most widely used measures of association in this case are Phi (i.e. Chi-square/Total sample size) and Tau-b. Phi is particularly for two-by-two tables whereas Tau-b is more general but widely regarded as the best test in these circumstances as it it statistically the most powerful and can be tested for statistical significance.
In situations other than two-by-two tables the scale of the data is important in the selection of the measure of association. Where both X and Y are nominal the following are used: Lambda, Uncertainty Coefficient, Cramer’s V, and Contingency Coefficient for square tables
Where Y and X are ordinal the following are frequently used: Tau-b for square tables, Tau-c for rectangular tables, Gamma, and Somer’s D
Where Y is interval and X is nominal, Eta is often used.
Measures for non-crosstabulated data
Non-crosstabulated data are analysed for association using correlation and regression techniques.
Correlation and regression procedures are used to measure the relationship between two or more variables, usually when the data is of an interval scale. They are based (in the main) on analysing the variance (and co-variance) of the data. The measures are more ‘powerful’ than those applied to crosstabulated data. The techniques attempt to show to what extent one variable is a function of (or dependent upon) one or more other variables, and to provide a specification of that relationship.
Correlation techniques measure the degree of association between variables, regression techniques specify the nature of the relationship.
Non-parametric measures of association
When data is not of an interval scale, regression analysis is problematic. However, there are a large number of techniques that attempt to measure the degree of association between variables that are not of an interval scale of measurement, through the development of non-parametric techniques.
Rather than attempt to define the variation (using the standard deviation) around an ‘average’ line of best fit, non-parametric techniques attempt to show the relationship between non-interval scale variables by various attempts to match ‘patterns’ of two or more variables. For example, an assessment of introversion and conservatism scores (both ordinal scale data) could be attempted by ranking each individual in a sample on the basis of the scores attained on tests of introversion and conservatism and see to what extent the ranks correspond.
A similar approach is adopted in some analyses of association of crosstabulated data. In such circumstances the exact nature of the relationships themselves are not specified.
An asymetric relation is a relation between variables in which there is a clear indication of which is the dependent variable and which is the independent variable. I.e it is a one-way relationship. E.g. to say ‘smoking is related to lung cancer’ does not imply that ‘lung cancer causes smoking’.
Some measures of association give different numerical results depending on which variable is taken as the independent variable. These are known as asymetric measures of association.
An example of an asymetric measure being the d-statistic. A symmetric measure of association (e.g. the Pearson’s r) gives the same numerical result no matter which variable is chosen as the independent variable.
Of course, any implied causal relationship must be construed in a one-way direction, otherwise a basic aspect of the notion of causality is violated, an effect cannot be a cause of cause.
Partial association is the association that remains between two variables when the effect of other variables have been controlled by statistical techniques.
Association and causality
Statistical measures of association simply indicate the extent to which operationalised concepts covary. This may be indicative of causal relationships, given a suitable theoretical model, an establishable time priority and the inclusion of control variable. However, measures of association do not prove or establish causal relationships. At best they suggest possible or likely factors. Crucially, causality implies an invariant relationship and this is virtually impossible to establish through procedures designed to measure statistical association.
Association also refers to interactions or relationships between people.
copyright Lee Harvey 2012–2017
copyright Lee Harvey 2012–2017