Orientation Observation In-depth interviews Document analysis and semiology Conversation and discourse analysis Secondary Data Surveys Experiments Ethics Research outcomes



Social Research Glossary

About Researching the Real World



© Lee Harvey 2012–2020

Page updated 29 April, 2020

Citation reference: Harvey, L., 2012–2020, Researching the Real World, available at
All rights belong to author.


A Guide to Methodology

7. Secondary data

7.1 Introduction to secondary analysis
7.2 Extent of re-analysis of secondary data

7.3 Nature of the data

7.3.1 Rigidity and longevity of the data
7.3.2 Nature of the enquiry
7.3.3 Ingenuity of the social researcher

7.4 Data sources
7.5 Examining data sources
7.6 Methodological approaches

7.7 Summary and conclusion

7.3 Nature of the data
The extent to which you can undertake secondary data analysis depends on three things: the rigidity of the data; the nature of the enquiry; and the ingenuity of the social researcher.


7.3.1 Rigidity and longevity of the data
The form in which the original data is available restricts what you can do with it. If the data is only available in already-published tables, then the researcher can only undertake secondary analysis within the framework set up by the table.

For example, a table may list the employment rates for males and for females and for different ethnic minorities but unless the table crosstabulates gender and ethnicity it is impossible, from the simple rates for each category to find, for example, the employment rate for Asian women.

Individual case data is available in some data archives. Such data is by far the most flexible as it allows researchers the freedom to reanalyse it any way that they think suitable, as they might if they had collected the data themselves. (See Section 7.4 Data Sources)

The longevity of the data is another issue, especially when undertaking time series analysis or comparisons over time. In Section 7.5.1 the problems that arise when changes are made to longitudinal data as a result of political pressure is discussed. However, sometimes the collection of key data for a time series is dropped altogether. For example, as Erzsébet Bukodi et al. (2015) explain in relation to the tracking of social mobility in the United Kingdom over time:

the data available for the determination of levels of and trends in social mobility are less adequate today than two or three decades ago.The data most appropriate in this regard are those provided by repeated sample surveys of the population at large. From 1972 to 1992, such data were available from the General Household Survey (GHS) carried out annually. Data were collected on survey respondents' employment status and occupation and on respondents' fathers' 'usual' employment status and occupation, and were then in both cases coded to the Registrar General's Socio- Economic Groups (SEGs), from which an approximation to the Goldthorpe social class schema...could be obtained.... While subject to various limitations, these data did allow reasonably informative tables of intergenerational class mobility to be constructed, and thus gave a basis for the analysis of mobility trends (Goldthorpe and Mills 2004). However, from 1993 the GHS ceased to collect information on respondents' fathers, or indeed on any other aspect of respondents' social origins. The one exception occurred in 2005 when, within an EU-SILC module incorporated into the GHS, information was again obtained on fathers' occupations but together with only limited information on their employment status—so that no close comparison with earlier GHS data was possible.


7.3.2 Nature of the enquiry
The scope for secondary analysis depends on the nature of the sociological enquiry. It may be that the specific hypotheses under consideration cannot be addressed through secondary analysis as there is no available survey that has collected data on all aspects of the hypothesis. Or the definition of the theoretical concept used in available studies in no way matches the theoretical concerns of the researcher (such as the use in official statistics of socio-economic group to stand for social class).

However, although government definitions may be different from those of sociologists, there is scope to overcome these problems according to Catherine Hakim (1982, p. 141), who suggested that many official surveys could 'yield new or additional results if re-analysed within the framework of social science theory'.


7.3.3 The ingenuity of the social researcher
The third factor that influences secondary data analysis is the ingenuity of the sociologist. Secondary data analysis depends on the researcher's imagination. This can be summed up as 'looking for a different angle on the data'. While this is most easily undertaken when data sets of individual cases is available from archives, it can also be done using composites of already-published tables.

For example, in State of the Nation, Stephen Fothergill and Jill Vincent (1985) brought together a large number of official statistics to demonstrate how the North–South divide had widened over the decade up to 1984.

Similarly, in their study of poverty, Brian Abel-Smith and Peter Townsend (1965) developed procedures to overcome data discontinuities such as changes in income banding, pattern of household composition and changes in definitions in the 20-year span of Family Expenditure Survey data used in their study.

Reworking data to produce new findings requires some data analytic skills (see Section 8). However, a lot can be done with some simple and straightforward reworking and comparison of available data.

Next 7.4 Data Sources