Orientation Observation In-depth interviews Document analysis and semiology Conversation and discourse analysis Secondary Data Surveys Experiments Ethics Research outcomes



Social Research Glossary

About Researching the Real World



© Lee Harvey 2012–2020

Page updated 29 April, 2020

Citation reference: Harvey, L., 2012–2020, Researching the Real World, available at
All rights belong to author.


A Guide to Methodology

5. Document analysis and semiology

5.1 Introduction
5.2 Document analysis for what?
5.3 Establishing the nature of documents and categorising them (external analysis)
5.4 Approaches to document analysis
5.5 Evidence of occurrence
5.6 Content analysis
5.7 Qualitative document analysis
5.8 Historical research
5.9 Hermeneutics
5.10 Semiology

5.11 Critical media analysis
5.12 Aesthetics. art criticism, art history

5.7 Qualitative document analysis
The approach to analysing the content of documents that is referred to as 'qualitative document analysis' is in principal the same as the analysis of any qualitative data, such as in-depth interview transcripts, participant observation notes or non-participant observation recordings.

It involves reading through the observations and developing themes and coding content. See Section 3.6 on analysing observational or ethnographic data and Section 4.5 on analysing in-depth interview data.

The approach attempts to explore the document to provide an interpretation of the actions, motivations and intentions of actors identified in the documentation.

Elise Wach, Richard Ward and Ruzica Jacimovic (2013) report undertaking a qualitative document analysis as part of the Triple-S (Sustainable Services at Scale) initiative, which is six-year, multi-country action research and learning initiative that aims to promote long-term sustainable approaches to the funding and implementation of water services in the rural water sector in parts of Africa.

The authors claimed that qualitative document analysis enables rigorous and systematic analysis of the contents of written documents. In political science research, the approach facilitates impartial and consistent analysis of written policies.

The process adopted by Wach et al. had the following steps. First, setting inclusion criteria for documents. This involved deciding which organisations would be included, the types of documents to be reviewed, and the time of publication and release of those documents.

Second, was the collection of documents. Most documents were in the public domain although not all appropriate documents were available online. The team therefore requested such documents from its contacts within the organisations, which presented various complications that impacted on the scope and limits of the approach. Only documents that were willingly disclosed, either publically or through direct contact were analysed; the team did not have unrestricted access to partners' documents in order to apply a sampling method.

Third, they articulated key areas for analysis. Initially they identified 21 themes that the Triple-S initiative had deemed to be important. However, it became apparent that some of these themes were not appropriate for application to practice documents and that the team needed to be more explicit about some of the themes. What for example was meant by a commitment to 'accountability and transparency' or  'monitoring for sustainability'. One of the themes for analysis was the extent to which an organisation ensured that its approach was 'country-specific' (adapting lessons from other places to local contexts). 'While the team found it possible to analyse policy documents for this theme, with regard to the practice documents, nearly any programme set in any given country could be argued to employ a country-specific approach and this aspect was therefore difficult to meaningfully score' (Wach et al., 2013, p. 4).  

The fourth stage was document coding. Each document was analysed to determine the extent to which the policy or programme it described addressed or considered each of the identified 'themes' for sustainable services. Text relevant to each theme was highlighted and coded using a qualitative data analysis software. Each document was subsequently assessed, based on the meaning, relevance and context the text for each theme, as 'good', 'limited', 'none' or 'unclear', with clear criteria of what each score signified.

The authors argued that qualitative analysis of content, meaning and relevance in context distinguishes the methodology from a search for key words as used, for example, in standard content analysis.

For example, one of the themes was 'monitoring for sustainability'. Rather than conduct a search for references to monitoring in general, the team aimed to assess whether the policy or programme monitored not just functionality, but other indicators that make a service sustainable. These would be context-specific, but would include issues such as adequate management capacity, tariff recovery, and technical backstopping (Wach et al., 2013, p. 3).

The fifth stage was verification, which involved a second person verifying analysis of every document. In addition, a third person provided ad hoc verification and also served as an arbiter for any inconsistencies between the two primary coders. This went beyond what is normally deemed sufficient for coding and 'ensured robust interpretative analysis and conclusions' (Wach et al., 2013, p. 3).

The sixth stage was the analysis of the data to compare policy and practice. Scores of good, okay, limited, none and unclear were assigned numerical values (0 to 3) to assist in aggregation and data presentation.

The documents served as the sole source of information for the scoring, which facilitated objectivity but proved challenging on some occasions. For example, in the policy review, it was difficult to give high scores for certain themes or organisations when the team knew that these stated policies were not being effectively applied in practice. This situation served as motivation to undertake a 'practice' QDA to see how things stacked up when it came to practices on the ground. Equally, however, in the practice review it was sometimes difficult for some researchers to give a low score when the team knew that practices had improved since the time of writing, or where certain text could be reasonably assumed to include or indicate other practices or measures that were not explicitly stated in the document. We had to remind ourselves of the original intention for this first round to: (a) serve as a baseline, reflecting policy and practice from 2008, rather than current practice; and (b) for it to be treated as just one of many sources of information about policies and practices in the water sector, rather than an evaluation of an organisation or programme. (Wach et al., 2013, pp. 3–4)


Next 5.8 Historical research