Orientation Observation In-depth interviews Document analysis and semiology Conversation and discourse analysis Secondary Data Surveys Experiments Ethics Research outcomes



Social Research Glossary

About Researching the Real World



© Lee Harvey 2012–2020

Page updated 29 April, 2020

Citation reference: Harvey, L., 2012–2020, Researching the Real World, available at
All rights belong to author.


A Guide to Methodology

5. Document analysis and semiology

5.1 Introduction
5.2 Document analysis for what?
5.3 Establishing the nature of documents and categorising them (external analysis)
5.4 Approaches to document analysis
5.5 Evidence of occurrence
5.6 Content analysis

5.6.1 What is content analysis? Qualitative content analysis

5.6.2 What is content analysis used for?
5.6.3 How to do content analysis The research topic Selecting content for analysis Developing content categories Coding the data in the document Inter-rater reliability Analysis of content Strengths and limitations of content analysis

5.7 Qualitative document analysis
5.8 Historical research
5.9 Hermeneutics
5.10 Semiology

5.11 Critical media analysis
5.12 Aesthetics. art criticism, art history

5.6 Content analysis

5.6.1 What is content analysis?
Content analysis is a quantitative technique for systematically describing written, spoken or visual content of documents (where document refers to any physical form of communication from books to films to musical recordings and recorded conversation).

Thus content analysis can be undertaken of the following: letters, diaries, books, short stories, printed publications, newspaper content, magazine articles, catalogues, advertisements (printed and filmed), posters, graffiti, photographs, drawings, radio broadcasts, television programmes, news reports, weather reports, videos, films, plays, Web pages, recorded oral testimonies, transcripts of conversations or depositions, speeches, interviews, folk songs, popular songs, products in shops.

It involves specifying characteristics of a communication, coding them, counting occurrences of the coded categories and subsequently using statistical techniques to analyse the data. In effect, the coded categories are treated like variables.

Examples include counting the number of aged people represented on 'prime time' television programmes and comparing them to real life; analysing the percentage of radio phone-in calls that criticise state agencies on different radio stations (Verwey, 1990); comparing the numbers and roles of males and females in soap operas (Pingree, 1983); analysing the portrayal of gender roles in pre-school books (Weitzman, 1974). (See, for example, CASE STUDY Press Reporting of Rape, which mixes content analysis with a critical approach to discourse analysis.).

Content analysis is thus a systematic way of identifying the content of documents by counting various aspects of the content. This provides accurate enumeration of the frequency of characteristics. Kerlinger (1986), for example, defined content analysis as a method of studying and analysing communication in a systematic, objective, and quantitative manner for the purpose of measuring variables.

Content analysis is one of the traditional ways of dealing with media content, where it has been used extensively. As Stuart Hall (1980, p. 117) noted, content analysis is part of an older American-based behaviourist orientation to media studies, which concerns itself with quantitative approaches including audience surveys. This tradition 'attempts to trace the relationship between mass communication and mass society in a kind of 'studies-response approach'. A recurring preoccupation of this approach has been the concern with how mass media has debased cultural standards through trivialisation. This is 'pinpointed in the issue of the media and violence'.

Content analysis studies often assume a relationship between content and effect. This rests on a common-sense notion that media content (at least some kinds of it, such as 'violence') must have some effect. Often, however, the media are not seen as causing new behaviour or attitudes but as reinforcing existing attitudes. In The People's Choice, for example, Paul Lazarsfeld  et al. (1944) argued against the idea of the media as a stimulus and showed that the media seemed to act to reinforce people's preconceptions and prejudices rather than to change them (Klapper, 1960). Similarly, cultivation theorists see the media as having a gradual effect. However, there is always the problem when analysing content of whether the content affects the viewers or whether the content reflects the society in which the viewers already live. In other words, does the media effect society or are the media effected by society (McQuail, 1983)?

Most studies of media effects tend to adopt quantitative approaches of one sort or another in attempts to show how the media are linked to social phenomena or attitudes. For example, Alexis Tan (1979) constructed scores for an experimental and control group based on how the subjects rated a list of characteristics, which she then compared statistically. Phillips (1986) used suicide statistics in his study of the effects of fictional suicides.  Pingree (1983) counted the numbers and roles of males and females in soap operas.

However, content analysis is also intended to identify what meanings, contexts and intentions are contained in messages as well as how they are communicated and with what effect.

Content analysis began in the 1930s and was developed during World War II when the United States government sponsored a project to evaluate enemy propaganda led by Harold Lasswell, whose approach was to discover who said what to whom with what effect, by statistically analysing content (Lasswell, 1940).

More recent researchers have been less vociferous about cause and effect when undertaking content analysis. Philip Stone et al. (1966), for example, stated that content analysis refers to any procedure for assessing the relative extent to which specified references, attitudes, or themes permeate a given message or document. Similarly, Robert Weber (1985) saw content analysis as a methodology that utilises a set of procedures to make inferences are about sender(s) of message, the message itself, or the audience of message.

Even though content analysis often analyses written words it is a quantitative method, the results of which are numbers, percentages, crosstabulations and multivariate analysis. An impressionistic summary of a television programme, or a review of a film or book does not constitute content analysis, despite it being about content of documents.

Some commentators claim that content analysis is somehow more objective than a qualitative evaluation. Bernard Berelson (1952, p. 17), for example, maintained that content analysis is a 'research technique for the objective, systematic, and quantitative description of the manifest content of communication'. Holsti (1968) described content analysis as a technique for making inferences by systematically and objectively identifying specified characteristics of messages and more recently Devi Prasad (2008) claimed that content analysis is all about making valid, replicable and objective inferences about the message on the basis of explicit rules.

Of course, content analysis is no more objective than any other process of analysing documents. Just because something is quantified does not make it objective, as someone subjectively determines both what it is that has to be quantified and also the method used to provide the quantification. On top of that, the outcomes of the quantification have to be interpreted or used to accept or reject a hypothesis: both processes depend on relating observation to existing theory. Qualitative content analysis

Some researchers and commentators refer to 'qualitative content analysis', which is slightly confusing but in general such approaches are not undertaking qualitative analysis per se but are coding segments or phrases to get a feeling of the dominant outcomes, intentions or approaches specified in the documents. These are often augmented by specific examples.

For example, Kajaste (2018) undertook a qualitative content analysis of audit reports of Finish universitiesof applied science. The research was exploring the quality assurance of research and innovation in the institutions. This involved an initial reading of the appropriate sections for all the sample of 15 institutions. 'A coding framework was then constructed for themes emerging from the data. Coding was done using Nvivo 11 qualitative analysis software.... Finally a synthesis of each theme was formed.' (Kajaste, 2018, p. 6). The results of the analysis are reported in a loose quantitative manner without any attempt to closely enumerate. Thus:

Universities of applied sciences have generally selected three to five overall strategic goals for RDI. The intentions are often to foster the integration between RDI and education and to update teachers' working-life knowledge through participation in RDI activities with working-life partners. Enlarging the volume of RDI activities is closely tied to external, competitive funding....At times, the audit teams have also mentioned if they have seen evidence of RDI projects being actively directed to the focus areas. MAMK was the only example, where the audit team lauded the UAS's activities in directing RDI. ((Kajaste, 2018, pp. 9–10)

Qualitative content analysis is very similar to qualitative document analysis (Section 5.7), the distinction being whether the data attempts to interpret meanings or focuses on identifying dominant tendencies.


Next 5.6.2 What is content analysis used for?