Orientation Observation In-depth interviews Document analysis and semiology Conversation and discourse analysis Secondary Data Surveys Experiments Ethics Research outcomes



Social Research Glossary

About Researching the Real World



© Lee Harvey 2012–2018

Page updated 9 February, 2018

Citation reference: Harvey, L., 2012–2018, Researching the Real World, available at
All rights belong to author.


A Guide to Methodology

5. Document analysis and semiology

5.1 Introduction
5.2 Document analysis for what?
5.3 Establishing the nature of documents and categorising them (external analysis)
5.4 Approaches to document analysis
5.5 Evidence of occurrence
5.6 Content analysis

5.6.1 What is content analysis?
5.6.2 What is content analysis used for?
5.6.3 How to do content analysis The research topic Selecting content for analysis Developing content categories Coding the data in the document Inter-rater reliability Analysis of content Strengths and limitations of content analysis

5.7 Qualitative document analysis
5.8 Historical research
5.9 Hermeneutics
5.10 Semiology

5.11 Narrative analysis
5.12 Aesthetics. art criticism, art history

5.6 Content analysis

5.6.1 What is content analysis?
Content analysis is a quantitative technique for systematically describing written, spoken or visual content of documents (where document refers to any physical form of communication from books to films to musical recordings and recorded conversation).

Thus content analysis can be undertaken of the following: letters, diaries, books, short stories, printed publications, newspaper content, magazine articles, catalogues, advertisements (printed and filmed), posters, graffiti, photographs, drawings, radio broadcasts, television programmes, news reports, weather reports, videos, films, plays, Web pages, recorded oral testimonies, transcripts of conversations or depositions, speeches, interviews, folk songs, popular songs, products in shops.

It involves specifying characteristics of a communication, coding them, counting occurrences of the coded categories and subsequently using statistical techniques to analyse the data. In effect, the coded categories are treated like variables.

Content analysis is thus a systematic way of identifying the content of documents by counting various aspects of the content. This provides accurate enumeration of the frequency of characteristics. Kerlinger (1986), for example, defined content analysis as a method of studying and analysing communication in a systematic, objective, and quantitative manner for the purpose of measuring variables.

Content analysis is one of the traditional ways of dealing with media content, where it has been used extensively. As Stuart Hall (1980, p. 117) noted, content analysis is part of an older American-based behaviourist orientation to media studies, which concerns itself with quantitative approaches including audience surveys. This tradition 'attempts to trace the relationship between mass communication and mass society in a kind of 'studies-response approach'. A recurring preoccupation of this approach has been the concern with how mass media has debased cultural standards through trivialisation. This is 'pinpointed in the issue of the media and violence'.

However, content analysis is also intended to identify what meanings, contexts and intentions are contained in messages as well as how they are communicated and with what effect.

Content analysis began in the 1930s and was developed during World War II when the United States government sponsored a project to evaluate enemy propaganda led by Harold Lasswell, whose approach was to discover who said what to whom with what effect, by statistically analysing content (Lasswell, 1940).

More recent researchers have been less vociferous about cause and effect when undertaking content analysis. Philip Stone et al. (1966), for example, stated that content analysis refers to any procedure for assessing the relative extent to which specified references, attitudes, or themes permeate a given message or document. Similarly, Robert Weber (1985) saw content analysis as a methodology that utilises a set of procedures to make inferences are about sender(s) of message, the message itself, or the audience of message.

Even though content analysis often analyses written words it is a quantitative method, the results of which are numbers, percentages, crosstabulations and multivariate analysis. An impressionistic summary of a television programme, or a review of a film or book does not constitute content analysis, despite it being about content of documents.

Some commentators claim that content analysis is somehow more objective than a qualitative evaluation. Bernard Berelson (1952, p. 17), for example, maintained that content analysis is a 'research technique for the objective, systematic, and quantitative description of the manifest content of communication'. Holsti (1968) described content analysis as a technique for making inferences by systematically and objectively identifying specified characteristics of messages and more recently Devi Prasad (2008) claimed that content analysis is all about making valid, replicable and objective inferences about the message on the basis of explicit rules.

Of course, content analysis is no more objective than any other process of analysing documents. Just because something is quantified does not make it objective, as someone subjectively determines both what it is that has to be quantified and also the method used to provide the quantification. On top of that, the outcomes of the quantification have to be interpreted or used to accept or reject a hypothesis: both processes depend on relating observation to existing theory.


Next 5.6.2 What is content analysis used for?