9.1 The classic experiment
Experimentation is regarded as the method of science (Easthope, 1974; Mayntz et al., 1976). It provides a way of measuring the impact that one factor has on another one. This is done, in the standard scientific experiment, by controlling all the other factors that may also have an effect.
The experiment involves testing variables under controlled circumstances. There are two types of variable, independent and dependent. The independent variable is presumed to cause changes in the dependent variable. In the experiment, the independent variable is manipulated to see what happens to the dependent variable. There are, thus, two basic requirements of an experiment.
First, that the researcher is able to manipulate the independent variable.
Second, that the researcher is able to control all other factors.
The laboratory experiment in the chemical and physical sciences is the ideal type of experiment. In such settings, the researcher sets up a situation in which all theoretically likely factors are controlled and then changes one of these factors to see what effect it has on a dependent factor.
For example, an electric circuit might be set up which includes a possible resistor (that is, something in the circuit that interferes with the flow of the electricity). The electrical supply is turned on to see whether the resistor indeed reduces the flow of electricity.
The way the experiment is set up depends on existing theory and the result is expected to confirm or elaborate the theory. In the example above, it may be assumed that the resistor will work but its level of resistance may be unknown. The design of the experiment is based on existing electromagnetic theory in order to test the resistance. In this case, of course, the researcher will not bother to control for the colour of the insulator on the wires that make up the circuit as this is irrelevant according to electromagnetic theory. The colour of the wires will not have any causal effect according to the theory and thus there is no need to control for them.
Thus, an experiment is set up to confirm, or to elaborate, empirically an existing theory by testing an experimental hypothesis. It is not a means of collecting data. The stages of conducting an experiment are as follows:
1. Construct an experimental hypothesis that specifies the independent variable, the dependent variable and the relationship between the two.
2. Decide on a suitable experimental design.
3. Decide on the control variables.
4. Work out how you are going to control these variables during the experiment.
5. Work out how to measure changes in the dependent and independent variables.
6. Apply the experimental stimulus.
7. Measure changes in the dependent variable.
8. Analyse the results and accept or reject the experimental hypothesis.
An experimental hypothesis is a specific statement about the relationship between two variables (dependent and independent variables) that can be tested by setting up a situation in which all other relevant factors are controlled. The hypothesis will be guided by theory.
For example, the ‘risky shift’ phenomenon was suggested by Filby and Harvey (1988) to explain, in part, why betting-shop punters were less successful during the busiest times in the betting shop. The theory suggests that people make riskier decisions when in a group than they would when on their own. Hence, when the betting shop is at its busiest there is more scope for punters to meet and discuss betting options, leading to riskier selections.
The risky shift phenomenon could be tested experimentally using the following hypothesis:
Punters are more likely to make selections with longer odds when making choices in a group than when making selections individually.
In the social sciences two types of experimental design are normally used. The first is the test–stimulus–retest approach. In this situation a group is tested to measure the extent of the dependent variable. An independent variable is introduced as a stimulus and then the group is retested to see whether the dependent variable has changed.
A simple example would be to test a group to see what knowledge they had about the effects of drinking alcohol, to show them a film about alcohol consumption, then to retest their knowledge. The assumption is that any change in the level of knowledge is caused by the film.
This, however, raises problems of control. What if other events took place between the test and the retest that effected the knowledge of the sample group? It is possible that simply testing the group in the first place is sufficient to get them thinking and thus lead to an increase in knowledge irrespective of the film.
The second type of experimental design is to use a ‘control group’. In this approach, two samples are selected, an experimental sample and a control sample. These samples should be as similar as possible to minimise the possibility of other factors affecting the result. This approach could have been applied in the experiment with the film. The experimental group could have been shown the film but not the control group. Both groups could then have been tested to assess their knowledge of the effects of alcohol.
Alexis Tan (1979) used the control group design in her study of the effects of beauty advertisements on perceptions of young women. If the experimental and control groups are well matched, then any difference in knowledge is presumed to be caused by the film. However, this does assume that the two groups start from the same level of knowledge.
The study focused on television advertisements that used sex appeal, beauty or youth as selling points. Tan referred to these as ‘beauty commercials’: for example, an advertisement that suggested using a particular toothpaste increased ‘sex appeal’.
The main concern was to determine whether exposure to TV beauty commercials affected a viewer’s perception of the importance of beauty, sex appeal and youth in various ‘real life’ roles.
The 56 subjects in the study were female high school students aged between 16 and 18.
The general hypothesis tested was ‘that all subjects exposed to the TV beauty commercials will rate sex appeal, youth and beauty characteristics more important in four defined role relationships than subjects not exposed to the beauty commercials’. The four roles were: success in a career or job; success as a wife; to be popular with (or liked by) men; and ‘for you personally to be desirable as a woman’.
The subjects were divided into two groups at random. The first group (called the experimental group) were shown 15 network TV ‘beauty commercials’ (called the treatment). The second group (called the control group) were shown 15 network commercials, such as Alpo dog food, all devoid of these ‘beauty’ features. In other respects the time and place in which the commercials were viewed and the length of the sequence of 15 commercials were identical.
To measure how important the two groups rated sex appeal, beauty and youth the subjects were asked to identify what they thought were the five most important characteristics, ranked in order, to be successful in the four different roles mentioned above. The subjects had to choose their five most important characteristics from a list that contained five beauty traits (a pretty face; sex appeal; a youthful appearance; a healthy, slim body; glamour) and five non-beauty traits (intelligence: hard-working; articulate (good) talker; a good education; competence). The ranking was awarded points 5 down to 1.
The total points for beauty items were recorded for each respondent (and could range from 15 (if all beauty items were chosen) to 0 (if none were selected). The average for the two groups for each role were as follows:
||Experimental group n=23
||Control group n=33
|To be successful in career
|To be a successful wife
|To be popular with men
|Personally desirable characteristic
The results indicated that exposure to beauty commercials (the treatment) had a marked effect on attitudes towards the two of the four roles. However, it is a small sample, there is no way of knowing whether the effect would be long-term nor what the prior disposition of the subjects were towards beauty.
One way to take account of this assumption would be to combine the test– stimulus–retest design with the control group design. This is a common design in social science experimental research. Thus, in the example on the effects of drinking alcohol, the experimental group and the control group would be given the two tests but only the experimental group would be shown the film. The difference between the first test and the retest would be compared for the two groups. If there is a larger increase in knowledge for the experimental group than for the control group then it would be assumed that the increase in knowledge was caused by the film (see Figure 9.1 Experimental design).
What experimental design would you use to test the following hypotheses:
1. First-year students are more likely to use the library if they have an introductory talk from the librarian.
2. Males would be more likely to take up a career in nursing if they spent some time on work experience in a hospital.
3. The risky shift hypothesi: that people make riskier decisions when in a group than they would when on their own.
(See suggested answer here)
REMEMBER to say which is the independent variable and which is the dependent variable. Putting your hypothesis in diagrammatic form can be a useful technique when you are trying to sort out your research design.
Control is a key issue in experimentation. It is necessary to identify the control variables and then devise some way of controlling them. The control variables will be those factors that, in theory, are likely to change the effect that the independent variable, which is being manipulated, will have on the dependent variable.
For example, an experiment to see how exposure to propaganda affected support for the Gulf War would need to take account of the gender of the respondent. Opinion polls suggested that at the outbreak of the war, a much larger proportion of men than women supported the war. Gender is thus a control variable in this case. It is often difficult to identify all the relevant control variables.
Once the control variables have been identified some way of controlling for them has to be devised. One way of doing this is, in effect, to exclude control variables. In the previous example, one way of controlling for gender differences is to have all the experimental subjects the same gender. This is known as a homogeneous group.
If the study group is all male then there may be a need to test the outcomes with an all female group to see whether the experimental results obtained are repeated.
In practice, however, it is difficult to identify all the important control variables and to be able to assemble such homogeneous experimental groups.
Allocating subjects to experimental and control groups can be done in two ways. The easiest way is ‘randomisation’.
In this process a single group is allocated into two subgroups at random. The assumption is that the control variables will also be randomised. In other words, the effect of the control variables will be purely random. If the two samples show a statistically significant difference (see Section 22.214.171.124)i n the dependent variable then it is assumed that this is due to the experimental stimulus and not the randomised control variables.
There are two major problems with this approach. First, it is not always possible to allocate a sample into two random subgroups. Second, randomisation of control variables may not be achieved if the sample groups are small; and this procedure should only be used when there is an initial sample of about one hundred people.
The other way of getting a control group is by using matched pairs. In this procedure relevant control factors are identified. Then for every person allocated to the experimental group, a person with the same set of values of the control variable is allocated to the control group.
For example, gender, age and ethnicity might be important control factors in an experiment. So, if an 18-year-old white woman was allocated to the experimental group then a matching 18-year-old white woman would also need to be allocated to the control group.
There are several problems with this procedure. First, you have to decide on the important control variables. The more you have, the more difficult it will be to match people. Second, this approach is likely to be wasteful because you are unlikely to be able to match all the people in your original sample. Third, if someone drops out of one group during a test-retest design, then you have to disregard the results of the matching person in the other group.
In general, it is preferable to match the control and experimental groups but also to allocate the pairs into the two groups at random. This way, control variables that you had not considered are likely to be randomly allocated.
In the suggested risky shift experiment (see answer to Activity 9.1, part 3) a randomised allocation was suggested. As the ‘risky shift’ notion does not specify any other factors that are likely to have a bearing then it is difficult to determine other control variables for purposes of matching.
An experiment reported by Dixon et al. (1987) illustrated the steps in the experimental approach (CASE STUDY Healthy snacks).
Experiments are used to establish causal relations. Readings taken during an experiment are used to confirm or deny a very precise hypothesis that has been specified in advance. Experimentation is, thus, a research design rather than a data collection procedure.
In effect, an experiment occurs at a late stage of the research and is a practical test of a causal theory.
The researcher, by using the experiment, hopes to prove or disprove a hypothesis. Proof refers to the evidence that is collected and presented to establish a causal relationship rather than just a chance happening. The idea of proof contains the notion that there are ‘objective’ facts that can be verified using scientific methods borrowed from the natural sciences. Thus, the experiment is the ultimate positivist research design.
Survey research that adopts multivariate analysis attempts to reproduce the experimental approach. It does this not by physically controlling factors, as in the experiment, but by taking account of control variables when showing relationships between independent and dependent variables.
Although the experiment is the ‘classic’ positivistic approach, it has limited application in the social sciences, even for positivists. The aim of experiments is to allow a researcher to examine the effect that a change to one variable has upon another. In other words, to put the researcher in a position where she or he can be in control of all the variables and be able to measure and study them and their effect. The problem in the social sciences is that the researcher is unlikely either to be able to manipulate the independent variable or to control for all other relevant factors.
In practice, psychologists are more likely to use experiments than are other social scientists. Laboratory-type experiments are often used to simulate a situation to assess subjects’ psychological reactions.
For example, Haney et al. (1973) looked at the effect on individuals when they were given the role of either prisoner or guard in a simulation prison. Similarly, in 1961, Albert Bandura carried out a psychological experiment. He showed children an adult beating up an inflatable doll, then left each child with the doll to see what he or she would do. The children also threw punches at it. He concluded that we are inclined to copy violent behaviour, rather than find it cathartic.
In an experiment in 1986, designed to test reactions to pornography, Neil Malamuth and Joseph Ceniti recruited 42 male college students and assessed them on the ‘likelihood of rape’ scale. Then he divided them randomly into three groups. The first was given a selection of sexually explicit materials containing scenes of rape and sadomasochism. The second was given non-violent pornography. The third group, the control group, was given non-pornographic material. Participants were studied over a 4-week period and were exposed to the media on ten separate occasions. In what the subjects thought was an unrelated experiment a week later, each of the men was paired up with a woman and told that she was not attracted to him. Then they had to play a guessing game, with the man having an option to punish the woman each time she got the answer wrong. The authors stated there was no difference in proclivity to rape between those exposed to the violent versus non-violent pornography, or the non-pornographic media. There were also no significant effects found over the desire to hurt females. In conclusion, neither violent nor non-violent pornography viewed over time, affected aggressiveness toward females or the likelihood to rape. From this and many other similar experiments, Malamuth and Ceniti (1986) concluded that if a man is already sexually aggressive and consumes a lot of sexually aggressive pornography, there is a greater likelihood that he will commit a sexually aggressive act. The authors thus argued against the simplistic thesis that pornography leads to rape, suggesting instead that, for some people, it may be viewed as a positive aspect of their life and does not lead them in any way to engage in any form of anti-social behaviour, while for others who have several other risk factors it can be a contributing factor towards violent activity.
Reliability refers to the ability of a research instrument to measure phenomena in a consistent manner (see Section 1.9). In some circumstances the reliability of experiments in the social sciences has been called into question because the very fact that people are taking part in an experiment can affect their behaviour. This is known as the ‘Hawthorne effect’, named after studies done at the Hawthorne Works (Roethlisberger and Dickson, 1939).
The Hawthorne studies showed that people under observation do not behave normally but respond to experimental conditions, not the experimental stimulus. In the study at the Hawthorne Works, the workers increased productivity even when their work conditions had worsened. They worked harder because of what they saw as the interest shown in them by the researcher and because they though the experiment would somehow lead to better working conditions in the factory. Thus they ignored the experimental stimulus and simply went on producing more because they were in an experimental situation.
This reaction to the experimental situation undermines the measurements made. Instead of having an unreliable measuring device the measurements are unreliable because the context in which they are made keeps changing. In short, the meaning of measurements keeps changing even though the ‘objective’ measure is consistent. It thus questions the reliability of the outcomes of experiments involving human subjects.
Theoretically, the experiment is internally valid (see Section 126.96.36.199). In other words, if the idea of experimental logic is accepted, then, when all other possible factors have been controlled, changes in the dependent factor must be the result of manipulating the independent factor.
The internal validity of the research can only be called into question if the design is faulty, or the means of controlling other factors is inadequate. In practice, of course, design and control are a problem, which is the main reason why experiments are not often used by sociologists.
The unreliability (see Section 1.9) of some experiments is due to the problem of controlling for the impact of the experiment itself.
Experiments are regarded as having poor external validity (see Section 188.8.131.52). External validity refers to the applicability of the results in different circumstances. Experimental results in the social sciences are often regarded as having very little relevance for the natural setting outside the experiment (ecological validity (see Section 184.108.40.206)).
To what extent would the experimental sample of betting-shop punters make the same decisions if they were wagering their own money in a betting shop? Would Haney’s experimental prison guards still like the role if it wasn’t a simulated prison? Similarly, are the effects noted as the result of an experiment long term or short term? Experiments that attempt to show the effects on subjects of televised violence have been criticised for assuming that immediate reactions are the same as long-term effects.
External validity also relates to the extent to which experimental results can be generalised to a broader population (population validity). An experiment might work on a group of psychology students but does this mean that the results can be generalised to all students, or the whole population? If you are going to make claims about the population then you must conduct the experiment on a representative sample.
Christopher Ferguson and Richard Hartley, (2009, p. 326) sum up the problems in relation to studies of the effect of pornography:
Results of these experimental research studies reveal that effects appear negligible, temporary and difficult to generalize to the real world. Studies such as these are also fraught with many limitations, some of which include validity issues with “aggression” measures, brief exposure times, complexities of correlating attitudes with behavior, and difficulties in generalizing results from college students to actual sexual offenders and rapists (Mould, 1988).
Suggested answer to Activity 9.1 part 3: A suitable design for the ‘risky shift’ experiment would be to ask a sample of punters to select the winners from a racecard of six races. The sample would be divided into two groups at random. The first group to make betting selections singularly, the second group to be divided into subgroups of five individuals and asked to make their decisions as a group. Riskiness would be judged by the combined odds of the selections made by each individual or subgroup. The average odds for the two samples would be tested to see if they were significantly different.