# Australian Bureau of Statistics

 ABS Home #search{vertical-align:-3.5px; }
 Understanding statistics

Module 3: Interpreting Data

6. What about causation in observational studies? Beware the confounding variable!

6.1 Confusion of scales

There is a tendency to treat qualitative variables that are measured using an ordinal scale of measurement as though they are measured on an interval/ratio scale. The reason is that relationships involving them can be examined using correlation coefficients.

An early observational study compared the results from 3 studies to examine the relationship between smoking habits and death rates (Cochrane1968). These data is presented in the scenario below.

 Scenario

 Test your knowledge Question After you examined these data between death rates and smoking habits what recommendations would you make to cigar and pipe smokers? Answer These data are very convincing. You need to give up smoking. If you can't give up smoking, you need to switch to cigarettes. I can't give you any advice right now. I need to know more about how these data were collected. Click here for answers

 Test your knowledge Question Is there a variable that is strongly related to dying? What do you think this variable is? Answer sex educational attainment age cultural group Click here for answers
Confounding variables tend to be more of a problem in observational studies than they are in experimental studies because in experiments it is possible to control for confounding variables. In general, the presence of a confounding variable can lead to misinterpretation when data are aggregated over the different values of the confounding variable.

Sometimes data from several groups, like separate age groups in the previous example, are combined to form a single group as in Figure 6.1.

Figure 6.1

This can happen with categorical and quantitative variables and it can lead to faulty reasoning. For example, the data in Table 6.1 simulates that which may be obtained from a university Admissions Centre (these data are artificial but reflect research in this area (Bickel & O'Connell 1975). The data are of acceptance rates for post-graduate study for males and females.

 Males Females Admitted to study 40 15 Not admitted 50 35 Total 90 50

Table 6.1

If you convert these data to percentages, what do you observe?

 Males % Females % Admitted to study 44 30 Not admitted 56 70 Total 100 100

Table 6.2

These percentages appear to indicate that the university admits a higher proportion of males than females. This would seem to indicate that females do not get a fair chance when they are seeking admission to the university. However, the university says that its figures do not indicate that it is acting unfairly towards females. Table 6.3 presents the data for enrolments within programs of study.

 Science/Engineering English/History Males Females Males Females Admitted 35 (50%) 5 (50%) Admitted 5 (25%) 10 (25%) Not Admitted 35 (50%) 5 (50%) Not Admitted 15 (75%) 30 (75%) Total 70 10 Total 20 40

Table 6.3

Note that the university admitted the same proportion of males and females into Science/Engineering and the English/History program. By dividing the data into programs of study within the university, the university was able to show that the Science/Engineering program admitted half of the applicants, both male and female, and English/History program admitted a quarter of both females and males.

 Test your knowledge Question Do these data indicate any relationship between sex and university admission decisions? Examine both tables and indicate which variable was reported in table 3 but not in table 2. Answer Sex University admission Age Program of study Click here for answers
This example is a classic case of Simpson's Paradox named after E. H. Simpson who identified this phenomenon in part of a wider discussion of conditional probability. Simpson's paradox occurs when the aggregate affect is in the opposite direction to the parts.

In some states of the USA convicted murderers can be sentenced to death. People have argued that whether a convicted murderer is sentenced to death or not depends on their race. Table 6.4 showed the following results for one study on race and the death sentence (Radelat. 1981).

 Sentenced to Death Not Sentenced to Death Total White Defendant 19 (11.9%) 141 160 Black Defendant 17 (10.2%) 149 166

Table 6. 4 Defendant's Race and the Death Sentence

Table 6.4 data indicates that 19/160 = 11.9% of white defendants are sentenced to death and 17/166 = 10.2% of black defendants are sentenced to death. From this data, there is little difference between the percentage of black and white defendants being sentenced to death. This seems counter intuitive to what we would expect, knowing a little about race discrimination in the U.S.A. When more data is available, such as the race of the victim, the data appears as in Table 6.5:

 Victim White Black Defendant Sentenced to death Not sentenced to death Sentenced to death Not sentenced to death White 19 132 0 9 Black 11 52 6 97

Table 6.5 Death Sentence in Reference to Race of Victim

Let's consider the 'sentencing to death' data in Table 5.

First, if the victim is white.

• When a victim is white, a white defendant has a 19/151 = 12.6% chance of being sentenced to death
• When a victim is white, a black defendant has a 11/63 = 17.5% chance of being sentenced to death
• When a victim is white, a white defendant has a 19/151 = 12.6% chance of being sentenced to death
• When a victim is white, a black defendant has a 11/63 = 17.5% chance of being sentenced to death

Second, if the victim is black.
• When a victim is black, a white defendant has a 0/9 = 0% chance of being sentenced to death
• When a victim is black, a black defendant has a 6/97 = 6% chance of being sentenced to death

Clearly the variable “race of victim” has introduced significant differences in the chance of being sentenced to death when compared with Table 4 – and the black defendant has a higher chance of being sentenced. “Race of victim” is a confounding variable.

So there has been a confounding variable at work – namely the race (ethnicity) of the victim.

Tables 6.4 and 6.5 indicate that there existed a confounding variable that was hidden when all crimes for which the death penalty was invoked were combined and then compared against the race of the defendant.

 Test your knowledge Question Can you identify the confounding variable in this example? Answer whether the defendant was sentenced to death the age of the defendant the race of the defendant the race of the victim Click here for answers

There may be other factors that are instrumental in the sentencing decision. If the circumstances of the crime are taken into account, it is found that black homicides tend to occur during disputes between people who know each other. However, white homicides are more likely to occur when the defendant is committing another type of crime such as robbery. The circumstances are considered 'aggravating' and thus increases the likelihood that the defendant, regardless of race, will be sentenced to death. Although another more recent study (Seligman 1994) has demonstrated that a higher proportion of white defendants is sentenced to death, other researchers (Rothman & Powers 1994) have argued that currently there is no racial bias in the death sentence.

 Test your knowledge Question In the above paragraph, which variable is a possible confounding variable in the proportion of white and black defendants sentenced to death. Answer Race of victim Race of defendant Circumstances of the crime The type of crime Click here for answers