1304.5 - Stats Talk WA, Sep 2010  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 29/09/2010   
   Page tools: Print Print Page Print all pages in this productPrint All

A testing relationship

Chai latte or chi-square? I guess that depends upon whether you want a new age frou frou refreshing drink or a statistic.

Assuming you would like the statistic, you may be asking what’s chi-square all about?

A chi-square (pronounced ‘kai’) is a statistical test which even comes with a groovy symbol χ. A chi- square tests whether two variables are related or independent of each other. These variables might be men and women’s beverage choice, or income and suburb choice.

It is about making sure the results are what we might expect. One could ask, whether sex is related to beverage preference? As sex and beverage preference are categorical variables, (they can only take a set number of values) we can test this by using a chi-square test. Some may argue however, that there is a definite link between alcohol and sex but I digress.

Firstly we calculate a chi-square test statistic. This is a measure of how different the data we observe are to what we would expect to observe if the variables were truly independent. The higher the test-statistic, the more likely that the data we observe did not come from independent variables.

Comparing Apples and Oranges - Agricultural Commodities WA Year Ended 30 June 2008

That’s where the chi-square distribution comes in. We may observe data that give us a high test-statistic just by chance, but the chi-square distribution shows us how likely it is. The chi-square distribution takes slightly different shapes depending on how many categories (degrees-of-freedom) our variables have. (I often wonder if a teenage girl’s degree of freedom is related to the age of her boyfriend and whether he has a car.)

Interestingly, when the degrees of freedom get very large, the shape begins to look like the bell curve we know and love. This is a property shared by the tea-distribution.. ahem… T-distribution.

So if the difference between what we observe and what we expect from independent variables is large (that is, the chi-square distribution tells us it is unlikely to be that large just by chance) then we reject the null hypothesis that the two variables are independent. What a shame we can’t do a chi-square test on couples before they get married, I know of a lot of null hypotheses that would have been rejected.

Instead, we favour the alternative that there is a relationship between the variables. So like a mother, who is always looking into your relationships, chi-square can help us discover that there is a relationship but cannot look too deeply into what that relationship is (unlike your mother).

So who does these sorts of tests? The ABS does, in fact if you are interested there are some papers you could read such as Measuring and Correcting for Information Loss in Confidentialised Census Counts (cat. no. 1352.0.55.083) or Patterns of Innovation in Australian Businesses, 2005 (cat. no. 8163.0).

We are not the only ones, universities and health authorities do chi-square tests – for example, variables to test include the relationship between smoking and low birth weight babies.

Anything else you need to know? I guess that depends on what you are looking to find out! Have a look at the ABS website or contact your (or our) friendly methodologist for more information. Personally I am off to make a chai latte.

Brian Pink Australian Statistician

Naomi Summers
Who actually prefers a flat white.