|Module 2: Describing, Clarifying and Presenting Data
4. Summarising data
Let’s review the data of student marks used at the beginning of section 3 in this module – namely:
52, 64, 16, 48, 35, 52, 85, 96, 90, 87, 77, 78, 37, 68, 62, 60, 51, 55, 57, 64, 54, 51, 62, 43, 68, 71, 76, 68, 65, 83, 47, 44
This set of data represents univariate (single variable, one variable) observations.
Knowing the context of these data allows you to make some assumptions about the data. For example, you might assume that since these data are subject marks, and the highest is 96, the range for these marks could be 0-100.
You can use sample statistics to examine the distribution of the data. In Module 1, a statistic was defined as a numerical characteristic (measure) of a sample. A parameter is a numerical characteristic (measure) of a population. Usually the values of parameters are not known so a statistic is used to make inferences about a population. For this inference step (from sample to population) the sample must be representative of the population.
What statistics can you generate that tell you something about the distribution of data?
Using statistics you can say something about:
1. the centre of the distribution of data
2. the spread of distribution
3. the shape of the distribution