Australian Bureau of Statistics

Rate the ABS website
CensusAtSchool
ABS @ Facebook ABS @ Twitter ABS RSS ABS Email notification service
Education Services
 

Education Services homepage

Teacher Statistical Literacy

Back to Education Services home page

Concepts and definitions

Click on the triangles to open a section. The table below is a list of the concepts covered in each section.

    Hide details for StatisticsStatistics

    Statistics are numerical data that have been organised to serve a useful purpose. A major role of the ABS is to provide the Australian community with statistics that will help them make informed decisions. Statistical information provided by the ABS is used widely in Australia by governments, business people, researchers, members of the public, teachers and students.

    Data
    Data are observations or facts which, when collected, organised and evaluated, become information or knowledge.

    Data item
    A data item is the smallest piece of information that can be obtained from a survey or census.

    Dataset
    A dataset is data collected for a particular study. A dataset represents a collection of elements; and for each element, information on one or more characteristics is included.

    Outliers
    An outlier is an extreme value of the data. It is an observation value that is significantly different from the rest of the data. There may be more than one outlier in a set of data.
    Sometimes, outliers are significant pieces of data and should not be ignored. In other instances, they occur as a result of an error or misinformation and should be ignored. The decision to include or exclude an outlier needs to be clearly justified when discussing results.

    Example:
    The weights (in kilograms) of 30 students were measured and recorded in the stem and leaf plot shown in Figure 1. In this case, the stem is the whole number values and the leaves are the decimal values. The outliers are 56.3 and 67.7.






















    Stem Leaf

    563
    57
    584 4 9
    590 0 2 3 8
    600 2 4 5 7 8 9
    611 2 4 4 5 6 7 9 9
    621 2 3 7
    63
    64
    657

    Fig 1 Stem and leaf plot

    Show details for VariablesVariables
    Show details for SamplingSampling
    Hide details for Frequency and distributionFrequency and distribution

    The frequency (f) of a particular observation is the number of times the observation occurs in that data.

    Cumulative frequency
    Cumulative frequency is the total of a frequency and all frequencies below it in a frequency distribution. It is the running total of frequencies.

    Relative frequency
    Relative frequency is another term for proportion. It is the number of times a particular observations occurs divided by the total number of observations.

    Distribution
    The distribution of a variable is the pattern of values of the observations.








    Hide details for Graphs and displaysGraphs and displays

    Graph
    A graph is a diagram representing a system of connections or interrelations among two or more variables by a number of distinctive dots, lines, bars, etc.

    Chart
    A chart is a visual representation of data. Bar, line, pie and other types of charts are examples of charts.

    Box and whisker plots (often called ‘box plots’) can be used to show the interquartile range. Figure 1 shows a box and whisker plot of student ages.
    Notice that a scale is drawn underneath. Box plots can be drawn horizontally or vertically.

    Frequency distribution tables can be used for nominal and numeric variables.

    Example:
    Twenty people were asked how many cars were registered to their households. The results were recorded as follows: 1, 2, 1, 0, 3, 4, 0, 1, 1, 1, 2, 2, 3, 2, 3, 2, 1, 4, 0, 0. This data can be presented in a frequency distribution table – see Figure 2.

    Stem and leaf plots are a convenient way to organise data. Each observation value is considered to consist of two parts - a stem and a leaf.

    • the stem is the first digit or digits
    • the leaf is the final digit

    Example:
    The number of books ten students read in one year were as follows: 12, 23, 19, 6, 10, 7, 15, 25, 21, 12.
    In ascending order, these are: 6, 7, 10, 12, 12, 15, 19, 21, 23, 25. Figure 3 is a stem and leaf plot of this data.

    In the stem and leaf plot (fig 3):

    • the stem '0' represents the class interval 0-9
    • the stem '1' represents the class interval 10-19
    • the stem '2' represents the class interval 20-29.

    If there are a large number of observations for each stem, the stem can be split in two. For example the interval 0-9 could be split into intervals 0-4 and 5-9. The stem would then be written as 0(0) and 0(5).

    Time series
    A time series is a collection of observations of well-defined data items obtained through repeated measurements over time. For example, measuring the value of retail sales each month of the year would comprise a time series.

    Trend
    The ABS defines a trend as the long term movement in a time series without calendar related and irregular effects, and is a reflection of the underlying change in that measure. It is the result of influences such as population growth, price inflation and general economic changes.

    Equation: Box and whisker plot
    Fig 1 Box and whisker plot






    Number of cars (x)
    Tally
    Frequency (f)

    0
    l l l l
    4
    1
    l l l l l
    6
    2
    l l l l
    5
    3
    l l l
    3
    4
    l l
    2

    Fig 2 Frequency distribution table




    Stem Leaf

    0
    1
    2
    6 7
    0 2 2 5 9
    1 3 5

    Fig 3 Stem and leaf plot

    Show details for Summary statisticsSummary statistics


    List of items in each category


    Commonwealth of Australia 2008

    Unless otherwise noted, content on this website is licensed under a Creative Commons Attribution 2.5 Australia Licence together with any terms, conditions and exclusions as set out in the website Copyright notice. For permission to do anything beyond the scope of this licence and copyright terms contact us.