Statistical Language - Frequency Distribution
|What is a frequency distribution?|
Frequency distributions are visual displays that organise and present frequency counts so that the information can be interpreted more easily.
Frequency distributions can show absolute frequencies or relative frequencies, such as proportions or percentages.
How do we show a frequency distribution?
A frequency distribution of data can be shown in a table or graph. Some common methods of showing frequency distributions include frequency tables, histograms or bar charts.
A frequency table is a simple way to display the number of occurrences of a particular value or characteristic.
For example, if we have collected data about height from a sample of 50 children, we could present our findings as:
Height of Children
|Height (cm) of children|
|120 – less than 130|
|130 – less than 140|
|140 – less than 150|
|150 – less than 160|
|160 – less than 170|
From this frequency table we can quickly identify information such as 7 children (14% of all children) are in the 160 to less than 170 cm height range, and that there are more children with heights in the 140 to less than 150 cm range (26% of all children) than any other height range.
Data can also be presented in graphical form.
Histograms and bar charts are both visual displays of frequencies using columns plotted on a graph. The Y-axis (vertical axis) generally represents the frequency count, while the X-axis (horizontal axis) generally represents the variable being measured.
A histogram is a type of graph in which each column represents a numeric variable, in particular that which is continuous and/or grouped.
A histogram shows the distribution of all observations in a quantitative dataset. It is useful for describing the shape, centre and spread to better understand the distribution of the dataset.
Features of a histogram:
The height of the column shows the frequency for a specific range of values.
Columns are usually of equal width, however a histogram may show data using unequal ranges (intervals) and therefore have columns of unequal width.
The values represented by each column must be mutually exclusive and exhaustive. Therefore, there are no spaces between columns and each observation can only ever belong in one column.
It is important that there is no ambiguity in the labelling of the intervals on the x-axis for continuous or grouped data (e.g. 0 to less than 10, 10 to less than 20, 20 to less than 30).
The histogram below shows the same information as the frequency table.
A bar chart is a type of graph in which each column (plotted either vertically or horizontally) represents a categorical variable or a discrete ungrouped numeric variable.
It is used to compare the frequency (count) for a category or characteristic with another category or characteristic.
Features of a bar chart:
In a bar chart, the bar height (if vertical) or length (if horizontal) shows the frequency for each category or characteristic.
The distribution of the dataset is not important because the columns each represent an individual category or characteristic rather than intervals for a continuous measurement. Therefore, gaps are included between each bar and each bar can be arranged in any order without affecting the data.
If data had been collected for 'country of birth' from a sample of children, a bar chart could be used to plot the data as 'country of birth' is a categorical variable.
Birthplace of Children
|Country of Birth|
|United States of America|
The bar chart below shows us that 'Australia' is the most commonly observed country of birth of the 50 children sampled, while 'Fiji' is the least common country of birth.
Return to Statistical Language Homepage