TECHNICAL NOTE
RELIABILITY OF THE ESTIMATES
1 The estimates in this publication are based on information obtained from a sample survey. Any data collection may encounter factors, known as non-sampling error, which can impact on the reliability of the resulting statistics. In addition, the reliability of estimates based on sample surveys are also subject to sampling variability. That is, the estimates may differ from those that would have been produced had all persons in the population been included in the survey. This is known as sampling error.
Non-sampling error
2 Non-sampling error may occur in any collection, whether it is based on a sample or a full count such as a census. Sources of non-sampling error include non-response, errors in reporting by respondents or recording of answers by interviewers and errors in coding and processing data. Every effort is made to reduce non-sampling error by careful design and testing of questionnaires, training and supervision of interviewers, and extensive editing and quality control procedures at all stages of data processing. It is not possible to quantify the non-sampling error.
Sampling error
3 One measure of sampling error is given by the standard error (SE), which indicates the extent to which an estimate might have varied by chance because only a sample of persons was included. There are about two chances in three (67%) that a sample estimate will differ by less than one SE from the number that would have been obtained if all persons had been surveyed, and about 19 chances in 20 (95%) that the difference will be less than two SEs.
4 Another measure of the likely difference is the relative standard error (RSE), which is obtained by expressing the SE as a percentage of the estimate. The RSE is a useful measure in that it provides an immediate indication of the percentage error likely to have occurred due to sampling and therefore avoids the need to also refer to the size of the estimate.
5 Only estimates (numbers or percentages) with RSEs less than 25% are considered sufficiently reliable for most analytical purposes. However, estimates with larger RSEs have been included. Estimates with an RSE in the range 25% to 50% should be used with caution while estimates with RSEs greater than 50% are considered too unreliable for general use. All cells in the Excel spreadsheets with RSEs greater than 25% have been annotated and footnoted.
6 Another measure of sampling error is the Margin of Error (MOE), which describes the distance from the population value that the sample estimate is likely to be within, and is specified at a given level of confidence. Confidence levels typically used are 90%, 95% and 99%. For example, at the 95% confidence level the MOE indicates that there are about 19 chances in 20 that the estimate will differ by less than the specified MOE from the population value (the figure obtained if all dwellings had been enumerated). The 95% MOE is calculated as 1.96 multiplied by the SE.
7 The 95% MOE can also be calculated from the RSE by:
8 The MOEs in this publication are calculated at the 95% confidence level. This can easily be converted to a 90% confidence level by multiplying the MOE by:
or to a 99% confidence level by multiplying by a factor of:
9 A confidence interval expresses the sampling error as a range in which the population value is expected to lie at a given level of confidence. The confidence interval can easily be constructed from the MOE of the same level of confidence by taking the estimate plus or minus the MOE of the estimate.
10 Estimates of proportions with an MOE more than 10% are annotated to indicate they are subject to high sample variability and particular consideration should be given to the MOE when using these estimates. Depending on how the estimate is to be used, an MOE greater than 10% may be considered too large to inform decisions. In addition, estimates with a corresponding standard 95% confidence interval that includes 0% or 100% are annotated to indicate they are usually considered unreliable for most purposes. |
11 The Excel spreadsheets in the Downloads tab contain all the tables produced for this release and the calculated RSEs and/or MOEs for each of the estimates.
Calculations of standard errors
12 Standard errors can be calculated using the estimates (counts or percentages) and the corresponding RSEs. See
What is a Standard Error and Relative Standard Error, Reliability of estimates for Labour Force data for more details.
Standard errors of proportions and estimates
13 Proportions and percentages formed from the ratio of two estimates are also subject to sampling errors. The size of the error depends on the accuracy of both the numerator and the denominator. A formula to approximate the RSE of a proportion is given below. This formula is only valid when x is a subset of y:
Comparisons of estimates
14 The difference between two survey estimates (counts or percentages) can also be calculated from published estimates. Such an estimate is also subject to sampling error. The sampling error of the difference between two estimates depends on their SEs and the relationship (correlation) between them. An approximate SE of the difference between two estimates (x-y) may be calculated by the following formula:
15 While this formula will only be exact for differences between separate and uncorrelated characteristics or sub populations, it provides a good approximation for the differences likely to be of interest in this publication.
Significance testing
16 A statistical significance test for a comparison between estimates can be performed to determine whether it is likely that there is a difference between the corresponding population characteristics. The standard error of the difference between two corresponding estimates (x and y) can be calculated using the formula shown above in the Comparison of estimates section. This standard error is then used to calculate the following test statistic:
where
17 If the value of this test statistic is greater than 1.96 then there is evidence, with a 95% level of confidence, of a statistically significant difference in the two populations with respect to that characteristic. Otherwise, it cannot be stated with confidence that there is a real difference between the populations with respect to that characteristic.