|Page tools: Print Page Print All|
Standard errors and replicate weights
2 Sampling error occurs because only a small proportion of the total population is used to produce estimates that represent the whole population. Sampling error can be reliably measured, as it is calculated based on the scientific methods used to design surveys.
3 Non-sampling error may occur in any data collection, whether it is based on a sample or a full-count (i.e. Census). Non-sampling error may occur at any stage throughout the survey process. Examples include:
4 More detailed information on sample survey errors, including sampling error, non-sampling error and response rates is provided in Data quality chapter.
5 Sampling error is the expected difference that could occur between the published estimates, derived from repeated random samples of persons, and the value that would have been produced if all persons in scope of the survey had been included. The magnitude of the sampling error associated with an estimate depends on the sample design, sample size and population variability.
Measures of sampling error
6 A measure of the sampling error for a given estimate is provided by the Standard Error (SE), which is the extent to which an estimate might have varied by chance because only a sample of persons was obtained.
7 Another measure is the Relative Standard Error (RSE), which is the SE expressed as a percentage of the estimate. This measure provides an indication of the percentage errors likely to have occurred due to sampling.
8 Another measure is the Margin of Error (MoE), which describes the distance from the population value that the sample estimate is likely to be within, and is specified at a given level of confidence. Confidence levels typically used are 90%, 95% and 99%. For example, at the 95% confidence level the MoE indicates that there are about 19 chances in 20 that the estimate will differ by less than the specified MoE from the population value (the figure obtained if all dwellings had been enumerated). The 95% MoE is calculated as 1.96 multiplied by the SE.
9 The 95% MoE can also be calculated from the RSE by:
10 The MoEs published in the National Health Survey: First Results, 2017-18 cat. no. 4364.0.55.001 are calculated at the 95% confidence level. This can easily be converted to a 90% confidence level by multiplying the MoE by:
or to a 99% confidence level by multiplying by a factor of:
11 A confidence interval expresses the sampling error as a range in which the population value is expected to lie at a given level of confidence. The confidence interval can easily be constructed from the MoE of the same level of confidence by taking the estimate plus or minus the MoE of the estimate.
Examples of interpretation of sampling error
12 Standard errors can be calculated using the estimates and the corresponding RSEs. For example, in the 2017-18 NHS the estimated proportion of males aged 18 years and over in New South Wales who are current daily smokers is 17.0%. The RSE for this estimate is 4.5%, and therefore the SE will be 0.8 (0.045 x 17.0) using the formula:
13 Standard errors can also be calculated using the MoE. For example the MoE for the estimate of the proportion of males aged 18 years and over in New South Wales who are current daily smokers is +/- 1.5 percentage points. The SE is calculated as 1.5/1.96 = 0.8 using the formula:
14 Note due to rounding, the SE calculated from the RSE may be slightly different to the SE calculated from the MoE for the same estimate.
25 The approximate 95% MoE for proportions, differences and sums can be calculated by:
Derivation of replicate weights
30 Under the delete-a-group Jackknife method of replicate weighting, weights were derived as follows:
Application of replicate weights
31 As noted above, replicate weights enable variances of estimates to be calculated relatively simply. They also enable unit record analyses such as chi-square and logistic regression to be conducted, which take into account the sample design.
32 Replicate weights for any variable of interest can be calculated from the 60 replicate groups, giving 60 replicate estimates. The distribution of this set of replicate estimates, in conjunction with the full sample estimate, is then used to approximate the variance of the full sample.
33 The formulae for calculating the standard error (SE), relative standard error (RSE) and 95% Margin of Error (MoE) of an estimate using this method are shown below:
35 The RSE(y) = SE(y)/y*100.
36 The 95% MoE(y)=SE(y)*1.96.
37 This method can also be used when modelling relationships from unit record data, regardless of the modelling technique used. In modelling, the full sample would be used to estimate the parameter being studied (such as a regression coefficient); i.e. the 60 replicate groups would be used to provide 60 replicate estimates of the survey parameter. The variance of the estimate of the parameter from the full sample is then approximated, as above, by the variability of the replicate estimates.
Availability of RSEs calculated using replicate weights
38 Actual RSEs for all estimates have been calculated in the publications released for the 2017-18 NHS. The RSEs for estimates are available in spreadsheet format (datacubes) accessed by clicking on the downloads tab of the 2017-18 NHS survey products. The RSEs in the spreadsheets were calculated using the replicate weights methodology.
Availability of MoEs calculated using replicate weights
39 Actual MoEs for proportion estimates have been calculated for 2017-18 NHS publications and are available in spreadsheet format (datacubes) accessed by clicking on the downloads tab of the publications. The MoEs in the spreadsheets were calculated using the replicate weights methodology.
These documents will be presented in a new window.