|Page tools: Print Page Print All RSS Search this Product|
TECHNICAL NOTE: DATA QUALITY
3 The data input editing process was supported by undertaking a post enumeration survey with data providers around Australia, and was used to identify any quality problems with the reported data. Those areas were targeted closely in the output editing stage of the survey to assist in minimising survey bias due to non-sampling error.
4 Any errors detected were followed up directly with the data providers, or in the case of registered collective agreements, checked against available listings of agreements. The more significant units, which collectively contributed to more than 45% of the survey estimates, were also directly contacted by telephone to verify their responses to the questions on how pay is set for all selected employees. These providers were asked background questions on pay setting methods used in their organisation, which was then used to validate the reported data.
RELIABILITY OF ESTIMATES
5 The sampling error associated with any estimate can be estimated from the sample results. One measure of sampling error is given by the standard error, which indicates the degree to which an estimate may vary from the value that would have been obtained from a full enumeration (the ‘true value'). There are about two chances in three that a sample estimate differs from the true value by less than one standard error, and about nineteen chances in twenty that the difference will be less than two standard errors.
6 An example of the use of a standard error is as follows. The estimated average weekly total earnings for all male employees in Australia is $838.80, with a standard error of $8.20. Then there would be about two chances in three that a full enumeration would have given an estimate in the range $830.60 to $847.00 and about nineteen chances in twenty that it would be in the range $822.40 to $855.20.
7 The difference between two survey estimates is also an estimate and it is therefore subject to sampling variability. The standard error on the difference between two survey estimates depends on the standard errors of the original estimates and on the relationship (correlation) between these two estimates. An approximate standard error on the difference between two survey estimates (x-y) may be obtained by the following formula: SE(x-y) = sqrt(SE(x)² + SE(y)²)
8 This formula will overestimate the standard error where there is a positive correlation between the two estimates (e.g. male and female school teachers). While this formula will only be accurate where there is no correlation between the two estimates (e.g. estimates from different states), it is expected to provide a reasonable approximation for the differences likely to be of interest.
9 The estimated average weekly total earnings for all female employees in Australia is $554.70, with a standard error of $5.70. The difference between the earnings of male and female employees is $284.10. The estimate of the standard error of the difference between the average weekly total earnings for male and female employees in Australia is: SE($838.80 - $554.70) = sqrt(($8.20)²+ ($5.70)²) = $9.99
10 There are about two chances in three that the true figure for the difference between male and female average weekly earnings lies in the range $274.11 to $294.09, and about 19 chances in 20 that the figure is in the range $264.12 to $304.08.
11 The formula above can be used to estimate the standard error on a difference between estimated averages in two different years. (The movement standard error will be approximately 1.4 times the standard error on the level estimate, if the standard errors on the two level estimates are similar.)
12 Another measure of the sampling error is the relative standard error, which is obtained by expressing the standard error as a percentage of the estimate to which it refers. The relative standard error is a useful measure in that it provides an immediate indication of the percentage errors likely to have occurred due to sampling, and thus avoids the need to refer also to the size of the estimate.
13 Relative standard errors can be calculated using the actual standard error and the survey estimate (referred to as x) in the following manner:
RSE%(x)= (SE(x)/x) * 100
14 For example, the average weekly total earnings for all male employees in Australia is $838.80, and for all female employees it is $554.70. The estimate for the standard error on the male estimate is $8.20, and an estimate of the standard error on the female estimate is $5.70.
15 Applying the above RSE%(x) formula yields:
Males: RSE%(838.80) = (8.20/838.80) * 100 = 0.98%
Females: RSE%(554.70) = (5.70/554.70) * 100 = 1.03%
16 An asterisk appears against an estimate in the publication where the sampling variability is considered high. This occurs when the standard error of the estimate is equal to or greater than 25% of the estimate. In these cases, the estimate should be used with caution. A double asterisk appears against an estimate with a relative standard error greater than 50%. In these cases the estimate is considered too unreliable for general use.
Pay Setting Method - Proportions Data
17 Standard errors can be used to construct confidence intervals around the estimated proportions. There are about two chances in three that the 'true' value is within the interval that ranges from the sample estimate minus one standard error (estimate - 1xSE) to the sample estimate plus one standard error (estimate + 1xSE). There are approximately 19 chances in 20 that the 'true' value lies within the interval from the estimate minus two standard errors (estimate - 2xSE) to the estimate plus two standard errors (estimate + 2xSE).
18 The above rule gives a symmetric confidence interval that is reasonably accurate when the estimated proportion is not too near 0.00 or 1.00. Where the estimated proportion is close to 0.00 or 1.00 it would be more accurate to use a confidence interval that was not symmetric around the sample estimate. If an estimate is close to 1.00, then the upper boundary of the confidence interval should be closer to the sample estimate than suggested above, while the lower boundary should be further from the sample estimate. Similarly, if an estimate is close to 0.00, then the lower boundary of the confidence interval should be closer to the sample estimate than suggested above, while the upper boundary should be further from the sample estimate. In particular, the symmetric confidence interval could include values that are not between 0.00 and 1.00. In such a case a good rule of thumb is to use a confidence interval of the same size as the symmetric one, but with the lower (or upper) boundary set to 0.00 (or 1.00).
These documents will be presented in a new window.