8155.0 - Data Reliability, 2003-04  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 09/08/2006   
   Page tools: Print Print Page Print all pages in this productPrint All



TECHNICAL NOTE 2 DATA RELIABILITY


SAMPLE ERROR

1 The Economic Activity Survey is, in part, a sample survey designed primarily to deliver national estimates for all industry divisions within the scope of the collection. Experimental estimates at the national level for industry classes and at the state and territory level for industry divisions are also produced, but the survey was not specifically designed for these purposes.


2 The majority of data contained in this publication have been obtained from a sample of businesses. As such, these data are subject to sampling variability; that is, they may differ from the figures that would have been produced if the data had been obtained from all businesses in the population. The measure of the likely difference as used by the ABS is given by the standard error, which indicates the extent to which an estimate might have varied by chance because the data were obtained from only a sample of units. There are about two chances in three that a sample estimate will differ by less than one standard error from the figure that would have been obtained if the data had been obtained from all units, and about 19 chances in 20 that the difference will be less than two standard errors.


3 The standard error can also be expressed as a percentage of the estimate, and this is known as the relative standard error (RSE). RSEs at the industry division level for Australia for selected data items representing the full range of data contained in this publication are shown in Technical Note 3. Detailed relative standard errors can be made available on request.


4 The size of the RSE may be a misleading indicator of the reliability of some of the estimates for industry value added (IVA) and operating profit before tax (OPBT). This situation may occur where an estimate may legitimately include positive and negative values, reflecting the financial performance of individual businesses. In these cases, the aggregated estimate can be small relative to the contribution of individual businesses, resulting in a standard error which is large relative to the estimate.


5 Some estimates presented in this publication rely on techniques in which proportions and relationships from data collected by the Australian Bureau of Statistics (ABS) are applied to business income tax (BIT) data sourced from the Australian Taxation Office (ATO), in order to provide estimates of items not available from the ATO BIT files. This technique, known as proration, has implications for reliability of the relevant RSEs as a measure of quality. Items appearing in this publication and which are derived by proration are:

      Average industry value added

      Average sales and service income

      Cost of sales Gross fixed capital formation

      Income from services

      Industry value added

      Interest income

      Investment rate value added

      Other operating expenses

      Other selected income

      Rent, leasing and hiring income

      Sales and service income

      Sales of goods.

6 In general, if RSEs of data items derived from proration are calculated in the same way as for items that are not prorated (i.e. directly collected in the economic activity survey (EAS) or available from BIT files), they will be less reliable as quality measures than for items that are not prorated. Specifically, RSEs calculated for prorated items will tend to understate the level of sampling variability in the estimates to which they relate.


7 The RSEs presented or annotated in this publication are based on calculations that do not distinguish between prorated and non-prorated items. The ABS is investigating methodologies that will allow more reliable RSEs to be derived for prorated items for future editions of this publication. This work is examining the effects on RSEs for four key variables: sales of goods, income from services, sales and service income, and IVA. Indications to date are that the effects are greatest on sales of goods and IVA. In other words, for some industries shown, the calculated RSEs of estimates of these two variables are more likely to be higher if a proration-based RSE methodology were used than would be the case for income from services and sales and service income. Please note that this alternative methodology is not suitable for some industries (for 2003-04, Mining, Manufacturing, and Electricity, gas and water supply), because of the design of the surveys that relate to them.


ANZSIC class experimental estimates

8 Experimental estimates at the ANZSIC class level are shown in Chapter 3 of this publication. This is the finest level of classification in the ANZSIC. It is only the incorporation of ATO BIT data that has made it feasible to produce estimates to this degree of industry detail, as the relatively small size of the directly collected EAS sample does not allow for the compilation of reliable estimates generally below the ANZSIC subdivision level. A broad general indication of the reliability of estimates at the ANZSIC class level is provided by the RSEs shown in Technical Note 3 for the industry division to which the class belongs.


9 Approximately 96% of the ANZSIC class level estimates for total income have RSEs of less than 25%. As annotated in table 3.1, some of the RSEs are relatively large and, therefore, the estimates to which they relate should be used with extreme caution.


State/territory experimental estimates

10 The design of the EAS sample does not take into account state/territory, and this could affect the size of the sample error at the state/territory level. To some extent, this is offset by the use of BIT data, which effectively increases the sample size, resulting in a broader coverage of units for each state/territory.



NON-SAMPLE ERROR

11 The imprecision due to sampling variability, which is measured by the standard error, should not be confused with inaccuracies that may occur because of inadequacies in available sources from which the population frame was compiled, imperfections in reporting by providers, errors made in collection such as in recording and coding data, and errors made in processing data. Inaccuracies of this kind are referred to collectively as non-sampling error and they may occur in any enumeration, whether a full census or a sample.


12 Although it is not possible to quantify non-sampling error, every effort is made to reduce it to a minimum. Collection forms are designed to be easy to complete and assist businesses to report accurately. Efficient and effective operating procedures and systems are used to compile the statistics. The ABS compares data from different ABS (and non-ABS) sources relating to the one industry, to ensure consistency and coherence.


13 Differences in accounting policy and practices across businesses and industries can also lead to some inconsistencies in the data used to compile the estimates. Although much of the accounting process is subject to standards, there remains a great deal of flexibility available to individual businesses in the accounting policies and practices that they adopt.


14 The class level estimates in this publication can sometimes differ from those produced by the ABS's Service Industries program of surveys, which deliver detailed data of industry structure and performance for individual ANZSIC classes. For details, see the Appendix.


15 Because direct collection has not been used to apportion EAS estimates to states and territories, some non-sample error will result from the techniques used to produce state/territory experimental estimates. For full details of the methodology used to allocate estimates to states and territories, please refer to Technical Note 1 paragraphs 22-28.


16 The above limitations are not meant to imply that analysis based on these data should be avoided, only that the limitations should be borne in mind when interpreting the data presented in this publication. This publication presents a wide range of data that can be used to analyse business and industry performance. It is important that any analysis be based upon the range of data presented rather than focusing on one variable.