1522.0 - Quality Management of Statistical Outputs Produced From Administrative Data, Mar 2011
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 22/03/2011
|Page tools: Print Page Print All|
USES OF ADMINISTRATIVE DATA
Administrative data can be used to help validate existing data sources at an aggregate or unit level. Validation can take the form of checking to see whether an estimate achieved from one collection seems reasonable in comparison with an already existing collection reporting either the same or a similar estimate.
Editing is the activity aimed at detecting, resolving and treating anomalies in data to help make the data ‘fit for purpose' (ABS 2009c). Once again, administrative data may be used to provide the answers to questions raised about anomalies in a particular data source, effectively replacing the original data or providing guidance as to the treatment of the data being investigated.
Imputation and substitution
Administrative data can be used for imputation purposes. Imputation is the process used to determine and assign replacement values for missing, invalid or inconsistent data (SDMX 2009). For example, where a survey has data that needs to be imputed, an administrative data source could be used to provide the information required.
Data substitution occurs when data for the same unit is used from a different source. For instance, a survey may ask a reporting unit a question and when the answers are being validated it may appear that the response to the survey is not plausible given the information known about that reporting unit from an administrative source. The administrative data source response may be substituted into the survey if it is considered more accurate.
Data substitution also includes the use of administrative data to replace a subgroup of a population. This may occur where the administrative data are of sufficient quality and the data available adequately match the concepts being measured by the direct survey collection. This in turn creates efficiencies by reducing the size of the survey sample required and also reduces respondent burden.
Reducing respondent burden
"Response burden is an unavoidable part of survey research, but efforts to limit it can help maximise response rates. Shorter questionnaires can improve response rates, particularly if interviewers inform respondents that the interview will be short" (Public Works and Government Services Canada 2007). When an agency conducts a survey the respondents are often overwhelmed by the extensive questions asked of them, as well as the number and frequency of the surveys they have to participate in. Statistical agencies generally need this level of detail in order to meet the requirements of key users of the statistical output.
As mentioned data substitution is one way of reducing respondent burden. Another is through the linking of administrative data about the target population in conjunction with other data sources, be they administrative or survey, to create richer data sets. This means that more detailed information is available about the population without undertaking a new or improved survey to obtain the already existing information. This reduces the number of questions which are required to be completed by respondents and hence reduces respondent burden.
Register maintenance and frame creation
The creation and maintenance of registers and creation of frames (for sample selections) can be achieved through the use of administrative data. A frame can be a list, map or other specification of the units which define a population to be completely enumerated or sampled (SDMX 2009). The ABS uses some Australian Taxation Office data to maintain a register of all the businesses in Australia from which frames and samples for various business surveys are then created.
The creation of a frame using administrative data also allows the use of auxiliary information for purposes such as stratification or other sampling purposes if there is enough detailed information from the administrative source. The ABS uses information such as industry, state, sector (public or private), the number of payees in a business (number of employees) and turnover to help stratify the business frame in order to collect information from a representative sample of businesses.
Using administrative data in the creation of a frame and its stratification can assist in improving sampling efficiency for output purposes. This is because more information is known about the population and therefore enables the identification of units to be included or excluded from the sample more easily. It also improves the ability to produce small area statistics (e.g. geographical, particular industries etc.), as well as enabling the prioritising of reporting units (e.g. prioritisation of non-response follow up by size of business).
Weighting and estimation
Using the administrative data as a frame allows benchmarks to be created for use in weighting / calibration estimates. This allows a sample taken from a population frame to be weighted to represent the population without having to enumerate the entire population. This can improve sampling and estimation efficiency by providing estimates of greater precision or reducing the sample size required.
The ability to show longitudinal effects
If the data custodians of the administrative data continue to collect the information about their population(s) of interest then the data can provide receiving agencies (including researchers) with the ability to analyse structural or situational changes over time for a specific population.