|Page tools: Print Page Print All|
DATA COLLECTION AND PROCESSING
For SIH (only) interviews, the interviewer:
For HES (including SIH) interviews the following process was followed:
In the initial contact interview, the interviewer:
The Diary Assistance Visit occurred two to four days after the diaries had been placed. The purpose of this visit was to answer respondent questions on the expenditure Diary. The diaries were checked at this visit and any discrepancies clarified with the respondent(s).
The Diary Exchange Visit occurred eight days after the first diaries were distributed. The purpose of this visit was to collect the first week’s diaries and distribute the second week’s Diaries.
The Group Interview was conducted with all usual residents aged 15 years and over in the household present.
The Individual Interviews were conducted with each usual resident in on scope, that is, all persons aged 15 years or over, who were not a member of a foreign defence force (stationed in Australia) or diplomatic personnel of overseas governments.
The Diary Collection Visit occurred to collect the diaries for the second week. During this visit diaries were checked and collected. If interviewers were unable to collect the completed diaries from respondents during the relevant visits respondents were offered a reply paid envelope to mail back their diaries.
Data collection instruments
A representation of the computer-assisted interview (CAI) questionnaires used in the SIH and HES and the HES expenditure diary can be downloaded as separate PDF files available from the 'Downloads' tab of this publication.
For details of the data items available from the 2015–16 SIH and HES see the Excel spreadsheet available as a data cube from the 'Downloads' tab of this publication in late 2017.
DATA PROCESSING AND DERIVATIONS
Data processing methods
Computer based systems were used to collect and process the data from the 2015-16 SIH and HES with a software program known as BLAISE. A variety of methods were employed to process and edit the data, reflecting the different questionnaires used to collect data from the household and individual components of the surveys. These processes are outlined below:
Coding and input editing of household and individual questionnaires
Internal system edits were applied in the CAI questionnaires to ensure the completeness and consistency of the responses being provided. The interviewer could not proceed from one section of the interview to the next until responses had been appropriately completed.
A number of range and consistency edits were programmed into the CAI questionnaire. Edit messages automatically appeared on the screen if the information entered was either outside the permitted range for a particular question, or contradicted information already recorded. These edit queries were resolved on the spot with respondents.
Data from the CAI questionnaires were electronically loaded to the processing database on receipt in the ABS office in each state or territory. Office checks were made to ensure data for all relevant questions were fully accounted for and that returns for each household and respondent were obtained. Problems identified by interviewers were resolved by office staff, where possible, based on other information contained in the schedule, or on the comments provided by interviewers.
Computer-assisted coding was performed on responses to questions on country of birth, occupation and industry of employment and language to ensure completeness. Data on relationships between household members were used to delineate families and income units within the household, and to classify households and income units by type.
Data capture and coding of individual HES diaries
For HES households diaries were collected by interviewers and dispatched to the ABS office where all reported expenditures in the diaries were entered into the BLAISE Diary Processing System by a diary coding team. The BLAISE system helped operators to code diary items into Household Expenditure Code (HEC) codes. A trigram coder enabled operators to select the appropriate goods or services from an alphabetically ordered pick list of options. The system also deleted expenditure recorded in the diaries on items covered by the household questionnaire. For example, the household questionnaire collected information on mains gas payments so any payments coded to HEC code 0201010201 (Mains gas - (selected dwelling)) were automatically deleted to prevent double counting.
The HEC coding list is a complete list of items classified to each expenditure code and is available for researchers who require a detailed knowledge of the content of each expenditure code (see the HEC coding list in Appendix 7). For example, a researcher may need to know the contents of HEC code 0309030101 (Potato crisps and other savoury confectionery) which the HEC coding list shows to contain Burger rings, Cheezels, chips (crisps), corn chips, Le Snak, pretzels, Twisties and many other products. During coding of the diary data, goods not already listed in the coding list and variant wording of reported expenditures were able to be added to the coding list. Complex queries during the coding process were sent to a query resolution team to determine how the expenditure should be coded and to ensure consistency across the coding.
Respondents were encouraged to attach dockets where possible when completing their diary. Store lists were obtained from several major retailers and pre-loaded into the diary coding system. These store lists provided information about the types of items sold by the retailer and were used to improve efficiency and quality of diary coding. These store lists were used to automatically populate some coded fields such as the description of the item and the HEC code where a match between the docket item and the store list could be found.
A range of edits was also applied to the diary, household and individual information to double check that logical sequences had been followed in the questionnaires; that specific values lay within expected ranges; and that relationships between items were consistent. Unusually high values (termed statistical outliers) were investigated to determine whether there had been errors in entering the data and corrections were made where necessary.
Imputation for missing records and values
Some households did not supply all the required information but supplied sufficient principal information to be retained in the sample. Such partial responses occur when:
In the first two cases of partial response above, the data provided are retained and the missing data are imputed by replacing each missing value with a value reported by another person with similar characteristics, referred to as the 'donor'.
Donor records are randomly selected by finding fully responding persons with matching information on multiple characteristics, such as state, sex, age, labour force status and income, as the person with missing information. As far as possible, the imputed information is an appropriate proxy for the information that is missing. Depending on which values are to be imputed, donors are randomly chosen from the pool of individual records with complete information for the block of questions where the missing information occurs.
In the third case for HES households, non-significant respondents who did not sufficiently complete either a week one diary or a week two diary, had diary data imputed from a fully responding donor. If all significant persons within the household failed to supply either diaries, then the household was converted to a SIH household for sample retention. In other cases if the first week of diary entries was provided but not the second week, then the first week of expenditure is used to represent expenditure for the second week.
In the SIH sample 388 partially responding households were retained in the final sample after full record imputation of person(s) in the household who were not the main income earners. For these households, any missing values were imputed by replacing each missing value with a value reported by another person (referred to as the donor).
The final SIH sample includes 5,117 households (29% of households) and 8,079 person records (24% of persons aged 15 years or over) which had at least one imputed value. Of all the relevant items (continuous variables), 3.5% of values were imputed. This is slightly lower than in SIH 2013-14 (3.9%) and slightly higher than in the last SIH cycle where HES was jointly collected (2.6%). 244 full records were imputed (0.5% of all SIH person level records).
In the HES sample 237 partially responding households were retained in the final sample after full record imputation of person(s) in the household who were not the main income earners. For these households, any missing values were imputed by replacing each missing value with a value reported by another person (referred to as the donor).
The final HES sample includes 3,487 households (35% of households) and 4,762 person records (25% of persons aged 15 years or over) which had at least one imputed value. 154 full records were imputed (0.5% of all HES person level records).
Modelled data items
Some data items of interest cannot reliably be collected from respondents, and some cannot be collected at all. However, it is sometimes possible to utilise other information provided by respondents as a basis for estimating the data items of interest. This process is referred to as modelling.
Childcare subsidies assist families with dependent children with the costs of childcare. Two subsidies are collected and modelled in the 2015–16 SIH. These are the Child Care Benefit and Child Care Rebate.
Child Care Benefit (CCB) is a payment from the Australian Government that assists families with the costs of registered or approved child care. The scheme is means-tested and allocates an hourly amount that can either be provided to child care consumers after child care has been paid, or directly paid to child care providers, thereby reducing the upfront child care fees payable by the consumer.
Child Care Rebate (CCR) is also an Australian Government payment that, like CCB, assists families with the cost of child care. Each childcare consumer is entitled to CCR, which is 50% of their net childcare costs. That is, a childcare consumer is entitled to 50% of their childcare costs after CCB has been deducted from the cost if they receive it, or else 50% of the whole cost. CCR payments accrue up to a per child, per year limit ($7,500 per child per year in 2015–16). CCR, like CCB, may be paid either to the consumer in a lump sum or directly to childcare providers, thereby further reducing the upfront cost of childcare.
Estimates of CCB and CCR are collected from the child care questions, however there has been a substantial gap between the reported number of households receiving childcare subsidies and the total value of that assistance, compared to administrative records. CCB and CCR have been modelled to improve the accuracy of estimates of these payments. The output data is made up of both reported and modelled data. Child care assistance is conceptually treated as social transfers in kind, including administrative overhead as part of the value of the transfer.
The modelled amounts of CCB and CCR are available at the household and income unit level.
Income tax and the Medicare levy and levy surcharge
Disposable income is calculated by deducting income tax, including the Medicare levy, from gross income.
The model is based on the liability rules described in the Tax Pack from the Australian Tax Office for the year concerned, the income reported by respondents, and other characteristics of household members reported in the survey.
Estimates of income tax are modelled, rather than collected from respondents, for a number of reasons including:
The Medicare levy surcharge and the Budget Repair Levy were also modelled and deducted from gross income in the calculation of disposable income.
For more information see the 'Income' section of this publication.
Hours usually worked in second job
In SIH and HES 2015-16, an isolated issue with the survey instrument occurred where respondents working more than one job were not asked to specify the hours they usually worked in their second job. As a result, data for hours usually worked in the second job for employees was modelled based on the respondents reported data, including their usual hours of work in their main job and the income reported for their main and second jobs. This model was developed based on the relationships found between these variables in previous cycle data. For people working in their own business, the hours usually worked was randomly imputed as previous cycle data did not show the same relationships that were found for employee hours. The items impacted by this issue are:
This issue has been resolved in the survey instrument for the 2017-18 SIH collection.
Governments payment modelling
The microediting of income from government payments has been improved in terms of accuracy. The ABS has introduced an eligibility-based model designed by the Department of Social Services. This model produces a value for every person aged 15 years or over for all government payments and allowances that are collected in the survey. The modelled amount is then compared to the reported government payment values to identify and edit values that are significantly higher than the maximum amount eligible, to impute missing values and to impute values for payments which are consequential on the basis of reported payments (e.g. a value for Utilities Allowance is allocated to all recipients of Partner Allowance, even those who did not separately report it). All other government payments' values remain as reported. Some payment values are entirely modelled based on eligibility as in previous cycles of SIH. More information about the new model will be available in appendix 9 of the User Guide later in 2017. Microdata products (e.g. the Basic CURF) will include both the reported and modelled values for comparison (except where the reported payment values were out of the possible range or missing).
Receipts of Family Tax Benefit are treated as income, regardless of whether they are received fortnightly or as a lump sum. The Newborn Supplement and Newborn Upfront Payment replaced the Baby Bonus on 1 March 2014 and those eligible receive it as part of their Family Tax Benefit Part A payments for a period of 13 weeks or with their lump sum. The Paid Parental Leave payment has also been included as income.
The Energy Supplement is included in income from government pensions or allowances. This tax-exempt, indexed payment is paid to pensioners, other income support recipients, families receiving Family Tax Benefit payments and Commonwealth Seniors Health Card holders, provided they meet eligibility requirements.
The twice-yearly Schoolkids bonus payment that was paid to eligible families, carers and students from January 2013 to July 2016 has been included in income from government pensions and allowances. This payment, paid in January and July, was made payable to families receiving Family Tax Benefit Part A. Young people enrolled in school who were receiving Youth Allowance and other specific income support or receiving an education allowance from Department of Veteran's Affairs are also entitled to this payment, providing that they meet the age and education requirements.
These documents will be presented in a new window.