DATA PROCESSING AND CODING
Data processing procedures and checks are primarily designed to check data provided and to correct, where possible, any inconsistencies in the data.
Input coding is the process by which certain data items were categorised during the interview. In the 2016 PSS, computer assisted input coding was performed on the following questions:
- Country of birth of respondent and, if applicable, their current partner who they live with and other household members
- Country of birth of respondent’s mother and father
- First language spoken as a child and main language spoken at home of respondent and, if applicable, their current partner who they live with
- Educational qualification of respondent and, if applicable, their current partner who they live with
- Relationships within a household.
Interviewers were able to code from a list of commonly used options (for example, 10 common languages spoken at home) or from a more comprehensive list contained within a 'trigram coder' (which allowed the interviewer to enter the first three letters of a response, then select the appropriate response from a pick list of options). Trigram coders are used to aid the interviewer with the collection of data for which there are detailed lists of output – primarily those associated with Standard Classifications – to eliminate the need for significant office coding. The trigram coders are complemented by manual coding of text fields where interviewers could not find an appropriate response in commonly used options or the trigram coder.
The following coders were utilised in the processing of the survey:
- Country of birth of respondent, their current partner who they live with, and mother/father of respondent - interviewers selected from a list of 10 frequently reported countries or from a trigram coder. Countries contained within the trigram coder were classified according to the Standard Australian Classification of Countries (SACC), 2016 (cat. no. 1269.0).
- First language spoken as a child and main language spoken at home of respondent and their current partner who they live with - interviewers selected from a list of 10 frequently reported languages spoken at home or from a trigram coder. Languages contained within the trigram coder were classified according to the Australian Standard Classification of Language (ASCL), 2016 (cat. no. 1267.0).
- Educational qualification - level and field of highest non-school educational qualification of respondent and their current partner who they live with, were coded to the Australian Standard Classification of Education (ASCED), 2001 (cat.no.1272.0). The 2016 PSS collected level of highest qualification using a trigram coder.
- Area data (Capital city, Balance of state/territory; Remoteness areas) are classified according to the Australian Statistical Geography Standard (ASGS): Volume 1 - Main Structure and Greater Capital City Areas, July 2011 (cat. 1270.0.55.001).
- Relationship within a household - collected from a responsible adult in the household at commencement of the interviews. They provide basic information about all persons who live in the household. Household composition, family composition and other relationship variables are then produced either via derivations within the instrument, or via office coding.
For more details on the ABS Standard Classifications used in the PSS, refer to Appendix 2
in this User Guide.
Further information about the categories available for each classification can be found in the data item list available in Excel spreadsheet format from the Downloads
page of this User Guide.
CODING OF FREE-FORM TEXT RESPONSES
A small number of questions in the 2016 PSS, containing ‘Other’ as an option to select from a pick list, sequenced to a free-form text field to obtain further details on their response. These fields include:
- Other known perpetrator types (violence/stalking)
- Other reason for separating from current/previous partner
- Other places stayed during separations from current/previous partner
- Other place stayed on first night separated from current/previous partner
- Other places stayed when relationship ended with previous partner
- Other reaction experienced as a result of unwanted contact or attention (stalking)
For the coding of these categories, office staff assessed whether or not it was possible to re-code the stated response into an existing response category from the feed-in question. Where this was the case, responses were manually re-coded. Otherwise they were left as ‘other’ responses.
During office processing of the data, checks were performed on records to ensure that specific values lay within valid ranges and relationships between items were within limits deemed acceptable for the purposes of the survey. This was in addition to edits which were triggered in the instrument at the time of the interview. These checks were also designed to detect errors which may have occurred during response entry and processing and to identify cases which although not necessarily errors, were sufficiently unusual or close enough to agreed limits to warrant examination.
Data available from the survey are essentially ‘as reported’ by respondents. In some cases it was possible to correct errors or inconsistencies in the data which were originally recorded through reference to other data in the record, including interviewer comments; in other cases this was not possible and some errors and inconsistencies remain on the data file. Wherever possible, known inconsistencies and potential areas for misinterpretation are identified in the interpretation section of the relevant topic pages in this User Guide.
The output data file was extensively validated through item-by-item examination of input and output frequencies, checking populations through derivations, internal consistency of items and across levels of the file, data confrontation, etc. Despite these checks, it is possible that some small errors remain undetected on the file.
Information from the survey is stored electronically in the form of data items. In some cases, items were formed directly from individual survey questions while in others, items were derived from answers to several questions.
The output datasets from the 2016 PSS are hierarchical in nature and contain six different levels. A hierarchical file is an efficient means of storing and retrieving information which describes one to many, or many to many, relationships.
The structure of the 2016 PSS output datasets is as follows:
At the top levels there are:
- Household level – contains compositional and geographic information about the household, and household income.
- Person level (linked to household level) - contains socio-demographic information about the respondent and (if applicable) their current partner (who they are living with) including income, education, labour force and language information as well as information about the respondent's general feelings of safety, self assessed health status, disability status, social connectedness, experiences of: sexual harassment, sexual or physical abuse before the age of 15, witnessing violence before the age of 15, and stalking. The person level also contains a significant number of aggregated data items produced from data contained on the levels outlined below. These aggregated data items provide only summary experience data, with detailed information remaining on the applicable levels. For more details on these topics, refer to the relevant topic pages contained within this User Guide.
Beneath the person level, there are 4 further levels (linked to the person level):
- Violence prevalence level - contains information about a respondent's experience of violence since the age of 15. The time frame of the most recent incident experienced by broad groupings of perpetrator type is available on this level for each of the 8 violence types collected, as well as for aggregated violence types (for example, the identification of the most recent experience out of the combined Physical assault and Sexual assault experiences – identified as Assault). In addition a detailed perpetrator type data item is available for use with the violence type data. However the timeframe data cannot be used in association with this detailed perpetrator type data item as the timeframe of the most recent incident for each individual category has not been collected. For more details on this level, refer to the Violence: Prevalence topic page in this User Guide.
- Violence most recent incident level - contains detailed characteristics about a respondent's most recent incident of a total of 8 types of violence (physical/sexual assault/threat by a male/female perpetrator). For more details on this level, refer to the Violence: Most Recent Incident topic page in this User Guide.
- Violence partner level - contains detailed information about a respondent's experience of violence by a current partner and/or most recently violent previous partner. For more details on this level, refer to the Partner violence topic page in this User Guide.
- Emotional abuse by a partner level - contains information about the respondent's experience of emotional abuse by current and/or most recently emotionally abusive male/female previous partner since the age of 15. For more details on this level, refer to the Partner emotional abuse topic page in this User Guide.
A comprehensive list of data items available on each level from the survey is available in the Downloads
tab of this User Guide.