DATA PROCESSING
Data processing procedures and checks are primarily designed to check data provided and to correct, where possible, any inconsistencies in the data.
INPUT CODING
Input coding is the process by which certain data items were categorised during the interview. In the 2012 PSS, computer-assisted coding was performed on the following questions:
- country of birth of respondent and if applicable their partner
- first language spoken as a child and main language spoken at home of respondent and if applicable their partner
- educational qualification of respondent and if applicable their partner
- relationships within a household.
Interviewers were able to code from a list of commonly used options (for example, 10 common languages spoken at home) or from a more comprehensive list contained within a 'trigram coder' (which allowed the interviewer to enter the first three letters of a response, then select the appropriate response from a pick list of options).
The following coders were utilised in the processing of the survey:
- Country of birth of respondent and their partner - interviewers selected from a list of 10 frequently reported countries or from a trigram coder. Countries contained within the trigram coder were classified according to the Standard Australian Classification of Countries (SACC), Second Edition (cat. no. 1269.0).
- First language spoken as a child and main language spoken at home of respondent and their partner - interviewers selected from a list of 10 frequently reported languages spoken at home or from a trigram coder. Languages contained within the trigram coder were classified according to the Australian Standard Classification of Language (ASCL), 2005-06 (cat. no. 1267.0).
- Educational qualification - level of highest non-school educational qualification and field of study of that qualification were classified according to the Australian Standard Classification of Education (ASCED), 2001 (cat. no. 1272.0). In the PSS level and field of current study were also coded to the ASCED 2001.
- Area data (Capital city, Balance of state/territory; Remoteness areas) are classified according to the Australian Standard Geographical Classification (ASGC), 2005 (cat. no. 1216.0).
- Relationship within a household - collected from a responsible adult in the household at commencement of the interviews. They provide basic information about all persons who live in the household. Household composition, family composition and other relationship variables are then produced either via derivations within the instrument, or via office coding.
Coding of open-ended questions
The survey contained a small number of open-ended questions, for which there were no predetermined responses. Information was recorded as stated, using a free-form text field. These responses were coded manually by the ABS for:
- Other emotional abuse behaviours experienced;
- Other places stayed during separations from current partner;
- Other places stayed during separations from previous partner; and
- Other places stayed when left previous partner.
For the coding of the 'Other emotional abuse behaviours' office staff assessed whether or not it was possible to re-code the stated response into an existing response category. Where this was the case, responses were manually re-coded.
For the coding of 'Other places stayed' office staff assessed whether the response could be re-coded into an existing response category. Where this was not able to be recoded, they were coded into a specific 'other' category.
EDIT CHECKS
During office processing of the data, checks were performed on records to ensure that specific values lay within valid ranges and relationships between items were within limits deemed acceptable for the purposes of the survey. This was in addition to edits which were triggered in the instrument at the time of the interview. These checks were also designed to detect errors which may have occurred during response entry and processing and to identify cases which although not necessarily errors, were sufficiently unusual or close to agreed limits to warrant examination.
Data available from the survey are essentially ‘as reported’ by respondents. In some cases it was possible to correct errors or inconsistencies in the data which were originally recorded through reference to other data in the record, including interviewer comments; in other cases this was not possible and some errors and inconsistencies remain on the data file.
Validation checks
The output data file was extensively validated through item-by-item examination of input and output frequencies, checking populations through derivations, internal consistency of items and across levels of the file, data confrontation, etc. Despite these checks, it is possible that some small errors remain undetected on the file.
OUTPUT DATASETS
Information from the survey is stored electronically in the form of data items. In some cases, items were formed directly from individual survey questions while in others, items were derived from answers to several questions.
The output datasets from the 2012 PSS are hierarchical in nature and contain 9 different levels. A hierarchical file is an efficient means of storing and retrieving information which describes one to many, or many to many, relationships.
The structure of the PSS output datasets is as follows:
At the top level there is a:
- Person level - contains socio-demographic information about the respondent and (if applicable) their partner including income, education, labour force and language information as well as information about the respondent's general feelings of safety, self assessed health status, disability status and social connectedness.
Beneath the person level, there are 7 further levels (linked to the person level):
- Violence prevalence level - contains information about a respondent's experience of violence since the age of 15. The time frame of the most recent incident experienced is available on this level for each of the 40 types of violence.
- Violence prevalence since age 15 level - contains information about a respondent's experience of violence since the age of 15. More detailed perpetrator type information is available on this level, however the time frame of the most recent incident experienced is not available on this level for each perpetrator type.
- Violence most recent incident level - contains detailed characteristics about a respondent's most recent incident of a total of 8 types of violence (physical/sexual assault/threat by a male/female perpetrator).
- Violence partner level - contains detailed information about a respondent's experience of violence by a current partner and/or most recently violent previous partner.
- Emotional abuse by a partner level - contains information about the respondent's experience of emotional abuse by current and/or most recently emotionally abusive male/female previous partner since the age of 15
- Abuse before the age of 15 level - contains information about the respondent's experience of physical and or sexual abuse before age 15.
- Sexual harassment level - contains information about the respondent's experience of selected types of sexual harassment.
- Stalking level - contains detailed information about the respondent's experience of stalking by a male and or female perpetrator during their lifetime.
Data about households and families are contained as individual characteristics on the person level. A comprehensive list of data items available on each level from the survey is available in the
Downloads Tab.