Latest release

Index of Household Advantage and Disadvantage (IHAD): Technical Paper

IHAD provides a summary measure of relative socio-economic advantage and disadvantage at the household level, using Census data

Reference period
2021
Released
11/02/2025

What is IHAD?

The Index of Household Advantage and Disadvantage (IHAD) is an analytical product developed by the Australian Bureau of Statistics (ABS) that summaries relative socio-economic advantage and disadvantage for households. 

A household is defined as one or more persons, at least one of whom is at least 15 years of age, usually resident in the same private dwelling. All occupants of a dwelling form a household. For Census purposes, the total number of households is equal to the total number of occupied private dwellings (Census of Population and Housing: Census Dictionary, 2021). This report will refer to ‘household’ rather than dwelling, as the index is designed to represent advantage and disadvantage at household level. 

IHAD uses information from the 2021 Census of Population and Housing (2021 Census) on the characteristics of each household and the people living within them.

In comparison, ABS's Socio-Economic Indexes for Areas (SEIFA) provides index values calculated for each geographic area, and not for each household. Within any area there are likely to be households with different characteristics to that generally measured across the overall population of that area. For example, a relatively advantaged area is likely to contain households that are relatively advantaged; however, the same area is also likely to contain some households that are relatively disadvantaged.

The IHAD data complement the area level rankings given by the SEIFA Index of Relative Advantage and Disadvantage (IRSAD), by allowing the relationship between area level disadvantage/advantage and household level disadvantage/advantage to be explored. IHAD quantiles summarise the diversity of area level advantage and disadvantage at household level, adding value to the use of SEIFA IRSAD for research and planning at the area level.

Because IHAD is derived at household level it can be cross classified with other Census variables not included in the index. This could be used to assist in exploring advantage and disadvantage for different population groups. 

Some common uses of IHAD include:

  • assisting research into the relationship between socio-economic disadvantage and various social outcomes, for different demographic groups, and
  • assisting decision making in funding and services to areas, alongside other relevant data sources. 

Purpose of technical paper

This paper provides information on the concepts, data, and methods used to create IHAD 2021. The paper also contains discussion of the recommended methods for interpretation and use of the index.

This paper is intended to be a comprehensive reference for IHAD 2021. Refer to the Methodology paper for basic information that has been prepared for a general audience.

This paper uses terminology relating to the 2021 Census data. The Census of Population and Housing: Census dictionary, 2021 provides a glossary and explanations of the 2021 Census variables. 

Historic context

The Index of Household Advantage and Disadvantage (IHAD) was first produced as an experimental index in 2018, based on 2016 Census data and funded by the ACT government. IHAD 2021 is the second release of this household level index and is funded by the Australian Government Department of Education. Future releases of an IHAD product by the ABS are contingent on interest and funding from stakeholders. IHAD is not part of the ABS ongoing work program.

Features of IHAD 2021

This section highlights some important features of IHAD 2021, and how they compare with the Experimental IHAD 2016.

The ABS has aimed to maintain consistency between IHAD 2021 and the previous release. However, some changes have been made and are described below.

Updated geography standard

IHAD 2021 uses the Australian Statistical Geography Standard (ASGS) Edition 3 (2021). The structure of the ASGS Edition 3 is similar to the structure of ASGS Edition 2 (2016), though there have been updates to SA1, SA2, and other ASGS boundaries in some areas. For more information about the ASGS, refer to Changes from the previous edition of the ASGS.

Variables underpinning IHAD

Occupation variables for the Experimental IHAD 2016 were based on the Australian and New Zealand Classification of Occupations, 2013 (ANZSCO), version 1.2A. For 2021, the updated version, ANZSCO version 1.3, was used, resulting in some changes to skill level and some title changes. Variables using cut-off values in their definitions, such as high and low income, were updated to use new cut-off values. For more information about how the cut-off values were selected, refer to the Description of candidate IHAD variables

The ‘Does any member of this household access the internet from this dwelling?’ question was not asked in the 2021 Census, so the NONET variable (derived from the output NEDD - Dwelling Internet Connection variable) could not be used when calculating the IHAD 2021 scores.

Output

IHAD output includes a general introduction to IHAD 2021, a basic Methodology, this Technical Paper, and data which can be sourced from:

Output includes 

  • proportions of households in each IHAD quartile, 
  • usual resident population,
  • SEIFA IRSAD quartile (where applicable), 
  • occupied private dwelling count (where applicable), and 
  • dwelling count (where applicable),

Where applicable, outputs are consistent with the release of these items in the SEIFA product (Socio-Economic Indexes for Areas (SEIFA), Australia, 2021) and in Census of Population and Housing: Mesh Block Counts, 2021. For SA1 and SA2 outputs, areas without a SEIFA score will also not be assigned IHAD quantile proportions.

Sums of IHAD quartile 1 through 4 proportions for each area (i.e. row) may equal more than 100% due to rounding and random adjustments made to the data. When calculating proportions, percentages, or ratios from tables of cross-classified or small geographic areas, the random error introduced can be ignored except when cells containing very small numbers of households are involved, in which case the impact on percentages and ratios can be significant. For more information see Introduced random error / perturbation.

Geography Available

Data cubes are available from the Data downloads section:

  • Statistical Areas Level 1: Household level; the percentage of households within each IHAD quartiles;
  • Statistical Areas Level 2: Household level; the percentage of households within each IHAD quartiles;
  • State and territory: Person level; the percentage of persons within each IHAD quartile

Census TableBuilder Pro is an online data tool that enables you to cross classify indexes and household indicators with all the levels of geography available.

Interpretation of IHAD

To set some context for the rest of this paper, it is worth briefly touching on some important characteristics of the index.

The IHAD measures are assigned to households, not to individuals. They indicate the collective socio-economic characteristics of the people living in a household.

As a measure of socio-economic conditions, the index can be used to understand the distribution of these conditions across different households. Index scores are on an arbitrary scale. The scores do not represent some quantity of advantage or disadvantage. For example, we cannot infer that a household with an index score of 1000 is twice as advantaged as a household with an index score of 500.

IHAD is constructed based on a weighted combination of selected variables. The indexes are dependent on the set of variables chosen for the analysis. A different set of underlying variables would result in a different index.

The index is primarily designed to compare the relative socio-economic characteristics of households at a given point in time. It can be very difficult to perform useful longitudinal or time series analysis, and this sort of analysis should be undertaken with considerable care.

There is more discussion of these points in Using and Interpreting IHAD.

Conceptual framework

The concept of relative socio-economic advantage and disadvantage

For IHAD 2021, the concept of household level socio-economic advantage and disadvantage is the same as that used in the Experimental IHAD 2016. That is, the ABS broadly defines relative household socio-economic advantage and disadvantage in terms of the individual access to resources of people living within households and their ability to collectively share these resources in order to participate in society (Wise and Williamson, 2013). This concept is described as ‘broadly defined’ in recognition of the many concepts that have emerged in the literature to describe advantage and disadvantage.

In most households, members often pool their income and resources and share similar living characteristics. A household can still be advantaged overall as a unit even if it contains some less advantaged members. This support isn’t restricted to economic aspects; for example, children within households containing people with higher levels of education may be more advantaged in educational outcomes than children in households containing people with lower levels of education. 

Since relative socio-economic advantage and disadvantage is a complex and multidimensional concept, it is difficult to condense into a single index with a manageable, accessible framework. The limitations of the data collected in the Census also place restrictions on the scope of the notion of advantage and disadvantage available to be used. The most important elements covered by SEIFA, the IHAD, and other socio-economic indexes developed around the world, include income, education, employment, occupation, housing, and family structure. Variables have been selected from these dimensions and are discussed further in the Description of candidate IHAD variables.

An important point to consider is both what is measured by a socio-economic index, and what is not measured. This helps clarify the ability of the index to accurately measure household relative socio-economic advantage and disadvantage (Wise and Mathews, 2011). Defining the concept behind IHAD provides more information on the index. 

Defining the concept behind IHAD

This section gives a description of the concept behind IHAD. For a list of the variables included, refer to the Technical details for IHAD: variables and loadings.

The IHAD

The IHAD summarises information about the economic and social conditions of people within households, including both relative advantage and disadvantage measures.

Scores

Household scores are created by adding together the weighted measures of the characteristics of that household. The scores for all households are then standardised to a distribution where the mean equals 1,000 and the standard deviation is 100.

A low score indicates relatively greater disadvantage and a lack of advantage in general. A high score indicates a relative lack of disadvantage and greater advantage in general. 

Scores are an ordinal measure on an arbitrary scale and do not represent the quantity of advantage or disadvantage. Therefore, it is not accurate to say a household with a score of 1,000 is twice as advantaged as a household with a score of 500.

The individual household level IHAD scores are not available as outputs. Quantiles (deciles and quartiles) have been provided for analytical purposes.

Deciles

Every household is ordered from lowest to highest score, with the lowest 10 per cent of households given a decile number of one, the next lowest 10 per cent of households given a decile number of two and so on, up to the highest 10 per cent of households which are given a decile number of 10. This means that households are divided up into 10 equal sized groups, based on their score, with decile 1 representing the most disadvantaged households and decile 10 representing the most advantaged households. In practice these groups won’t each contain exactly 10% of households as it depends on the distribution of the IHAD scores. Note that the groups will have an approximately equal number of households, not an approximately equal number of persons.

Quartiles

Every household is ordered from lowest to highest score, with the lowest 25 per cent of households given a quartile number of one, the next lowest 25 per cent is given a quartile number two and so on, up to the highest 25 per cent of households which are given a quartile number of 4. This means that households are divided up into four equal sized groups, based on their score, with quartile 1 representing the most disadvantaged households and quartile 4 representing the most advantaged households. In practice these groups won’t each contain exactly 25% of households as it depends on the distribution of the IHAD scores. Note that the groups will have an approximately equal number of households, not an approximately equal number of persons.

The data underpinning IHAD

This chapter looks at the data used to construct IHAD 2021. All data is from the 2021 Census of Population and Housing.

The candidate list of variables

Variables from the Census were included in the initial candidate variable list for IHAD if they were deemed to be related to the definition of advantage and disadvantage that the IHAD is intending to capture. The same candidate variable list from the Experimental IHAD 2016 was used for IHAD 2021, excluding the dwelling internet connection variable as noted in Variables underpinning IHAD. The candidate variables fall into a multi-dimensional framework. The dimensions are: 

  • housing,
  • family,
  • education,
  • occupation, and
  • miscellaneous.

Constructing the variables

Specifications

IHAD is constructed from 2021 Census data, with variables derived as binary indicators. Variables typically relate to persons but also relate to families or dwellings. Family and person level variables have been derived at the household level. For example, for the candidate variable ‘households where the person with the highest educational attainment has a Bachelor Degree or above’, the highest qualification for all in scope persons in the household is considered and if one person has a Bachelor Degree or higher, the derived variable has a value of 1. If no people in the household have a Bachelor Degree or higher, the value is 0. For the candidate variable ‘households where all people aged 15 years and over are unemployed’ if all in scope people aged 15 years and over in the household have labour force status unemployed, the derived variable will have a value of 1, otherwise it will have a value of 0. In most cases, the indicator specifications were based on the Experimental IHAD 2016 specifications. Some minor changes were made to reflect updates to the Census 2021 variable coding. The Appendix contains detailed descriptions of the indicator specifications used for all the IHAD variables. 

Scope

The scope of the IHAD is private dwellings that were occupied on Census Night. Non-classifiable occupied private dwellings (e.g. dwellings that only contained visitors) and unoccupied private dwellings were excluded. This accounted for approximately 1.6 million dwellings or 14.5% of all private dwellings. Approximately 1.0 million (9.6%) were unoccupied private dwellings; 0.5 million (4.9%) were non-classifiable occupied private dwellings. Non-private dwellings, offshore, migratory, and shipping were also excluded. Note that residents in boarding houses and hostels are not included as these are classified as non-private dwellings.

Census (all private dwellings - in scope dwellings)Excluded from index%
10.85 - 9.28 million dwellings1.58 million dwellings14.5

Persons temporarily away from home

The IHAD is calculated based on the characteristics of persons who are both usually resident in a household and enumerated in that household on Census Night. If all usual residents of a household aged 15 or more were away from home on Census Night, that dwelling would be out of scope of the IHAD.

Persons temporarily overseas on Census Night are out of scope of the Census, and thus Census data is not available for those persons. Persons staying elsewhere in Australia are in scope of the Census, but they are not able to be associated back to their dwelling of usual residence, and therefore their characteristics as measured in the Census are not able to be used in the derivation of the household level variables used in the index.

If one or more usual residents were away, but at least one person was at home on Census Night, then that dwelling remains in scope of the IHAD and an index value would be calculated for that household. However, the persons temporarily away from that dwelling would not have their characteristics contributing to the index value for that household; only those persons present will. This may result in a different level of advantage or disadvantage being calculated for that dwelling than would have been the case had all persons usually resident in that dwelling been at home on Census Night. Around 4.8% (0.44 million dwellings) of in scope dwellings had one or more usual residents away from home on Census Night.

An example of this situation would be a one family couple household with two adults usually resident. One member of the couple is unemployed and was at home on Census Night, while the other person was travelling for work-related reasons. The person characteristics used in the calculation of the IHAD are based only on the characteristics of the (unemployed) person that was home on Census Night.

Exclusions

Rules for the minimum number of persons and dwellings for an area to receive an index score has been a feature of SEIFA since its inception following the 1986 Census. In the Statistical Areas Level 1 (SA1) and Statistical Area Level 2 (SA2) data cubes available in the Data downloads section, IHAD quartile percentages will not be provided for SA1s or SA2s that do not have a SEIFA score. Refer to Areas without a SEIFA score for more information about these excluded areas.

Missing responses

Data quality considerations for construction of an index at household level centre on the level of non-response to Census questions. Overall non-response was relatively low (around 0-6%) and fairly consistent across candidate variables, with the exception of equivalised total household income (HIED and level of highest educational attainment (HEAP) (around 7-11%). Please refer to the Census of Population and Housing: Census Dictionary, 2021 for details about these variables.

Due to partial non-response from some Census respondents, some households could not be included in the IHAD construction without some action to account for missing values within candidate variables. For example, for the candidate variable ‘Households where all people aged 15 years and over are unemployed’, if someone within the household does not indicate their labour force status, it may not be possible to assign a value for this candidate variable.

Two actions to deal with missing data have been applied:

  • removal of households with high numbers of non-response
  • imputation of missing values

Removal of households with high numbers of non-response

Households with 10 or more missing candidate variable responses have been removed. This number is consistent with the Experimental IHAD 2016 and was chosen because it tended to correspond to dwellings where most person-based variables were coded as 'not stated' (Wise and Williamson, 2013). Approximately 0.5% of in scope households (43,726) were removed; the proportion of households with 3 or fewer missing candidate variable responses was approximately 97.5%.

Imputation of missing values

Wise and Williamson (2013) noted that if ‘not stated’ responses are grouped with records that do not have a particular disadvantaging characteristic, then there is an implicit advantage being assigned to those individuals. They recommended that imputation should be performed where appropriate.

Missing values for household, family, and person level Census variables that were required have been imputed. The method used randomly assigned missing responses for a given variable to one of the allowed responses, based on the frequency proportions for the variable at the national level. As a result, the distribution of the imputed responses for most of the variables being treated aligned within reasonable bounds with the original distribution of non-missing responses.

This was also true for HIED (equivalised total household income) at the national level. However, within household composition categories the distribution of imputed values was more variable. By state, the rate of non-response for HIED was between 5-11%; when looking at different household compositions it ranged from 3-19%, with group and multiple family households having the higher values within that range. Similar results were observed for HEAP (level of highest educational attainment) by state and by ten-year age groups.

Due to this variability and consistent with the Experimental IHAD 2016, hot-deck imputation was applied to HIED and HEAP. This involved assigning a value for the missing HIED/HEAP variable from a donor that matched the recipient’s values for a selection of other Census variables. For HIED the selection of variables were household composition, state or territory, SA2, and number of adults and employed people in the household; for HEAP the selection of variables were: age in 10 year groups, state or territory, SA2, level of non-school qualification, highest year of school completed, total personal weekly income and labour force status. The distribution of the imputed responses for each of the variables improved from the initial imputation approach and was deemed to be sufficient for the purpose of constructing the IHAD.

Description of candidate IHAD variables

This section contains a description of each variable on the candidate variable list. The tables containing the variable descriptions also state whether the variable is an indicator of relative advantage (adv) or relative disadvantage (dis). Each subsection corresponds to one of the socio-economic dimensions listed in The candidate list of variables. The candidate list includes all variables considered for inclusion in IHAD before the principal component analysis (PCA) stage. The final list of variables included in IHAD can be found in the Technical details for IHAD: variables and loadings.

Housing variables

List of household variables

Variable Variable description
NOCARHouseholds with no car (dis)
HIGHCARHouseholds with three or more cars (adv)
FEWBEDHouseholds with one or no bedrooms (dis)
HIGHBEDHouseholds with four or more bedrooms (adv)
OVERCROWDHouseholds requiring one or more extra bedrooms (based on Canadian National Occupancy Standard) (dis)
SPAREBEDHouseholds with one or more bedrooms spare (based on Canadian National Occupancy Standard) (adv)
OTHER_HHLDHouseholds with a structure classified as "other" (e.g. caravan, tent) (dis)
MULTI_FAMILYMulti-family households (adv)
LOWRENTHouseholds where rent payments are less than $250 per week, excluding employer landlords (excludes $0) (dis)
HIGHRENTHouseholds where rent payments are more than $500 per week (adv)
PUBLIC_RENTHouseholds being rented from a state or territory housing authority, or a housing co-operative/community/church group (dis)
OWNEDHouseholds owned outright (adv)
PURCHASEDHouseholds being purchased (adv)
HIGHMORTGAGEHouseholds where mortgage repayments are greater than or equal to $2,900 per month (adv)
AREA_RVRHouseholds in remote/very remote area (dis)
AREA_MCHouseholds in major cities (adv)

The cut-off values that are used to determine which dwellings are considered to have high or low income, mortgage repayments, and rent, mostly align with those used for 2021 SEIFA. These were updated, based on the most recent Census, to reflect real-world changes. For the mortgage and rent variables, the high value cut-off captures the 9th and 10th deciles while the low value cut-off captures the 1st and 2nd deciles.

Family variables

List of family variables

Variable Variable description
ONEPARENTHouseholds with a one-parent family, with dependent children only (dis)
CHILDJOBLESSHouseholds with children aged under 15 years and parent(s) not employed (dis)

Education variables

List of person variables - education

Variable Variable description
NOYEAR11_OR_HIGHERHouseholds where the person with the highest educational attainment left school at year 10 or below, including those who did not go to school and with Certificate level I or II (excludes those currently studying secondary education) (dis)
YEAR11Households where the person with the highest educational attainment left school at year 11 (excludes those currently studying secondary education) (dis)
CERTIFICATEHouseholds where the person with the highest educational attainment has a Certificate III or IV (adv)
DIPLOMAHouseholds where the person with the highest educational attainment has an Advanced Diploma or Diploma (adv)
DEGREEHouseholds where the person with the highest educational attainment has a Bachelor Degree or above (adv)
DEGREE_DEPENDENT*Households with at least one dependent child and the person with the highest educational attainment has a Bachelor Degree or above (adv).
NOYEAR12_DEPENDENT*Households with at least one dependent child and the person with the highest educational attainment left school at year 11 or below, including those who did not go to school and with Certificate level I or II (excludes those currently studying secondary education) (dis).

* Combining education level with dependent children represents the concept of household level advantage/disadvantage to children from having or not having educated parents. Dependent children are derived using CDCF (Counts the number of dependent children in the family). A dependent child is a person who is either a child under 15 years of age, or a dependent student aged 15-24 years. 

Occupation variables

List of person variables - occupation

Variable Variable description
INC_LOWHouseholds with low annual equivalised income (between $1 and $25,999) (dis)
INC_HIGHHouseholds with high annual equivalised income (greater than $90,999) (adv)
ALL_UNEMPLOYEDHouseholds where all people aged 15 years and over are unemployed (dis)
HIGH_SKILLHouseholds where the highest skilled employed adult works in a skill level 1 occupation (adv)
SKILL_LVL_2Households where the highest skilled employed adult works in a skill level 2 occupation (adv)
SKILL_LVL_4Households where the highest skilled employed adult works in a skill level 4 occupation (dis)
LOW_SKILLHouseholds where the highest skilled employed adult works in a skill level 5 occupation (dis)
ALL_SHORT_DISTANCEHouseholds where all people aged 15 years and over who are employed, travel 0 to less than 2.5 km to work (adv)
ALL_LONG_DISTANCEHouseholds where all people aged 15 years and over who are employed,  travel 50 to less than 250 km to work (dis)
ALL_VLONG_DISTANCEHouseholds where all people aged 15 years and over who are employed,  travel 250 or more km to work (dis)

Miscellaneous variables

List of person variables - miscellaneous

Variable Variable description
SEP_DIVORCEDHouseholds with one or more people aged 15 years and over separated or divorced (dis)
ENGPOORHouseholds with one or more people aged 15 years and over who do not speak English well (dis)
ROMHouseholds with one or more people aged 15 years and over who arrived in Australia in the last 10 years (dis)
UNENGAGED_YOUTHHouseholds with one or more people aged between 15 and 24 years who are not working or studying (dis)
DISABILITY_UNDER70Households with one or more people aged under 70 years who need assistance with core activities (dis)
DISABILITY_HH_PROPHouseholds where more than 50% of people need assistance with core activities (dis)
CARERHouseholds with one or more people aged 15 years and over who provide unpaid assistance to a person with a disability (dis)
VOLUNTEERHouseholds with one or more people aged 15 years and over who does voluntary work for an organisation or group (adv)
RETIRED_NOT_OWNEDHouseholds with a person aged 65 years and over who does not own the home, or occupy it under a life tenure scheme (dis)

Construction of IHAD

This chapter describes the methods used to construct IHAD, some important technical specifications and basic outputs.

Principal Component Analysis

Principal component analysis (PCA) has been used since the first release of SEIFA to summarise Census variables related to socio-economic advantage and disadvantage. The same methodology is used to create the IHAD, modified where necessary to use binary variables.. The aim of PCA is to reduce a large number of correlated variables into a smaller set of transformed variables, called "principal components". Each component is a weighted linear combination of the original. It is possible to extract as many components as there are variables. If the original variables are highly correlated, much of the variation can be summarised by a single principal component.

The first principal component is the weighted linear combination of variables that captures the maximum amount of variation present in the original dataset. This is calculated using the correlations between the variables. In general, variables that are strongly correlated with many others in the list will receive high weights. The first principal component is used to create the IHAD index.

The PCA used the binary candidate variables and the correlation matrix of these variables to give an indication of how significantly each variable contributes to the measurement of the unobserved latent variable of interest, namely socio-economic advantage and disadvantage. Each variable receives a loading that indicates the correlation of that variable with the index. A positive loading indicates an advantaging variable whereas a negative loading indicates a disadvantaging variable. The variables with the highest loadings are the ones that have the highest correlation with the index value.

Polychoric correlations were used instead of the standard Pearson correlations for the correlation matrix; this is appropriate for binary variables to ensure the correlation coefficients used in the PCA are unbiased. Using polychoric correlations is considered to be more accurate when running a PCA on discrete data such as the binary variables used in the IHAD.

The candidate variables listed in Description of candidate IHAD variables were used in the PCA for the IHAD and removed if their loading was less than or equal to 0.3 on the grounds that they were not particularly strong indicators of advantage or disadvantage. This process was performed iteratively, until all of the variables had a loading above 0.3. This is the same procedure used to create the SEIFA. The final variables and their loadings following this process are presented in the Technical details for IHAD: variables and loadings.

The first principal component scores were derived by taking the product of each standardised variable with its respective weight, then taking the sum. For convenience and consistency with the approach taken for SEIFA, these raw component scores were then standardised to a mean of 1,000 and a standard deviation of 100 to produce the index.

The sign of the PCA weights is arbitrary, but intuitively we want more disadvantaged households to have lower scores, for example NOCAR is a disadvantage variable and so should have a negative weight. The weights were multiplied by -1 to give advantage indicators positive weights and loadings, and disadvantage indicators negative weights and loadings. Accordingly, high index scores indicate relative advantage, and low index scores indicate relative disadvantage.

Step-by-step process

With the preceding two sections providing context, a step-by-step process for constructing IHAD is presented below:

1: Creating the initial variable list

Given the data available, we created a list of variables related to the definition of relative household socio-economic advantage and disadvantage.

2: Removing households with 10+ missing responses and imputing missing responses

We applied the IHAD scope to the dataset, and then identified households with 10 or more applicable missing responses. We removed these households from the dataset, imputed missing responses for most of the required variables, and then applied Hotdeck imputation for HIED and HEAP to create the dataset we used to construct the candidate variables.

3: Constructing the variables

We created binary indicators from household, family, and person level variables. These indicators take a value of 1 if the characteristics is present, and 0 if it isn’t.

4: Removing very highly correlated variables

We removed highly correlated variables to avoid over-representing any specific socio-economic characteristic. When two variables had a correlation coefficient greater than 0.8 in absolute value and were measuring conceptually similar aspects of advantage or disadvantage, we generally removed one of them. However, we applied some discretion, depending on the variables in question and the size of the correlation.

5: Conducting the initial PCA

We conducted principal component analysis (PCA) using the binary candidate variables and the correlation matrix of these variables, to obtain the loading for each variable on the first principal component.

6: Removing low loading variables

We excluded variables with loadings less than 0.3 in absolute value, on the grounds that they were not strong indicators of relative advantage or disadvantage. This limit is an accepted level in the PCA literature and has been used in past releases of SEIFA and IHAD. We removed variables one at a time, starting with the lowest loading variable.

7: Conducting PCA on the reduced list of variables

We conducted a PCA on the reduced variable list, and if any other variables loaded below 0.3, we repeated steps six and seven.

8. Calculating and standardising component/index scores

We derived the first principal component scores for each household by taking the product of each variable with its respective weight, then taking the sum across all variables. Note that the weight for each variable was calculated by dividing the loading by the square root of the eigenvalue.

\({Z_{SA1}} = \sum\limits_{j = 1}^p {\frac{{{L_j}}}{{\sqrt \lambda  }} \times {X_{j,}}_{SA1}}\)

where,

\({Z_{SA1}}\)= raw score for the SA1

\({{X_{j,}}_{SA1}}\) = standardised variable of the j-th variable for the SA1

\({{L_j}}\) = loading for the j-th variable

\(\lambda\) = eigenvalue of the principal component

\(p\) = total number of variables in the index

For convenience of presentation, we then rescaled the raw scores to a mean of 1,000 and standard deviation of 100 to create a new set of scores that are the household index scores in IHAD.

Note that the principal components are arbitrary with respect to their sign (positive or negative), so we set the sign of the weights and loadings so that they make intuitive sense. That is, we gave advantage indicators positive weights and loadings, and disadvantage indicators negative weights and loadings. Accordingly, high scores indicate relative advantage, and low scores indicate relative disadvantage. This is consistent with previous editions of SEIFA and IHAD.

Technical details for IHAD: variables and loadings

This section gives the results of the principal component analysis carried out for IHAD, including variable loadings and percentage of variance explained. A list of variables initially considered for inclusion but removed due to high correlations with other variables or weak loadings is also provided.

IHAD summaries variables that indicate either relative socio-economic advantage or disadvantage, according to the concept described in Defining the concept behind IHAD. The final IHAD variables and loadings are listed below.

IHAD variables and loadings

IHAD indicators of disadvantage

The following variables are indicators of disadvantage. PUBLIC_RENT is the strongest indicator of disadvantage in the index. 

VariableDescriptionLoading
PUBLIC_RENTHouseholds being rented from a state or territory housing authority, or a housing co-operative/community/church group (disadvantage)-0.84
LOWRENTHouseholds where rent payments are less than $250 per week, excluding employer landlords (excludes $0) (disadvantage)-0.81
INC_LOWHouseholds with low annual equivalised income (between $1 and $25,999) (disadvantage)-0.71
NOYEAR11or higherHouseholds where the person with the highest educational attainment left school at year 10 or below, including those who did not go to school and with Certificate level I or II (excludes those currently studying secondary education) (disadvantage)-0.69
NOCARHouseholds with no car (disadvantage)-0.61
RETIRED_NOT_OWNEDHouseholds with a person aged 65 years and over who does not own the home, or occupy it under a life tenure scheme (disadvantage)-0.59
DISABILITY_HH_PROPHouseholds where more than 50% of people need assistance with core activities (disadvantage)-0.55
NOYEAR12_DEPENDENTHouseholds with at least one dependent child and the person with the highest educational attainment left school at year 11 or below, including those who did not go to school and with Certificate level I or II (excludes those currently studying secondary education) (disadvantage)-0.54
FEWBEDHouseholds with one or no bedrooms (disadvantage)-0.45
ALL_UNEMPLOYEDHouseholds where all people aged 15 years and over are unemployed (disadvantage)-0.44
YEAR11Households where the person with the highest educational attainment left school at year 11 (excludes those currently studying secondary education) (disadvantage)-0.41
CHILDJOBLESSHouseholds with children aged under 15 years and parent(s) not employed (disadvantage)-0.35
IHAD indicators of advantage

The following variables are indicators of advantage. DEGREE_DEPENDENT is the strongest indicator of advantage in the index.

VariableDescriptionLoading
HIGHCARHouseholds with three or more cars (advantage)0.43
HIGHBEDHouseholds with four or more bedrooms (advantage)0.50
INC_HIGHHouseholds with high annual equivalised income (greater than $90,999) (advantage)0.68
PURCHASEDHouseholds being purchased (advantage)0.75
DEGREEHouseholds where the person with the highest educational attainment has a Bachelor Degree or above (advantage)0.76
HIGH_SKILLHouseholds where the highest skilled employed adult works in a skill level 1 occupation (advantage)0.78
HIGHMORTGAGEHouseholds where mortgage repayments are greater than or equal to $2,900 per month (advantage)0.79
DEGREE_DEPENDENTHouseholds with at least one dependent child and the person with the highest educational attainment has a Bachelor Degree or above (advantage).0.81

The 2021 IHAD index explains 41.4% of the total variance of the variables in the final variable list. The Experimental IHAD 2016 explained 43.2% of this total variance.

Removal of highly correlated variables

In most cases, highly correlated variables were removed from the initial candidate list. This was done to prevent instability in the variable weights and over-representation of any specific socio-economic characteristic. When two variables had a correlation coefficient of size greater than 0.8 in absolute value, one of them was generally removed. However, if they were deemed to be measuring different socio-economic characteristics (e.g. education and occupation), both were retained.

Variable descriptionReason for exclusion
Households with one or more people aged 15 years and over who are unemployed (UNEMPLOYED) (disadvantage)Highly correlated with ALL_UNEMPLOYED which highlights disadvantaged households better
Households with one or more people aged 70 years and over who need assistance with core activities (DISABILITY_OVER70) (disadvantage)Highly correlated with DISABILITY_HH_PROP (0.83) and not as representative of the total population
Households where all people aged 15 years and over have no educational attainment (NOEDU) (disadvantage)Small prevalence and highly correlated with NOYEAR11_OR_HIGHER

Removal of low loading variables

The following variables were initially considered for the index but were excluded when the analysis showed that they were weak indicators of relative advantage or disadvantage.

VariableVariable description
OVERCROWDHouseholds requiring one or more extra bedrooms (based on Canadian National Occupancy Standard) (disadvantage)
SPAREBEDHouseholds with one or more bedrooms spare (based on Canadian National Occupancy Standard) (advantage)
OTHER_HHLDHouseholds with a structure classified as "other" (e.g. caravan, tent) (disadvantage)
MULTI_FAMILYMulti-family households (advantage)
HIGHRENTHouseholds where rent payments are more than $500 per week (advantage)
OWNEDHouseholds owned outright (advantage)
ONEPARENTHouseholds with a one-parent family, with dependent children only (disadvantage)
CERTIFICATEHouseholds where the person with the highest educational attainment has a Certificate III or IV (advantage)
DIPLOMAHouseholds where the person with the highest educational attainment has an Advanced Diploma or Diploma (advantage)
SKILL_LVL_2Households where the highest skilled employed adult works in a skill level 2 occupation (advantage)
SKILL_LVL_4Households where the highest skilled employed adult works in a skill level 4 occupation (disadvantage)
LOW_SKILLHouseholds where the highest skilled employed adult works in a skill level 5 occupation (disadvantage)
ALL_SHORT_DISTANCEHouseholds where all people aged 15 years and over who are employed, travel 0 to less than 2.5 km to work (advantage)
ALL_LONG_DISTANCEHouseholds where all people aged 15 years and over who are employed travel 50 to less than 250 km to work (disadvantage)
ALL_VLONG_DISTANCEHouseholds where all people aged 15 years and over who are employed travel 250 or more km to work (disadvantage)
SEP_DIVORCEDHouseholds with one or more people aged 15 years and over separated or divorced (disadvantage)
ENGPOORHouseholds with one or more people aged 15 years and over who do not speak English well (disadvantage)
ROMHouseholds with one or more people aged 15 years and over who arrived in Australia in the last 10 years (disadvantage)
UNENGAGED_YOUTHHouseholds with one or more people aged between 15 and 24 years who are not working or studying (disadvantage)
CARERHouseholds with one or more people aged 15 years and over who provide unpaid assistance to a person with a disability (disadvantage)
VOLUNTEERHouseholds with one or more people aged 15 years and over who does voluntary work for an organisation or group (advantage)

Distribution of the IHAD

This section presents the frequency histogram of IHAD scores. The IHAD distributions have generally similar shapes to those from Experimental IHAD 2016.

The scores for IHAD range from 613 to 1,246; the table presents maximum and minimum scores of each IHAD quartile. These show that there is sufficient variation in the IHAD scores to allow for the formation of these groups.

Some households will not have any indicators of advantage or disadvantage (i.e. their values for the final binary candidate variables are all 0). They will still receive an IHAD score reflecting the middle of the IHAD score distribution, which places them in quartile 2.

Frequency distribution of ranked households index group
Household index groupNumber of households*Household index score
FrequencyPercentageMinimumMaximum
1              2,307,765 25.0613943
2              2,308,638 25.0943992
3              2,338,78625.39921,070
4              2,275,87224.71,0701,246

* The total number of in-scope households assigned an IHAD score is 9,231,061

Validation of IHAD

Once the index was calculated, it was checked to ensure that it measured the desired concept and that the results make intuitive sense. This validation is important to establish the credibility of the index and identify any issues that may have been missed in the construction of the index. 

A range of validation checks were applied to the 2021 IHAD, including:

  • comparison with the experimental IHAD 2016,
  • comparison with 2021 SEIFA IRSAD, 
  • sense checking of the SA1s with the highest and lowest proportions in quartile 1 or quartile 4, and 
  • sense checking of SA1s where SEIFA IRSAD and IHAD differed the most. 

Results showed that IHAD 2021 is consistent with both the 2021 SEIFA IRSAD and the experimental IHAD 2016. The top and bottom SA1s were sensible and SA1s where IRSAD and IHAD differed the most were reasonable.  These checks confirmed that IHAD was performing as intended, in adding value to SEIFA IRSAD by showing the distribution of advantage and disadvantage at household level.

Using and interpreting IHAD

Broad guidelines on appropriate use

Household level index

The IHAD is constructed at the household level, based on the assumption that economic and other resources are generally shared within households, and therefore persons within households will share similar levels of socio-economic advantage and disadvantage. However, this may not always be the case, particularly for multi-family households, group households, and households containing lodgers or boarders. It is possible for a relatively advantaged person to be a resident in a relatively disadvantaged household or a relatively disadvantaged person to reside in a relatively advantaged household.

Quantiles are created based on assigning, as near as is practicable, equal numbers of households into each quantile (rather than equal numbers of persons). As larger households tend to have higher index values, more advantaged deciles tend to contain larger numbers of persons.

Importance of the underlying variables

IHAD is constructed using a weighted combination of selected variables. The index is dependent on the set of variables chosen for the analysis. A different set of underlying variables would result in a different index. However, due to the large number of variables in IHAD, removing or altering a single variable will usually not have a large effect.

Users should consider the aspect of socio-economic advantage and disadvantage in which they are interested and examine the underlying set of variables for IHAD. This will allow them to make an informed decision on whether IHAD is appropriate for their particular purpose. 

Relationship with census variables

As the IHAD is constructed using Census variables, when undertaking analyses involving cross-tabulation of the IHAD with other Census variables, users should examine the variables contained within the index to aid in the interpretation of those results. Refer to Technical details for IHAD: variables and loadings for details of variables that were included in the IHAD.

Time series

The index is designed to compare socio-economic characteristics of a household at a point in time, not to compare households over time. There are several issues that make longitudinal or time series analysis of IHAD difficult to interpret.

  • The constituent indicators and indicator weights for the index are likely to have changed.
  • The distribution of the standardised index values will have changed (e.g., a score of 800 does not represent the same level of disadvantage in different years).
  • Census variable changes will affect the variables used to calculate the IHAD scores.

Topics not represented in the index

Topics represented in IHAD are limited by the variables collected in the Census.

Measures relating to wealth, access to services, and social/community participation may provide more information about the relative advantage or disadvantage within a household, but these measures are not collected in the Census.

Long-term health conditions, asked for the first time in the 2021 Census, were not included in IHAD. This is to allow health researchers to analyse the relationship between health outcomes and socio-economic advantage/disadvantage. Adding health variables to IHAD would make these relationships less clear.

Other potential topics that could be associated with advantage and disadvantage but are not captured in the Census include crime, and the environment. If data were available on these topics, they may provide additional information about the level of advantage and disadvantage present within households that could result in households being assigned a different index value.

Mapping IHAD

Mapping IHAD provides an excellent way of observing the spatial distribution of relative socio-economic advantage and disadvantage. Interactive maps at SA2 level are available for 2021 IHAD.

References

Australian Bureau of Statistics (2013) ANZSCO – Australian and New Zealand Standard Classification of Occupations, Version 1.3, cat. no. 1220.0, ABS, Canberra.

Australian Bureau of Statistics (Jul2021-Jun2026), Australian Statistical Geography Standard (ASGS) Edition 3, ABS Website, accessed 20 April 2023.

Australian Bureau of Statistics (2016) Census of Population and Housing: Census dictionary, ABS Website, accessed 23 August 2016.

Australian Bureau of Statistics (2021), Census of Population and Housing: Census dictionary, ABS Website, accessed 20 April 2023.

Australian Bureau of Statistics (Jun 2011), Information Paper: Measures of Socioeconomic Status, New Issue for June 2011, ABS Website, accessed 20 April 2023.

Joliffe, I.T. (1986) Principal Component Analysis, Springer–Verlag, New York.

Wise, P. and Mathews, R. (2011) Socio-Economic Indexes For Areas: Getting a Handle on Individual Diversity Within Areas, Methodology Research Papers, cat. no. 1351.0.55.036, Australian Bureau of Statistics, Canberra.

Wise, P. and Williamson, C. (2013) Building on SEIFA: Finer levels of Socio-Economic Summary Measures by Phillip Wise and Courtney Williamson (cat. no. 1352.0.55.135), Australian Bureau of Statistics, Canberra.

Appendix: Variable specifications

This appendix gives descriptions of each variable considered for inclusion for IHAD 2021. The square brackets contain specifications for creating the indicators from Census data items, according to the mnemonics used in the Census of Population and Housing: Census Dictionary, 2021. The variables are arranged by socio-economic dimension.

Notes:

  • The Skill Level for each occupation can be found in ANZSCO – Australian and New Zealand Standard Classification of Occupations, Version 1.3
  • Household composition was ‘not classifiable’ if the household: contained only visitors or persons aged under 15 years on Census night; or was determined to be occupied on Census Night but the collector could not make contact; or could not be classified because there was insufficient information on the Census form.
  • The Canadian National Occupancy Standard determines housing appropriateness, using the number of bedrooms and the number, age, sex and relationships of household members. For more information refer to Housing Occupancy and Costs, Australia, 2019–20.
  • Housing, Family, and Equivalised total household income variables listed below include only classifiable private dwellings [DWTD = 1 and HHCD = 1, 2 or 3]. Education, Occupation, and Miscellaneous variables listed below include only people living in classifiable private dwellings who were at home on Census night [DWTD = 1 and HHCD = 1, 2 or 3 and UAICP = 1].

Housing variables

Housing variables - specification

Variable Variable description
NOCARHouseholds with no car [VEHRD = 0]
HIGHCARHouseholds with three or more cars [VEHRD = 3-4]
FEWBEDHouseholds with one or no bedrooms [BEDRD = 0-1]
HIGHBEDHouseholds with four or more bedrooms [BEDD = 4-6]
OVERCROWDHouseholds requiring one or more extra bedrooms (based on Canadian National Occupancy Standard) [HOSD = 01-04]
SPAREBEDHouseholds with one or more bedrooms spare (based on Canadian National Occupancy Standard) [HOSD = 06-09]
OTHER_HHLDHouseholds with a structure classified as "other" (e.g. caravan, tent) [STRD = 91-94]
MULTI_FAMILYMulti-family households [HHCD = 2]
LOWRENTHouseholds where rent payments are less than $250 per week, excluding employer landlords (excludes $0) [RNTD = 1-250 and LLDD = 10-40] 
HIGHRENTHouseholds where rent payments are more than $500 per week [RNTD = 501 – 9999]
PUBLIC_RENTHouseholds being rented from a state or territory housing authority, or a housing co-operative/community/church group [LLDD 20-21]
OWNEDHouseholds owned outright [TEND = 1]
PURCHASEDHouseholds being purchased [TEND = 2,3,6]
HIGHMORTGAGEHouseholds where mortgage repayments are greater than or equal to $2,900 per month [MRED = 2900-9999]
AREA_RVRHouseholds in remote/very remote areas [Remoteness area category = 3-4, based on SA1 Geography]
AREA_MCHouseholds in major cities [Remoteness area category = 0, based on SA1 Geography]

Family variables

Family variables - specification

Variable Variable description
ONEPARENTHouseholds with a one-parent family, with dependent children only [FMCF = 3112, 3122, 3212]
CHILDJOBLESSHouseholds with children aged under 15 years and parent(s) not employed [FMCF = 21, 31 and LFSF = 16-17, 19, 25-26]

Education variables

Education variables - specification

Variable Variable description
NOYEAR11orhigherHouseholds where the person with the highest educational attainment left school at year 10 or below, including those who did not go to school and with Certificate level I or II (excludes those currently studying secondary education) [HEAP = 621, 811, 812 and TYPP not = 31-39, or HEAP = 720-724]
YEAR11Households where the person with the highest educational attainment left school at year 11 (excludes those currently studying secondary education) [HEAP = 613, and TYPP not = 31-39]
CERTIFICATEHouseholds where the person with the highest educational attainment has a Certificate III or IV [HEAP = 5]
DIPLOMAHouseholds where the person with the highest educational attainment has an Advanced Diploma or Diploma [HEAP = 4]
DEGREEHouseholds where the person with the highest educational attainment has a Bachelor Degree or above [HEAP = 1-3]
DEGREE_DEPENDENT*Households with at least one dependent child and the person with the highest educational attainment has a Bachelor Degree or above.
NOYEAR12_DEPENDENT*Households with at least one dependent child and the person with the highest educational attainment left school at year 11 or below, including those who did not go to school and with Certificate level I or II (excludes those currently studying secondary education)

* Combining education level with dependent children represents the concept of household level advantage/disadvantage to children from having or not having educated parents. Dependent children are derived using CDCF (Counts the number of dependent children in the family). A dependent child is a person who is either a child under 15 years of age, or a dependent student aged 15-24 years.

Occupation variables

Occupation variables - specification

Variable Variable description
INC_LOWHouseholds with low annual equivalised income (between $1 and $25,999) [HIED = 2-5]
INC_HIGHHouseholds with high annual equivalised income (greater than $90,999) [HIED = 12-16]
ALL_UNEMPLOYEDHouseholds where all people aged 15 years and over are unemployed [AGEP > 14 and LFSP = 4,5]
HIGH_SKILLHouseholds where the highest skilled employed adult works in a skill level 1 occupation [OCSKP = 1]
SKILL_LVL_2Households where the highest skilled employed adult works in a skill level 2 occupation [OCSKP = 2]
SKILL_LVL_4Households where the highest skilled employed adult works in a skill level 4 occupation [OCSKP = 4]
LOW_SKILLHouseholds where the highest skilled employed adult works in a skill level 5 occupation [OCSKP = 5]
ALL_SHORT_DISTANCEHouseholds where all people aged 15 years and over who are employed, travel 0 to less than 2.5 km to work [DTWR1P = 1,2]
ALL_LONG_DISTANCEHouseholds where all people aged 15 years and over who are employed,  travel 50 to less than 250 km to work [DTWR1P = 6]
ALL_VLONG_DISTANCEHouseholds where all people aged 15 years and over who are employed,  travel 250 or more km to work [DTWR1P = 7]

Miscellaneous variables

Miscellaneous variables - specification

Variable Variable description
SEP_DIVORCEDHouseholds with one or more people aged 15 years and over separated or divorced [MSTP = 3,4]
ENGPOORHouseholds with one or more people aged 15 years and over who do not speak English well [AGEP > 14 and ENGLP = 4,5]
ROMHouseholds with one or more people aged 15 years and over who arrived in Australia in the last 10 years [AGEP > 14 and YARP = 2021-2021]
UNENGAGED_YOUTHHouseholds with one or more people aged between 15 and 24 years who are not working or studying [AGEP = 15-24 and EETP = 41]
DISABILITY_UNDER70Households with one or more people aged under 70 years who need assistance with core activities [AGEP < 70 and ASSNP = 1]
DISABILITY_HH_PROPHouseholds where more than 50% of people need assistance with core activities [number of people where ASSNP=1 / number of people where UAICP = 1 > 0.5]
CARERHouseholds with one or more people aged 15 years and over who provide unpaid assistance to a person with a disability [UNCAREP = 2]
VOLUNTEERHouseholds with one or more people aged 15 years and over who does voluntary work for an organisation or group [VOLWP = 2]
RETIRED_NOT_OWNEDHouseholds with a person aged 65 years and over who does not own the home, or occupy it under a life tenure scheme [AGEP > 64 and TEND = 3-5,7]