Page tools: Print Page Print All | ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
OVERVIEW OF METHODOLOGY
The very remote communities and people living in non-private dwellings, populations that were out of scope of the micro surveys, were excluded from the ASNA estimates and distributed separately using data from the 2006 and 2011 ABS Census of Population and Housing. These distributions were then added to the ASNA distributions based on the micro surveys to obtain the final distribution of the ASNA household income, consumption and wealth estimates. For detail information regarding the methodology described above, please refer to Information Paper: Australian National Accounts, Distribution of Household Income, Consumption and Wealth, 2009-10 (cat. no. 5204.0.55.009), Chapter 4 - Data Sources and Methodology: Distribution of the Household National Accounts Estimates. IMPROVEMENTS TO THE ORIGINAL METHODOLOGY The following improvements were made to the original methodology described above: Household Quintiles There are two options for quintile boundaries when sorting households into equivalised income and net worth quintiles. Either, an equal number of people are allocated to each quintile (person weighted quintiles) or equal number of households are allocated to each quintile (household weighted quintiles). For national accounts purposes, the preference is to use household weighted quintiles as the preferred unit of analysis is the household. However, due to resource and time limitations, person based quintiles were used in the compilation of Information Paper: Australian National Accounts, Distribution of Household Income, Consumption and Wealth, 2009-10 (cat. no. 5204.0.55.009). Due to the addition of the out of scope population to the micro defined equivalised income and net worth quintiles, the final income and net worth quintiles published in this release are not equal in proportion. From the lowest to the highest quintile, the following proportion of households: 20.9; 20.1; 19.7; 19.6; and 19.7 respectively are in each quintile. This is still an improvement on the quintiles generated for the original methodology. Distribution of non-indigenous population living in very remote communities Previously, due to access limitation of the 2006 Census data set, the non-indigenous population living in very remote communities were distributed based on the distribution of the in scope micro population. For this release, we were able access a larger scope of the 2006 Census data set, therefore non-indigenous households living in very remote communities were distributed using the demographic information from the Census data set, this has led to greater conceptual consistency across the distributions of the out of scope population. IMPLEMENTATION OF THE TIME SERIES Periodicity The time series presented in this release for distribution of the household income, consumption and wealth is biannual from 2003-04 to 2010 -12. The decision to start the time series in 2003-04, and compile the estimates biennially was based on:
Model used to estimate for data gaps in distributional source data Two options were investigated to model for distributional household indicators for the years that the source micro (SIH, HES, Census and STiK) data was not available. The first option was to use the nearest available source data for the missing years (nearest year method); and the second option was to linearly interpolate (or extrapolate) the data for the missing years. The second option was chosen. Option one: nearest year method The Table 3.1 below provides the source data that would be applied for the ASNA time series points. For example, to compile the estimates for 2007-08, the HES data from the 2009-10 survey would have been used as no HES data exists for 2007-08.
When this model was applied to distribute the ASNA data for the time series, and growth patterns for components were analysed some major flaws were revealed, due to the following assumption of this model: (1) for any two years that share the same distributional source data, the distribution pattern does not change. For example, 2003-04 HES was used for years 2003-04 and 2005-06, and therefore the assumption is that the household consumption pattern between these two years would not have changed. (2) any changes to the distribution of household income, consumption and wealth between two points where no source data is available, and the nearest data used is two different time points of a survey, the changes in data between the two iterations of survey are captured over the two years, regardless of the distance between the source data estimates. For example, the time point 2005-06, the 2003-4 HES data was applied; for the time point 2007-08, the 2009-10 HES data applied. Despite the fact that these two HES surveys are six years apart, the changes in consumption pattern between these two surveys will be captured in the two year period between 2005-06 and 2007-08, and therefore the compression of the change makes the change between the data points bigger than it is in reality. The aim of implementing a time series of the household distribution ASNA income, consumption and wealth data is to capture changes overtime of these household aggregates and the components, the nearest year method did not adequately do this.Option two: linear interpolation and extrapolation As mentioned above, linear interpolation (and extrapolation) was applied to generate ASNA distributional household groups for the years that the source micro (SIH, HES, Census and STiK) data was not available. A detail description is provided of this model below.
Table 3.2 provides the ASNA household final consumption expenditure (HFCE) distributed to equivalised income quintiles using available source data from 2003-04 and 2009-10 HES. Linear interpolation and extrapolation methodology is applied to generate 2005-06, 2007-08 and 2011-12 ASNA HFCE equivalised income quintiles using the 2003-04 and 2009-10 HFCE quintile estimates from Table 3.2. The following example is used to explain the methodology used to linearly interpolate and extrapolate missing values. Let Point A be a known value (vA) at a known point in time (tA) and let point B be another known value (vB) at a known point in time (tB) where tB > tA. Let point X be an unknown value (vX) at a known point in time (tX) that we want to estimate. This information is summarised in Figure 3.1. Figure 3.1: Graphical display of the variables used for Formulas 3.1 and 3.2 As triangles APX and AQB are similar triangles, Formula 3.1 can be constructed in which the only unknown variable is vX. Formula 3.1 (to find the value of vX) Initial. Isolating vX and simplifying then gives Formula 3.2. Formula 3.2 (to find the value of vX, simplified) Using the data from 2003-04 for quintile 1 Point A is (2004.5, 1333.43), Point B is (2010.5, 1870.97) and Point X is (2008.5, vX). Substituting these numbers into formula 3.2 give us formula 3.3 which gives us a value for vx. Formula 3.3 Formula to find the value of vX in the given example. The above process is repeated to generate data for all quintiles and for all missing years. Table 3.5 contains the time series , equivalised income quintiles, ASNA HFCE, clothing and footwear, 2003-04 to 2011-12.
The data in Table 3.3 , shows the interpolated equivalised income quintile data, linearly interpolated for years 2005-06 and 2007- 08 and extrapolated for 2011-12. The simulated data does have some shortcomings that it derives clearly linear changes in distribution, however, it does lead to a more realistic growth pattern than the nearest year model. This ‘linear interpolation’ method was used to simulate Census, SIH and HES values using the available source data as the two nearest data points as the known points. The following should be noted in applying this method across all missing values:
Formula 3.4: Calculation of number of households in each Equivalised Net Worth Quintile for 2007-08. METHODOLOGY FOR ESTIMATING MISSING SURVEY OF INCOME AND HOUSING DATA ITEMS The 2003-04 and 2005-05 did not include some income and wealth data items included in 2009-10 SIH. When these missing data item form part of aggregates in the SIH, it meant the aggregates were underestimated in 2003-04 and 2005-06 compared 2009-10 SIH. This was particularly problematic for some of the income items as these items affected household’s main source of income (MSI) and the derivation of household equivalised income quintiles. Also, if SIH data set was not made consistent with the 2007-08 data, large series break between 2005-06 and 2007-08 would have appeared in ASNA distributional household data set. The missing SIH data items for 2003-04 and 2005-06 were estimated by applying factors to the 2003-04 and 2005-06 SIH data, the factors were calculated from the items (now being reported) in 2007-08, 2009-10 and 2011-12 SIH. The missing SIH income items impacted only on MSI Property Income and Superannuation and Other categories . Once these MSI categories were estimated, new equivalised income quintiles were derived for 2003-04 and 2005-06 SIH data CHANGING DEMOGRAPHICS OVER TIME Once the time series for the distribution of household income, consumption and wealth was produced, the ABS reflected on the challenge of how this data set could be analysed as time series. When distributional data across different years was compared, it is important to note that the change in the estimate is impacted by demographic changes over time such as increase in the number of households in a particular household distributional group. For example, when analysing final consumption expenditure on food, distributed by age of household reference person over 65 years , a change in the estimate from 2009-10 to 2011-12, need to be divided into (a) change due changes in consumption habits and (b) change due to more households in 2011-12 where the reference person was over 65?” In order to separate changes due to (a) and (b) as described above, we applied a number of different methods to control for the demographic shift , that is (b). (i) dollars per household The data was analysed in ‘dollars per household’ terms. For this measure, the total level of an item in each group was divided by the number of households in each group to give the amount of income, consumption or wealth for that item per household in that group. This analysis while removing the effect of the change in demographic shift, it also removed the sense of some groups simply being bigger than others by virtue of having a greater number of households in that group. For example, in 2011-12 consumption of food by households with a reference person aged 15-24 was $2,483m while the consumption of food by households with a reference person aged between 25 and 34 years was $11,418m. When, expressed in dollar per household terms the annual consumption per household was $6,784 and $7,995 respectively. The similarity of these two numbers hid the fact that less money was being spent on food by households with a reference person of age 15-24 years. (ii) standardisation to a reference year The idea of the standardisation method was to remove demographic shift without removing a sense of total spending. The approach standardise the household distribution to a reference year, in effect answering the question of "What would income/consumption/wealth distributions look like if there were the same number of households as in the reference year and they were distributed the same way?". This was achieved by converting the distributed ASNA items into dollars per household terms and multiplying these numbers by the number of households in that group in the reference year. An issue with this analysis was that while it included a sense of the different sizes of the groups, it removed the fact that the total number of households between all the groups was increasing and, as such, items on this basis understated total growth. (iii) standardisation to the current year This analysis was undertaken to try and get a sense of total growth back into the data. This was achieved by dividing the standardised items found in (ii) by the total number of households in the reference year and then multiplying the results by the total number of households in the actual year. This was like answering the question of "What would the income, consumption and wealth distributions look like this year if the households this year were distributed in the same way as the reference year?" This analysis returned a sense of total growth back into the data and maintained the sense of difference in the size of groups. However, the total figures for each item were no longer the total figures from the ASNA. In fact, the total figures were different depending upon what groups you were distributing the totals into. After weighing up the advantages and disadvantages of the three methods described above to account for demographic shifts, method (i) dollars per household was applied to the data. While it may be useful or interesting to further explore other methods of adjusting for demographic shift, they are beyond the scope of this publication. TIME SERIES ANALYSIS INCLUDED IN THE RELEASE The release includes the following tables to:
STRENGTHS AND WEAKNESS OF TIME SERIES To enable users to interpret the data from the time series, the strengths and limitation of the time series are presented below. Strengths The time series of the distributed household income, consumption and wealth data:
Weakness The major weakness of this data set are the relatively long intervals between collection years for the Household Expenditure Survey (HES) and, to a lesser extent, the Census of Population and Housing (Census). As a result:
While improvements were made to equate the size of the equalised income and net worth quintiles, the number of households in each of the quintiles still ranges from 19.6% to 20.9%. This should be taken into account when comparing the quintiles. Finally, users are encouraged to read Chapter 4, Methodological Issues, in the Information Paper: Australian National Accounts, Distribution of Household Income, Consumption and Wealth, 2009-10 (cat. no. 5204.0.55.009), to understand the issues in constructing a household distributional data set for a single time point. Document Selection These documents will be presented in a new window.
|