6461.0 - Consumer Price Index: Concepts, Sources and Methods, 2016  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 21/02/2017   
   Page tools: Print Print Page Print all pages in this productPrint All



15.1 The launch of barcode scanner technology in Australia during the 1970s, and its growth in the 20th century, has enabled retailers to capture detailed information on transactions at the point of sale. Transactions data are high in volume and contain detailed information about transactions, including date, quantities, product descriptions, and value of sales. As such, it is a rich data source for National Statistical Offices (NSOs) that can be used to enhance their statistics, reduce provider burden, and reduce associated costs of physically collecting data.

15.2 From March quarter 2014 the ABS significantly increased its use of transactions data to compile the Australian CPI, now accounting for approximately 25 per cent of the weight of the Australian CPI. The approach adopted was a 'direct replacement' of observed point-in-time prices with a unit value calculated from the transactions data(footnote 1) .

15.3 While this has enhanced the Australian CPI, it is acknowledged that more can be done with transactions data to compile official statistics than traditional approaches.

15.4 This chapter provides a summary of the current and future use of transactions data in compiling the Australian CPI outlined in the Information Paper: Making Greater Use of Transactions Data to compile the Consumer Price Index, Australia, 2016 (cat. no. 6401.0.60.003).


15.5 The current ABS approach to compile the CPI using transactions data is consistent with the International Labour Organization (ILO 2004), and is a replacement of directly observed (point-in-time) prices with a unit value calculated from the transactions data. The unit value approach takes expenditure and quantity data by product over the period of interest (e.g. quarter) to calculate an average unit price. It allows for better outlet coverage as unit values are calculated over all of a respondent's outlets, rather than just a sample. The major benefit of this approach compared to the traditional point-in-time pricing is that unit values provide a more accurate summary of an average transaction price than an isolated price quotation (Diewert 1995).

15.6 The chosen time aggregation is important when using transactions data, with more time aggregation providing more stable estimates of price change as price and quantity bouncing(footnote 2) behaviour is smoothed out. Unit prices are derived at both monthly and quarterly frequencies from transactions data to ensure elementary aggregation is consistent across other modes of price collection. The ABS aggregates both transactions data and field collected prices at the EA level using the geometric mean formula.

15.7 When using the unit value approach, all of the difference between successive unit values for the same item is attributed to price change (i.e. it assumes quality differences are zero). Items must therefore be tightly defined at a fine level of aggregation to maximise homogeneity and prevent quality differences from affecting the unit values. This can be challenging when using transactions data as these datasets tend to exhibit a high level of churn. The ABS defines items as homogeneous by using product classifications provided by Australian proprietors known as the stock keeping unit (SKU).

15.8 The ABS has developed an approach to make explicit quality adjustments to items that have changed using the detailed descriptions provided in the transactions data. There are broadly three main scenarios which initiate the quality adjustment process for prices obtained from transactions data.

      i) A new SKU is brought into the sample as a replacement for another SKU to which it is not directly comparable: this requires calculating a previous period price for the new item.
      ii) A SKU in the sample has a quantity change (e.g. packet size): this requires calculating a quality adjustment factor (i.e. ratio of packet sizes) by extracting information for the item description. This quality adjustment factor is applied to product expenditures prior to calculating a unit price.
      iii) A SKU is replaced with a similar SKU, possibly with a quantity change: This also requires calculating a quality adjustment factor by extracting information for the item descriptions. To capture price change due to products of the same quality with different SKUs (e.g. product re-launches), the ABS has developed a method to link comparable new and disappearing products. The linking process uses information on the item description, price, expenditures, timing (when products appear or disappear on sales listings) and quantity sold.

15.9 To ensure the sample derived from transactions data remains relevant, the ABS has incorporated an approach to review the sample each period to identify items that have become less relevant based on expenditure shares, and recommend suitable replacements. Existing SKUs sampled in the CPI are assessed in each period against several relevancy tests to identify items that have shown large drops in expenditure. The main principle behind these relevance tests is that the items have a stable expenditure share within the CPI product group (i.e. items which fail these tests are recommended for replacement), with an output file recommending replacements for analysis. The current approach to determine item replacements is manually driven, with the strategy being intentional and manageable with the current sample sizes.

15.10 The relevancy tests described above are different for seasonal items (exclusively seasonal fruits) where certain varieties are popular in one part of the year, but not available at all in other periods. For these seasonal items, the ABS selects the highest expenditure product variety each period. This approach ensures the most representative item is selected for that period, which to avoid selecting clearance or unsuitable items, is subject to a minimum monthly expenditure threshold. This is a continuation of the field collection practice, where field officers collect the price of the most representative variety each period.

15.11 Reference prices are calculated for any new items that are introduced to the sample via the replacement process. To ensure the new items are introduced into the sample at a normal selling price, the ABS calculates a 'normal' unit value price (which has an average level of specialling) as the reference unit value price. This is done by first calculating an average discount over the previous 12 months and then applying this average discount to the highest price in the previous month/last month of the previous quarter. This is to reflect a 'normal level of discounting' and ensure the index is not distorted by unusual price activity in the previous period.

15.12 Summarising the current approach, the calculation of average unit prices over both time and respondent outlets provides a more representative price paid by consumers over the reference period (e.g. month, quarter) than a point-in-time price from a specific outlet. The replacement strategy also informs sampling decisions, where high expenditure items dictate the products sampled in the CPI. While the implementation of the direct replacement method using unit values has provided the Australian CPI with a significant enhancement, it is recognised that more can be done with transactions data.


15.13 The benefits of using a superlative index formula, such as the Fisher Ideal index, to reduce bias in the CPI is discussed in Price index theory of this manual. The availability of timely expenditure information (for weighting purposes) enables the calculation of weighted bilateral indexes (such as the Fisher and Törnqvist) that account for consumer substitution across time. However, traditional methods have known fundamental weaknesses when chaining price indexes at a high frequency (Ivancic, Fox and Diewert 2011).

15.14 New methods and processes are required to maximise the use of transactions data. Typically, multilateral index methods have been used in the spatial context to compare price levels across different regions, however academics and NSOs are proposing they be used to make price comparisons across multiple (three or more) time periods. Temporal multilateral methods produce weighted price indexes and have the property of transitivity(footnote 3) .

15.15 In recent years there has been an increase in the range of multilateral methods proposed for use in CPI aggregation when using transactions data. Earlier research conducted by the ABS assessed well-known multilateral methods in a temporal context, including the Gini, Eltetö, Köves and Szulc (GEKS) and the Time Product Dummy (TPD) which were proposed by Ivancic, Fox and Diewert (2009). Since then however, a number of new methods have been proposed for producing multilateral temporal indexes, such as the multilateral method recently implemented by Statistics Netherlands for mobile phones adapted from the Geary-Khamis (GK) method (Chessa 2016).

15.16 Practical challenges exist when applying these methods in the CPI. When a multilateral method is extended by an additional period (e.g. quarter), previous price movements are revised, which is unacceptable for NSOs. Additionally, the length of time the multilateral method uses price and expenditure information for index compilation is important in order to account for seasonal availability of product prices without impairing the index. There are a number of approaches available to NSOs to deal with these challenges, and these have been discussed in Multilateral extension methods in the ABS Information Paper: Making Greater Use of Transactions Data to compile the Consumer Price Index, Australia, 2016 (cat. no. 6401.0.60.003).

15.17 While multilateral methods have strong theoretical and practical properties, only a handful of NSOs have actually implemented them(footnote 4) , displaying caution in implementation and a divergence in methods and practices. The choice of index formula is contextual: the best elementary aggregation method depends, in part, on the transactions datasets that are available to the NSO. Other considerations specific to the NSO, such as the environment in which they operate, and the methods they use to compile the CPI more broadly are also necessary.

Benefits and challenges in using multilateral methods

15.18 Big data are becoming available in new ways, providing NSOs with opportunities to deliver statistical output in a more efficient and innovative manner. Traditional price aggregation methods based on weighted price index formula are suitable for use when the basket of items remains fixed over time. Moving to transactions data, items are dynamic, appearing and disappearing on a regular basis over time. In order to accommodate a dynamic universe of items these techniques try to match as many of these as possible and form a continuous price index where price movements are chained together. Frequent chaining of weighted bilateral price indexes causes a chain drift(footnote 5) problem due to the nature of price and quantity bouncing. This breakdown of bilateral price index formula has led researchers and NSOs to investigate alternate methods for price index aggregation.

15.19 The consideration of multilateral methods is motivated by a number of studies showing that bilateral methods can suffer from drift when chained at high frequencies (Ivancic, Fox and Diewert 2011; van der Grient and de Haan 2011). While a chained bilateral index measures short term movements accurately, a comparison between published index levels several periods apart may not accurately reflect the price change over that time. Multilateral methods are designed to preserve transitivity; they are independent of the choice of base period. However, as new data becomes available over time, multilateral methods revise previously estimated index numbers which is troublesome in the context of producing a non-revised CPI. There are several strategies available to NSOs to extend a multilateral index which are discussed later in this publication.

15.20 Since multilateral methods use data from several periods, an unavoidable consequence is that the price comparison between any two periods depends on prices and weights in other periods as well, which could affect the relevance of the index. A tension between 'characteristicity' - the relevance of a price comparison to the periods under consideration - and 'chain drift' - loss of transitivity - must be considered when assessing multilateral methods (Ivancic, Fox and Diewert 2011). Empirical analysis should assist with this decision, with particular interest focusing on differences in the direction of the index movement, or the timing of turning points, that result from using different extension methods.

15.21 By making greater use of the number of price observations from the dynamic transactions dataset, sample representativeness in the product dimension would be increased, the probability of sampling error would be reduced and biases (e.g. item substitution, and new and disappearing items) associated with purposive sampling would be eliminated, hence the accuracy of the index would be enhanced. This is in contrast to the traditional approach which is based on the purposive sampling technique, where a sample of prices are taken across several dimensions, including products, respondents, locations and time (Mackin, Oehm and Gow 2012). The incorporation of transactions data in to the Australian CPI has increased the representativeness of the sample in both the location and time dimensions. However, making greater use of transactions data would further enhance the CPI. In addition, using expenditure data to weight the dynamic universe of prices based on their economic importance would make the price index movement a more accurate reflection of the concept that the CPI measures: the price change facing Australian households for a basket of goods and services (ABS 2011).

15.22 Increasing the number of price observations from transactions data and aggregating by way of a multilateral method would reduce the number of resources required at various stages of the statistical cycle (i.e. sampling, collection and processing of prices). The opportunity to automate processes to reduce costs associated with producing a CPI is attractive to all NSOs. An added advantage for the Australian CPI is that efficient methods also facilitate more timely outputs - processes that require less manual intervention can be completed sooner after the reference period, and make the production of higher-frequency outputs more feasible.

15.23 Utilising transactions data could also require an increase in resources at new points of the statistical cycle since new data usually requires new processes to be followed. The mapping of items to CPI classifications is currently a labour intensive process at the ABS as each respondent has unique classification structures. This process of mapping is required every time new transactions data are secured, or when there is a high amount of product churn in the datasets. The ABS plans to further investigate automated mapping in the future to facilitate increased use of transactions data in a less resource intensive manner.

15.24 While multilateral methods have several attractive properties, they are more complex than bilateral methods. It is important that NSOs are able to explain their properties to stakeholders so they can assess their suitability for their purposes. Additionally, it is important that NSOs explain published movements using these methods. Tools that can decompose price movements at the lower levels of aggregation are required to enable this explanation to occur.


15.25 The Australian CPI currently uses transactions data by calculating an average unit value by product, by taking the quantity and expenditure information over the period of interest and replacing the directly observed price with this unit value. While this has enhanced the Australian CPI, it is recognised that more can be done with big data.

15.26 The availability of timely expenditure data in transactions datasets allows weighted bilateral indexes to be calculated, accounting for consumer substitution. However traditional bilateral methods break down when using transactions data. New methods and processes are required. Temporal multilateral methods have been proposed as they preserve transitivity and make greater use of price and expenditure information.

15.27 Multilateral methods allow NSOs to use the dynamic universe of transactions data to enhance the accuracy of their price indexes. They make greater use of automated processes, providing NSOs with an opportunity to reduce costs across the statistical cycle. Automated processes also provide an opportunity for more timely output - less manual intervention facilitates completion sooner after the reference period, rendering higher frequency price indexes more feasible.

15.28 ABS research demonstrates support for the use of multilateral methods as the most opportunistic way to make greater use of transactions data to enhance the CPI. Whilst having many benefits, multilateral methods present NSOs with challenges which must be considered in a local context. The ABS will continue to assess aspects of multilateral methods. The ABS will also consult with peers and experts in order to develop a best practice approach.

REFERENCES(footnote 6)

Australian Bureau of Statistics (ABS) Sep 2013. Feature Article: The Use of Transactions Data to Compile the Australian Consumer Price Index. cat. no. 6401.0, Canberra.

Australian Bureau of Statistics (ABS) 2016. Information Paper: Making Greater Use of Transactions Data to compile the Consumer Price Index. cat. no. 6401.0.60.003, Canberra.

1 For a list of the CPI expenditure classes where some prices are derived from transactions data see the Feature Article: The use of transactions data to compile the Australian Consumer Price Index, Sep 2013 (cat. no. 6401.0). <back
2 Price bouncing is where there is considerable volatility in prices; for example, due to seasonal factors or sales competition. Quantity bouncing is when consumers frequently increase or decrease their volume of purchases in response to the item's change in price. <back
3 Transitivity is where the price change measured between two time periods is independent of whether they are compared directly or via some other period. Chained weighted price indexes do not share this property. <back
4 For a list of NSOs using transactions data in the compilation of the CPI see Appendix 1 of the Information Paper: Making Greater Use of Transactions Data to compile the Consumer Price Index, Australia, 2016 (cat. no. 6401.0.60.003). <back
5 Chain drift is the failure of an index to return to parity after prices and quantities revert back to their original values. <back
6 For a full list of references see the Bibliography of the Information Paper: Making Greater Use of Transactions Data to compile the Consumer Price Index, Australia, 2016 (cat. no. 6401.0.60.003). <back