**APPENDIX 1** REVIEW BY PROFESSOR JAN DE HAAN

**TRANSACTIONS DATA IN THE AUSTRALIAN CPI: RECOMMENDATION ON THE CHOICE OF MULTILATERAL METHOD AND EXTENSION METHOD**

By Jan de Haan(footnote 1) 5 May 2017

**SUMMARY**

The ABS has decided to implement a multilateral price index method for the treatment of transactions data in the CPI. Based on theoretical arguments and empirical research conducted at the ABS and elsewhere, this paper recommends GEKS-Törnqvist as the main multilateral method, preferably applied at the elementary (i.e. lowest) aggregation level. Aggregating up the GEKS-Törnqvist price indexes at the elementary aggregation level to the expenditure class level should be done using the Törnqvist formula as fixing the weights may introduce upper level substitution bias. To extend the time series when new data becomes available, a "mean splice" with a 9 quarter window is recommended. The paper also suggests some further improvements, including the improved treatment of comparable items with different Stock Keeping Units.

**ABOUT THE AUTHOR**

Jan de Haan is a senior methodologist at Statistics Netherlands and a Professor at Delft University of Technology (currently on leave). He works primarily in the field of price index theory and practice. He is a member of the Steering Committee of the Ottawa Group on price indexes, a member of Statistics Canada's Prices Methodology Advisory Committee, and an elected member of the U.S. Conference on Research in Income and Wealth.

Over the last two decades, he conducted extensive research on the use of transactions data in the CPI and has been involved in the implementation of scanner data in the Dutch CPI.

He collaborated with the ABS in a number of ways, through visits, secondments and the 2006-2008 Australian Research Council Grant "Scanner Data in the Consumer Price Index: How to Expand and Improve Their Use", led by Prof. Kevin Fox (UNSW), with the ABS and Statistics Netherlands as industry partners. From February to December 2017 he is seconded to the ABS as a principal advisor.

**BACKGROUND**

The ABS has been using transactions data for several years to compile price indexes for the CPI. Essentially, the prices collected by visiting stores were replaced by unit values from transactions data. The methodology to construct price indexes at the Elementary Aggregation (EA) level, in particular the samples of items and the method to aggregate prices into price indexes, was left unchanged. Apart from cost benefits, the advantage is that unit values are the appropriate measures of prices actually paid by consumers.

Transactions data, also known as scanner data, provides the ABS with the opportunity to use a census of products and compile weighted rather than unweighted price indexes. This would further enhance the Australian CPI. Weighting by expenditure at the product level would be an important improvement. Transactions data also enables more accurate weighting of the price indexes at the EA level to form price indexes at the Expenditure Class (EC) level.

A distinct feature of transactions data is product turnover, which can be significant for some EAs. To maximize the number of matches in the data, chaining quarter-on-quarter price changes would seem useful. However, when sales occur, chain-linking weighted price indexes can lead to downward drift; see for example Ivancic (2007). Multilateral price index methods have been proposed to deal with this issue.

The ABS decided to implement a multilateral price index method for the treatment of scanner data in the December quarter 2017. I strongly support their decision. This will lead to weighted price indexes, based on all the matched products available in the data sets, which are free of chain drift. Experiences at Statistics New Zealand and Statistics Netherlands, who already implemented multilateral price index methods, have shown that these methods perform as expected. There are also cost benefits since multilateral methods can be almost fully automated.

**RECOMMENDATIONS**

Several multilateral index number methods are available. The ABS previously excluded the Geary-Khamis method; the choice is between the GEKS and weighted Time Product Dummy (TPD) methods. The GEKS method transitivizes superlative bilateral indexes, such as Fisher or Törnqvist indexes. GEKS-Törnqvist has an advantage over GEKS-Fisher in that it facilitates decompositions. Weighted TPD is a regression-based method where the logarithm of price is regressed on time and product specific dummy variables in a multi-period context.

My recommendation is to use GEKS-Törnqvist as the main multilateral price index method. Where appropriate, or where the ABS previously identified issues with GEKS, TPD can be used instead. As a final check on their performance, I suggest running the two methods in parallel during the second half of 2017.

The ABS will construct multilateral price indexes for each data provider separately. The preferred level of aggregation for estimating multilateral indexes is the EA level. The EA-indexes should be aggregated to the EC level using the Törnqvist formula. At this stage, I would suggest aggregating the EC-indexes across data providers using annually fixed weights from scanner data.

The ABS has been investigating different methods to extend multilateral price index series when new data becomes available. I recommend the use of what is now known as a mean splice with a window length of 9 quarters.

**JUSTIFICATION OF THE GEKS-TöRNQVIST MULTILATERAL METHOD**

The recommendation to use GEKS-Törnqvist as the main multilateral method is based on theoretical arguments, empirical evidence and practical issues, taking into account the environment of the ABS, such as key stakeholders' and academics' views.

**Theory**

The economic approach to index number theory suggests that, when comparing two periods (months or quarters), a superlative index number formula, such as the Fisher or Törnqvist, should be used. Superlative indexes treat the two periods in a symmetric fashion and account for substitution effects (ILO et al., 2004).

Item turnover in scanner data can be high, and to maximize the number of matches in the data, high frequency chaining is required. It turns out, however, that when sales occur in scanner data, some assumptions underlying the standard economic approach are no longer valid. In particular, the quantities of storable goods purchased after the sales period do not immediately return to their "normal" level, which produces drift in period-on-period chained superlative price indexes. The drift resulting from sales is typically downward.

The solution is to use a multilateral index method. A multilateral index is transitive, hence independent of the choice of base period and free from chain drift. The GEKS method is grounded in standard index number theory: it takes the geometric mean of all possible (matched-item) superlative price indexes across the estimation period, where each period serves as the base. In particular when item turnover is low, this method is best from a theoretical point of view. Ivancic, Diewert and Fox (2011) proposed the use of GEKS with a particular extension method (a movement splice) for the treatment of scanner data in the CPI (footnote 2) .

To fully account for substitution effects, implementing GEKS at the EC level would be preferable. On the other hand, there are good reasons to compile price indexes at the EA level; see below. Since superlative indexes are approximately consistent in aggregation, applying GEKS at the EA level will be fine, provided that the indexes are aggregated up to the EC level using a superlative index number formula. The use of Törnqvist bilateral price indexes in the GEKS procedure facilitates decomposition analysis. For reasons of consistency, the Törnqvist indexes at the EA level should preferably be aggregated up to the EC level using Törnqvist weighting.

When item turnover is relatively high, there may be some issues with GEKS. One of the reasons is that the "missing prices" for the new and disappearing items are not imputed; GEKS is a strictly matched-item approach. A model-based approach that imputes these prices might perform better, in particular when a hedonic model is used. The ABS does not systematically observe characteristics information, and so hedonic regression cannot be implemented on a large scale.

An alternative method is the regression-based weighted TPD method, which imputes the "missing prices" also. Krsinich (2016) proposed this method, in combination with another extension method (a window splice). Nevertheless, TPD is still a matched-item approach in that it utilizes longitudinal price information to estimate the parameters and therefore requires at least two observations for an item to be included. Consequently, new items will be introduced with a one period lag, similar to the GEKS method. It can be shown that the weighted TPD index is approximately equal to a standardized unit value index where the standardization factors - which can be interpreted as quality-adjustment factors - are based on the estimated item-specific parameters in the model. This raises two issues.

Firstly, and most importantly, after standardization, TPD is an approximately additive method. That is, the TPD method is appropriate for product categories consisting of items which, after standardization/quality adjustment, are perfectly substitutable(footnote 3) . Many ECs are too heterogeneous to apply the TPD method; standardization can only work for broadly comparable items. It is difficult to determine exactly when a product category consists of broadly comparable items, but some EAs may be sufficiently homogeneous to allow the use of a single TPD model.

Secondly, because information on item characteristics is not used, there is no guarantee that the standardization factors are good approximations to the true quality-adjustment factors (de Haan, Hendriks and Scholz, 2016).

**Empirical evidence**

The ABS has been conducting extensive empirical research on multilateral price index methods (ABS, 2016). The empirical evidence is not conclusive, however. Surprisingly, the evidence does in general not point to upward bias in weighted TPD indexes at the EC level as the economic approach to index number theory predicts. It is not entirely clear why this is the case. One reason could be that standard theory assumes a fixed universe of items, i.e. no item turnover, whereas real data is typically characterized by significant churn. However, even for product categories, both at the EA and EC level, with limited item turnover, the ABS did not find systematic differences between GEKS and TDP indexes. Similar findings were reported by Statistics Netherlands on scanner data from supermarkets and department stores (Chessa, 2016).

Unpublished work by the ABS showed that the GEKS-Törnqvist index can be written in a form which is mathematically very similar to the weighted TPD index. In addition, de Haan (2015) showed that the weighted TPD index can be interpreted as an imputation Törnqvist-type price index. These results provide some insight as to why the differences between the two methods may be small, but more work remains to be done to come up with a completely convincing answer.

In any case, it is reassuring that GEKS and weighted TPD often lead to similar price trends; in practice the choice of multilateral method appears to be less important than economic theory suggests.

**Current practices and the environment**

As was mentioned above, there are several reasons to prefer calculating GEKS indexes at the EA rather than EC level, data permitting. First, this leaves the current structure of the CPI more or less unchanged, which may be useful when explaining new methods to users as well as practitioners. Second, stratification of an EC into EAs provides some flexibility. For example, when an unexpected problem occurs in an EA, the remaining EAs would not be affected. Third, it enables the ABS to use different methods - GEKS-Törnqvist or weighted TPD - for different EAs belonging to the same EC, if deemed necessary.

In their comments on the ABS Information Paper (ABS, 2016), the Reserve Bank of Australia stressed the importance of the economic approach to index number theory when choosing a multilateral method. In other words, they prefer GEKS over weighted TPD. Diewert and Fox (2017) recently recommended the GEKS-Törnqvist method for the treatment of scanner data(footnote 4) .

**JUSTIFICATION OF THE MEAN SPLICE EXTENSION METHOD**

When new data becomes available, previously estimated multilateral indexes change, which is problematic because the CPI cannot be (continuously) revised. A number of methods are available to extend a multilateral time series without revising the published price index numbers. We distinguish between rolling window methods and an annually-chained direct method.

Rolling window methods estimate multilateral price indexes on a window with fixed length, say* **T* quarters (for the quarterly Australian CPI), which is shifted forwards each quarter. The results of the latest window are then spliced onto the existing time series. For example, the most recently estimated quarter-on-quarter GEKS index movement can be spliced onto the index level of the previous quarter. This "movement splice" was used by Ivancic, Diewert and Fox (2011). Another option is to splice the most recently estimated movement across the whole window onto the index level of *T*-1 quarters ago. Krsinich (2016) proposed this "window splice". These are extreme choices, and de Haan (2015a) proposed a "half splice", where the most recently estimated second half of the index movement is spliced onto the index level of (*T*-1)/2 quarters ago (assuming *T* is an odd number).

The above extension methods splice price movements onto a single link quarter. With a volatile price series, as we often observe in scanner data, different link quarters yield different results. Moreover, there are more possible index movements and link quarters than the three mentioned above. As all link quarters are equally valid, Diewert and Fox (2017) proposed a "mean splice"(footnote 5) by taking the geometric mean of all the price indexes that are obtained by using every possible link quarter, given the window length. This method makes the result independent on the choice of link period, which is a useful property.

The ABS previously identified some issues with movement and window splicing; the movement splice can yield downward drift due to disappearing items with unusually low prices (clearance sales) whereas the window splice can yield downward drift due to new items entering with unusually high prices. Recent decomposition analysis at the ABS indicated that the mean splice acts more like a movement splice near the start of the window (thus mitigating problems with disappearing items) and more like a window splice near the end (mitigating problems with new items). Further empirical exercises showed that the mean splice indeed works as expected, further strengthening the case for using it.

The choice of length of the estimation window is largely an empirical matter. Ivancic, Diewert and Fox (2011) argued that it should be at least 5 quarters (or 13 months) to be able to include strongly seasonal goods. Since seasonal patterns may shift slightly over time, it would be safer to choose a somewhat longer window. On the other hand, the window should not be too long because this will incur a loss of characteristicity: past prices and price changes will disproportionally affect the estimated price movements for recent periods(footnote 6) . Empirical research at the ABS revealed that, at least for many products sold in supermarkets, the trend in rolling window GEKS indexes becomes flatter when increasing the window length. All in all, I recommend a window length of 9 quarters (or 25 months).

Chessa (2016) proposed an annually chained direct extension method, in part to comply with recommendations for the European Union Harmonized Index of Consumer Prices. For quarterly data, the idea is to construct short-term multilateral index series, starting in the December quarter and ending in December of the next year, i.e. with a length of 5 quarters, and chain link them in the December quarter of each year to obtain a long-term time series(footnote 7) . The length of the estimation window for the short-term indexes is extended each month, without publishing the revised index numbers - the index for the March quarter in the short-term series is estimated on two quarters of data (a bilateral rather than multilateral comparison), and so forth, until in the December quarter five quarters of data is used.

A potential drawback of Chessa's (2016) proposal is that the price indexes for the first quarters of each year are based on sparse data and expected to be quite volatile. This was confirmed by empirical research at the ABS. Also, the December quarter acts as the short-term index reference period and is thus given special importance. If, for some reason, the December quarter is "unusual", the results may be adversely affected. So, although annual chaining can be useful because this will probably not lead to drift, I would not recommend Chessa's (2016) extension method(footnote 8) .

**POTENTIAL IMPROVEMENTS**

To keep the production system relatively simple, I suggest using annually fixed weights from the scanner data to aggregate the EC GEKS price indexes across the different data providers. Annual updating the weights alleviates potential chain substitution bias. In the future, the ABS could decide to change over to Törnqvist weighting, but I do not consider this a priority.

An important aspect when constructing matched-items (multilateral) price indexes is the choice of item identifier. Items in scanner data are typically identified by barcode. Some products with different barcodes, however, are similar from the consumers' point of view. Also, barcodes often change if unimportant characteristics change, such as type of packaging. In that case, matching at the barcode level overestimates item churn, and disguised price changes due to re-launches of comparable items with different barcodes will not be observed; see also Dalén (2017). Such disguised price changes are typically upward, and missing them produces downward bias in the index.

Fortunately, the ABS receives Stock Keeping Units (SKU) from the data providers and calculates unit values across SKUs rather than individual barcodes. This mitigates the above issues. In some instances, for example for the Personal Care EC, even SKU may be too detailed so that matched-item methods, including GEKS, can yield biased results. Characteristics information extracted from item descriptions in the scanner data sets could potentially be used to identify items from their characteristics rather than SKU and apply a multilateral method to these broadly defined items, which is the approach followed by Statistics Netherlands (Chessa, 2016), or to estimate hedonic multilateral price indexes, for example as described by de Haan and Krsinich (2014) (2017).

As mentioned before, the ABS is currently not extracting characteristics information on a large scale, and it may therefore be sensible to exclude EAs or ECs for the time being where this problem occurs and continue with the current unweighted, sample based geometric means index.

The ABS previously identified another issue with GEKS: the GEKS index seems to be more sensitive to tiny quantities than weighted TPD, for example when clearance sales occur. Increased volatility of the results is not such a big issue - price indexes from scanner data are often volatile anyway - but bias obviously is. Perhaps the modification of GEKS proposed by Lamboray and Krsinich (2015) could help resolve the clearance sales issue. Another possibility would be to use some form of filtering. The drawback of these approaches is their ad hoc and sometimes arbitrary nature.

Another potential issue with GEKS is the loss of characteristicity within the estimation window, say the recommended 9 quarters. A weighted GEKS approach, for proposed by Melser (2016), could be considered to reduce the loss of characteristicity. The bilateral indexes are down-weighted the further away the link period is from the most recent period. The aggregate matched-items expenditure shares seem to be a useful choice for the weights.

**REFERENCES**

Australian Bureau of Statistics (ABS) (2016), "Making Greater Use of Transactions Data to Compile the Consumer Price Index", Information Paper 6401.0.60.003, November 29, Canberra: ABS.

Chessa, A.G. (2016), "A New Methodology for Processing Scanner Data in the Dutch CPI", Eurona 1, 49-69.

Dalén, J. (2017), "Unit Values in Scanner Data - Some Operational Issues", Paper to be presented at the 15th Meeting of the Ottawa Group, 10-12 May 2017, Altville, Germany.

Diewert, W.E. and K.J. Fox (2017), "Substitution Bias in Multilateral Methods for CPI Construction Using Scanner Data", Discussion Paper 17-02, Vancouver School of Economics, The University of British Columbia, Vancouver, Canada.

de Haan, J. (2015a), "A Framework for Large Scale Use of Scanner Data in the Dutch CPI", Paper presented at the 14th meeting of the Ottawa Group, 20-22 May 2015, Tokyo, Japan.

de Haan, J. (2015b), "Rolling Year Time Dummy Indexes and the Choice of Splicing Method", Research paper, Statistics Netherlands, The Hague, The Netherlands.

de Haan, J. and H.A. van der Grient (2011), "Eliminating Chain Drift in Price Indexes Based on Scanner Data", Journal of Econometrics 161, 36-46.

de Haan, J. and F. Krsinich (2014), "Scanner Data and the Treatment of Quality Change in Non-Revisable Price Indexes", Journal of Business & Economic Statistics 32, 341-358.

de Haan, J., R. Hendriks, and M. Scholz (2016), "A Comparison of Weighted Time-Product Dummy and Time Dummy Hedonic Indexes", Graz Economics Papers 2016-13, Department of Economics, University of Graz, Austria.

de Haan, J. and F. Krsinich (2017), "Time Dummy Hedonic and Quality-Adjusted Unit Value Indexes: Do They Really Differ?", forthcoming in Review of Income and Wealth.

ILO/IMF/OECD/UNECE/Eurostat/The World Bank (2004), Consumer Price Index Manual: Theory and Practice. ILO Publications, Geneva.

Ivancic, L. (2007), Scanner Data and the Construction of Price Indices, PhD thesis, University of New South Wales, Sydney, Australia.

Ivancic, L., W.E. Diewert and K.J. Fox (2011), "Scanner Data, Time Aggregation and the Construction of Price Indexes", Journal of Econometrics 161, 24-35.

Krsinich, F. (2016), "The FEWS Index: Fixed Effects with a Window Splice", Journal of Official Statistics 32, 375-404.

Lamboray, C. (2017), "The Geary Khamis Index and the Lehr Index: How Much Do They Differ?", Paper to be presented at the 15th meeting of the Ottawa Group, 10-12 May 2017, Altville, Germany.

Lamboray, C. and F. Krsinich (2015), "A Modification of the GEKS Index when Product Turnover Is High", Paper presented at the 14th meeting of the Ottawa Group, 20-22 May 2015, Tokyo, Japan.

Melser, D. (2016), "Scanner Data Price Indexes: Addressing Some Unresolved Issues", Journal of Business & Economic Statistics, online version, DOI: 10.1080/07350015.2016.1218339.

1 Division of Corporate Services, IT and Methodology, Statistics Netherlands and Delft University of Technology; email: j.dehaan@cbs.nl. The views expressed in this paper are those of the author and do not necessarily reflect the views of Statistics Netherlands. <back

2 They used GEKS-Fisher price indexes. In section 4 of the present paper, various extension methods are discussed. For rolling-year GEKS-Törnqvist indexes from Dutch scanner data, see de Haan and van der Grient (2011). <back

3 Diewert and Fox (2017) showed that TPD is also consistent with an elasticity of substitution equal to one. This result holds prior to standardization, but it is a bit unclear what the impact of standardization on their finding is. <back

4 Note that they refer to GEKS-Törnqvist as CCDI. <back

5 Ivancic, Diewert and Fox (2011) already alluded to the use of a mean splice. <back

6 In terms of the TPD method, a long window means that the item-specific parameters are constrained to be fixed over time for a longer period, and this is generally not good practice. Note that a rolling window approach would continually update the estimated parameters. De Haan (2015b) discussed the choice of splicing method for rolling year TPD and time dummy hedonic indexes. <back

7 Chessa (2016) used the Geary-Khamis multilateral method in combination with defining items by their characteristics. <back

8 Rolling window approaches can introduce some drift again because it impairs the transitivity property of multilateral price indexes. Lamboray (2017) suggested to combine Chessa's (2016) annually chained direct method with a rolling window approach. <back