1351.0.55.056 - Research Paper: A Statistical Framework for Analysing Big Data, Jun 2015  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 30/06/2015  First Issue
   Page tools: Print Print Page Print all pages in this productPrint All

In this paper, it is contended that the threshold challenges that must be adequately addressed before Big Data sources can be used for the production of official statistics are the business case, the validity of statistical inference, and data ownership and access issues.
Using statistical modelling, the paper outlines necessary conditions for addressing the biases inherent in Big Data sources when estimating parameters of a finite population or super-population model.
To illustrate the proposed statistical framework, the paper describes a method, based on State Space modelling, for utilising satellite imagery data to predict crop types and crop yields. The paper also outlines methods to address related statistical computing issues, and proposes strategies for extending the model to provide a better fit to the observed data.