|Page tools: Print Page Print All RSS Search this Product|
The Semantic Web and Official Statistics
On 22 October, 2013, Dr Siu-Ming Tam, Chief Methodologist and First Assistant Statistician at the ABS, was invited to deliver a key note address to the First International Workshop on the Semantic Web and Official Statistics.
This was the first time that Semantic Web specialists and official statisticians have gathered together to discuss how work on the Semantic Web may advance official statistics and vice versa.
The Semantic Web is also broadly described as Web 3.0. However, if one poses the question “What is the difference between Web 2.0 and Web 3.0”, most official statisticians will find it difficult to answer the question, and not surprisingly, a tongue-in-cheek answer would be “1.0”.
Dr Tam believes that the Semantic Web, together with Big Data and the active international collaboration work to reform and re-engineer statistical production processes, are three key developments that will significantly change the landscape of official statistics in the next five years.
Putting it simply, the Semantic Web is a certain way to organise, describe and annotate rich content (statistical data, text, imagery, etc) – supported by a framework of international standards – that will facilitate the discovery, exchange, retrieval, processing and analysis of information from many disparate sources.
The Semantic Web aspires to support a global web of data for consumption by people, machines or people assisted by machines, just like the web of documents supported by Web 1.0 and enriched by Web 2.0.
An example may help to illustrate this. In another talk presented in Abu Dhabi earlier this year, Dr Tam posed the following question to a statistical audience “How many Web 3.0 companies are there in Abu Dhabi?” By googling “Abu Dhabi Web 3.0 companies”, Dr Tam advised he achieved 65 million hits, and yet there are only about 110,000 companies reported to be registered in Abu Dhabi by the Abu Dhabi Statistics Centre.
So what is not working? The answer lies in the fact that the web-based descriptions of the activities of the companies do not allow for this type of research to take place accurately.
The solution, offered by Semantic Web technologies, lies in two key enhancements. Firstly, structuring the data in “triples” and relating object to standards through Uniform Resource Identifiers creates a network of web-accessible linked information. Secondly, the meaning of this data – its “semantics” – is described along with its structure and format in a machine-interpretable way.
The challenges facing data scientists are to better articulate the value proposition, and raise awareness of the Semantic Web; and the challenge for official statistics is in harnessing the opportunities provided by the Semantic Web to improve the business of official statistics.
These documents will be presented in a new window.