1331.0 - Statistics - A Powerful Edge!, 1996  
ARCHIVED ISSUE Released at 11:30 AM (CANBERRA TIME) 31/07/1998   
   Page tools: Print Print Page Print all pages in this productPrint All  
Contents >> Information Studies >> Data - Processing

DATA PROCESSING

Data are raw facts. When organised and presented properly, they become information. Turning data into information involves several steps. These steps are known as data processing. This section looks at data processing and the use of computers to do it easily and quickly.

The diagram below shows a simplified view of the procedure for turning data into information. Data, in a range of forms and from various sources, may be entered into a computer where it can be manipulated to produce useful information (output).


Image: Procedure for turning data into information



Data processing includes the following steps:

  • data coding,
  • data input,
  • data editing, and
  • data manipulation.


DATA CODING

Before raw data is entered into a computer it may need to be coded. Coding involves labelling the responses in a unique and abbreviated way (often by simple numerical codes). The reason raw data are coded is that it makes data entry and data manipulation easier. Coding can be done by interviewers in the field or by people in an office.

A closed question implies that only a fixed number of predetermined responses are allowed, and these responses can have codes affixed on the form. An open question implies that any response is allowed, making subsequent coding more difficult. One may select a sample of responses, and design a code structure which captures and categorises most of these.


DATA INPUT

The keyboard of a computer is one of the more commonly known input, or data entry, devices in current use. In the past, punched cards or paper tapes have been used.

Other input devices in current use include light pens, trackballs, scanners, mice, optical mark readers and bar code readers. Some common everyday examples of data input devices are:
  • bar code readers used in shops, supermarkets or libraries, and
  • scanners used in desktop publishing.

The ABS gathers data from censuses and surveys. The method of data entry varies, depending on the type or method of collection.
  • The 1996 Census of Population and Housing used optical mark readers to read the forms.
  • Data from surveys using Computer-Aided Telephone Interviews (CATI) are entered by means of a keyboard while survey staff telephone interview the respondent. The ABS’s Retail Trade Survey uses CATI.
  • Tests are being made for household surveys using computer assisted personal interviewing (CAPI). Interviewing staff enter data directly into notebook computers during the interview at a respondent’s house. Laptop computers are considered too bulky for this type of work, but hand held computers which use lightpens for data entry are also being tested.


DATA EDITING

Before being presented as information, data should be put through a process called editing. This process checks for accuracy and eliminates problems that can produce disorganised or incorrect information. Data editing may be performed by clerical staff, computer software, or a combination of both; depending on the medium in which the data is submitted.

Some editing processes are:

Validity check: ensures that data fall within set limits. For example, alphabetic characters do not appear in a field that should have only numerical characters, or the month of year is not greater than 12.

Verification check: checks the accuracy of entered data by entering it again and comparing the two results.

Consistency check: checks the logical consistency of answers. For example, an answer stating never married should not be followed by one stating divorced.

Data editing should detect and minimise errors such as:
  • questions not asked by interviewers,
  • answers not recorded, and
  • inaccurate responses.

Inaccuracy in responses may result from carelessness or a deliberate effort to give misleading answers. Answers needing mental calculations may result in errors, for example: when converting days into hours, or annual income into weekly income.


EXAMPLE

1. This example of data to be edited shows an inaccurate response. By carefully reading this section of the ABS Labour Force Survey form you should be able to detect a very inaccurate response!
Image: Example of data to be edited.



Question 34A shows that Person 1 said they worked on every day of the previous week. Question 34B shows there was no time off, and Question 34C says some overtime was also worked. However, Question 34D says that all of this amounted to less than one hour of time worked!

The answers to individual questions look acceptable. It is only by comparing them with each other that you find if one or more are wrong.

This cross-checking is only one type of edit. It could be performed either by clerical staff or editing software. It indicates that further action should be taken. In the previous example, the interviewer will get in touch with the household and re-check how many days and hours were worked by Person 1.


DATA MANIPULATION

After editing, data may be manipulated by computer to produce the desired output. The software used to manipulate data will depend on the form of output required.

Software applications such as word processing, desktop publishing, graphics (including graphing and drawing), databases and spreadsheets are commonly used. Following are some ways that software can manipulate data:
  • Spreadsheets are used to create formulas that automatically add columns or rows of figures, calculate means and perform statistical analyses. They can be used to create financial worksheets such as budgets or expenditure forecasts, balance accounts and analyse costs.
  • Databases are electronic filing cabinets: systematically storing data for easy access to produce summaries, stocktakes or reports. A database program should be able to store, retrieve, sort, and analyse data.
  • Charts can be created from a table of numbers and displayed in a number of ways, to show the significance of a selection of data. Bar, line, pie and other types of charts can be generated and manipulated to advantage.

Processing data provides useful information called output. Computer output may be used in a variety of ways. It may be saved in storage for later retrieval and use. It may be laser printed on paper as tables or charts, put on a transparent slide for overhead projector use, saved on floppy disk for portable use in other computers, or sent as an electronic file via the internet to others.

Types of output are limited only by the available output devices, but their form is usually governed by the need to communicate information to someone. For whom is output being produced? How will they best understand it? The answers to these questions help determine one’s output type.


EXERCISES
1.Place the following in correct logical order:
PROCESSING — COLLECTION — INFORMATION — DATA
2.List the steps involved in data processing and write a brief description of each.
3.Investigate the different types of data input devices that are present in your school. Do any require special skills to be able to use them?
4.If data editing did not take place, what effect might this have on information produced from the data?
5.The following responses to sample survey questionnaires contain inaccuracies or errors. Can you list what they are?

a)Lambing during year ended 31 March 1995
Number
Lambs marked1,054
Ewes mated to produce above lambsyes

b)What is your present marital Status?
(X)
( )
( )
(X)
( )
Never married
Married
Separated
Divorced
Widowed
c)How did you get to work on 5 April? (If you used more than one method mark all relevant boxes)
(a)
(a)
(a)
(a)
( X )
(a)
(a)
( X )
( X )
Train
Bus
Ferry or tram
Car
Motorbike
Bicycle
Walked
Worked at home
Did not go to work


Click here for answers



Previous PageNext Page