Coding service formats

WoAG Occupation Coding Service User Guide

Request formats and recommended text inputs.

Released
30/06/2025

Request formats

The coding service uses JSON format for the following services:

  • Real-time (synchronous) public coding service
  • Real-time partner coding service
  • Real-time small batch coding service

It uses JSONL format for the large batch/bulk (asynchronous) partner coding service. 

The GET Data method will return the following for each of the specified services for occupation coding:

ServiceReturns
Real-time (synchronous) public coding service
  • One or more classification codes and titles for the free text supplied
Real-time partner coding service
  • One or more classification codes and titles for the free text supplied
Real-time small batch coding service (up to 300 records)
  • The best match 1-digit to 6-digit codes and titles (moving up the classification hierarchy from 6-digit to 1-digit level) for the free text supplied
  • If the coder cannot code the free text supplied, it will provide 3 suggestions
Large batch/bulk (asynchronous) partner coding service
  • The best match 1-digit to 6-digit codes and titles (moving up the classification hierarchy from 6-digit to 1-digit level) for the free text supplied
  • If the coder cannot code the free text supplied, it will provide 3 suggestions

Recommended text input for coding

  • The occupation coder will perform optimally when provided with both a job title and tasks as free text inputs, as this is how the ML training was carried out.
  • The coder will not perform as well with just one text field entered (i.e if only the job title or only the task text is entered). If results are unsuccessful, entering more information will help the Coding Service make better predictions.
  • Text strings can be a maximum of 100 characters only (a total of 100 characters for combined occupation and task input text entries).
  • The coding service API will not accept custom data queries or query string parameters. 

The coding service has been trained on English inputs only. The service accepts printable ASCII characters, which includes all English letters and connectives, but excludes certain accents, foreign currency symbols and control characters like file endings or backspace. Including a bad character may result in an 'Invalid request body' error.

The contextual assumption of the input text is that the text relates to and describes a person’s job. The coder is able to recognise a very broad vocabulary and will attempt to code all input text, regardless of context, so users need to ensure a contextual fit between their input data and the coding task being undertaken. 

For example, if a person describes their job as a ‘prisoner’, the Coding Service assumes a context that the occupation to be coded works with prisoners in some way, and codes to ‘Correctional Officer’. Likewise, the input text ‘baby’ codes to ‘Nanny’. 

Multiple occupation entries

The service is designed to provide a single occupation code and title for a single occupation  record. If multiple occupations per record are entered in the occupation title text input, the coder will attempt to code the provided text to a best fit single occupation code at the most detailed level. 

The output will reflect the training data and will depend on how many times the two jobs were present together in the training data. The Coding Service will default to whatever is most commonly found in the training data.

  • If multiple occupations are present, you will need to format each job as a separate request. 
Back to top of the page