Coding service formats
Request formats and recommended text inputs.
Request formats
The coding service uses JSON format for the following services:
- Real-time (synchronous) public coding service
- Real-time partner coding service
- Real-time small batch coding service
It uses JSONL format for the large batch/bulk (asynchronous) partner coding service.
The GET Data method will return the following for each of the specified services for occupation coding:
Service | Returns |
---|---|
Real-time (synchronous) public coding service |
|
Real-time partner coding service |
|
Real-time small batch coding service (up to 300 records) |
|
Large batch/bulk (asynchronous) partner coding service |
|
Recommended text input for coding
- The occupation coder will perform optimally when provided with both a job title and tasks as free text inputs, as this is how the ML training was carried out.
- The coder will not perform as well with just one text field entered (i.e if only the job title or only the task text is entered). If results are unsuccessful, entering more information will help the Coding Service make better predictions.
- Text strings can be a maximum of 100 characters only (a total of 100 characters for combined occupation and task input text entries).
- The coding service API will not accept custom data queries or query string parameters.
The coding service has been trained on English inputs only. The service accepts printable ASCII characters, which includes all English letters and connectives, but excludes certain accents, foreign currency symbols and control characters like file endings or backspace. Including a bad character may result in an 'Invalid request body' error.
The contextual assumption of the input text is that the text relates to and describes a person’s job. The coder is able to recognise a very broad vocabulary and will attempt to code all input text, regardless of context, so users need to ensure a contextual fit between their input data and the coding task being undertaken.
For example, if a person describes their job as a ‘prisoner’, the Coding Service assumes a context that the occupation to be coded works with prisoners in some way, and codes to ‘Correctional Officer’. Likewise, the input text ‘baby’ codes to ‘Nanny’.
Multiple occupation entries
The service is designed to provide a single occupation code and title for a single occupation record. If multiple occupations per record are entered in the occupation title text input, the coder will attempt to code the provided text to a best fit single occupation code at the most detailed level.
The output will reflect the training data and will depend on how many times the two jobs were present together in the training data. The Coding Service will default to whatever is most commonly found in the training data.
- If multiple occupations are present, you will need to format each job as a separate request.