Normalising the long tail of role titles in an online employment marketplace

One of the key features of a successful online employment marketplace is the ability to match people with the most relevant job opportunities. Our business uses data about candidates, jobs and hirers to perform this task. One valuable data point in this process is the job titles, which we discover in semi-structured forms in a candidate’s employment history and in a hirer’s job advertisement. For Search, Matching and Insights purposes, the ability to successfully normalise the various forms in which users provide a given job title on-site to an authoritative form and understand the relationships between the job title and others is essential. Our team has developed innovations to our job title normalisation process by firstly building NERD (Named Entity Recognition and Discovery), a suite of web services that leverage data represented using Linked Data standards housed in the PoolParty ontology management software. The team then developed automated processes to harvest and ingest new role titles from our marketplace data. More recently, the team has proved out an approach to better scale the normalisation process – splitting apart the notions of role functions and specialisms in our ontology and training a model to classify the different facets of the role. A second model then concatenates the different role title combinations with confidence scores based on historical data. This presentation will outline the details of these innovations, the challenges faced along the way, and the key ways we measure success.

Speakers: