AI today rests on three pillars – ML algorithms, the hardware on which they’re run, and the data for training and testing the models. While the first two pose no obstacle as such, obtaining high-quality up-to-date data at scale remains a challenge. One of the ways to resolve this is to adopt the data-centric approach to data labeling that entails building human-in-the-loop pipelines, i.e. hybrid pipelines that include both machine and human efforts.
Human-in-the-Loop (HITL) refers to a computational process that combines the efforts of human labelers and code and is normally managed by a human architect (crowd solutions architect, CSA).
Crowdsourcing is an online activity where individuals perform tasks assigned on a platform that’s becoming more popular because of its cost- and time effectiveness. Let’s look at two categories of case studies in which crowdsourcing successfully aided AI production.
Many industries today rely on recommender systems to support their business. Recommender systems consist of learning-to-rank algorithms: these are utilized with search engines (documents), e-commerce sites (shopping items), as well as social networks and sharing apps (images and videos). The main hurdle to overcome when testing and improving learning-to-rank systems has to do with obtaining enough relevant data that (by definition) consists of subjective opinions of individual users.
We’ve been able to determine that the following pipeline allows for effective testing and validation because it shortens testing periods from many weeks to just a few hours:
To follow it, you need to:
With pairwise comparison used in recommender system testing, one object within each pair indicates the user’s preference. While the task looks simple, it allows us to address a number of complex problems, including information retrieval evaluation. Since we’re annotating object pairs, we need to aggregate these comparisons into ranked lists for further use.
To do that and obtain an improved recommender system based on up-to-date human judgements, we need to:
Another common task utilized by many companies is spatial crowdsourcing, also known as field tasks. Spatial crowdsourcing is used to find information about brick-and-mortar stores (i.e. physical retail) for digital maps and directory services. Obtaining up-to-date information about such establishments normally poses a huge challenge because of a large number of modern businesses that tend to come and go or change their whereabouts on a regular basis.
Spatial crowdsourcing is a powerful HITL pipeline element that can successfully overcome this issue. Unlike the traditional survey-like crowdsourcing tasks, spatial tasks are shown on a map, so people can sign up to visit any number of locations to gather the latest information about a business required for the task (and, for example, take a photo).
Just like with pairwise comparison, this sounds deceptively simple, but it can actually help us resolve a number of extremely complex problems. We suggest using the following pipeline:
If the business in question can be located, the information is transcribed accordingly: the name of the business, telephone number, website, and working hours. ML algorithms are applied to retrieve company codes and other information. We ask the crowd if it is possible to use this photo as part of a map or directory service. In contrast, if the business in question cannot be located, we choose a different, more suitable photo.
This screenshot shows a typical spatial crowdsourcing task, in which one person takes a picture and another does the transcribing.
Important points to consider:
With human-in-the-loop data labeling, humans and machines complement each other, which results in simple solutions for a variety of difficult problems at scale.