In an earlier post, we discussed various approaches to implement Autocomplete functionality. We came to a conclusion that Completion Suggester covers most of the cases required in implementing a fully functional and fast autocomplete. This post explains in detail about what is Completion Suggester and how to use them practically.
Completion Suggester is a type of suggester in Elasticsearch, which is used to implement autocomplete functionality. It uses an in-memory data structure called Finite State Transducer. Elasticsearch stores FST on a per segment basis, which means suggestions scale horizontally as more new nodes are added.
To use Completion Suggester, a special type of mapping type called completion
is defined. Let’s take an example of Marvel movie data and define an index named movies
with type as marvels
. Complete movie list can be accessed from here
Here name.completion
is a type of completion field. In this field, we can add various other mapping parameters like analyzer
, search_analyzer
, etc.
To index data, a slightly different syntax is used. A suggestion field is made of an input
and an optional weight
parameter. Let’s index a movie into our movies
index.
We can also define aweight
for each field. This weight can help us in controlling the ranking of documents when querying.
We can also index multiple suggestions for a document at the same time
To query document, we need to specify suggest type as completion
. Let’s query for thor
in our movies
index. movies
index contains all the 22 movies from Marvel Cinematic Section of this page.
We get the following movies as result
We see that all documents are having _score
as 1. This means that all the documents in completion suggestor are ranked equally. To give boost to a particular document, or to alter the ranking, we can use the optional parameter called weight
. We have already indexed Iron Man
(with no weight) and Iron Man 2
(with weight as 2). Let’s search for Iron Man
in our movies index.
We get the following movies as result
We can clearly see here how weight is used to control the ranking of documents. This is the reason why Iron Man 2
is ranked higher than Iron Man
when searched for Iron Man
.
We can also specify thesize
to control the number of documents returned.
We can also add fuzziness in completion suggester. This helps us in providing suggestions even when there is a typo. Let’s try searching for captain amrica the
with fuzzy query
We get the following movies as result
Let’s try finding suggestion for movie names which contain america
.
We get no results. This is because completion suggester support prefix matching. It starts matching from the start of the string and there is no movie which contains america
at the start of the string. To deal with this type of situation, we can tokenize the input text on space and keep all the phrases as canonical names. This way Captain America: The First Avenger
will be inserted as
In queries, we can filter documents by using filter
but filter does not work in Completion Suggester. To understand this better, let’s run a query which finds all movies with name iron man
released in year 2008
.
The response received looks like
In the response, we see that hits
key along with suggest
is present. This happened because query
and suggest
works at the same level parallely. Hence we get both keys in response. So we cannot apply filter in a suggestion query.
To deal with this, Completion Suggester provides Context Suggester, which are basically filters for completion
field. Let’s define another mapping for movies index, this time with year
as a context suggester for name
field.
We can index our complete movies data into this index. Let’s find all movies with name iron man
released in year 2008
.
We get the following movies as result
We can also boost context suggester as well. Let’s search for movies with name as iron man
, released in year 2008
and 2010
, giving a boost of 4 to year 2008
.
We get the following movies as the result