As an e-commerce professional, you know the importance of providing a five-star search experience on your site or in your app.
In the fast-paced world of digital marketing, the user experience, starting when someone lands on your website and ending with them leaving as a satisfied customer, is nothing short ofĀ everything.
But do you know anything about semantic textual similarity (or just semantic similarity for short) and how it helps create that first-rate information-retrieval experience for your shoppers?
It comes down to this: when someone comes searching for a product or content, they fully expect to be given relevant, personalized, fabulous search results.
Which is where semantic textual similarity (STS) comes in. It compares the similarity of two pieces of text by analyzing their underlying meaning and context.
With this similarity search dataset revealing an āunderstandingā of the context and depth, a search engine canĀ excelĀ at pegging someoneāsĀ intent.
And then, like a thoughtful butler, it can suggest search results that are the most likely to resonate.
What Is Semantic Textual Similarity (STS)?
So what, exactly, is this complicated-sounding similarity-task technology?
Semantic textual similarity is a key metric used to assess likeness in meaning between terms or documents. Beyond simply looking at words, it incorporates numerical descriptions that measure the strength ofĀ semantic relationships.
In other words, semantic similarity is the ability of a computer system to understand the meaning of a piece of text and compare it to another. For instance, this could apply to sentence similarity.
Two sentences that convey the same meaning could be phrased slightly (or significantly) differently, and the STS technology would be able to identify the similarity in their meanings.
This process is rooted in the linguistics and computer science disciplineĀ natural language processingĀ (NLP), utilizing approaches such asĀ word embedding. Semantic analysis is a sub-field of computational linguistics, which looks at the meanings of words and how they relate.
Artificial-intelligence-aided semantic analysis technology examines vocabulary, grammar, structure, and context.
In the same way as Siamese twins are pretty different from fraternal ones, semanticĀ similarityĀ is different from semanticĀ relatedness.
AsĀ WikipediaĀ notes, semantic relatedness āincludes any relation between two terms, while semantic similarity only includes āis aā relationsā¦ ācarā is similar to ābusā, but is also related to āroadā and ādrivingāā¦semantic similarity, semantic distance, and semantic relatedness all mean, āHow much does term A have to do with term B?ā
The answer to this question is usually a number between -1 and 1, or between 0 and 1, where 1 signifies extremely high similarity.ā
Where is semantic textual similarity currently being utilized?Ā Natural language understandingĀ (NLU), sentiment analysis, and machine translation (automatically converting content to another language) are a few domains.
Determining Semantic Similarity
At Algolia, we use neural networkābased technology to facilitate comprehension of search intent. We utilizeĀ vector searchĀ and machine learning to determine semantic similarity as part of providing the best search results.
With vectors, computers make sense of terms by clustering them in n-dimensional space. They can each be located with coordinates (x, y, z), and their similarity can then be assessed using distance and angles (our post onĀ cosine similarityĀ has details).
Machine-learning models determine that words near each other in vector space could be synonyms. When two pieces of content are embedded in a vector representation, deep learning helps determine the similarity.
We also use a tie-breaking algorithm that uses various criteria to compare matching items.
Here are the basic steps in our process:
-
Query understanding.Ā NLP techniques are used to prepare and structure the search query so the search engine can analyze it.
-
Retrieval.Ā In theĀ AI searchĀ process,Ā neural hashingĀ is next. The search engine retrieves the most relevant results and ranks them from most to least relevant. We measure retrieval quality using precision and recall. Precision is the percentage of retrieved documents that are relevant. Recall is the percentage of all relevant documents retrieved. Both metrics help determine whether the search results are any good.
-
Semantic similarity measurement:Ā Based on the extracted embeddings, the semantic similarity score, representing how closely the two pieces of text are related, is calculated.
-
Re-ranking:Ā Based on clicks and conversions ā plus rules and personalization as they relate to the particular shopper ā aĀ dynamic re-rankingĀ process pushes the best results to the top of the list.
Adventures in Text Similarity (and Differences)
Whether people are terrible at asking for what they want or know exactly how to phrase a query to zero in on their desired item, STS has their back. Here are examples of content that might be processed in STS tasks:
āBest fitness tracker for weight lossā vs. āfitness tracker for losing weight.ā
Consider the English phrases ābest fitness tracker for weight lossā and āfitness tracker for losing weight.ā At first glance, they may appear to have virtually identical meanings.
However, with the help of semantic textual similarity, a search engine can delve deeper and identify slight variations in the intent.
Whether the searcher is interested in the mostĀ highly rated trackersĀ people use when they want to lose weight or simply wants to know ifĀ wearing a fitness tracker is helpfulĀ when trying to lose weight, STS is the key to displaying the most relevant results, which ultimately leads to a more satisfied user.
āMakeshift studioā vs. āhomemade studioā
If content talks about a āmakeshift studioā as opposed to a āhomemade studio,ā a savvy search engine can determine through fine-tuning whether the phrases are referring to the same concept.
In this case, āmakeshiftā could mean something more temporary, like a setup in a living room that must be torn down in order to have people over for dinner, whereas āhomemadeā could mean a space thatās a bit hokey ā but still permanent ā in a corner of the basement.
āNew York Knicksā vs. āMadison Square Gardenā
Sometimes a search engine must rack its digital brain to determine whether two completely different phrases refer to the same entity. If someone is searching for information about New York Knicks games, for instance, they might only type the venue name in their query.
But STS can make associations from benchmark phrases like āHome of the New York Knicksā and gather that the searcher might want to know about upcoming games.
From examples like these, itās easy to see why semantic textual similarity is a critical component of modern search-engine skill sets.
Why Is STS a No-Brainer for Search?
As someone steeped in all things online, you probably hear the phrase āgame-changingā on a regular basis, and youāre aware that some of that is simply overblown marketing speak.
In this case, however, genuine game-changing is occurring, as STS fundamentally improves search-engine and recommendation system accuracy and relevance.
Thatās key because thereās nothing more necessary than knowing your usersā needs and ensuring that youāre getting them the right search results.
Semantic textual similarity functionality supplies search relevance and satisfaction with every user interaction.
For the record, STS goes way beyond traditional keyword matching. It empowers a search engine to understand the different ways people might express the same idea, which means linguistic ambiguity and variation arenāt possible roadblocks.
That hasnāt been the case with earlier-generation, traditional keyword search techniques.
This language-understanding skill is particularly important inĀ e-commerce, where shoppersā intent and context vary, and where online retailers must basically read shoppersā minds in order to stay the least bit competitive.
STS can also improveĀ related recommendationsĀ by suggesting items that are semantically similar to what the person has been showing interest in.
State-of-the-Art STS
Are you tasked with managing a search engine or e-commerce recommendation system?
If so, check out ourĀ NeuralSearch, which utilizes vector search in concert with neural hashes to deliver fast, accurate search results. Itās allowed us to combine the speed of traditional keyword search with the accuracy of neural search in a single API.
Our technology isĀ rave-worthyĀ at assessing user intent, context, and conceptual meaning to connect a query with the best content.
Then,Ā letās talkĀ about your options for providing the best imaginable customer experience, with all the benefits that it can bring to your business.