Build a Scalable Semantic Search System with Sentence Transformers and FAISS

by Anant KumarApril 22nd, 2025
Read on Terminal Reader
tldt arrow

Too Long; Didn't Read

This guide walks through building a high-speed semantic search engine using Sentence Transformers for text embeddings and FAISS for vector similarity search. It includes code, performance tips, and scaling strategies to efficiently search large AI/ML datasets with context-aware accuracy.
featured image - Build a Scalable Semantic Search System with Sentence Transformers and FAISS
Anant Kumar HackerNoon profile picture
0-item
1-item


With the overwhelming information/data, finding relevant and contextual content has become critical. Traditional keyword search mostly fails to capture the semantic meaning behind a search query, returning ir-relevant results when synonyms are used or context is misunderstood. Enter semantic search: a powerful technique that understands the intent and contextual meaning of search queries.


This article will show you how to build a lightning-fast semantic search system using Sentence Transformers for creating embeddings and Facebook AI Similarity Search (FAISS) for efficient similarity matching.

Semantic search achieves this by translating text into numerical vectors (embeddings) that preserve meaning in a multi-dimensional space. Similar ideas wind up near one another in this space, enabling us to discover related content by comparing vector similarity instead of matching exact keywords.


Our Tech Stack

We'll be using:


  1. Sentence Transformers:A powerful library that provides pre-trained models for converting sentences into dense vector representations (embeddings).
  2. FAISS (Facebook AI Similarity Search): A very fast library for similarity search and clustering of dense vectors that can handle billions of vectors with amazing speed.
  3. Python: Our glue language, with auxiliary utilities from NumPy and Pandas.

Implementation

Let's build a semantic search system for a collection of product descriptions. We'll embed these descriptions, index them with FAISS, and then search for similar products based on user queries.

Step 1: Install Required Libraries

# Install required packages
!pip install sentence-transformers faiss-cpu pandas numpy

Step 2: Prepare the Dataset

import pandas as pd
import numpy as np
from sentence_transformers import SentenceTransformer
import faiss
import pickle
import time

# Sample product dataset (in a real scenario, you would load from a CSV/database)
products = [
    {"id": 1, "name": "Ergonomic Office Chair", "description": "Adjustable height chair with lumbar support and breathable mesh back"},
    {"id": 2, "name": "Standing Desk", "description": "Electric height-adjustable desk with memory settings and anti-collision system"},
    {"id": 3, "name": "Wireless Noise-Cancelling Headphones", "description": "Over-ear headphones with active noise cancellation and 30-hour battery life"},
    {"id": 4, "name": "Mechanical Keyboard", "description": "RGB backlit mechanical keyboard with customizable keys and wrist rest"},
    {"id": 5, "name": "Ultra-wide Monitor", "description": "34-inch curved display with 4K resolution and USB-C connectivity"},
    {"id": 6, "name": "Wireless Mouse", "description": "Ergonomic wireless mouse with adjustable DPI and silent clicks"},
    {"id": 7, "name": "Laptop Stand", "description": "Adjustable aluminum stand for improved ergonomics and cooling"},
    {"id": 8, "name": "External SSD", "description": "1TB portable solid-state drive with USB 3.2 and shock resistance"},
    {"id": 9, "name": "Webcam", "description": "4K webcam with auto-focus and built-in privacy cover"},
    {"id": 10, "name": "USB Docking Station", "description": "12-in-1 hub with HDMI, Ethernet, USB-A, USB-C, and SD card reader"}
]

# Convert to DataFrame for easier handling
df = pd.DataFrame(products)

Step 3: Generate Embeddings with Sentence Transformers

# Load pre-trained Sentence Transformer model
# We're using the all-MiniLM-L6-v2 model which provides a good balance of speed and quality
model = SentenceTransformer('all-MiniLM-L6-v2')

# Generate embeddings for the product descriptions
# We'll combine name and description for better context
texts = df['name'] + ": " + df['description']
embeddings = model.encode(texts.tolist(), show_progress_bar=True)

# Check the embedding dimension
print(f"Embedding shape: {embeddings.shape}")

Step 4: Build the FAISS Index

# Get the dimension of the embeddings
dimension = embeddings.shape[1]

# Initialize a FAISS index
# We'll use a flat index for exact nearest neighbor search
# For larger datasets, consider using quantization or hierarchical approaches
index = faiss.IndexFlatL2(dimension)

# Convert embeddings to float32 (required by FAISS)
embeddings = np.array(embeddings).astype('float32')

# Add the embeddings to the index
index.add(embeddings)

print(f"Total vectors in FAISS index: {index.ntotal}")

Step 5: Implement the Search Function

def semantic_search(query, top_k=3):
    """
    Search for products semantically similar to the query
    """
    # Generate embedding for the query
    query_embedding = model.encode([query])
    
    # Convert to float32
    query_embedding = np.array(query_embedding).astype('float32')
    
    # Perform the search
    start_time = time.time()
    distances, indices = index.search(query_embedding, top_k)
    search_time = time.time() - start_time
    
    # Get the results
    results = []
    for i, idx in enumerate(indices[0]):
        results.append({
            "id": df.iloc[idx]["id"],
            "name": df.iloc[idx]["name"],
            "description": df.iloc[idx]["description"],
            "distance": distances[0][i]
        })
    
    print(f"Search completed in {search_time*1000:.2f} ms")
    return results

Step 6: Try Some Searches

# Search for ergonomic office equipment
results = semantic_search("ergonomic work setup for back pain")
print("Query: ergonomic work setup for back pain")
for i, result in enumerate(results, 1):
    print(f"{i}. {result['name']}")
    print(f"   {result['description']}")
    print(f"   Distance: {result['distance']:.4f}")
    print()

# Search for audio-related products
results = semantic_search("quality sound with noise reduction")
print("\nQuery: quality sound with noise reduction")
for i, result in enumerate(results, 1):
    print(f"{i}. {result['name']}")
    print(f"   {result['description']}")
    print(f"   Distance: {result['distance']:.4f}")
    print()


Scaling to Millions of Documents

The implementation above works well for small to medium-sized datasets. For scaling to millions of documents, FAISS offers several techniques:


# Example: Using IndexIVFFlat for faster search on large datasets
nlist = 100  # number of clusters
quantizer = faiss.IndexFlatL2(dimension)
index_ivf = faiss.IndexIVFFlat(quantizer, dimension, nlist, faiss.METRIC_L2)

# Need to train the index on a representative dataset
index_ivf.train(embeddings)
index_ivf.add(embeddings)

# Set the number of clusters to probe during search
index_ivf.nprobe = 10  # Higher values = more accurate but slower search


Performance Comparison

The key advantage of using FAISS is its incredible speed, especially at scale. Here's a comparison of search times:


# Simulation of brute force search (without FAISS)
def brute_force_search(query, embeddings, top_k=3):
    query_embedding = model.encode([query])[0]
    
    start_time = time.time()
    distances = []
    for emb in embeddings:
        # Calculate Euclidean distance
        distance = np.sum((query_embedding - emb) ** 2)
        distances.append(distance)
    
    # Get top k nearest neighbors
    indices = np.argsort(distances)[:top_k]
    search_time = time.time() - start_time
    
    print(f"Brute force search completed in {search_time*1000:.2f} ms")
    return indices, [distances[i] for i in indices]

# Compare on a small dataset (just for demonstration)
query = "portable storage solution"
brute_force_search(query, embeddings)
semantic_search(query)  # FAISS-based search


FAISS shows performance benefits even with smaller data-sets. The difference becomes dramatic at scale, where FAISS can search billions of vectors in milliseconds.

Conclusion

Semantic search is a giant leap from simple keyword matching as it can provide real understanding of user intent. With Sentence Transformers and FAISS, you can build a powerful semantic search system up to millions, or even billions, of documents while experiencing lightening speed.

The applications don't end with product search - the same technique can be extended to knowledge bases, document retrieval, recommendation systems, and more. As language models continue to improve and improve, semantic search will be that much more precise and context-sensitive.

In an age where finding the right information in a timely manner is increasingly valuable, integrating semantic search capability into your applications gives users an incredibly richer experience.

Trending Topics

blockchaincryptocurrencyhackernoon-top-storyprogrammingsoftware-developmenttechnologystartuphackernoon-booksBitcoinbooks