paint-brush
A Petabyte-Scale Vector Store for the Future of AGIby@datastax
867 reads
867 reads

A Petabyte-Scale Vector Store for the Future of AGI

by DataStaxJuly 18th, 2023
Read on Terminal Reader
Read this story w/o Javascript

Too Long; Didn't Read

DataStax announced the general availability of DataStax Astra DB with vector search.
featured image - A Petabyte-Scale Vector Store for the Future of AGI
DataStax HackerNoon profile picture

You stumble upon an intriguing YouTube video that guides you through creating your very own chatbot. After an hour of experimentation in Visual Studio, you have a fantastic little project to showcase to your colleagues.


However, when your boss mandates implementing AI throughout the company, you realize that this proof of concept is only suitable for a laptop; it’s not practical for production.


In this artificial intelligence technology inflection, only a tiny percentage of companies have done anything at scale with generative AI. A laptop POC can get away with a gigabyte-scale vector store.


But that will change quickly, and when it does, my colleagues and I at DataStax and the Apache Cassandra® project are already on it—using proven technology to push the boundaries far beyond a gigabyte of vector data in one data center.


To that end, today, we announced the general availability of DataStax Astra DB with vector search. We're building a future for generative AI that features autonomous agents. These AI agents will need a lot of fast-access memory for contextual recall. And guess what?


Vector stores will be the key to satisfying this voracious hunger for memory.

More Data Everywhere

The more vector data we use, the more apparent it becomes that scale will inevitably be the limiting factor. But this is where Cassandra truly shines. We are confident in the claim of a vector store hitting a petabyte because it's built on Cassandra.


Yes, the same Cassandra our users are already running with petabyte-size clusters. For the past 12 years, we, as an open-source project, have been building and optimizing a system for the largest transactional data workloads in the world.


Storing and searching vectors is just one more feature to add to an already incredible piece of technology.


As a bonus, one of the most significant advantages of using Cassandra as a vector store is its built-in replication mechanism. This allows for active-active replication globally, which means your data can exist and be updated in real-time in multiple places. In the era of big data, this was a superpower for many organizations.


In the age of generative AI, it will be a matter of survival as agents act independently and globally. Consistent data storage anywhere it’s needed, with the elasticity required to make it affordable at scale.

Do We Really Need This?

Now, you might ask, "Who actually needs a vector store that can store a petabyte?" If history has taught us anything, the need for data storage capacity grows much faster than anyone anticipates.


Using vectors has quickly become the predominant way to incorporate enterprise data into foundation models. Even though fine-tuning might theoretically achieve the same outcome, many businesses have discovered that incorporating vectors offers significant advantages.


It provides data provenance, which is particularly important in regulated fields like healthcare and law, and helps avoid the complexities of model tuning.


Retrieval-Augmented Generation (RAG) and the newer Forward-Looking Active Retrieval Augmented Generation (FLARE) are impressive solutions that can reduce the problem of large language model hallucinations while using the most dynamic and up-to-date information.


If you're looking for the best results, combining LLMs with vector search is the way to go.


Improved LLMs haven’t decreased the need for vectors. With their consumption of compute, network, and storage resources, LLMs are becoming the leader in infrastructure spending. They will pass up the current leader of what some have termed "petacost" infrastructure: the enterprise data lake.


However, combining LLMs with vector search can provide optimal performance and quality at a reasonable cost.


It’s only a matter of time before we'll need petabyte-sized vector stores based on the variety of things we’ll need to embed. A critical factor in the effectiveness of similarity search is the quality of the embedding algorithm used, coupled with efficient storage and retrieval.


It’s not that the system is efficient until there is too much data. The system should be efficient well beyond the point you run out of data to give it.

No Pain for the AI Brain

ChatGPT captured everyone's attention and created an enormous amount of “what if” speculation, but in the end, it’s a product that demonstrates a new class of data architecture. LLMs will continue to improve, but what you do with the LLM is what creates value.


Experts in the field that are looking forward have been telling us the real revolution will happen in two parts:


  1. Artificial general intelligence (AGI)


  2. Distributed autonomous AI agents


Either one of these will cause enormous resource strains and, combined, could spell a lot of trouble for organizations that run up against limits. Agents are similar to humans: the more they know, the better the decisions can be made.


If you had a simple flight booking agent, consider all the relevant things that need immediate recall. Not only changing schedules and things like weather conditions but the experience gained after booking many flights. Wait —experience?


Human travel agents have deep experience with working with a chaotic system, and that experience can be characterized as one thing: memory. AI agents will become more valuable as they gain insights into their tasks, and those memories will be stored as embeddings.


We don’t want our agents to suffer the same problems seen in the movie Memento, so let's not even start with limits.

Start Tomorrow Today

So, my advice? Start thinking about AI agents and how you will scale them today. Don't wait for tomorrow, next week, or when you hit that inevitable roadblock. Set yourself up for success now.


Plan for growth and scalability. Don't put yourself in a position where you're forced to undertake a massive migration later. I’ve been involved in some huge data migration projects that always start with, “Well, we didn’t think we would need more scale.”


Cassandra is open-source and free to use. If you don’t want the toil of running a large cluster, DataStax Astra DB can have you up and running in a few clicks and will auto-scale as high as you ever want.


And for those looking at trendlines and trying to plan the next move, AI agents are what you need to consider. The future of AI is vast, and it's exciting. But to be ready for it, we need to prepare today.


Learn about frameworks like LangChain and LlamaIndex and use CassIO to access a petabyte-scale vector store built on robust and reliable Cassandra. Start on the right foot today, and don't set yourself up for migration later.


Let's usher in the future of AI together, one petabyte-scale vector store at a time.


By Patrick McFadin, DataStax