It has been a while since my last “What the heck is?” article, and I’ve recently seen some rapid growth from
GlareDB is an
They describe how it fits in the stack in this diagram:
It supports data located on GCS or S3 of the following types:
They are quickly adding support for various engines, so this list could be incomplete by the time you read this.
At first blush, you look at this and think, hey, this seems a lot like
Given that Trino is written in Java, that means there is a lot of Java ecosystem you need to deal with if you want to use it. Sure, there are pre-built Docker containers around that can shorten this path, but generally, if you are “just trying to do something,” then you have a heavy lift to install and set up Trino. With GlareDB, you have a single executable to download and use or make use of their SaaS product, which looks like this when you first use it:
Now to Hybrid Execution. I’ll paraphrase some of what GlareDB had to say in their blog post on the topic. Say you have a CSV list of user IDs that had gotten extracted from some other tool from your database. Now, you want to enrich that data with some of the user's demographic information from your database. We’ll say our table name is user_demo and our CSV file is user_id.csv, and our query would look something like this:
SELECT
m.user_id,
m.first_name,
m.last_name,
m.birth_date
FROM
user_demo m
INNER JOIN '/user_id.csv' u on m.user_id = u.id
GROUP BY m.user_id;
Clearly, this is a simple example, but you could enhance it to get information out of other joined tables as well. You can also go in the other direction, where you have some local file with a key field and some data you are interested in that you can join to a table in a database where that extra data in the file doesn’t exist in the database. This has the advantage of not having to go through the process of creating a new table and loading it for this ad-hoc report, thus saving a lot of time.
That’s all just meant to give you a quick tickle about what GlareDB can do and where it is at currently. The docs and blogs on their site are well done, making it pretty quick to jump in.
GlareDB is very interesting, and I appreciate how quickly they are iterating and updating the software. I need to spend some more time thinking about how it plays in the
You can read the other “What the heck” articles at these links: