Trying Aerospike as Data store in Kurio

First Impression and A Little Story When Trying Aerospike in Kurio

aerospike logo from google images search

Last week we develop a new feature in Kurio. This feature is big enough for me because there are only two of us from Backend Team + one from Infra-team, was assigned to finish this project. So we need to find a Database application that suited to our cases that are:

Support — has an official library or easy to integrate — with Golang, because our current project already running on Golang
We want a fast read and write database, either is RDBMS or NoSQL. Which mean, while we write the data, it does not affect the read performances.
The database must easy to scale.
And the data was persistent and saved to disk.

If drawn in a picture, we have something similar like this:

Our System Schema

After figuring the cases, we list a few options of databases.

Redis

We know Redis is very good, and fast because it saves the data in memory. But we know it’s not suitable for our case. Redis will persist the data to disk if there was any trigger. As our case need, we need the data stored in the disk and persisted plus fast read.

MongoDB

MongoDB comes to our second option because, in our current system, we use MongoDB as the data store. But we need more performances.

MyRocks

Another option that we started to think is MyRocks. MyRocks introduced by Facebook, using MySql with RocksDB as the storage engine. Because it was used by Facebook, we think it was a better option. But later, after discussing with our infrastructure team, MyRocks same as other MySql, it cannot scale out. The difference is only the storage engine, so there’s no big difference compared to another normal Mysql in term of “scaling out”.

Aerospike

Later, I found many databases that have great performance out there, something like Cassandra, Scylla, and etc. I don’t remember many of them. Until I found Aerospike. It was like a rising star database. Also, there were many article benchmark about it on the Internet. Well, to be honest, Aerospike is something new to me, and also for the team in Kurio, but after seeing all the review, and the feature of Aerospike, and quite fit to our cases, we decide to try Aerospike. So, after discussing with the team, also with the Infra-team, we decide to use Aerospike, thanks to its scalability, the infra team does not need extra effort for maintaining Aerospike in scaling out.

First Impression

This is a few features of Aerospike that amazed us the team and also fit our case.

Redis-way

After learning the concept and how the data saved in Aerospike, I learn that Aerospike has a similar concept with Redis. It's using key:value concept.

Talking about performance in retrieving data, of course, it same with redis. It was a key:value anyway.

Secondary Index

Another thing, I learned from Aerospike is, they supported secondary index. So, even the aerospike was a key:value, it also possible to us to query using another index that we created.

Asynchronously Persisted

Not like Redis, Aerospike persists the data asynchronously to disk. If redis persist the data by trigger or action, Aerospike can persist the data to disk automatically, because in Aerospike we can use the Hybrid data storage. It will save to memory and disk.

Data Model and Schema

In Aerospike, there a few terms related to data that must we know first. They are :

Namespace
Set
Record
Bin

Aerospike Data Schema

Namespaces

Namespaces are top-level of the container. The namespace contains one or more Set, Records, Bins, Index. If we compare to RDBMS, namespace is similar to a Database Schema.

Namespace image from Aerospike documentations

Sets

Set is more similar like a collection in MongoDB, or a table in RDBMS. It contains many records and bins.

Set in Aerospike

Records

Records are more similar like rows in RDBMS. One record has one PK (key), and have one or many bins. And in one set/collection, it may have many records.

Record in Aerospike

Bins

Bin in Aerospike

Bins in Aerospike is more similar like a column in RDBMS. We can add the index to any bin as any RDBMS does. The different is, it’s more flexible and dynamic. It can have a lot of bins in one record. And for a single bin, it’s can store any data type ( Int, String, Byte, etc). It’s more like the column but more flexible.

Example of Bins

More about this already explained well in the official documentation here: https://www.aerospike.com/docs/architecture/data-model.html. So I will not tell much about this four here.

Querying and Indexing

So, after developing the feature (which using aerospike as the data store), we must and had to learn how to query in Aerospike.

Luckily, Aerospike already creates many client library and support for many programming language. We can see in their official GitHub account here http://github.com/aerospike. Also to help in debugging control data, they also create the aql (Aerospike Query Language). It provides a SQL-like command line interface for database, UDF (User Defined Function) and index management.

With the aql, we can do a query to the Aerospike server like :

$ aql> SELECT * FROM test.user$ aql> SELECT * FROM test.user WHERE PK=2

More about command and information about aql you can read it here: https://www.aerospike.com/docs/tools/aql

For our case, because we use golang in our project, we use the official client created by Aerospike here: https://github.com/aerospike/aerospike-client-go

Indexing

As we know, Aerospike is a key:value data storage. But, aerospike is also support for the seconday index. That’s mean, we also add an index on the value/bin. Then, with that index, we can query to the value. So it’s not just a like get the data by key, but also we can get the data by value, or indexed bin.

For example, let say I have User set, that has bins: user_id,name,email. For this example, I will make the user_id be the PK. So in total, for one record, I will have minimum of 2 bins.

Example of User

With this record, I can directly query or get Record by PK. If using aql it just like this command:

$ aql> SELECT * FROM sample.user WHERE PK=12

Another case, let’s say I want to query by email. I want to get user by email [email protected] . If using aql it will more like this.

# Add Index on email bin$ aql> CREATE INDEX email_user_idx ON sample.user (email) STRING# Query by Email$ aql> SELECT * FROM sample.user WHERE email="[email protected]"# Will Display the result|-----|---------------|------------------|| PK | name | email || 14 | Iman Ganteng | [email protected]||----------------------------------------|

More about this querying and indexing you can read in the official documentation.

Deploying To Production

Well, back to our story, if you want know more about Aerospike you can read in the official documentations in their website.

After finishing all the feature and environment, we trying to deploy it to production.

We deploy it at midnight, around 11.00 PM till 11.59 PM, and we just leave it until in the morning to gather the data.

But at morning 06.00 AM, our CPU usage going high and spike. And unfortunately, we must rollback the service to the stable version.

Detecting and Fixing Issues

So, after trying to release it to the production, we get some critical issue. When the request is high, our CPU usage is going abnormal than the old version. Well to be honest, in this new version feature, it has many computation process, than the previous version. Also, we don’t implement the autoscaling mechanism yet. So we assume it was because our added function that cause the CPU usage going high. But until we trying to profiling our application, we get unexpected case. From profiling we can see that the client library has slow process and quite a lot of CPU usage.

Profiling golang using pprof. Show the CPU usage in client library aerospike

More usage caused by Syscall.

But later, after looking for the slide presentation by the CTO of Aerospike here: https://www.slideshare.net/brian-aerospike/go-meetup-nov142, and also after looking all the pprof images, we can see that this is happen by the network I/O. So, to fix the issue, we implement the autoscaling mechanism to our system.

Conclusion

So, after trying the Aerospike, it is quite challenging. Because it was new for us. We are just two person to doing this, three with one extra of the Infra-team member. And from my own perspective, Aerospike is a worth to try for them to seek a data store like our cases. Redis-like but persisted (Hybrid: memory and disk). And also support for secondary Index.

Talking about the drawback, I found some drawback, it was in the library golang itself, not the Aerospike. The library return the data in map[string]interface{}. I wish someone out there will submit a PR to the repository, so it will allow the client-library return only bytes when querying results, so we can handle the marshalling by ourself. LOL 😈

Well, maybe there was a few things that I missed, but I hope I can write it well. And by the way, I write this based on my perspective and opinion and also my own experience when trying the Aerospike directly.

If you think this story worth enough to read, share it to your circle, so your friend can also read this. Or if you have a question or another perception or if I write something wrong, just put a response below, or you can email me. Thank you