Manticore is a Faster Alternative to Elasticsearch in C++

Written by snikolaev | Published 2022/07/25
Tech Story Tags: elasticsearch | manticore | manticore-search | c++ | c-programming | learn-c-plus | c-basics | coding

TLDRManticore Search can be used as an alternative to Elasticsearch for both full-text search and (now) data analytics too. The main use is an external databaseing indexing databaseing from Sphinxā€™s external database. The project was built as a fork of an open source version of the once popular search engine Sphinx Search. It is now being developed as an open-source alternative to the search engine Elasticsearch. We hope to use the project as an everyday user view of the functionality they need in everyday use today.via the TL;DR App

Five years agoĀ ManticoreĀ began as a fork of an open source version of the once popular search engineĀ Sphinx Search. We hadĀ two bags of grass, seventy-five pellets of mescaline, three C++ developers, a support engineer, a power user of Sphinx Search / backend team lead, an experienced manager, a mother of five helping us part-time, and a ton of bugs, crashes, and technical debts. So we got a shovel and other digging tools and started working to get it up to the search engine industry standards. Not that Sphinx was impossible to use, but many things were missing, and existing features werenā€™t quite stable or mature. And we had pushed it about as far as we could. So after 5 years and hundreds of new users, weā€™re ready to say thatĀ Manticore Search can be used as an alternative to Elasticsearch for both full-text search and (now) data analytics too.

In this article, I want to:

ā­ā­ā­ Your star onĀ GitHubĀ supports the project and makes us think we are on the right path!ā­ā­ā­

A little of the history

2001 - just Lucene and Sphinx

The first Apple store opened, Windows XP, iTunes and Mac OS X were released.

The genius Andrey Aksyonoff started working on Sphinx Search, for which I want to thank him very much! There was no SOLR and Elasticsearch yet, but there was already Lucene, on which they were both subsequently built. Sphinx Search started slowly coming together, and in a few years became quite popular technology having an impact on thousands of websites using it.

2010 - Elasticsearch appeared

Retina display, systemd, Ipad, and Elasticsearch appeared.

By this time Sphinx was already a popular full-text search engine, but the Sphinxā€™s concept of ā€œsource data has to be somewhere and we just make a full-text index that needs to be rebuilt regularlyā€ was not as interesting as Elasticsearchā€™s ā€œgive me any JSON via HTTP in real-time, I will find a node to place it onā€. SOLR wasnā€™t very good with data distribution, and JSON was gaining popularity, while XML was losing its attraction. Soon Elasticsearch started to rapidly gain popularity.

2017 - Manticore appeared

  • Elastic had firmly established itself as a standard tool for full-text search and log and data analytics.
  • Sphinx ceased its development as an open source project. Development, in general, slowed down significantly, and for some time was completely suspended.
  • Many Sphinx users who loved it and knew how to deal with it were not pleased about this and it was painful for them to migrate to Elasticsearch. In addition, by then, Elasticsearchā€™s conceptual flaws had surfaced: excessive memory consumption, difficulty in maintaining large clusters, and some performance issues.

As a result, the frustrated users and some former Sphinx developers teamed up and built a fork - Manticore Search. Our primary goals were as follows:

  • Continue developing the project as an open source
  • Look at everything from just a regular everyday normal userā€™s point of view and add the functionality they need in todayā€™s environment
  • Strengthen Sphinxā€™s strong sides and eliminate obvious weaknesses

2022: Five more years later

ā€œOkay, who wants to find out if this thing works?ā€


šŸ™ Sphinx 2: The main use case is indexing data from an external database: Sphinx returns id, then by id you have to go to the database and search there for the source document. The data schema can only be declared in the config.

āœ… Manticore: The basic way to work with it is exactly the same as in MySQL / Postgres and Elasticsearch: aĀ table can be created on the fly, data can be modified by a single/bulk INSERT/REPLACE/DELETE query, the data gets automatically compacted in the background. There is no need to look up the original document in an external source. Auto ID supported.


šŸ™ Sphinx 2: No replication.

āœ… Manticore:Ā Replication based on Galera, which is also used by Mariadb and Percona Server.


šŸ™ Sphinx 2: Queries can be done via SQL (MySQL wire protocol) or Sphinx binary protocol, there are clients for a few programming languages.

āœ… Manticore: AddedĀ JSON interface very similar to Elasticsearchā€™s. Based on the new protocol,Ā new clients for PHP, Python, Java, Javascript, and ElixirĀ were built. The clients are generated automatically, making new functionality available in the client sooner after it appears in the engine.


šŸ™ Sphinx 2: Difficult to configure text tokenization for most languages

āœ… Manticore: Simplified: madeĀ aliasesĀ cjkĀ andĀ non_cjk. Made tokenization of Chinese based on ICU. Added many new stemmers, including Ukrainian.


šŸ™ Sphinx 2: No official docker image and no support in the Kubernetes ecosystem

āœ… Manticore: Made and supportĀ official dockerĀ andĀ Helm chartĀ for Kubernetes


šŸ™ Sphinx 2: No APT/YUM/Homebrew repositories

āœ… Manticore: AddedĀ APT/YUM/HomebrewĀ repositories.Ā Nightly builds are also availableĀ in theĀ development repository. Each new commit becomes available as a package.


šŸ™ Sphinx 2: Novice users had a hard time understanding whatā€™s what.

āœ… Manticore: MadeĀ platform with interactive coursesĀ ā€”Ā https://play.manticoresearch.com/


šŸ™ Sphinx 2: Few examples in the documentation

āœ… Manticore:Ā rewrote documentation, made our own rendering engine for it -Ā https://manual.manticoresearch.com/. Itā€™s also available in a simpleĀ markdown formatĀ for contributions and easy editing.


šŸ™ Sphinx 2: Bugs, that often lead to crashes

āœ… Manticore:Ā Crashes are now rare. Hundreds of old bugs have been fixed.


šŸ™ Sphinx 2: Running search queries in parallel is limited

āœ… Manticore: Migrated toĀ coroutines. Made itĀ possible to parallelize any search query, so as to fully load the CPU and reduce the response time to a minimum


šŸ™ Sphinx 2: Cannot be used without full-text fields

āœ… Manticore: Can be usedĀ without full-text, like any other database.


šŸ™ Sphinx 2: Non-full-text data is stored row-wise, it must be in memory to work efficiently.

āœ… Manticore: Implemented and open-sourcedĀ Manticore Columnar Library, an external fully independent library that allows storing data column-oriented in blocks with support for different codecs for compressing different types of data efficiently. Requires almost no memory. You can now handle much larger amounts of data on the same server.


šŸ™ Sphinx 2: No secondary indexes

āœ… Manticore: The second important functionality of Manticore Columnar LIbrary isĀ support for secondary indexesĀ based on the modern and innovativeĀ PGM algorithm.


šŸ™ Sphinx 2: No percolate indexes for reverse search (when there are queries in the index and documents are used as input to find out which queries would match them)

āœ… Manticore: AddedĀ percolate type indexes.

This is approximately only a third of the changes - the ones you can easily see. On top of that, there have been many months of refactoring different parts of the system, resulting in a much simpler, more reliable, and more productive code. We hope this will attract new developers to the project.

What about Elasticsearch?

Elasticsearch is fine: itā€™s not very hard to use up to a certain amount of data, thereā€™s replication, fault tolerance, and rich functionality.Ā But there are nuances.

Letā€™s take a look at those nuances and what Manticore is like compared to Elasticsearch now (July 2022). Future reader, weā€™ve already bolted something else on, check out ourĀ Changelog.

Search Speed

Performance, namely low response time, is important in many cases, especially in log and data analytics, when there is a lot of data and not many search queries. You donā€™t want to wait 30 seconds instead of two for a response, do you? So hereā€™s to the nuances: Elasticsearch is considered a standard for log management, but, for example, it canā€™t effectively parallelize a query to a single index shard. And Elasticsearch has only 1 shard by default, but there are much more CPU cores in modern servers. Making too many shards is also bad. All this doesnā€™t make life any easier for a devops who cares about the response time: you have to think about what hardware Elasticsearch will run on and make changes accordingly.

Manticore, on the contrary, is able to parallelize the search query to all CPU coresĀ unconditionally and by default. It would be more correct to say that Manticore itself decides when to parallelize and when not, but in most cases it does, which allows you to efficiently load the CPU cores (which are often idle in cases of logging and data analytics) and significantly reduce response time.

But even if you make as many shards in Elasticsearch as there are CPU cores on the server, Manticore turns out to be significantly faster, specifically: hereā€™s a test for 1.7 billion documents, from which you can see that overallĀ Manticore is 4 times faster than Elasticsearch. If you are interested in the details or want to reproduce that on your own hardware, here is an articleĀ https://db-benchmarks.com/test-taxi/Ā (all examples below are also supported by scripts and links, etc., you wonā€™t find any idle talking in this blog)

Here is a different case: no big data, just 1.1 million comments from Hacker News. In this test,Ā Manticore is 15x faster than Elasticsearch.Ā All the details here.

And another test indicative for Elasticsearch as a standard log analytics tool - 10 million Nginx logs and various quite realistic analytical queries -Ā Manticore is 22 times faster than ElasticsearchĀ here.Ā All the details here

Data ingestion performance

There are also nuances with Elasticsearchā€™s write speed. For example, the dataset for the 1.7 billion-document test discussed above was loaded:

  • to Elasticsearch - in 28 hours and 33 minutes
  • to Manticore Search - 1 hour and 8 minutes.

This was on a 32-core server with SSD. The amounts of data after indexing are about the same. To learn more about how exactly the load was handledĀ read here.

In brief:

  • Source - csv
  • Logstash was used to put data to Elasticsearch with PIPELINE_BATCH_SIZE=10000 and PIPELINE_WORKERS=32 in 32 shards.
  • Manticore Search used a built-in toolĀ indexerĀ to put data to 32 shards in parallel.

Here is the log of the data loading to Elasticsearch and Manticore: https://gist.github.com/sanikolaev/678dd862a7668921e3417321be0a2513

It turns out that in this testĀ Manticore is 25 times faster in terms of data ingestion. Maybe I donā€™t know how to bake Logstash and Elasticsearch, but the import of the same dataset (but of a slightly smaller size) tookĀ Mark LitwintschikĀ even longer - 4 days and 16 hours.

Maybe the problem is in Logstash, not Elasticsearch? Letā€™s go find out by writing directly to Elasticsearch. The index scheme is as follows:

"properties": {
  "name": {"type": "text"},
  "email": {"type": "keyword"},
  "description": {"type": "text"},
  "age": {"type": "integer"},
  "active": {"type": "integer"}
}

Starting Manticore and Elasticsearch using their official docker images like this:

docker run --name manticore --rm -p 9308:9308 -v $(pwd)/manticore_idx:/var/lib/manticore manticoresearch/manticore:5.0.2

docker run --name elasticsearch --rm -p 9200:9200 -e discovery.type=single-node -e xpack.security.enabled=false -v $(pwd)/es_idx/:/usr/share/elasticsearch/data docker.elastic.co/elasticsearch/elasticsearch:8.3.2

Letā€™s now put 50 million random docs like this to both:

{
  1,
  84,
  "Aut corporis qui necessitatibus architecto est. Harum laboriosam temporibus praesentium quis et nulla. Consequuntur quia neque et repellat.",
  "[email protected]",
  "Keely Doyle Sr."
}

Weā€™ll useĀ simple php scriptsĀ with a batch size 10,000 and concurrency 32 (there are 16 physical CPU cores on the server and hyper-threading).

root@perf3 ~ # php load_elasticsearch.php 10000 32 1000000 50
preparing...
found in cache
querying...
finished inserting
Total time: 178.24096798897
280519 docs per sec

root@perf3 ~ # php load_manticore.php 10000 32 1000000 50
preparing...
found in cache
querying...
finished inserting
Total time: 215.7572619915
231742 docs per sec

OK, nowĀ Elastic is 21% faster, but again there is an interesting nuance: Elasticsearch by defaultĀ buffersĀ new documents for one second, which meansĀ the last batch will not be available for searching right away. This is ok in many cases, but to make things fair letā€™s doĀ /bulk?refresh=1Ā in Elasticsearch and see what it gives:

root@perf3 ~ # php load_elasticsearch.php 10000 32 1000000 50
preparing...
found in cache
querying...
finished inserting
Total time: 307.47588610649
162614 docs per sec

In this case Manticore is again faster by 43%.

If we want to test the maximum performance, we can:

  • Use sharding in both Elasticsearch and Manticore
  • Let Elasticsearch buffer incoming documents at maximum
  • Use MySQL interface to put data to Manticore Search (itā€™s slightly faster)
  • Disable binlog in Manticore Search (unfortunately, you canā€™t do that in Elasticsearch)

Hereā€™s what it gives:

Manticore:

// docker run -p9306:9306 --name manticore --rm -v $(pwd)/manticore_idx:/var/lib/manticore -e searchd_binlog_path= manticoresearch/manticore:5.0.2

root@perf3 ~ # php load_manticore_sharded.php 10000 32 1000000 32 50
preparing...
found in cache /tmp/bc9719fb0d26e18fc53d6d5aaaf847b4_10000_1000000
querying...
finished inserting
Total time: 55.874907970428
894856 docs per sec

Elasticsearch:

root@perf3 ~ # php load_elasticsearch_sharded.php 10000 32 1000000 32 50
preparing...
found in cache
querying...
finished inserting
Total time: 119.96515393257
416788 docs per sec

But, remember the nuance: you have to spend another 13 seconds to make the documents searchable:

root@perf3 ~ # curl -s -X POST "localhost:9200/_sql?format=json&pretty" -H 'Content-Type: application/json' -d'{"query": "select count(*) from user"}'                 {
  "columns" : [
    {
      "name" : "count(*)",
      "type" : "long"
    }
  ],
  "rows" : [
    [
      0
    ]
  ]
}

root@perf3 ~ # time curl -XPOST "localhost:9200/user/_refresh"
{"_shards":{"total":64,"successful":32,"failed":0}}
real    0m13.505s
user    0m0.003s
sys     0m0.000s

root@perf3 ~ # curl -s -X POST "localhost:9200/_sql?format=json&pretty" -H 'Content-Type: application/json' -d'{"query": "select count(*) from user"}'
{
  "columns" : [
    {
      "name" : "count(*)",
      "type" : "long"
    }
  ],
  "rows" : [
    [
      50000000
    ]
  ]
}

All in all,Ā Manticore is 2x faster than Elasticsearch in terms of data ingestion performance. And the data is searchable immediately after the batch is loaded, not 2 minutes later. The scripts used for this test can be foundĀ here.

What itā€™s written in

  • Elasticsearch itself is written in Java, and the Lucene library it uses and depends on is also written in Java.
  • Manticore is written in C++. What it gives:
    • The code is harder to write, yes.
    • But we are closer to the hardware, so we can makeĀ more optimized code.
    • No need to think about JVM heap size.
    • There isĀ no risk for JVM garbage collectorĀ to start gc at an inappropriate moment, which can greatly affect performance.
    • No need to run a heavy JVMĀ on startup which takes quite a time.

Open source

  • Elasticsearch isĀ not a pure open sourceĀ anymore. The license was changed from Apache 2 to the Elastic License in 2021.
  • Manticore isĀ purely open sourceĀ with GPLv2 license for theĀ daemonĀ and the Apache 2 license for theĀ columnar library.

JSON vs SQL

BothĀ Elasticsearch and Manticore can do both SQL and JSON, but the difference is:

  • Elasticsearch is based on JSON by default while Manticore is SQL-first. What we love in SQL is that if use it, many things are much easier to do at the proof of concept stage. For example, here are 2 queries that do the same thing. Do you wanna spend a minute countingĀ {Ā andĀ }Ā brackets or ā€¦ ?

  • SQL is very limited in Elasticsearch, for example:
    • you canā€™t doĀ SELECT id
    • you canā€™tĀ INSERT/UPDATE/DELETE
    • you canā€™t run service commands (create cluster, see status, etc.).
  • In Manticore itā€™s the other way around:
    • You can do everything via SQL
    • JSON covers only basic functionality: search and data modification queries.

Startup time

In some cases, you need to be able to launch a service quickly. For example, in IoT (Internet of things) or some ETL scenarios.

Near-real-time vs real-time


As mentioned above, by defaultwhen you put data to Elasticsearch, it becomes searchable only after a second. This can be adjusted, but then the ingestion rate will become significantly slower, as you can see above.

Manticore always works in real-time mode.

Full-text search

Probably worth another article to explain it all. In short: both Manticore and Elasticsearch are good in terms of full-text search, have a lot in common, but there are a lot of differences, too. According toĀ these objective testsĀ (which is important when evaluating relevance) on almost default settingsĀ Manticore can give higher relevance than Elasticsearch. Here is theĀ relevant pull requestĀ inĀ BEIR(information retrieval benchmark).

Aggregations

Both Manticore and Elasticsearch provide rich aggregation functionality. You probably know what Elasticsearch can do, hereā€™s what can be done in Manticore for you to compare:

  • Just grouping:Ā SELECT release_year FROM films GROUP BY release_year LIMIT 5

  • Get aggregates:Ā SELECT release_year, AVG(rental_rate) FROM films GROUP BY release_year LIMIT 5

  • Sort buckets:Ā SELECT release_year, count(*) from films GROUP BY release_year ORDER BY release_year asc limit 5

  • Group by multiple fields at the same time:Ā SELECT category_id, release_year, count(*) FROM films GROUP BY category_id, release_year ORDER BY category_id ASC, release_year ASC

  • Get N records from each bucket, not 1:Ā SELECT release_year, title FROM films GROUP 2 BY release_year ORDER BY release_year DESC LIMIT 6

  • Sort inside a bucket:Ā SELECT release_year, title, rental_rate FROM films GROUP BY release_year WITHIN GROUP ORDER BY rental_rate DESC ORDER BY release_year DESC LIMIT 5

  • Filter buckets:Ā SELECT release_year, avg(rental_rate) avg FROM films GROUP BY release_year HAVING avg > 3

  • UseĀ GROUPBY()Ā to access aggregation key:Ā SELECT release_year, count(*) FROM films GROUP BY release_year HAVING GROUPBY() IN (2000, 2002)

  • Group by array value:Ā SELECT groupby() gb, count(*) FROM shoes GROUP BY sizes ORDER BY gb asc

  • Group by json node:Ā SELECT groupby() color, count(*) from products GROUP BY meta.color

  • Get count of distinct values:Ā SELECT major, count(*), count(distinct age) FROM students GROUP BY major

  • UseĀ GROUP_CONCAT():Ā SELECT major, count(*), count(distinct age), group_concat(age) FROM students GROUP BY major

  • UseĀ FACETĀ after your main query and it will group the main queryā€™s results:Ā SELECT *, price AS aprice FROM facetdemo LIMIT 10 FACET price LIMIT 10 FACET brand_id LIMIT 5

  • Faceting by aggregation over another attribute:Ā SELECT * FROM facetdemo FACET brand_name by brand_id

  • Faceting without duplicates:Ā SELECT brand_name, property FROM facetdemo FACET brand_name distinct property

  • Facet over expressions:Ā SELECT * FROM facetdemo FACET INTERVAL(price,200,400,600,800) AS price_range

  • Facet over multi-level grouping:Ā SELECT *,INTERVAL(price,200,400,600,800) AS price_range FROM facetdemo FACET price_range AS price_range, brand_name ORDER BY brand_name asc;

  • Sorting of facet results:

    SELECT * FROM facetdemo
    FACET brand_name BY brand_id ORDER BY FACET() ASC
    FACET brand_name BY brand_id ORDER BY brand_name ASC
    FACET brand_name BY brand_id ORDER BY COUNT(*) DESC
    
  • Pagination in facet results:

    SELECT * FROM facetdemo
    FACET brand_name BY brand_id ORDER BY FACET() ASC  LIMIT 0,1
    FACET brand_name BY brand_id ORDER BY brand_name ASC LIMIT 2,4
    FACET brand_name BY brand_id ORDER BY COUNT(*) DESC LIMIT 4;
    

Schemaless

Elasticsearch is famous for the fact that you can write anything into it.Ā With Manticore Search, you have to create a scheme beforehand.Ā Many Elasticsearch experts recommend using static mapping, for example,Ā https://octoperf.com/blog/2018/09/21/optimizing-elasticsearch/#index-mapping:

One of the very first things you can do is to define your indice mapping statically.


But we find dynamic mapping important in the area of log management and analysis. Since we want Manticore to be easy to use for thatwe have plans to enable dynamic mapping in Manticore, too.

Integrations

  • Both Elasticsearch and Manticore have clients for different programming languages.
  • MySQL wire protocol support:
    • An important advantage of Manticore is the possibility to use MySQL clients to work with the server. Even if there is no official Manticore client for some language, there is definitely a MySQL client you can use. Using the command lineĀ MySQL client for administration is more convenientĀ than usingĀ curl, because the commands are much more compact and the session is supported.
    • The support for the MySQL protocol has also made it possible to supportĀ MySQL/Mariadb FEDERATEDĀ engine for tight integration between those and Manticore.
    • In addition, Manticore can be used viaĀ ProxySQL.
  • HTTPĀ JSON API is supported in bothĀ Elasticsearch and Manticore.
  • Logstash, Kibana:Ā Manticore supports Kibana, but itā€™s a work in progress and in a beta stage. Weā€™ll get those integrations up to speed soon. This is how you can try Manticore with Kibana:
# download manticore beta version with support for Kibana, check https://repo.manticoresearch.com/repository/kibana_beta/ for different OS versions
wget https://repo.manticoresearch.com/repository/kibana_beta/ubuntu/jammy.zip

# unarchive it
unzip jammy.zip

# install the packages
dpkg -i build/*

# switch Manticore to the mode supporting Kibana
mysql -P9306 -h0 -e "set global log_management = 0; set global log_management = 1;"

# start Kibana pointing it to Manticore Search instance listening on port 9308
docker run -d --name kibana --rm -e ELASTICSEARCH_HOSTS=http://127.0.0.1:9308 -p 5601:5601 --network=host docker.elastic.co/kibana/kibana:7.4.2

# install php and composer, download loading script and put into Manticore 1 million docs of fake users
apt install php composer php8.1-mysql
wget https://gist.githubusercontent.com/sanikolaev/13bf61bbe6c39350bded7c577216435f/raw/8d8029c0d99998c901973fd9ac66a6fb920deda7/load_manticore_sharded.php
composer require fakerphp/faker
php load_manticore_sharded.php 10000 16 1000000 16 1

# don't forget to create an index patter in Kibana (user*)

# run `docker stop kibana` to stop the Kibana server

If all went well you should see:


Replication

  • Both Elasticsearch and Manticore Search useĀ synchronous replication. At Manticore we decided not to reinvent the wheel and made integration with theĀ Galera library, which is also used by Mariadb and Percona Xtradb cluster.
  • An important difference in managing replication and clustering in Manticore and Elasticsearch is that with ElasticsearchĀ you need to edit the config to set up a replica, while in Manticore you donā€™t:Ā replication is always enabled and itā€™s very easy to connectĀ to and sync up with another node:


Sharding and distributed indexes

Unlike Elasticsearch, Manticore does not yet have automatic sharding, butĀ combining multiple indexes into one for manual sharding is easier than in Elasticsearch:

Adding an index located on a remote node is also supported, just specify the remote host, port, and index name.

Ease of use and learning

Our thinking is that we donā€™t want our users, be it a developer or a devops to become experts in databases or search engines or have a PhD to be able to use Manticore products. We assume you have other things to do rather than spending hours trying to understand how this or that setting affects this or that functionality. Hence, Manticore Search should work fine in most cases even on defaults.


Our ultimate goal is to make Manticore Search as easy to use and learn as possible.

  • As mentioned previously,Ā Manticore is SQL-firstĀ which we find important while you are just getting started with Manticore compared to Elasticsearch.
  • Manticore provides interactive coursesĀ -Ā play.manticoresearch.comĀ to walk you through the essential steps to get familiar with Manticore.
  • There is aĀ guide on how to get startedĀ with examples for different OSes and programming languages -Ā https://manual.manticoresearch.com/Quick_start_guideĀ .
  • YouĀ can talk directly to the developersĀ in public channels:Ā Slack,Ā Telegram,Ā Forum.
  • We have aĀ special short domainĀ mnt.crĀ integrated with the documentation so thatĀ mnt.cr/<keyword>Ā takes you to the search results in the documentation in special mode - it immediately rewinds to the most relevant section. This is especially handy when you need to recall some details on some setting, e.g.Ā mnt.cr/max_packet_size.

Cloud native

Imperative and declarative usage modes

In Elasticsearch, most things are only done through the API. There is no way (anymore) to add mappings to a configuration file so that they are available immediately after startup.

Manticore, like Kubernetes,Ā supports two usage modes:

  • Imperative: when everything can be managed online usingĀ CREATE TABLE/DROP TABLE/ALTER TABLE, CREATE CLUSTER/JOIN CLUSTER/DELETE CLUSTERĀ etc.
  • Declarative: when you can define mappings in a configuration file, which gives greater portability and easier integration of Manticore into CI/CD, ETL, and other processes.

Percolate

Percolate or Persistent Query is when a table contains queries, not documents, and the search is performed on documents, not queries. The search results are queries that satisfy the documents. This type of search is useful for usersā€™ subscriptions: if you subscribed, for example, to the queryĀ TV > 42 inches, then as soon as it appears on the site, you will be notified about it. Manticore provides the functionality for that as well as Elasticsearch. According to theĀ testsĀ we did a few years agoĀ throughput of this type of search in Manticore is significantly higher than in Elasticsearch.

Whatā€™s next?

We are now developing the project in the following directions:

  • Drop-in replacement for ElasticsearchĀ in the ELK stack, so Kibana and Logstash (or the Opensearch alternatives) can work with it fine. We want theĀ low latency thatā€™s easier to achieve with Manticore to be available to people for log analysis. We already have aĀ beta.
  • Schemaless mode. When you use Manticore as a log analysis solution you donā€™t have to think about the schemas.
  • Automatic shardingĀ and orchestration of shards, so you can load data into Manticore even faster and the shards will be spread out in an optimal order for better fault tolerance.
  • Further performance optimizations. We justĀ want even lower latency and higher throughput, so you can run Manticore on cheaper hardware and make the Earth greener.

Conclusions

So, at the end of it all, what do we have? Manticore may now be of interest to those:

  • Who cares aboutĀ low response timesĀ on both small and large amounts of data,
  • Who likesĀ SQL,
  • Who wants somethingĀ simpler than Elasticsearch to integrate search into their application faster,
  • Who wants somethingĀ more lightweightĀ which starts fast,
  • Who cares about usingĀ purely open sourceĀ software.

We are continuing!

ā­ā­ā­ Your star onĀ GitHubĀ supports the project and makes us think we are on the right path!ā­ā­ā­

Also Published Here


Written by snikolaev | Database expert. Passionate about databases and search engines.
Published by HackerNoon on 2022/07/25