CrateDB is an open-source, distributed SQL database system developed by Crate.io, build on top of a variety of open source projects. Some of these projects include Facebook’s Presto SQL parser and Apache’s Lucene search library. The system is designed for high scalability and written in Java, so it can run on any operating system with a Java 7 VM. CrateDB is typically used with machine-generated data, particularly for operational analytics applications. There is a free community edition under the Apache 2 license, and an enterprise edition which includes premium features and support options. Crate.io also owns other products: CrateDB Cloud, CrateDB Cloud on Azure, and Crate IoT Data Platform.
CradeDB started as a standalone project by Jodok Batlogg (who previously contributed to Open Source Initiative Vorarlberg), Christian Lutz, and Bern Dorn. The group, who ran a consulting business before that helped companies use tools for their data needs, turned that knowledge into a product. The team won Judge’s Choice at GigaOm Structure Launchpad competition in June 2014 and TechCrunch Discord Europe in October 2014. The first version was released in September 2016, and was reportedly downloaded millions of times. The second and enterprise versions released in May 2017.
Optimistic concurrency control is implemented using rows’ sequence number and primary term. Initially each row’s sequence number is 0, and is incremented with every insert/update/delete to its shard (partition of the table). The primary term is incremented when a shard becomes primary. When updating or deleting, the query must be done with the correct sequence number and primary term, otherwise no effect will take place. CrateDB does not support transactions.
CrateDB supports two join algorithms: nested loop joins and block hash joins. By default the system uses nested loop joins. Block hash joins can only be applied on inner joins where the condition meets the following criteria: it contains at least one equal operator and no or operators and every argument of an equal operator can only reference fields from one relation. The hash join algorithm can be enabled or disabled explicitly. The system supports cross joins, inner joins, and left/right/full outer joins, and its performance is limited when joining two or more tables (resulting in poor execution plans).
CrateDB supports creating, querying, and dropping views. The view is not materialized, so the query associated with this view is rerun every time the view is used. The enterprise version allows different users to have privileges, so to query a view the user must have DQL privileges on the view. The user who created the view automatically has DQL privileges on all the relations in the view.