GrapheekDB is a lightweight graph database with support for multiple back-end storage managers. It only represents directed graphs and is persistent if the chosen data model is a Key/Value store.
GrapheekDB was developed in 2014 by Raphaël Braud, a freelance developer from France. It was built for a recommendation system to extract the contents of documents, tokenizing their contents, and give recommendations of similar documents based on user queries. A graph database was chosen over a relational database to avoid multiple joins on tables of several million rows to improve performance. It was built with a specific purpose of recommending documents and has a python-like API (close to Django and Germlin).
The Naive Page Rank compression algorithm is listed as one of the todo items in the source-code but is not yet supported.
The DBMS is a multi-model document store. Presently it can either be a graph or Key/Value Store (KVS). The DBMS uses many KVS backends such as Kyoto Cabinet and Symas LMDB. If a KVS backend is used, the DBMS becomes object persistent. There are no strict assertions on data modelling.
While a graph database is index-free as it consists of direct pointers to its adjacent elements (a property known as adjacency), GrapheekDB does not need an index to find node and edge indices. However, the latest version of the DBMS does support nodes and edge indices for lookups on sparse graphs. The current version only supports "exact match indices" and performs a Depth-First-Search (DFS) in order to match indices. Storing the indices leads to a storage overhead and slows down writes in the DBMS.
The DBMS was built with serializable execution in mind. This was done to avoid loading the entire data in memory every time the intended recommendation algorithm was run and produce the desired list of documents based on the user query.
A graph database does not need join operations as they are expensive. The DBMS is also schemaless.
The Query interface is close to Germlin and Django frontend. The DBMS has methods for lookups on graphs that resemble Django lookups and methods for path traversals for inner and outer vertices and edges that resemble Germlin traversal methods. The DBMS also has aliasing and collecting methods as well as aggregation methods such as count and sum which are implemented using python's entity iterators.
The DBMS uses in memory storage to store the graph.
GrapheekDB is a multi model document store. The nodes and edges can have related data, but this is not enforced.