Neptune

View Current Viewing Revision #5 from 11/24/2019 7:11 p.m.

Amazon Neptune is a fully-managed graph database service used to work with highly connected datasets. It supports multiple graphs, including Property Graph and W3C's RDF, along with their respective query languages Apache TinkerPop, Gremlin, and SPARQL. Neptune is highly available including read-only replicas, point-in-time recovery, and continuous backup to Amazon S3.

History

Amazon Neptune was announced on November 29, 2017 by Amazon Web Services with a limited preview of the service. On May 30, 2018, Neptune became fully available. Due to use cases involving private data, Neptune because HIPAA eligible on September 12, 2018 and complied with the Payment Card Industry Data Security Standard on December 12, 2018.

Data Model

Graph Triplestore / RDF

Isolation Levels

Read Committed Snapshot Isolation

Read-Only Queries are evaluated under Snapshot Isolation. Mutation Queries (i.e. write queries) are executed under Read Committed isolation.

Storage Architecture

Disk-oriented

Indexes

Not Supported

Neptune uses four-position (quad) element called a Neptune Quad. A Neptune quad is composed of a subject, predicate, object, and a graph identifier. A quad is an assertion about one or more resources. For example, an edge is described by a quad and so is each property of a node. A graph is a set of quad statements with the same graph identifier.

Neptune maintains three indices: SPOG - Key composed of Subject + Predicate + Object + Graph POGS - Key composed of Predicate + Object + Graph + Subject GPSO - Key composed of Graph + Predicate + Subject + Object

Amazon Neptune's documentation does not reveal what data structure is used to create these indexes.

System Architecture

Shared-Disk

Foreign Keys

Supported

The edges/relationships in the graph are foreign keys.

Query Interface

SPARQL Gremlin

Compression

Naïve (Page-Level)

Neptune supports compression of single files using the gzip format.

Concurrency Control

Multi-version Concurrency Control (MVCC)

Read-Only queries are evaluated under snapshot isolation. That is, read-only queries operate on a single consistent snapshot of the database which is taken right when the query begins. Snapshot isolation is achieved via multiversion concurrency control and guarantees that dirty reads, non-repeatable reads, and phantom reads do not occur. Read-Only queries may be performed on read replicas causing a small replication lag between the given query results and what the result should be.

For Mutation Queries (i.e. write queries), Neptune locks records and ranges of records when reading data. This ensures consistency of data.

Revision #5 | Updated 11/24/2019 7:11 p.m.

View Current Viewing Revision #5 from 11/24/2019 7:11 p.m.

Website

https://aws.amazon.com/neptune/

Tech Docs

https://docs.aws.amazon.com/neptune/latest/userguide/

Developer

Amazon

Country of Origin

Start Year

2018