PrestoDB

View Current Viewing Revision #12 from 11/24/2019 7:02 p.m.

Presto is an open source distributed SQL query engine for running interactive analytic queries against heterogeneous data sources. It was open sourced by Facebook in 2013. Although it is also known as PrestoDB, Presto is not a general-purpose database management system (DBMS). It does not manage the storage of data. Instead, Presto is a query engine which allows querying data where it lives, including Hive, Cassandra, Kafka, and relational databases. A single Presto query is able to combine data from multiple sources. Presto was designed, built and optimized for interactive queries. In comparison, both Presto and Hive support SQL queries against HDFS, while Presto is targeted at interactive queries and Hive is suitable for batch processing. Presto supports ANSI-compatible SQL statements.

History

Presto started out as a project at Facebook. Prior to Presto, Facebook primarily used Hive, but it wasn't optimized for high speed interactive queries. The development of Presto finished in 2012, and it was rolled out to the company in early 2013. Facebook open-sourced Presto in November 2013 under the Apache Software License.

Isolation Levels

Read Uncommitted Read Committed Serializable Repeatable Read

Presto supports 4 different types of isolation levels. The isolation level is to be specified when a transaction is started.

Storage Architecture

In-Memory

Indexes

Not Supported

Presto does not support indexes. All intermediate processing and storage are done in the memory to avoid unnecessary I/O overheads.

Checkpoints

Not Supported

Presto does not support checkpointing, or any other forms of fault-tolerance as of late 2018. The client has to rerun the entire query once failure occurs.

Presto follows the ANSI SQL specification. To improve usability, Presto also supports anonymous functions and built-in higher-order functions, including transform, filter and reduce.

Revision #12 | Updated 11/24/2019 7:02 p.m.