DBDB.io The Encyclopedia of Database Systems · Est. 2017
Database of Databases

Database Entry

DeepDB


DeepDB (also sometimes called DeepSQL near the end of the project) was a proprietary MySQL storage engine designed for OLAP and OLTP workloads. It was designed to be an alternative to the InnoDB storage engine for MySQL. Intended to scale MySQL to large scale data operations, it utilizes adaptive data structures and machine learning algorithms to optimize transactional workloads at a big data scale. Different from classic B+Tree and LSM-Tree based storage engines, DeepDB is built on top of a new tree structure, called CASSI (Continuous Adaptive Sequential Summarization of Info), which dynamically configures the database during runtime to adapt to new hardware deployments. CASSI keeps running the three steps of analysis, adaption, and optimization for high efficiency. Therefore, this storage engine allows enterprises to utilize MySQL without manual configuration under new hardware settings.[03][04][05]

Source Code
https://github.com/DeepFound/deep_engine[02]
Country of Origin
US
Start Year
2010
End Year
2017 [19]
Former Name
DeepSQL
Project Types
Commercial, Open Source
Written in
C++
Derived From
MySQL
Licenses
AGPL v3, Proprietary

Database Entry

DeepDB


DeepDB (also sometimes called DeepSQL near the end of the project) was a proprietary MySQL storage engine designed for OLAP and OLTP workloads. It was designed to be an alternative to the InnoDB storage engine for MySQL. Intended to scale MySQL to large scale data operations, it utilizes adaptive data structures and machine learning algorithms to optimize transactional workloads at a big data scale. Different from classic B+Tree and LSM-Tree based storage engines, DeepDB is built on top of a new tree structure, called CASSI (Continuous Adaptive Sequential Summarization of Info), which dynamically configures the database during runtime to adapt to new hardware deployments. CASSI keeps running the three steps of analysis, adaption, and optimization for high efficiency. Therefore, this storage engine allows enterprises to utilize MySQL without manual configuration under new hardware settings.[03][04][05]

History[06][07]


Deep Information Sciences was founded in 2010 based on research conducted at the University of New Hampshire. After the company went under in 2017, the source code of the DeepDB engine was released as open-source as part of a new Deep Software Foundation holding. A large portion of the source code of the system was a custom C++ implementation of the Java Development Kit software and not related to the DBMS itself.

Checkpoints[08][09][10]


DeepDB supports last transaction checkpoints, and it takes checkpoints asynchronously.

Compression[11]


There exists prefix compression in indexes. DeepDB keeps compressed data in the cache, and decompress it during operations. The system supports high-levels of compression with a compact representation of keys and delta compression.

Concurrency Control[11][09]


DeepDB utilizes Multi-version Concurrency Control such that the database can be rolled back to any transactional state.

Data Model[12]


While the DeepDB storage engine implements a Key/Value model, the data model is fully relational, as specified in MySQL.

Indexes[09][12][13]


The index DeepDB uses does not belong to any choice. It uses hyper-indexing and the new CASSI(Continuous Adaptive Sequential Summarization of Info) tree for indexing, as a replacement for B+Tree or LSM-Tree type indexing. CASSI tree is a persistent data structure that supports ACID transactions. It is improved from B+Trees in that they are able to collapse the internal structure through virtualizing and summarizing. CASSI tree provides O(1) write complexity and O(log(N)) read complexity in the worst case. Based on the type of data and tasks, whether transactional, data stream capture, or analytics, the tree and indexes used in queries are dynamically adjusted to maximize hardware resources. For speed of lookup, sometimes it chooses to index every column in a database table.

Joins[14]


DeepDB is a storage engine full compatible with other components in MySQL. Thus, it supports the same join algorithms as MySQL, including the Nested-Loop Join Algorithm and the Block Nested-Loop Join Algorithm.

Logging[09]


DeepDB maintains state logs so that the replicas can generate the same indexes and perform the same operations from the streamed state log.

Query Interface[08][15]


SQL

DeepDB is designed as a storage engine that is compatible with MySQL. Since it uses the same MySQL API, the query interface is also the same, which supports the standard SQL query interface with several additional extensions.

Storage Architecture[16][11][17]


DeepDB runs on POSIX-based disk-oriented file-systems (e.g. SSD/HDD). All files are persistently stored on the disk and it will write data into in-memory files temporarily. DeepDB stores data in 3 forms including on-disk row store tables, in-memory row store tables, and on-disk column store indexes. Instead of organizing in pages, the in-memory row store is designed to manage single rows as much as possible. The data and indexes on disk in memory are organized into segments, with various sizes. Segments may contain summary data or metadata so that metadata or summary data remain in the cache when the segments are evicted.

The system manages cache usage using adaptive algorithms. Variable-sized segments rather than pages are used to store data. In addition, summary indexing is used to identify relevant segments.

Storage Model[08][09]


DeepDB supports transactional row storage for OLTP workloads. It also implements segmented column stores to deliver capabilities similar to OLAP databases (column store). Different from traditional column stores that maintain large and monolithic column store files, variable-sized segments are maintained in DeepDB to improve space utilization.

Storage Organization[16]


A shadow copy is maintained for recovery on the system crash. Disk snapshots are maintained to roll forward and back in data history. The database files including indexes, transactional state, and metadata are streamed append-only files.

System Architecture[16][03][18]


DeepDB storage engine is designed as an easy-to-install plugin replacement for MySQL's native InnoDB storage engine. Using DeepDB does not require any application or schema change. It augments MySQL with full ACID compliance and additional machine-learning metrics. The system is architected for complex environments and supports HTAP(Hybrid Transactional Processing).

Citations

19 sources
  1. http://deep.is deep.is Spam — Check Archive
  2. GitHub - DeepFound/deep_engine: High-performance C++ key/value database storage engine · GitHub github.com
  3. Deep Information Sciences Releases New DeepSQL Engine eweek.com
  4. http://misclassblog.com/databases-and-data-warehouses/deepsql-the-next-generation-of-database-optimization misclassblog.com Dead — Check Archive
  5. DeepDB: General Purpose Database For Big Data Era | InformationWeek informationweek.com
  6. https://www.crunchbase.com/organization/deep-information-sciences-inc crunchbase.com
  7. https://www.businesswire.com/newsroom businesswire.com
  8. http://dev.deepis.com.473elmp01.blackmesh.com/WhatisDeep blackmesh.com Dead — Check Archive
  9. http://dev.deepis.com.473elmp01.blackmesh.com/technical-documents/deepdb-white-paper blackmesh.com Dead — Check Archive
  10. http://dev.deepis.com.473elmp01.blackmesh.com/product-documentation/deepsql-users-guide-rel-320 blackmesh.com Dead — Check Archive
  11. Introduction to Deep Information Sciences and DeepDB | DBMS 2 : DataBase Management System Services dbms2.com
  12. Cloud Computing recent news | InformationWeek informationweek.com
  13. deep_docs/Whitepaper_Continuous-Adaptive-Seq-Sum-Info.pdf at master · DeepFound/deep_docs · GitHub github.com
  14. https://dev.mysql.com/doc/refman/5.7/en/nested-loop-joins.html mysql.com
  15. http://dev.deepis.com.473elmp01.blackmesh.com/How-Deep-Works blackmesh.com Dead — Check Archive
  16. Best storage engine for MySQL | PPTX slideshare.net
  17. http://dev.deepis.com.473elmp01.blackmesh.com/blog/deepsql-amazon-ec2-smokes-aurora-and-rds-performance blackmesh.com Dead — Check Archive
  18. https://www.businesswire.com/news/home/20160204005083/en/Deep-Data-Game-deepSQL-World’s-Cloud-Aware-Autonomic-Scaling businesswire.com
  19. https://finance.yahoo.com/news/deep-information-sciences-goes-open-130000559.html yahoo.com Dead — Check Archive
Revision #18 Last Updated: