Velox

OLAP

Velox is a reusable vectorized database execution engine. It can be used to build compute engines focused on analytical workloads, including batch (Spark, Presto), interactive (PyVelox), stream, log processing, and AI/ML.

Unlike a complete database, Velox cannot be used directly by end-users. Rather, it is designed to be a general-purpose component to handle execution that database developers can use in their systems.

History

Meta's data infrastructure contains dozens of specialized data computation engines, which have been largely developed independently. Maintaining and enhancing each of them can be difficult, especially considering the rapid change of workload requirements and hardware condition.

Velox is created in 2020 and open-sourced in 2021 to address this problem as a unified execution engine. It is under active development, but it’s already in various stages of integration with some systems, including Presto, Spark, and PyTorch (the latter through a data preprocessing library called TorchArrow), etc. Additional contributions were provided by Intel, ByteDance, and Ahana.

Data Model

Relational

Query Execution

Vectorized Model

Query Interface

Custom API

Storage Format

Apache Arrow

System Architecture

Embedded

Velox Logo
Website

https://velox-lib.io/

Source Code

https://github.com/facebookincubator/velox

Tech Docs

https://facebookincubator.github.io/velox/

Developer

Meta

Country of Origin

US

Start Year

2020

Project Type

Industrial Research, Open Source

Written in

C++

Supported languages

C++, Python

Operating Systems

Linux, OS X

Licenses

Apache v2