Velox is a reusable vectorized database execution engine. It can be used to build compute engines focused on analytical workloads, including batch (Spark, Presto), interactive (PyVelox), stream, log processing, and AI/ML.
Unlike a complete database, Velox cannot be used directly by end-users. Rather, it is designed to be a general-purpose component to handle execution that database developers can use in their systems.
Meta's data infrastructure contains dozens of specialized data computation engines, which have been largely developed independently. Maintaining and enhancing each of them can be difficult, especially considering the rapid change of workload requirements and hardware condition.
Velox is created in 2020 and open-sourced in 2021 to address this problem as a unified execution engine. It is under active development, but it’s already in various stages of integration with some systems, including Presto, Spark, and PyTorch (the latter through a data preprocessing library called TorchArrow), etc. Additional contributions were provided by Intel, ByteDance, and Ahana.
https://github.com/facebookincubator/velox
https://facebookincubator.github.io/velox/
Meta
2020
Industrial Research, Open Source