In Vertica, each node maintains checkpoints and transaction logs separately. The synchronization duration can be tuned by users as well. For a single-node failure, it can be recovered from other nodes. If the entire cluster fails, it can be recovered up to the earliest checkpoints when all nodes are good. New transaction log cannot be appended when a new checkpoint begins.
Both Run-Length Encoding and Delta encoding are used in Vertica. RLE encoding is only used when the number of repetitions is large. Delta encoding works for INTEGER/DATE/TIME/TIMESTAMP/INTERVAL type, where the variations from the smallest value are stored instead of the real values to save more space.
Vertica supports MVCC to achieve data consistency. Both current and previous statuses are stored and visible to transactions.
Columnar store is used in Vertica to improve the performance of sequential access by sacrificing the performance of single access. Compared with row-oriented databases which has to scan the whole table, only few needed columns are retrieved based on given queries in Vertica, which can improve throughput by reducing disk I/O costs.
Vertica allows users to use foreign key constraints. Foreign keys should be defined when tables are created or "ALTER TABLE" is used.
Indexes are not support in Vertica. Projections are used to improve query performance in Vertica.
Read Committed and Serializable are supported in Vertica. Read Committed is the default isolation level. Read Uncommitted and Repeatable Read are treated automatically as Read Committed and Serializable respectively in vertica.
Both merge join and hash join are supported in Vertica. Merge join is faster in general and requires less memory, but data is required to be sorted before. Hash join requires more memory, but it is faster if the inner table can fit in the memory.
Data is stored in Vertica in columnar format to improve query performance, since a lot of disk I/O can be avoided.
Stored Procedures are not support in Vertica. External Procedures such as R,C++ can be used in Vertica.
Shared-nothing architecture is used in Vertica, where all nodes don't share anything in terms of memory and disk storage. Shared-nothing architecture are easier to scale, since there is no race or contention caused by locks. Moreover, Massively MPP(Massive Parallel Processing) architecture is used in Vertica, which can improve query performance such as increasing the throughput of large joins when multiple machines are involved.
C++, Java, Perl, Python, R