Splunk is a database system designed for extracting structure and analyzing machine-generated data. It combines data from things like other databases, web servers, networks, etc. and then offers services to analyze the data, and produce dashboards, graphs, reports, and alerts that offer analysis of the data.
Splunk supports the notion of checkpoints. When reading data and indexing, a checkpoint can be created to mark the data as being read or indexed.
Splunk compresses the raw data up to half its size.
Splunk indexes data by breaking them into events, based on the timestamp of the data. After breaking the data up into events, the events are passed through the indexing pipeline where additional steps are taken such as: breaking the events into segments so indexing and searching can be done efficiently, building data structures for the indexes, and writing the events out to disk.
Splunk supports inner join, and outer join, but inner join is the default.
Splunk is disk-oriented.