Splunk allows for structuring of machine data. It combines data from things like other databases, web servers, networks, etc. and then offers services to analyze the data, and produce dashboards, graphs, reports, and alerts that offer analysis of the data.
Splunk supports the notion of checkpoints. When reading data and indexing, a checkpoint can be created to mark the data as being read or indexed.
Splunk compresses the raw data up to half its size.
Indexes in Splunk do not seem to be the traditional B+ Trees, Skip Lists, or Radix Trees. However, Splunk indexes data by breaking them into events, based on the timestamp of the data. After breaking the data up into events, the events are passed through the indexing pipeline where additional steps are taken such as: breaking the events into segments so indexing and searching can be done efficiently, building data structures for the indexes, and writing the events out to disk.
Splunk supports inner join, and outer join, but inner join is the default.
Splunk is disk-oriented.