Splunk is a database system designed for extracting structure and analyzing machine-generated data. It takes in data from other databases, web servers, networks, sensors, etc. and then offers services to analyze the data, and produce dashboards, graphs, reports, alerts, and other visualizations. All this data is captured in a searchable repository and served via a web interface called Splunk Web.
Splunk is a horizontal application and is useful for many different kinds of users with different knowledge bases in an organization, such as monitoring IT operations, security, and performing business analytics. It is also possible to extend the Splunk environment by installing or developing an app. An app runs on the Splunk platform and includes inputs, lookups, and reports to display information about the data to add specific functionality.
Splunk was founded by Erik Swan, Michael Baum, and Rob Das in 2002. Prior to founding Splunk, all three founders were dealing with large-scale search infrastructures and were unhappy about the tools available for analyzing log files at the time. Early customers of Splunk reported their experience of debugging their environments as ‘digging through caves’ and ‘crawling through the muck to find the problems’, which inspired the founders to name the company after the word for exploration of caves, spelunking.
Splunk raised a $5 million Series A in 2004 led by August Capital and went public in 2012. It acquired SignalFx, a cloud monitoring platform for infrastructure, microservices, and applications, in August 2019 for $1.1 billion.
Splunk adds all incoming data to indexes after processing it. It indexes data by breaking them into events, based on the timestamp. After breaking the data up into events, the events are passed through the indexing pipeline where additional steps are taken, such as breaking the events into segments so indexing and searching can be done efficiently, building data structures for the indexes, and writing the events out to disk.
Splunk supports events and metrics indexes. Events indexes are the default index type, impose minimal structure, and can accommodate any type of data. Metrics indexes are highly structured and designed to handle high volume and low latency demands. These indexes have better performance and less space utilization compared to events indexes.
Splunk stores data in indexes organized in a set of buckets by age. The hot buckets contain data that is currently being written to. This is eventually rolled to the warm, cold, and frozen buckets. The hot bucket cannot be backed up, but Splunk provides the ability to create a consistent snapshot of the other buckets. This is done either using incremental ongoing backups (using the user's preferred snapshot utility) or a single backup of all data. Taking periodic snapshots from a healthy environment allows you to recover from the last valid checkpoint in the event of a catastrophic event.
Splunk supports concurrent search but limits the number in order to preserve performance. It also allows you to configure the maximum number of concurrent searches between scheduled and summarization queries based on your usage.
Splunk also supports concurrent users. A user uses exactly one CPU core on each indexer for the duration of the search. By default, a search on Splunk cannot use multiple cores.
A Splunk index stores the raw data in compressed form along with index files that contain metadata that is used to search the event data. For indexes, it supports gzip
(default), lz4
, and zstd
for compression and can handle different buckets compressed with different algorithms. Splunk calculates disk storage using the formula (daily average indexing rate) * (retention policy) * 0.5
because it compresses raw data to up to approximately "to approximately half its original size."