Elasticsearch is a highly scalable open-source full-text search and analytics engine based on Lucene. It allows you to store, search, and analyze big volumes of data quickly and near real time. It is generally used as the underlying engine/technology that powers applications that have complex search features and requirements. A few sample use-cases including online web store catalog, collect and analyze logs for data mining, supervise and alerting system, business-intelligence needs. Elastic stack is used by many technology companies including Linkedin and Uber, its business counterpart is Splunk. Elasticsearch is the search engine part of Elastic stack. For most cases, you will also need Logstash, the data import and storage system, Kibana, the data visualization system.
Compass is the precursor to ElasticSearch, created by Shay Banon in 2004. In the release of its 3rd version, Banon rewrite big parts of Compass to "create a scalable search solution". A solution built from the ground up to be distributed and used a common interface, JSON over HTTP. Shay Banon released the first version of Elasticsearch in February 2010. Elasticsearch BV was founded in 2012 to provide commercial services and products around Elasticsearch and related software. In March 2015, the company ElasticSearch changed their name to Elastic.
By default, Logstash uses in-memory bounded queues absorbs bursts of events and buffer them on disk. Persistent queues provide durability of data within Logstash for Elastic systems. When it's enabled, Logstash will store events on disk, commit to disk using checkpointing. The persistent queue has two kinds of pages: head pages and tail pages. There is only one head page, when head page is of a certain size, it becomes a tail page. Tail page is immutable and head page is append only. When recording a checkpoint, Logstash will call fsync on the head page and atomically write to disk the current state of the queue. The process of checkpointing is atomic, any update to the file is saved if successful. If Logstash is terminated or there is a hardware-level failure, any data that is buffered in the persistent queue but not yet checkpointed is lost.