Solr

Solr is an open source NoSQL enterprise search platform built on Apache Lucene. Supporting distributed search and index replication, Solr is highly reliable, scalable and fault tolerant. It runs as a standalone full-text search server with a REST-like API and provides features including hit highlighting, faceted search, near real-time indexing, dynamic clustering, database integration, and geospatial search. Currently, Solr powers some of the highest-traffic websites and applications in the world.

History

In 2004, CNET Networks, an American media website, started Solr to support search capability and later donated to Apache Software Foundation as a open-source project in 2006. In 2007, graduated as a top-level project (TLP), Solr grew steadily with more features and supported several popular websites. Finally in 2010, Solr was merged with Lucene as a sub project and changed the version number to 3.1 after Solr 1.4 to match that of Lucene.

Concurrency Control

Optimistic Concurrency Control (OCC)

Solr uses Optimistic Concurrency Control to ensure that documents can not be concurrently modified by multiple client applications. All documents will be assigned a version field. When updating, clients are guaranteed to read the latest version and resubmit the document after local modification. When a version conflict is encountered, the transaction should be redone.

Checkpoints

Consistent

For standalone mode, Solr provides support for checkpoints through replication handler and will back up the system from the latest index commit point. Checkpoints can be triggered manually or users can set customized configurations to back up automatically after each commit or startup. For SolrCloud mode, Solr utilizes the Collection API which will back up the indexes and configurations to a shared filesystem. The checkpoints will be taken across multiple shards. When restoring, a new collection with same number of shards will be created and will preserve all the shard structure like routing information.

Data Model

Document / XML

Solr stores data as documents which consist of different fields. Each field contains a piece of more specific information about the document and can have different data types. Using the index of documents, Solr can provide efficient search.

Solr Logo
Website

http://lucene.apache.org/solr/

Source Code

https://github.com/apache/lucene-solr

Tech Docs

https://lucene.apache.org/solr/7_3_0/index.html

Developer

Apache Software Foundation

Country of Origin

US

Start Year

2004

Project Type

Open Source

Written in

Java

Supported languages

C#, C++, Clojure, Go, Java, JavaScript, Lua, Perl, PHP, Python, R, Ruby, Rust, Scala

Operating Systems

Linux, OS X, Windows

Licenses

Apache v2