Member-only story

Data Science, Data Analytics

The Beginners’ Guide to Elasticsearch — Part 1

A brief introduction to Elasticsearch

Chetan Ambi

Published in

Towards AI

4 min readSep 12, 2020

If you would like to know what is Elasticsearch, why should we use Elasticsearch, what are the alternatives/competitors to Elasticsearch then you are in the right place. In this article, I will try to answer all these questions. So, let’s get started.

What is Elasticsearch

As mentioned on the official page, Elasticsearch — is a distributed, open-source search and analytics engine for all types of data, including textual, numerical, geospatial, structured, and unstructured. Elasticsearch is built on Apache Lucene.

So, what does this mean?

Distributed — a distributed system contains multiple nodes/machines which are spread geographically but connected together so that they can communicate and coordinate to achieve a common goal.

Open-source — the original source code is freely available and can be modified.

Search & analytics engine — a system capable of searching a given text/keyword in the database and display relevant data to the end-users. This is much like Google or Bing search engine.

Data — Elasticsearch works with both structured and unstructured data such as textual, numerical, geospatial, etc.

Apache Lucene — Apache Lucene is a free and open-source search engine written in Java. It is supported by the Apache Software Foundation and Elasticsearch is built on top of Lucene.

Why use Elasticsearch

Distributed: Elasticsearch is a distributed document store. The documents are stored in the nodes which are distributed across the cluster and can be accessed from any node.
Scalable: Elasticsearch provides the ability to quickly increase or decrease the server (nodes). Elasticsearch automatically distributes the data and query load across all of the available nodes.
Fast: Elasticsearch is blazing fast. It can search the document in near real-time usually less than a second. Note that there is little latency between the time when the documents are indexed and when the…