Introduction
Elastic search is a distributed, open-source search and analytics engine build on top of Apache Lucene. It is designed to handle a large volumes of data and provide a real-time fast and efficient search capabilities.
Elastic Search is a document-oriented database that stores data in JSON format. It is schema-less, meaning you can index and search data without defining a rigid structure. However you can define mappings to optimize how data is indexed and searched.
Major Advantages
- Full-Text search
- Elastic search excels at full-text search, allowing you to perform complex queries like phrase search and relevance search. It sues advanced algorightms like TF-IDF and BM25 to rank search results based on relevance.
- Real-Time analytics
- If you want to get insights or know more about you data, elasticsearch provides a real-time data analysis. You can import, index and query data in real-time
A brief on how elastic search works
- Index
- Data is stored in indices, which is like a container having multiple documents, which are JSON format
- Tokenization
- The input document data are then broken down into chunks of word known as token. Example: The Fox jumps over the fence is broken into Token : {“The”, “Fox”, “jumps”, “over”, “the”, “fence”}
- Pre-processing the data
- So before the tokenize word is directly used for searching, some filtrations needs to be done such as making words case insensitive since if we’re are searching for fox, the hits should return all the document which contains FOX, Fox etc.
- Search
- Elastic search uses inverted index to quickly locate relevant document. The hits are ranked by relevance and returned order by showing the most relevant content at top and least relevant content at bottom.
- Fine Tuning
- You can also fine tune the search ranking by you customizing some values. So that you can build a search engine which is suitable for your usecases.
Getting Started With Elasticsearch
- You can refer this gitHub repository for getting starting with elastic search, it’s a very simple and useful tutorial made by official elastic search developers, specially for beginners. Link
- To run elastic search locally with Kibana you can refer to this gitHub repository Link
- Since Elasticsearch doesn’t supports analyzingTibetan language, you can refer to this GitHub repository made by @Elie_Roux which is a custom analyzer build specific for Tibetan language.
Conclusion
So Elasticsearch is a powerful tool for search and analytics, offering speed, flexibility, and scalability. Whether you’re building a search engine, analyzing logs, or searching tool for you website, Elasticsearch can help you unlock your full potential of your data.