In the past, most business data was managed in structured operational databases. But under the pressure of sales and marketing, companies have been looking for new sources of information. As a result, messaging, social networks and the Internet are more and more often consulted to enrich organization’s knowledge, not to mention geolocation data or statistics and graphs. These new media require the use of specific collection and analysis tools.
Open source engine
In the early 2010s,
Shay Banon, the founder of the Compass Project, wrote his own version of
Compass to provide a distributed solution for using JSON (JavaScript Object
Notation) and Java through http requests. That made the search engine usable
with any programming language. In 2012, he set up the ElasticSearch company,
together with Steven Shuurman.
In practice, ElasticSearch is a NoSQL database
that searches any type of document using text-based indexing, all through a
REST interface, eliminating the need for different types of search mechanisms.
The database has an adaptable architecture and allows you to perform searches
in real time and at very high speed. The solution looks into the actual text,
not just into an index. ElasticSearch relies on Lucene's open source Apache-based
search engine and is able to store large volumes of documents that can be
analyzed in (near) real time.
ElasticSearch is distributed on several servers, meaning that it is possible to route requests, run processes in parallel, replicate data in case of failure and increase the system’s indexing capacity. In fact, the stored data is spread over several nodes.
Strictly speaking, ElasticSearch is not a search
system that connects to data and allows to display results, but rather a
scalable (thanks to its distribution on several machines) and easily managed
backbone, supporting REST calls.
In addition to its scalability – hence its name – ElasticSearch is particularly effective. It is possible to analyze billions of records in a few seconds. ElasticSearch is also multilingual, which is a significant asset in a country like Belgium. And ther’s more. The ‘completion suggester’ already offers relevant results while the user is writing the query, improving the accuracy of the search. Finally, ElasticSearch does not require definitions, such as index, type or field, before the indexing process. When a new object is subsequently indexed with a new property, it is automatically added to the mapping definitions.
Applications
At a time when
companies store and analyze more and more unstructured data, especially from
messaging and from social networks such as Facebook and Twitter, an adapted
search engine is needed. ElasticSearch’s scalability and ability to search for
data in near real-time are important assets. What’s more, the product proves to
be stable and user-friendly.
Moreover, in the era of big data, ElasticSearch can be positioned as a complement to solutions like Hadoop or Solr, especially considering its ease of use and ability to support very large volumes of data (in Petabytes). In this case, ElasticSearch can be used as the front-end of the Hadoop framework.
At the same time, ElasticSearch can also be used for logging and log analysis, especially in combination with a dashboarding and analysis tool, for collecting and combining public data (tweets or hashtags), for full text search, for the collection and analysis of event and measurement data, as well as for data visualization, for example in combination with Kibana.
Implementing ElasticSearch, however, requires not only technological knowledge, but also a perfect understanding of the customer’s business. Who else but your trusted IT partner can understand your strategy, needs and goals? Aprico Consultants’ ElasticSearch team of experts helps, guides, facilitates and coordinates such projects to enable faster implementation, increase efficiency and reduce project costs.
For more information: marketing@aprico-consult.com