Transform logging from ELK to EFK

Probably, the most known logging pipeline is the Elasticsearch-Logstash-Kibana stack. We used to use this pipeline in our own projects until last year, but now we are using Elasticsearch-Filebeat-Kibana, removing Logstash.

1. Why?

  1. It is difficult for developers to create and run filters on the logs they produce. The passwords of the servers are only for the people in the infrastructure team for security reasons. Therefore, developers cannot add new filters by connecting to Logstash servers on their own.
  2. Unresolved Logstash Error. When we came to work in some morning, we noticed that the log count were lower than normal. Logstash could not send logs to Elasticarch due to network reasons. We still don’t understand what the problem is. When we connected the Logstash server, we made sure that the Elasticsearch server could be accessed over the network, but it was throwing connection type errors in the error message. When we restarted Logstash, the problem was gone. If you are not using Logstash persistently, you will lose logs on restart. This poses a risk in an environment where logs are important.

2. Removing Logstash

Application -> Logstash -> Elasticsearch

The new pipeline will no longer have logstash. It will be replaced by Filebeat.

Application -> File <- Filebeat -> Elasticsearch

The first question that comes to mind? Have we given up on the features we used in Logstash? It depends on your use case. There were 2 issues we were concerned about.

Backpressure

The most important feature of Logstash is that it checks on Elasticsearch before sending the inputs to. While ES can normally receive 100 (random number) data per second, it can get 10 (random number) data per second when heavy queries are run on it. In such cases, instead of sending 100 data, Logstash sends 10 data without tiring the ES. If it sent 100 data and there was no retry mechanism, the remaining 90 data would have been lost. This is called backpressure in Logstash.

Filebeat also provides this feature.

Ingest Nodes

In Logstash, we used to format inputs by putting them in filters. In this way, we made queries run through Kibana faster. Even in some cases, we were reducing the storage cost by dropping the logs that were not useful to us.

We delegated these tasks to ingest nodes, which are built-in processors on Elasticsarch. In this way, developers can easily create and modify their own filters.

Ingest node management on Kibana

The created ingest node name should be specified as pipeline in Filebeat’s Elasticsearch output. In this way, logs reaching elasticsearch are first processed in this pipeline. (https://www.elastic.co/guide/en/beats/filebeat/current/configuring-ingest-node.html)

3. Adding Filebeat

ECS (Elastic Common Schema)

The Elastic Common Schema (ECS) defines a common set of fields for ingesting data into Elasticsearch. A common schema helps you correlate data from sources like logs and metrics or IT operations analytics and security analytics. (https://github.com/elastic/ecs) (For example, logback-ecs-encoder for Spring Boot. https://github.com/elastic/ecs-logging-java/tree/master/logback-ecs-encoder)

Example Spring Logback setting for ECS
  • The application will write the logs to the file using ECS.
  • Filebeat will read the file (no extra action needed) and send it to Elasticsearch.
  • Logs coming to Elasticsarch will be processed by Ingest nodes.

4. Result

By removing Logstash, we eliminated server cost and time spent on running and maintenance for Logstash. Also, we got rid of the unresolved error I mentioned at the beginning.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store