Probably, the most known logging pipeline is the Elasticsearch-Logstash-Kibana stack. We used to use this pipeline in our own projects until last year, but now we are using Elasticsearch-Filebeat-Kibana, removing Logstash.
1. Why?
- The main use case of Logstash is sending logs from source to multiple destination. For example, from application to Elasticsearch and Apache Kafka. In our case, Elasticsearch was the only place where we sent and stored logs from the inputs.
- It is difficult for developers to create and run filters on the logs they produce. The passwords of the servers are only for the people in the infrastructure team for security reasons. Therefore, developers cannot add new filters by connecting to Logstash servers on their own.
- Unresolved Logstash Error. When we came to work in some morning, we noticed that the log count were lower than normal. Logstash could not send logs to Elasticarch due to network reasons. We still don’t understand what the problem is. When we connected the Logstash server, we made sure that the Elasticsearch server could be accessed over the network, but it was throwing connection type errors in the error message. When we restarted Logstash, the problem was gone. If you are not using Logstash persistently, you will lose logs on restart. This poses a risk in an environment where logs are important.
2. Removing Logstash
Application -> Logstash -> Elasticsearch
The new pipeline will no longer have logstash. It will be replaced by Filebeat.
Application -> File <- Filebeat -> Elasticsearch
The first question that comes to mind? Have we given up on the features we used in Logstash? It depends on your use case. There were 2 issues we were concerned about.
Backpressure
The most important feature of Logstash is that it checks on Elasticsearch before sending the inputs to. While ES can normally receive 100 (random number) data per second, it can get 10 (random number) data per second when heavy queries are run on it. In such cases, instead of sending 100 data, Logstash sends 10 data without tiring the ES. If it sent 100 data and there was no retry mechanism, the remaining 90 data would have been lost. This is called backpressure in Logstash.
Filebeat also provides this feature.
Ingest Nodes
In Logstash, we used to format inputs by putting them in filters. In this way, we made queries run through Kibana faster. Even in some cases, we were reducing the storage cost by dropping the logs that were not useful to us.
We delegated these tasks to ingest nodes, which are built-in processors on Elasticsarch. In this way, developers can easily create and modify their own filters.
The created ingest node name should be specified as pipeline in Filebeat’s Elasticsearch output. In this way, logs reaching elasticsearch are first processed in this pipeline. (https://www.elastic.co/guide/en/beats/filebeat/current/configuring-ingest-node.html)
3. Adding Filebeat
When using Logstash, the logs generated from the application must reach the Logstash in the appropriate format. LogstashEncoder is used for this. In case Logstash is not used, the logs generated from the application should reach ES in a appropriate format. This is done with ECS.
ECS (Elastic Common Schema)
The Elastic Common Schema (ECS) defines a common set of fields for ingesting data into Elasticsearch. A common schema helps you correlate data from sources like logs and metrics or IT operations analytics and security analytics. (https://github.com/elastic/ecs) (For example, logback-ecs-encoder for Spring Boot. https://github.com/elastic/ecs-logging-java/tree/master/logback-ecs-encoder)
- The application will write the logs to the file using ECS.
- Filebeat will read the file (no extra action needed) and send it to Elasticsearch.
- Logs coming to Elasticsarch will be processed by Ingest nodes.
4. Result
In our own projects, we used Beat to collect access logs(Filebeat) and server metrics(Metricbeat) as well as the ELK stack.
By removing Logstash, we eliminated server cost and time spent on running and maintenance for Logstash. Also, we got rid of the unresolved error I mentioned at the beginning.