Heading towards Central Log System

aws Mar 24, 2021

If you are from IT Field, you always have to follow logs in your day to day task as a Developer or Tester and If you are working on multiple things, you have to go check multiple server logs. Let us suppose if to search one error in the log file takes approx of 3–4 minutes. Then how long will it take throughout the day, throughout the month?

So I just get motivated by this, as a DevOps of my office, It was my responsibility to save my developer’s time. So I came up with the Central Log System. At that time I had two options, Either I go for Logstash or Fluentd. I continued with Fluentd.

Requirement:

  • Fluentd (Central Server)
  • Elasticsearch (Central Server)
  • Kibana (Central Server)
  • Open ports for Kibana and ES
  • Fluentd (td-agent)
  • Metricbeat
  • Heartbeat (Not necessary)

Why Fluentd? Fluentd vs Logstash?

  • Fluentd is a lightweight tool to collect and filter logs. It is built in 2 Component, (CORE and PLUGIN).
  • In Fluentd, routing is based on Tags, so that, it will be more feasible to filter logs, based on tags.
  • Fluentd is a built-in alert mechanism, So, we don’t have to add an extra plugin to set up an alert mechanism for our system, that is present in its CORE component.

Setting Up Application

Setting Up Central Server First:

Setting Up Fluentd for Central Server:

In the config file of fluentd (td-agent.conf), we have to provide details about the server on which it is running. And the details about the destination we want to send the data, So in Our case, we want to send all the data to elasticsearch(ES). So Provide details of elasticsearch.

Restart fluentd.

Setting Up ES and Kibana

Allow to remotely access both Elasticsearch and Kibana.

Restart ES

Restart Kibana

Setting Up Log Server:

Setting Up Fluentd for Log Server:

The config file of fluentd has basically two parts, <source></source> and <match></match>. In <source>, we basically write configuration of logs that we want to fetch and how to filter those logs.

Keywords: path, tag, log_level.

  • Tag is used to distinguishing between the logs.
  • Path is the path of the file, whose logs we are reading.
  • Log_level is verbose of logs we needed.

We also have to filter the logs in <source> part. Fluentd provides many inbuilt filters for some applications. To parse custom application logs we’ll use a grok pattern.

In <match>, we use a secure_forward method to send data to fluentd for a central server. We need to download a secure_forward plugin to use this method. It can be downloaded by:

/usr/sbin/td-agent-gem install fluent-plugin-secure-forward

We have to provide shared_key value to fluentd of the central server, and should be the same as fluentd of the value of the Central server.

To send the data to the central server, we need to enter the server details in the server tag.

Restart fluentd

Setting Up Metricbeat for Log Server:

  • Enter Kibana hostname.
  • Enter Elasticsearch hostname.

Restart metricbeat

Setting Up Email Alert for 5xx Error

To catch the server related log, we need to monitor the webserver log. So In Our case, we are using nginx. So well will read nginx.access file to get the correct response code when any code runs on the server. If it throws a 5xx response. It’ll trigger an email that the Server is Down.

We need to add a source tag in the fluentd conf file of the log server, that reads the nginx.access file. We have filtered the Nginx log using an inbuilt fluentd filter. We have added those logs to a different tag (let us say nginx.access).

Now form nginx.access tag. We have added a regex expression to filter all 5xx logs. And added the threshold to 1. If any 5xx response comes, it runs further. So next we added these 5xx response logs to another tag (let us say error.5xx)

If any nginx.access.error.5xx tag is created. It will trigger an email. For email service. We have used AWS SES (Simple Email Service).

Setting Up Dashboard for Metricbeat

Please make sure all services are running on both the servers.

meticbeat setup — dashboardss

Introducing Kafka (server log → Kafka → ES )

I used Kafka to avoid any data loss for any server, Kafka is added in between fluent/metricbeat and Elasticsearch. So from now, all data that were going directly to ES will now go through Kafka.

To setup Kafka, I directly used confluent. It provides all the required servers for Kafka(you can say a package of Kafka). And to connect Kafka to Elasticsearch, I used the Elasticsearch sink connector.

So, the Kafka server is started by confluent start.

And to establish connection between kafka and ES:

/usr/bin/connect-standalone worker.properties filesource.properties.

Happy Logging :)

Tags