Centralize Container Logs Using ELK Stack

Last year, I did a research project which is "Multi-Cloud Container using SDN Approach". In which I build my own cross-cloud platform for the 3rd level infrastructure team to deploy containers on the fly. The research project made me learn more new things! I kinda learned Ansible, Terraform (a bit), AWS, containers, etc...

In the project, I dealt with lots of containers, I had to maintain its logs in a centralized place!

So, In this blog, I am gonna cover why we need to centralize all logs and show you how to achieve those needs.

WHY?

Let's think we have deployed tons of containers and its generating logs, visiting each container is not possible and then, once containers stopped or killed and replaced with new containers, the old logs have gone.

Note: logging is essential within any application to know what happened to containers to troubleshoot.

To view container log docker logs <ContainerId> command is used

The place in which container logs are stored var/lib/docker/container/<containerid>

So, the solution is to create a centralized log in a single place. this is where Elastic (ELK) Stack comes in

Let me give a small brief about ELK stack to connect some dots.

E-Elsaticsearch- Json based NoSQL database that stores and indexes the data (logs). This runs on 9200 default port (you can edit it on /etc/elasticsearch/elasticsearch.yml)
L- Logtash- This is a data collection pipeline. Which collects from various input (shipper-Ex: filebeat) and filter & parse the data.
K-Kibana- This is data visualization tool. which represent the logs to easy analyzing. This runs on port 5601 (you can edit it on /etc/kibana/kibana.yml)

Beat- This is known as a shipper. These are lightweight log shipper which ships logs to ELK stack. (I used filebeat as a beats)

You almost got the point and let move on to my setup and build the idea

My setup is with two t2.micro (free tier) AWS EC2 instances, running Ubuntu 18.04 Instances. The 1st Instance is running with webapp (as a container) and the 2nd instance is running ELK stack.

The webapp container logs will be collected via Filebeat running inside the 1st Instance as a container. Filebeat will be installed on each docker host machine (we will be using a custom Filebeat docker file for this, which will be explained later in this blog.)

Download my filebeat container repo here.
Our goal is to send the logs to logstash, so in my repo please update the filebeat.yml file which contains the Filebeat configurations.
Jump to line 23 which has the configuration "output.logstash field and the hosts field" update with the 2nd instance ip.

Note: To get to know more about Filebeat Docker configuration parameters, look here. as per the links document, you can decode the JSON of the log field and to map each field (such as timestamp, version, message, logger_name, …) to an indexed Elasticsearch field. Decoding and mapping represents the transform done by the Filebeat processor “json_decode_fields”. which helps you to build your own filbeat.yml

After that, you can create your own Filebeat Docker image by using the Dockerfile in the repo.

Build the Dockerfile with the docker build command along with an image name.

docker build --name <imagename> .

Once the image is built, run the docker image with volume persistent to store logs and bind the image with Docker daemon to access container inside the filebeat container.

docker run -v filebeat_data:/usr/share/filebeat/data:rw -v '/var/lib/docker/containers:/usr/share/dockerlogs/data:ro' -v '/var/run/docker.sock:/var/run/docker.sock' --name filebeat ${ImageName}:latest

With the above Docker command, three bind mount parameters parsed:
/var/lib/docker/containers is the path where docker logs exist as I told earlier, and it has been bound to /usr/share/dockerlogs/data path within Filebeat container with read only access.
/var/run/docker.sock is bound into the Filebeat container’s Docker daemon. It is the unix socket the Docker daemon listens on by default and it can be used to communicate with the daemon from within a container. This allows our Filebeat container to obtain Docker metadata and enrich the container log entries along with the metadata and push it to ELK stack.
filebeat_data volume has created as docker volume to persist filebeat container. /usr/share/filebeat/data has bound to the filebeat_data volume with read-write permission.

Now we are done with 1st Instance and then move to 2nd Instance to configure ELK Stack

Now we can do this by simply installing Docker compose and checking out this awesome deviantony/docker-elk repo and just running docker-compose up -d

Note: make sure your logstash.conf file is properly configured to listen to incoming beats logs on port 5044 and the logs are being properly added onto the elasticsearch host. Also, you need to make sure to add an index parameter on your Elasticsearch to identify the logs generated by Filbeat uniquely. The path to the logstash.conf on repo docker-elk/logstash/pipeline

Once Everythins is done, access your Kibana dashboard on port 5601. Under the management tab, you can create an index pattern for Filebeat logs. This has to be done before you can view the logs on Kibana dashboard.

If your containers are pushing logs properly into Elasticsearch via Logstash, and you have successfully created the index pattern, you can go to the Discover tab on the Kibana dashboard and view your Docker container application logs along with Docker metadata under the filebeat* index pattern.

That's it pretty much today, PEACE!

Thank you

References

https://www.elastic.co/guide/en/beats/filebeat/current/filebeat-getting-started.html

https://medium.com/@bcoste/powerful-logging-with-docker-filebeat-and-elasticsearch-8ad021aecd87

https://www.elastic.co/guide/en/logstash/current/configuration.html

Automation , AWS , Docker , Docker Compose , Docker logs , ELK , ELK Stack , filebeat , logs

Centralize Container Logs Using ELK Stack

No comments:

Post a Comment