Log aggregation in AWS – part 2 – keeping your index under control

This is the second part in the series as a follow on to /log-aggregation-aws-part-1/

Hopefully by this point you’ve now got kibana up and running, gathering all the logs from each of your desired CloudWatch groups. Over time the amount of data being stored in the index will constantly be growing so we need to keep things under control.

Here is a good view of the issue. We introduced our cleanup lambda on the 30th, if we hadn’t I reckon we’d have about 2 days more uptime before the disks ran out. The oscillating items from the 31st onward are exactly what we’d want to see – we delete indices older than 10 days every day.

Initially this was done via a scheduled task from a box we host – it worked but wasn’t ideal as it relies on the box running, potentially user creds and lots more. What seemed a better fit was to use AWS Lambda to keep our index under control.

Getting setup

Luckily you don’t need to setup much for this. One AWS Lambda, a trigger and some role permissions and you should be up and running.

  1. Create a new lambda function based off the script shown below
  2. Add 2 environment variables:
    1. daysToKeep=10
    2. endpoint=elastic search endpoint e.g. search-###-###.eu-west-1.es.amazonaws.com
  3. Create a new role as part of the setup process
    1. Note, these can then be found in the IAM section of AWS e.g.  https://console.aws.amazon.com/iam/home?region=eu-west-1#/roles
    2. Update the role to allow Get and Delete access to your index with the policy:
  4. Setup a trigger (in CloudWatch -> Events -> Rules)
    1. Here you can set the frequency of how often to run e.g. a CRON of

      will run at 2am every night
  5. Test your function, you can always run on demand and then check whether the indices have been removed

And finally the lambda code:

Note, if you are running in a different region you will need to tweak req.region = “eu-west-1”;

How does it work?

Elastic search allows you to query the index to find all indices via the url: /_cat/indices. The lambda function makes a web request to this url, parses each row and finds any indices that match the name: cwl-YYYY.MM.dd. If an indice is found that is older than days to keep, a delete request is issued to elasticSearch

Was this the best option?

There are tools available for cleaning up old indices, even ones that Elastic themselves provide: https://github.com/elastic/curator however this requires additional boxes to run hence the choice for keeping it wrapped in a simple lambda.

Happy indexing!

4 thoughts on “Log aggregation in AWS – part 2 – keeping your index under control

  1. Hi,

    Nice article.

    I am newbie. When you create Lambda function what type of Runtime do we have to select (c#, java 8, node.js 4.3, jode.js 6.10, python 2.7, python 3.6)

    Do we have to specify VPC and Subnet for this when we creating the function.

    Finally, how do test this please manually. Detail steps would be very helpful.

    Many thanks

Leave a Reply

Your email address will not be published. Required fields are marked *