Customizing logging in a C# dotnetcore AWS Lambda function

November 27, 2019 by boro

One challenge we hit recently was how to build our dotnetcore lambda functions in a consistent way – in particular how would we approach logging.

A pattern we’ve adopted is to write the core functionality for our functions so that it’s as easy to run from a console app as it is from a lambda. The lambda can then be considered only as the entry point to the functionality.

Serverless Dependency injection

I am sure there are different schools of thought here, should you use a container within a serverless function or not? For this post the design assumes you do make use of the Microsoft DependencyInjection libraries.

Setting up your projects

Based on the design mentioned above, ie you can run from functionality as easily from a Console App as you can a lambda, I often setup the following projects:

Project.ActualFunctionality (e.g. SnsDemo.Publisher)
Project.ActualFunctionality.ConsoleApp (e.g. SnsDemo.Publisher.ConsoleApp)
Project.ActualFunctionality.Lambda (e.g. SnsDemo.Publisher.Lambda)

The actual functionality lives in the top project and is shared with both other projects. Dependency injection, and AWS profiles are used to run the functionality locally.

The actual functionality

Let’s assume the functionality for your function does something simple like pushing messages into an SQS queue

public class SqsSender
{
    private readonly IAmazonSQS _amazonSQS;
    private readonly ILogger<SqsSender> _logger;

    public SqsSender(IAmazonSQS amazonSQS, ILogger<SqsSender> logger)
    {
        _amazonSQS = amazonSQS;
        _logger = logger;
    }

    public void SendMessage()
    {
        var message = new SendMessageRequest
        {
            QueueUrl = "...",                 
        };
        
        message.MessageBody = $"Message {Guid.NewGuid()}";
        
        _amazonSQS.SendMessageAsync(message).Wait();  

        _logger.LogInformation("_logger Messages sent");

        Console.WriteLine("Console Message(s) sent");
    }
}

public class SqsSender

{

private readonly IAmazonSQS _amazonSQS;

private readonly ILogger<SqsSender> _logger;

public SqsSender(IAmazonSQS amazonSQS, ILogger<SqsSender> logger)

{

_amazonSQS = amazonSQS;

_logger = logger;

}

public void SendMessage()

{

var message = new SendMessageRequest

{

QueueUrl = "...",

};

message.MessageBody = $"Message {Guid.NewGuid()}";

_amazonSQS.SendMessageAsync(message).Wait();

_logger.LogInformation("_logger Messages sent");

Console.WriteLine("Console Message(s) sent");

}

The console app version

It’s pretty simple to get DI working in a dotnetcore console app

static void Main(string[] args)
{
    IConfiguration config = new ConfigurationBuilder()
        .AddJsonFile("appsettings.json", true, true)
        .Build();

    var serviceProvider = new ServiceCollection()
        .AddSingleton(config)
        .AddSingleton<SqsSender>()
        .AddLogging(a =>
        {
            a.AddConsole();
        })
        .AddAWSService<IAmazonSQS>()
        .BuildServiceProvider();

    serviceProvider.GetService<SqsSender>().SendMessage();
}

static void Main(string[] args)

{

IConfiguration config = new ConfigurationBuilder()

.AddJsonFile("appsettings.json", true, true)

.Build();

var serviceProvider = new ServiceCollection()

.AddSingleton(config)

.AddSingleton<SqsSender>()

.AddLogging(a =>

{

a.AddConsole();

})

.AddAWSService<IAmazonSQS>()

.BuildServiceProvider();

serviceProvider.GetService<SqsSender>().SendMessage();

}

The lambda version

This looks very similar to the console version

public string FunctionHandler(object input, ILambdaContext context)
{
    IConfiguration config = new ConfigurationBuilder()
            .AddJsonFile("lambdasettings.json", true, true)
            .Build();

    var serviceProvider = new ServiceCollection()
        .AddSingleton(config)
        .AddSingleton<SqsSender>()             
        .AddLogging(a => a.AddProvider(new CustomLambdaLogProvider(context.Logger)))
        .AddAWSService<IAmazonSQS>()
        .BuildServiceProvider();
   
    serviceProvider.GetService<SqsSender>().SendMessage();

    return "...";
}

public string FunctionHandler(object input, ILambdaContext context)

{

IConfiguration config = new ConfigurationBuilder()

.AddJsonFile("lambdasettings.json", true, true)

.Build();

var serviceProvider = new ServiceCollection()

.AddSingleton(config)

.AddSingleton<SqsSender>()

.AddLogging(a => a.AddProvider(new CustomLambdaLogProvider(context.Logger)))

.AddAWSService<IAmazonSQS>()

.BuildServiceProvider();

serviceProvider.GetService<SqsSender>().SendMessage();

return "...";

}

The really interesting bit to take note of is: .AddLogging(a => a.AddProvider(new CustomLambdaLogProvider(context.Logger)))

In the actual functionality we can log in many ways:

_logger.LogInformation("_logger Messages sent");

Console.WriteLine("Console Message(s) sent");

_logger.LogInformation("_logger Messages sent");

Console.WriteLine("Console Message(s) sent");

To make things lambda agnostic I’d argue injecting ILogger<Type> and then _logger.LogInformation(“_logger Messages sent”); is the preferred option.

Customizing the logger

It’s simple to customize the dotnetcore logging framework – for this demo I setup 2 things. The CustomLambdaLogProvider and the CustomLambdaLogger.

internal class CustomLambdaLogProvider : ILoggerProvider
{
    private readonly ILambdaLogger _logger;

    private readonly ConcurrentDictionary<string, CustomLambdaLogger> _loggers = new ConcurrentDictionary<string, CustomLambdaLogger>();

    public CustomLambdaLogProvider(ILambdaLogger logger)
    {
        _logger = logger;
    }

    public ILogger CreateLogger(string categoryName)
    {
        return _loggers.GetOrAdd(categoryName, a => new CustomLambdaLogger(a, _logger));
    }

    public void Dispose()
    {
        _loggers.Clear();
    }
}

internal class CustomLambdaLogProvider : ILoggerProvider

{

private readonly ILambdaLogger _logger;

private readonly ConcurrentDictionary<string, CustomLambdaLogger> _loggers = new ConcurrentDictionary<string, CustomLambdaLogger>();

public CustomLambdaLogProvider(ILambdaLogger logger)

{

_logger = logger;

}

public ILogger CreateLogger(string categoryName)

{

return _loggers.GetOrAdd(categoryName, a => new CustomLambdaLogger(a, _logger));

}

public void Dispose()

{

_loggers.Clear();

}

And finally a basic version of the actual logger:

internal class CustomLambdaLogger : ILogger
{
    private string _categoryName;
    private ILambdaLogger _lambdaLogger;

    public CustomLambdaLogger(string categoryName, ILambdaLogger lambdaLogger)
    {
        _categoryName = categoryName;
        _lambdaLogger = lambdaLogger;
    }

    public IDisposable BeginScope<TState>(TState state)
    {
        return null;
    }

    public bool IsEnabled(LogLevel logLevel)
    {
        //todo - add logic around filtering log messages if desired
        return true;
    }

    public void Log<TState>(LogLevel logLevel, EventId eventId, TState state, Exception exception, Func<TState, Exception, string> formatter)
    {
        if (!IsEnabled(logLevel))
        {
            return;
        }

        _lambdaLogger.LogLine($"{logLevel.ToString()} - {_categoryName} - {formatter(state, exception)}");
    }
}

internal class CustomLambdaLogger : ILogger

{

private string _categoryName;

private ILambdaLogger _lambdaLogger;

public CustomLambdaLogger(string categoryName, ILambdaLogger lambdaLogger)

{

_categoryName = categoryName;

_lambdaLogger = lambdaLogger;

}

public IDisposable BeginScope<TState>(TState state)

{

return null;

}

public bool IsEnabled(LogLevel logLevel)

{

//todo - add logic around filtering log messages if desired

return true;

}

public void Log<TState>(LogLevel logLevel, EventId eventId, TState state, Exception exception, Func<TState, Exception, string> formatter)

{

if (!IsEnabled(logLevel))

{

return;

}

_lambdaLogger.LogLine($"{logLevel.ToString()} - {_categoryName} - {formatter(state, exception)}");

}

Summary

The aim here is to keep your application code agnostic to where it runs. Using dependency injection we can share core logic between any ‘runner’ e.g. Lambda functions, Azure functions, Console App’s – you name it.

With some small tweaks to the lambda logging calls you can ensure the OTB lambda logger is still used under the hood, but your implementation code can make use of injecting things like ILogger<T> wherever needed 🙂

Automating a multi region deployment with Azure Devops

October 18, 2019 by boro

For a recent project we’ve invested a lot of time into Azure Devops, and in the most part found it a very useful toolset for deploying our code to both Azure and AWS.

When we started on this process, YAML pipelines weren’t available for our source code provider – this meant everything had to be setup manually 🙁

However, recently this has changed 🙂 This post will run through a few ways you can optimize your release process and automate the whole thing.

First a bit of background and then some actual code examples.

Why YAML?

Setting up your pipelines via the UI is a really good way to quickly prototype things, however what if you need to change these pipelines to mimic deployment features alongside code features. Yaml allows you to keep the pipeline definition in the same codebase as the actual features. You deploy branch XXX and that can be configured differently to branch YYY.

Another benefit, the changes are then visible in your pull requests so validating changes is a lot easier.

Async Jobs

A big optimization we gained was to release to different regions in parallel. Yaml makes this very easy by using Jobs – each job can run on an agent and hence push to multiple regions in parallel.

https://docs.microsoft.com/en-us/azure/devops/pipelines/process/phases?view=azure-devops&tabs=yaml

Yaml file templates

If you have common functionality you want to duplicate, e.g. ‘Deploy to Eu-West-1’, templates are a good way to split your functionality. They allow you to group logical functionality you want to run multiple times.

https://docs.microsoft.com/en-us/azure/devops/pipelines/process/templates?view=azure-devops

Azure Devops rest API

All of your build/releases can be triggered via the UI portal, however if you want to automate that process I’d suggest looking into the rest API. Via this you can trigger, monitor and administer builds, releases and a whole load more.

We use powershell to orchestrate the process.

https://docs.microsoft.com/en-us/rest/api/azure/devops/build/builds/queue?view=azure-devops-rest-5.1

Variables, and variable groups

I have to confess, this syntax feels slightly cumbersome, but it’s very possible to reference variables passed into a specific pipeline along with global variables from groups you setup in the Library section of the portal.

Now, some examples

The root YAML file:

pr: none
trigger: none

variables:
- group: 'DataDog' # reference Variable groups if needed
- name : 'system.debug'
  value: true
- name : 'DynamicParameter' # these can be calculated off other variable values
  value: "name-$(EnvironmentName)-$(ColourName)"
- name: 'WebsiteFolder'
  value: 'Website/FolderName'

#- name: "EnvironmentName" # see the rest api example below for how to pass in variables
#  value: "Set externally"
#- name: "ColourName"
#  value: "Set externally"
#- name: "AwsCredentials"
#  value: "Set externally"

jobs:
- job: Build
  pool:
    vmImage: 'windows-2019' # vmImages: https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/hosted?view=azure-devops#use-a-microsoft-hosted-agent
  steps:  
 
  - task: NuGetToolInstaller@0
    displayName: 'Use NuGet 4.4.1'
    inputs:
      versionSpec: 4.4.1    

  - task: NuGetCommand@2 # if using secure artifacts, you can download them into a dotnetcore project this way
    displayName: 'NuGet restore'
    inputs:
      restoreSolution: 'Website/###.sln'
      feedsToUse: config
      nugetConfigPath: Website/nuget.config

  - task: Npm@1
    displayName: 'NPM install'
    inputs:
      workingDir: '$(WebsiteFolder)'     
      verbose: false

  - task: Npm@1
    displayName: 'NPM build scss'
    inputs:
      workingDir: '$(WebsiteFolder)'     
      command: custom
      verbose: false
      customCommand: 'run scss-build'

  - task: DotNetCoreCLI@2
    displayName: 'dotnet publish'
    inputs:
      command: publish
      publishWebProjects: false
      projects: '$(WebsiteFolder)/Website.csproj'
      arguments: '--configuration Release --output $(Build.ArtifactStagingDirectory)\Website'
      zipAfterPublish: false  

  - task: PublishPipelineArtifact@0 # in order to share the common build with multiple releases you need to publish the artifact
    inputs:
      artifactName: "Website"
      targetPath: '$(Build.ArtifactStagingDirectory)'

- job: ReleaseEU
  pool:
    vmImage: 'windows-2019'
  dependsOn: Build # these will only start when the 'Build' task above starts
  steps:
  - template: TaskGroups/DeployToRegion.yaml # this
    parameters:
      AwsCredentials: '$(AwsCredentials)'
      RegionName: 'eu-west-1'      
      EnvironmentName: '$(EnvironmentName)'
      ColourName: '$(ColourName)'
      DatadogApiKey: '$(DatadogApiKey)' # referenced from a variable group      

- job: ReleaseRegionN # Will run in parallel with ReleaseEU if you have enough build agents
  pool:
    vmImage: 'windows-2019'
  dependsOn: Build
  steps:
  - template: TaskGroups/DeployToRegion.yaml # this template file is shown below
    parameters:
      AwsCredentials: '$(AwsCredentials)'
      RegionName: 'ANother region'      
      EnvironmentName: '$(EnvironmentName)'
      ColourName: '$(ColourName)'
      DatadogApiKey: '$(DatadogApiKey)' # referenced from a variable group

pr: none

trigger: none

variables:

- group: 'DataDog' # reference Variable groups if needed

- name : 'system.debug'

value: true

- name : 'DynamicParameter' # these can be calculated off other variable values

value: "name-$(EnvironmentName)-$(ColourName)"

- name: 'WebsiteFolder'

value: 'Website/FolderName'

#- name: "EnvironmentName" # see the rest api example below for how to pass in variables

# value: "Set externally"

#- name: "ColourName"

# value: "Set externally"

#- name: "AwsCredentials"

# value: "Set externally"

jobs:

- job: Build

pool:

vmImage: 'windows-2019' # vmImages: https://docs.microsoft.com/en-us/azure/devops/pipelines/agents/hosted?view=azure-devops#use-a-microsoft-hosted-agent

steps:

- task: NuGetToolInstaller@0

displayName: 'Use NuGet 4.4.1'

inputs:

versionSpec: 4.4.1

- task: NuGetCommand@2 # if using secure artifacts, you can download them into a dotnetcore project this way

displayName: 'NuGet restore'

inputs:

restoreSolution: 'Website/###.sln'

feedsToUse: config

nugetConfigPath: Website/nuget.config

- task: Npm@1

displayName: 'NPM install'

inputs:

workingDir: '$(WebsiteFolder)'

verbose: false

- task: Npm@1

displayName: 'NPM build scss'

inputs:

workingDir: '$(WebsiteFolder)'

command: custom

verbose: false

customCommand: 'run scss-build'

- task: DotNetCoreCLI@2

displayName: 'dotnet publish'

inputs:

command: publish

publishWebProjects: false

projects: '$(WebsiteFolder)/Website.csproj'

arguments: '--configuration Release --output $(Build.ArtifactStagingDirectory)\Website'

zipAfterPublish: false

- task: PublishPipelineArtifact@0 # in order to share the common build with multiple releases you need to publish the artifact

inputs:

artifactName: "Website"

targetPath: '$(Build.ArtifactStagingDirectory)'

- job: ReleaseEU

pool:

vmImage: 'windows-2019'

dependsOn: Build # these will only start when the 'Build' task above starts

steps:

- template: TaskGroups/DeployToRegion.yaml # this

parameters:

AwsCredentials: '$(AwsCredentials)'

RegionName: 'eu-west-1'

EnvironmentName: '$(EnvironmentName)'

ColourName: '$(ColourName)'

DatadogApiKey: '$(DatadogApiKey)' # referenced from a variable group

- job: ReleaseRegionN # Will run in parallel with ReleaseEU if you have enough build agents

pool:

vmImage: 'windows-2019'

dependsOn: Build

steps:

- template: TaskGroups/DeployToRegion.yaml # this template file is shown below

parameters:

AwsCredentials: '$(AwsCredentials)'

RegionName: 'ANother region'

EnvironmentName: '$(EnvironmentName)'

ColourName: '$(ColourName)'

DatadogApiKey: '$(DatadogApiKey)' # referenced from a variable group

The ‘DeployToRegion’ template:

parameters:
  AwsCredentials: ''
  RegionName: ''  
  EnvironmentName: ''
  ColourName: ''
  DatadogApiKey: ''  

steps:
- task: DownloadPipelineArtifact@1 # you can download artifacts from other builds if needed
  inputs:
      buildType: 'specific'
      project: 'Project Name'
      pipeline: '##'
      buildVersionToDownload: 'latest'
      artifactName: 'Devops'
      targetPath: '$(System.ArtifactsDirectory)/Devops'

- task: DownloadPipelineArtifact@1 # or download from the current one
  inputs:
      buildType: 'current'
      artifactName: 'Website'
      targetPath: '$(System.ArtifactsDirectory)'

- template: DeployToElasticBeanstalk.yaml # and can chain templates if needed
  parameters:
      AwsCredentials: '${{ parameters.AwsCredentials }}'
      RegionName: '${{ parameters.RegionName }}'      
      EnvironmentName: '${{ parameters.EnvironmentName }}'
      ColourName: '${{ parameters.ColourName }}'
      DatadogApiKey: '${{ parameters.DatadogApiKey }}'

parameters:

AwsCredentials: ''

RegionName: ''

EnvironmentName: ''

ColourName: ''

DatadogApiKey: ''

steps:

- task: DownloadPipelineArtifact@1 # you can download artifacts from other builds if needed

inputs:

buildType: 'specific'

project: 'Project Name'

pipeline: '##'

buildVersionToDownload: 'latest'

artifactName: 'Devops'

targetPath: '$(System.ArtifactsDirectory)/Devops'

- task: DownloadPipelineArtifact@1 # or download from the current one

inputs:

buildType: 'current'

artifactName: 'Website'

targetPath: '$(System.ArtifactsDirectory)'

- template: DeployToElasticBeanstalk.yaml # and can chain templates if needed

parameters:

AwsCredentials: '${{ parameters.AwsCredentials }}'

RegionName: '${{ parameters.RegionName }}'

EnvironmentName: '${{ parameters.EnvironmentName }}'

ColourName: '${{ parameters.ColourName }}'

DatadogApiKey: '${{ parameters.DatadogApiKey }}'

And finally some powershell to fire it all off:

### Example usage: .\TriggerBuild.ps1 -branch "release/release-006" -isReleaseCandidate $false -additionalReleaseParameters @{ "EnvironmentName" = "qa"; "ColourName" = "blue"; }

param (
    [Parameter(Mandatory = $true)][string]$branch,   
    [boolean]$isReleaseCandidate = $false,
    [HashTable]$additionalReleaseParameters = @{ }
)

$ErrorActionPreference = "Stop"

$authToken = Get-DevOpsAuthToken # see https://docs.microsoft.com/en-us/azure/devops/organizations/accounts/use-personal-access-tokens-to-authenticate?view=azure-devops for how to get a token
$accountName = "AzureDevopsAccountName" 
$projectName = "AzureDevopsProjectName"

$buildDefinitionIds = @(27) # the build pipeline id

Write-Host "Building with settings:"
Write-Host "Branch: '$branch'"
Write-Host "Tag as 'release-candidate' and retain build: $isReleaseCandidate"
Write-Host "Build definition IDs: $buildDefinitionIds"
Write-Host "Additional parameters: $($additionalReleaseParameters | ConvertTo-Json) "
Write-Host ""

$releaseIds = @()

$result = @{
    Success = $false;    
}

foreach ($definitionId in $buildDefinitionIds)
{
    $deploymentParams = @{
        "definition" = @{
            "id" = $definitionId;
        }
        "sourceBranch" = $branch;
    }

    if ($additionalReleaseParameters.GetEnumerator().length -gt 0)
    {
        $deploymentParams.parameters = $additionalReleaseParameters | ConvertTo-Json
    }

    $content = (Invoke-WebRequest -uri "https://dev.azure.com/$accountName/$projectName/_apis/build/builds?api-version=4.1" `
        -ContentType "application/json" -Headers (Get-DevOpsHeaders -AuthToken $authToken) -Method POST -Body ($deploymentParams | ConvertTo-Json)).Content | ConvertFrom-Json

    $releaseIds += $content.id  

    Write-Host "Build $($content.id) queued: https://dev.azure.com/$accountName/$projectName/_build/results?buildId=$($content.id)" -ForegroundColor Yellow
}

$aBuildFailed = $false

foreach ($releaseId in $releaseIds)
{
    $status = ""

    while ($status -ne "completed")
    {
        try
        {
            $content = (Invoke-WebRequest -uri "https://dev.azure.com/$accountName/$projectName/_apis/build/builds/$releaseId" -Headers (Get-DevOpsHeaders -AuthToken $authToken)).Content | ConvertFrom-Json
        }
        catch
        {
            Write-Host "  Error calling DevopsAPI. If this happens several times check the url: https://dev.azure.com/$accountName/$projectName/_apis/build/builds/$releaseId" -ForegroundColor red
        }

        $status = $content.status

        Write-Host " Build id $releaseId has status: $status"

        if ($content.result -eq "failed" -or $content.result -eq "canceled")
        {
            $aBuildFailed = $true

            Write-Host "Build $releaseId failed - check https://dev.azure.com/$accountName/$projectName/_build/results?buildId=$releaseId for details" -ForegroundColor Red
        }
        elseif ($content.result -eq "completed")
        {
            Write-Host "Build $releaseId completed successfully" -ForegroundColor Green
        }

        Start-Sleep -s 5
    }

    if ($isReleaseCandidate -eq $true)
    {
        Write-Host " Adding RC tags: release-candidate"
        $tags = (Invoke-WebRequest -uri "https://dev.azure.com/$accountName/$projectName/_apis/build/builds/$releaseId/tags/release-candidate?api-version=4.1" -Headers (Get-DevOpsHeaders -AuthToken $authToken) -Method PUT).Content | ConvertFrom-Json

        Write-Host " Adding retain build to $releaseId"
        $updates = (Invoke-WebRequest -uri "https://dev.azure.com/$accountName/$projectName/_apis/build/builds/$($releaseId)?api-version=4.1" -ContentType "application/json" -Headers (Get-DevOpsHeaders -AuthToken $authToken) -Method PATCH -Body (@{"retainedByRelease" = $true } | ConvertTo-Json)).Content | ConvertFrom-Json
    }
}

$result.Success = !$aBuildFailed

return $result

100

### Example usage: .\TriggerBuild.ps1 -branch "release/release-006" -isReleaseCandidate $false -additionalReleaseParameters @{ "EnvironmentName" = "qa"; "ColourName" = "blue"; }

param (

[Parameter(Mandatory = $true)][string]$branch,

[boolean]$isReleaseCandidate = $false,

[HashTable]$additionalReleaseParameters = @{ }

)

$ErrorActionPreference = "Stop"

$authToken = Get-DevOpsAuthToken # see https://docs.microsoft.com/en-us/azure/devops/organizations/accounts/use-personal-access-tokens-to-authenticate?view=azure-devops for how to get a token

$accountName = "AzureDevopsAccountName"

$projectName = "AzureDevopsProjectName"

$buildDefinitionIds = @(27) # the build pipeline id

Write-Host "Building with settings:"

Write-Host "Branch: '$branch'"

Write-Host "Tag as 'release-candidate' and retain build: $isReleaseCandidate"

Write-Host "Build definition IDs: $buildDefinitionIds"

Write-Host "Additional parameters: $($additionalReleaseParameters | ConvertTo-Json) "

Write-Host ""

$releaseIds = @()

$result = @{

Success = $false;

}

foreach ($definitionId in $buildDefinitionIds)

{

$deploymentParams = @{

"definition" = @{

"id" = $definitionId;

}

"sourceBranch" = $branch;

}

if ($additionalReleaseParameters.GetEnumerator().length -gt 0)

{

$deploymentParams.parameters = $additionalReleaseParameters | ConvertTo-Json

}

$content = (Invoke-WebRequest -uri "https://dev.azure.com/$accountName/$projectName/_apis/build/builds?api-version=4.1" `

-ContentType "application/json" -Headers (Get-DevOpsHeaders -AuthToken $authToken) -Method POST -Body ($deploymentParams | ConvertTo-Json)).Content | ConvertFrom-Json

$releaseIds += $content.id

Write-Host "Build $($content.id) queued: https://dev.azure.com/$accountName/$projectName/_build/results?buildId=$($content.id)" -ForegroundColor Yellow

}

$aBuildFailed = $false

foreach ($releaseId in $releaseIds)

{

$status = ""

while ($status -ne "completed")

{

try

{

$content = (Invoke-WebRequest -uri "https://dev.azure.com/$accountName/$projectName/_apis/build/builds/$releaseId" -Headers (Get-DevOpsHeaders -AuthToken $authToken)).Content | ConvertFrom-Json

}

catch

{

Write-Host " Error calling DevopsAPI. If this happens several times check the url: https://dev.azure.com/$accountName/$projectName/_apis/build/builds/$releaseId" -ForegroundColor red

}

$status = $content.status

Write-Host " Build id $releaseId has status: $status"

if ($content.result -eq "failed" -or $content.result -eq "canceled")

{

$aBuildFailed = $true

Write-Host "Build $releaseId failed - check https://dev.azure.com/$accountName/$projectName/_build/results?buildId=$releaseId for details" -ForegroundColor Red

}

elseif ($content.result -eq "completed")

{

Write-Host "Build $releaseId completed successfully" -ForegroundColor Green

}

Start-Sleep -s 5

}

if ($isReleaseCandidate -eq $true)

{

Write-Host " Adding RC tags: release-candidate"

$tags = (Invoke-WebRequest -uri "https://dev.azure.com/$accountName/$projectName/_apis/build/builds/$releaseId/tags/release-candidate?api-version=4.1" -Headers (Get-DevOpsHeaders -AuthToken $authToken) -Method PUT).Content | ConvertFrom-Json

Write-Host " Adding retain build to $releaseId"

$updates = (Invoke-WebRequest -uri "https://dev.azure.com/$accountName/$projectName/_apis/build/builds/$($releaseId)?api-version=4.1" -ContentType "application/json" -Headers (Get-DevOpsHeaders -AuthToken $authToken) -Method PATCH -Body (@{"retainedByRelease" = $true } | ConvertTo-Json)).Content | ConvertFrom-Json

}

$result.Success = !$aBuildFailed

return $result

Happy deploying 🙂

AWS Serverless template – inline policies

August 3, 2018 by boro

If you’ve worked with AWS Serverless templates, you’ll appreciate how quickly you can deploy a raft of infrastructure with very little template code. The only flaw I’ve found so far is the documentation is a bit tricky to find.

Say you want to attach some custom policies to your function, you can simply embed them into your template. E.g:

{
  "AWSTemplateFormatVersion": "2010-09-09",
  "Transform": "AWS::Serverless-2016-10-31",
  "Description": "An AWS Serverless Application.",
  "Resources": {
    "BackupTriggerGeneratorFunction": {
      "Type": "AWS::Serverless::Function",
      "Properties": {
        "Handler": "BackupTriggerGenerator::BackupTriggerGenerator.Functions::FunctionHandler",
        "Runtime": "dotnetcore2.0",
        "CodeUri": "",
        "MemorySize": 256,
        "Timeout": 30,
        "Environment": {
          "Variables": {
            "BucketName": "...",
            "FolderNames": "...",
            "FileName": "..."
          }
        },
        "Role": null,
        "Policies": [
          "AWSLambdaBasicExecutionRole",
          "AmazonS3ReadOnlyAccess",
          {
            "Version": "2012-10-17",
            "Statement": [
              {
                "Effect": "Allow",
                "Action": [
                  "s3:Put*"
                ],
                "Resource": [
                  "arn:aws:s3:::bucketname-*-*-*-1/Databases/*"
                ]
              }
            ]
          }
        ],
        "Events": {
          "Schedule": {
            "Type": "Schedule",
            "Properties": {
              "Schedule": "cron(30 1,3,5,7,9,11,13,15,17,19,21,23 * * ? *)"
            }
          }
        }
      }
    }
  }
}

{

"AWSTemplateFormatVersion": "2010-09-09",

"Transform": "AWS::Serverless-2016-10-31",

"Description": "An AWS Serverless Application.",

"Resources": {

"BackupTriggerGeneratorFunction": {

"Type": "AWS::Serverless::Function",

"Properties": {

"Handler": "BackupTriggerGenerator::BackupTriggerGenerator.Functions::FunctionHandler",

"Runtime": "dotnetcore2.0",

"CodeUri": "",

"MemorySize": 256,

"Timeout": 30,

"Environment": {

"Variables": {

"BucketName": "...",

"FolderNames": "...",

"FileName": "..."

}

"Role": null,

"Policies": [

"AWSLambdaBasicExecutionRole",

"AmazonS3ReadOnlyAccess",

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Action": [

"s3:Put*"

"Resource": [

"arn:aws:s3:::bucketname-*-*-*-1/Databases/*"

]

}

]

}

"Events": {

"Schedule": {

"Type": "Schedule",

"Properties": {

"Schedule": "cron(30 1,3,5,7,9,11,13,15,17,19,21,23 * * ? *)"

}

This also shows a few other neat features:

Wildcards in the custom policy name, allowing it to work across multiple buckets
Cron triggered events
How to set environment variables from your template

Serving images through AWS Api Gateway from Serverless Lambda_proxy function

April 18, 2018 by boro

In another post I mentioned the neat features now available in AWS: Serverless templates.

As part of an experiment I thought it would be interesting to see how you could put a light security layer over S3 so that media requests stream from S3 if a user logs in.

The WebApi template already ships with an S3 proxy controller. Tweaking this slightly allowed me to stream or download the image:

        [HttpGet("{*key}")]       
        public async Task Get(string key)
        {
            try
            {
                var getResponse = await S3Client.GetObjectAsync(new GetObjectRequest
                {
                    BucketName = BucketName,
                    Key = key
                });

                Response.ContentType = getResponse.Headers.ContentType;

                getResponse.ResponseStream.CopyTo(Response.Body);
            }
            catch (AmazonS3Exception e)
            {
                Response.StatusCode = (int)e.StatusCode;
                var writer = new StreamWriter(Response.Body);
                writer.Write(e.Message);
            }
        }

        [HttpGet("dl/{*key}")]        
        public async Task Download(string key)
        {
            try
            {
                var getResponse = await S3Client.GetObjectAsync(new GetObjectRequest
                {
                    BucketName = BucketName,
                    Key = key
                });
              
                return File(getResponse.ResponseStream, "application/octet-stream");
            }
            catch (AmazonS3Exception e)
            {
                Response.StatusCode = (int)e.StatusCode;
                var writer = new StreamWriter(Response.Body);
                writer.Write(e.Message);
            }

            return new EmptyResult();
        }

[HttpGet("{*key}")]

public async Task Get(string key)

{

try

{

var getResponse = await S3Client.GetObjectAsync(new GetObjectRequest

{

BucketName = BucketName,

Key = key

});

Response.ContentType = getResponse.Headers.ContentType;

getResponse.ResponseStream.CopyTo(Response.Body);

}

catch (AmazonS3Exception e)

{

Response.StatusCode = (int)e.StatusCode;

var writer = new StreamWriter(Response.Body);

writer.Write(e.Message);

}

[HttpGet("dl/{*key}")]

public async Task Download(string key)

{

try

{

var getResponse = await S3Client.GetObjectAsync(new GetObjectRequest

{

BucketName = BucketName,

Key = key

});

return File(getResponse.ResponseStream, "application/octet-stream");

}

catch (AmazonS3Exception e)

{

Response.StatusCode = (int)e.StatusCode;

var writer = new StreamWriter(Response.Body);

writer.Write(e.Message);

}

return new EmptyResult();

}

The issue I ran into was accessing images via the endpoint you then get in API Gateway – images were getting encoded so wouldn’t render in the browser.

The solution:

In the settings for your api gateway, add the Binary Media Type: */*
On the {proxy+} method ‘Method Response’ add a 200 response header and add a header Content-Type. Finally publish your api

The second step here may not be necessary but I found the */* didn’t kick in until I made the change.

PUB SUB in AWS Lambda via SNS using C#

March 21, 2018 by boro

AWS Lambda’s are a great replacement for things like Windows Services which need to run common tasks periodically. A few examples would be triggering scheduled backups or polling urls.

You can set many things as the trigger for a lambda, for scheduled operations this can be a CRON value triggered from a CloudWatch event. Alternatively lambda’s can be triggered via a subscription to an SNS topic.

Depending on the type of operation you want to perform on a schedule you might find it takes longer than the timeout restriction imposed by AWS. If that’s the case then a simple PUB SUB (publisher, subscriber) configuration should help.

Sample scenario

We want to move databases backups between 2 buckets in S3. There are several databases to copy, each of which being a large file.

In one lambda you can easily find all the files to copy but if you also try to copy them all at some point your function will timeout.

Pub sub to the rescue

Why not setup 2 lambda functions? One as the publisher and one as the subscriber, and then glue the two together with an SNS topic (Simple Notification Service)

The publisher

Typically this would be triggered from a schedule and would look to raise events for each operations. Lets assume we use a simple POCO for converying the information we need:

class UrlRequestMessage
{
    public string[] Urls {get;set;}
}

class UrlRequestMessage

{

public string[] Urls {get;set;}

}

public class Function
{
	public string FunctionHandler(object input, ILambdaContext context)
	{
                //This could gather urls to poll from a file, db or anywhere
		var urlsToScan = LoadUrls();

		var snsClient = new AmazonSimpleNotificationServiceClient();

		int batchSize = 4;

		context.Logger.LogLine($"Batch size: {batchSize}");

		foreach (var urlToScan in urlsToScan.Batch(batchSize))
		{
			snsClient.PublishAsync("topic arn e.g. arn:aws:sns:eu-west-1:98976####:UrlPoller_Topic",
				JsonConvert.SerializeObject(new UrlRequestMessage {Urls = urlToScan.ToArray()})).Wait();

			context.Logger.LogLine($"Raised event for urls: {String.Join(" | ", urlToScan)}");
		}

		return "ok";
	}
}

public class Function

{

public string FunctionHandler(object input, ILambdaContext context)

{

//This could gather urls to poll from a file, db or anywhere

var urlsToScan = LoadUrls();

var snsClient = new AmazonSimpleNotificationServiceClient();

int batchSize = 4;

context.Logger.LogLine($"Batch size: {batchSize}");

foreach (var urlToScan in urlsToScan.Batch(batchSize))

{

snsClient.PublishAsync("topic arn e.g. arn:aws:sns:eu-west-1:98976####:UrlPoller_Topic",

JsonConvert.SerializeObject(new UrlRequestMessage {Urls = urlToScan.ToArray()})).Wait();

context.Logger.LogLine($"Raised event for urls: {String.Join(" | ", urlToScan)}");

}

return "ok";

}

The batching can be ignored if needs be – in this scenario this allows multiple urls to be handled by one subscriber.

The subscriber

Next we need to listen for the messages – you want to configure the subscriber function to have an SNS trigger that uses the same topic you posted to before.

public class Function
{
	public string FunctionHandler(SNSEvent message, ILambdaContext context)
	{
		foreach (var record in message.Records)
		{
			var decodedMessage = JsonConvert.DeserializeObject<UrlRequestMessage>(record.Sns.Message);

			foreach (var url in decodedMessage.Urls)
			{
                                //here you just need to implement your logic for polling a url
                                // e.g. var result = new WebClient().DownloadStringTaskAsync(url).Result;
				var requestSummary = RequestUrl(url, context.Logger);				
			}
		}

		return "OK!";
	}
}

public class Function

{

public string FunctionHandler(SNSEvent message, ILambdaContext context)

{

foreach (var record in message.Records)

{

var decodedMessage = JsonConvert.DeserializeObject<UrlRequestMessage>(record.Sns.Message);

foreach (var url in decodedMessage.Urls)

{

//here you just need to implement your logic for polling a url

// e.g. var result = new WebClient().DownloadStringTaskAsync(url).Result;

var requestSummary = RequestUrl(url, context.Logger);

}

return "OK!";

}

Debugging things
You can either run each function on demand and see any output directly in the Lambda test window, or dig into your cloudwatch logs for each function.

AWS Lambda now supports Serverless applications including WebApi

February 16, 2018 by boro

One of the most exciting areas that I’ve seen emerging in the Cloud space recently is Serverless computing. Both AWS and Azure have their own flavour: AWS Lambda and Azure Functions.

An intro into Serverless

It really does what it says on the tin. You can run code but without dedicated infrastructure that you host. A good example is when building Alexa Skills.

You create AWS lambda function, in most of the languages of your choice, and then deploy into the cloud. Whenever someone uses your skill the lambda gets invoked and returns the content you need.

Behind the scenes AWS host your function in a container, if it receives traffic the container remains hot. If it doesn’t receive traffic its ‘frozen’. There is a very good description of this at https://medium.com/@tjholowaychuk/aws-lambda-lifecycle-and-in-memory-caching-c9cd0844e072

Your language of choice

AWS Lambda supports a raft of languages: Python, Node, Java, .net core and others. Recently this has been upgraded so that it supports .net core 2.

Doing the legwork

With a basic lambda function you can concoct different handlers (methods) which respond to requests. This allows one lambda to service several endpoints. However, you need to do quite a lot of wiring and it doesn’t feel quite like normal WebApi programming.

Enter the serverless applications

This came right out the blue, but was very cool – Amazon released some starter kits that allow you run both RazorPage and WebApi applications in Lambdas!!! https://aws.amazon.com/blogs/developer/serverless-asp-net-core-2-0-applications/

Woah, you can write normal WebApi and deploy into a lambda. That is big.

Quick, migrate all the things

So I tried this. And in the most part everything worked pretty seamlessly. All the code I’d already written easily mapped into WebApi controllers I could then run locally. Tick.

Deploying was simple, either via Visual Studio or the dotnet lambda tools. Tick.

Using the serverless.template that ships with the starter pack it even setup my Api Gateway. Tick.

Dependency injection thats inherently available in .net core all worked. Tick.

WebApi attribute routing all works. Tick.

So far so good right 🙂

What I haven’t quite cracked yet?

In my original deployment (pre WebApi) I was using API level caching over a couple specific endpoints. This was path based as it was for specific methods. The new API Gateway deployment directs all traffic to a {/proxy+} url in order to route any request to the routing in your WebApi. If you turn caching on here, its a bit of a race, whichever url is hit first will fill the cache for all requests. Untick!

Debugging errors locally don’t always bubble startup errors very well. I have a feeling this isn’t anything Amazon related but is something worth being aware of. E.g. if you mess up your DI, it takes some ctor null debugging to find the cause. Untick.

Summary

I was hugely impressed with WebApi integration. Once the chinks in the path based caching at the API Gateway can get ironed out I’d consider this a very good option for handling API requests.

Watch this space 🙂

Copying large files between S3 buckets

January 30, 2018 by boro

There are many different scenarios you might face where you need to migrate data between S3 buckets, and folders. Depending on the use case you have several options for the language to select.

Lambda’s – this could be Python, Java, JavaScript or C#
Bespoke code – again, this could be any language you select. To keep things different from above, lets add Powershell to the mix

Behind the scenes a lot of these SDK’s call into common endpoints Amazon host. As a user you don’t really need to delve too deeply into the specific endpoints unless you really need to.

Back to the issue at hand – copying large files
Amazon impose a limit of roughly 5GB on regular copy operations. If you use the standard copy operations you will probably hit exceptions when the file sizes grow.

The solution is to use the multipart copy. It sounds complex but all the code is provided for you:

Python
This is probably the easiest to do as the boto3 library already does this. Rather than using copy_object, the copy function already handles multipart uploads: http://boto3.readthedocs.io/en/latest/reference/services/s3.html#S3.Client.copy

C#
The C# implementation is slightly more complex however Amazon provide good worked examples: https://docs.aws.amazon.com/AmazonS3/latest/dev/CopyingObjctsUsingLLNetMPUapi.html

Powershell
A close mimic of the C# implementation – someone has ported the C# example into powershell: https://stackoverflow.com/a/32635525/1065332

Happy copying!

AlexaCore – a c# diversion into writing Alexa skills

September 11, 2017 by boro

Following the recent Amazon Prime day, I thought it was time to jump on the home assistant bandwagon – £80 seemed a pretty good deal for an Alexa Echo.

If you’ve not tried writing Alexa skills there are some really good blog posts to help get started at: http://timheuer.com/blog/archive/2016/12/12/amazon-alexa-skill-using-c-sharp-dotnet-core.aspx

Skills can be underpinned by an AWS lambda function. In experimenting with writing these I’ve started putting together some helpers which remove a lot of the boiler-plate code needed for C# Alexa Lambda functions, including some fluent tools for running unit tests.

The code and some examples are available at https://github.com/boro2g/AlexaCore. Hopefully you will find them help get your skills off the ground!

Log aggregation in AWS – part 3 – enriching the data sent into ElasticSearch

June 9, 2017 by boro

This is the third, and last part of the series that details how to aggregate all your log data within AWS. See Part 1 and Part 2 for getting started and keeping control of the size of your index.

By this point you should have a self sufficient ElasticSearch domain running that pools logs from all the CloudWatch log groups that have been configured with the correct subscriber.

The final step will be how we can enrich the data being sent into the index?

By default AWS will set you up a lambda function that extracts information from the CloudWatch event. It will contain things like: instanceId, event timestamp, the source log group and a few more. This is handled in the lambda via:

var source = buildSource(logEvent.message, logEvent.extractedFields);
source['@id'] = logEvent.id;
source['@timestamp'] = new Date(1 * logEvent.timestamp).toISOString();
source['@message'] = logEvent.message;
source['@owner'] = payload.owner;
source['@log_group'] = payload.logGroup;
source['@log_stream'] = payload.logStream;

var source = buildSource(logEvent.message, logEvent.extractedFields);

source['@id'] = logEvent.id;

source['@timestamp'] = new Date(1 * logEvent.timestamp).toISOString();

source['@message'] = logEvent.message;

source['@owner'] = payload.owner;

source['@log_group'] = payload.logGroup;

source['@log_stream'] = payload.logStream;

Note, a tip around handling numeric fields – in order for ElasticSearch to believe fields are numbers rather than strings you can multiply the value by 1 e.g.: source[‘@fieldName’] = 1*value;

What to enrich the data with?

This kind of depends on your use case. As we were aggregating logs from a wide range of boxes, applications and services, we wanted to enrich the data in the index with the tags applied to each instance. This sounds simple in practice but needed some planning around how to access the tags for each log entry – I’m not sure AWS would look kindly on making 500,000 API requests in 15 mins!

Lambda and caching

Lambda functions are a very interesting offering and I’m sure you will start to see a lot more use cases for them over the next few years. One challenge they bring is they are stateless – in our scenario we need to provide a way of querying an instance for its tags based of its InstanceId. Enter DynamoDb, another AWS service that provides scalable key-value pair storage.

Amazon define Dynamo as: Amazon DynamoDB is a fully managed non-relational database service that provides fast and predictable performance with seamless scalability.

Our solution

There were 2 key steps to the process:

Updating dynamo to gather tag information from our running instances
Updating the lambda script to pull the tags from dynamo as log entries are processed

1. Pushing instance data into Dynamo

Setup a lambda function that would periodically scan all running instances in our account and push the respective details into Dynamo.

Setup a new Dynamo db table
1. Named: kibana-instanceLookup
2. Region: eu-west-1 (note, adjust this as you need)
3. Primary partition key: instanceId (of type string)
  1. Note – we will tweak the read capacity units once things are up and running – for production we average about 50
Setup a new lambda function
1. Named: LogsToElasticSearch_InstanceQueries_regionName
2. Add environment variable: region=eu-west-1
  1. Note, if you want this to pool logs from several regions into one dynamo setup a lambda function per region and set different environment variables for each. You can use the same trigger and role for each region
3. Use the script shown below
4. Set the execution timeout to be: 1 minute (note, tweak this if the function takes longer to run)
5. Create a new role and give the following permissions:
  1. AmazonEC2ReadOnlyAccess (assign the OTB policy)
  2. Plus add the following policy:
  3. { "Effect": "Allow", "Action": "dynamoDb:*", "Resource": "arn:aws:dynamodb:eu-west-1:###:table/kibana-instanceLookup" }
    
    1
    2
    3
    4
    5
    
    {
    "Effect": "Allow",
    "Action": "dynamoDb:*",
    "Resource": "arn:aws:dynamodb:eu-west-1:###:table/kibana-instanceLookup"
    }
    1. Note, the ### in the role wants to be your account id
6. Setup a trigger within Cloudwatch -> Rules
  1. To run hourly, set the cron to be: 0 * * * ? *
  2. Select the target to be your new lambdas
    1. Note, you can always test your lambda by simply running on demand with any test event

And the respective script:

var AWS = require("aws-sdk");

var tableName = 'kibana-instanceLookup';

exports.handler = function (input, context)
{
	AWS.config.update({ region: process.env.region });

	this._ec2 = new AWS.EC2();

	var queryParams = {
		MaxResults: 200
	};

	var instances = [];

	this._ec2.describeInstances(queryParams, function (err, data)
	{
		if (err) console.log(err, err.stack);
		else
		{
			data.Reservations.forEach((r) =>
			{
				r.Instances.forEach((i) =>
				{
					instances.push({ "instanceId": i.InstanceId, "tags": i.Tags });
				});
			});

			console.log(JSON.stringify(instances));

			instances.forEach((instance) =>
			{
				pushInstanceToDynamo(instance);
			});
		}
	});

	function pushInstanceToDynamo(instance)
	{
		AWS.config.update({ region: 'eu-west-1' });

		var params = {
			Key: {
				"instanceId": {
					S: instance.instanceId
				}
			},
			TableName: tableName
		};

		new AWS.DynamoDB().getItem(params, function (err, data)
		{
			if (err)
			{
				console.log(err, err.stack);
			}
			else
			{
				if (data.Item)
				{
					console.log("Item found in dynamo - not updating " + instance.instanceId);
				}
				else
				{
					var tags = {};
					tags.L = [];
					instance.tags.forEach((tag) =>
					{
						if (tag.Key.indexOf("aws:") === -1)
						{
							tags.L.push(buildArrayEntry(tag));
						}
					});

					var insertParams = {
						Item: {
							"instanceId": {
								S: instance.instanceId
							},
							"tags": tags
						},
						ReturnConsumedCapacity: "TOTAL",
						TableName: tableName
					};

					new AWS.DynamoDB().putItem(insertParams, function (err, data)
					{
						if (err) console.log(err, err.stack);
						else console.log(data);
					});
				}
			}
		});
	}

	function buildArrayEntry(tag)
	{
		var m = {};

		m[tag.Key] = { "S": tag.Value };

		return { "M": m };
	}
}

100

101

102

103

104

105

var AWS = require("aws-sdk");

var tableName = 'kibana-instanceLookup';

exports.handler = function (input, context)

{

AWS.config.update({ region: process.env.region });

this._ec2 = new AWS.EC2();

var queryParams = {

MaxResults: 200

};

var instances = [];

this._ec2.describeInstances(queryParams, function (err, data)

{

if (err) console.log(err, err.stack);

else

{

data.Reservations.forEach((r) =>

{

r.Instances.forEach((i) =>

{

instances.push({ "instanceId": i.InstanceId, "tags": i.Tags });

});

console.log(JSON.stringify(instances));

instances.forEach((instance) =>

{

pushInstanceToDynamo(instance);

});

}

});

function pushInstanceToDynamo(instance)

{

AWS.config.update({ region: 'eu-west-1' });

var params = {

Key: {

"instanceId": {

S: instance.instanceId

}

TableName: tableName

};

new AWS.DynamoDB().getItem(params, function (err, data)

{

if (err)

{

console.log(err, err.stack);

}

else

{

if (data.Item)

{

console.log("Item found in dynamo - not updating " + instance.instanceId);

}

else

{

var tags = {};

tags.L = [];

instance.tags.forEach((tag) =>

{

if (tag.Key.indexOf("aws:") === -1)

{

tags.L.push(buildArrayEntry(tag));

}

});

var insertParams = {

Item: {

"instanceId": {

S: instance.instanceId

"tags": tags

ReturnConsumedCapacity: "TOTAL",

TableName: tableName

};

new AWS.DynamoDB().putItem(insertParams, function (err, data)

{

if (err) console.log(err, err.stack);

else console.log(data);

});

}

});

}

function buildArrayEntry(tag)

{

var m = {};

m[tag.Key] = { "S": tag.Value };

return { "M": m };

}

Note, if your dynamo runs in a different region to eu-west-1, update the first line of the pushInstanceToDynamo method and set the desired target region.

Running on demand should then fill your dynamo with data e.g.:

2. Querying dynamo when you process log entries

The final piece of the puzzle is to update the streaming function to query dynamo as required. This needs a few things:

Update the role used for the lambda that streams data from CloudWatch into ElasticSearch
{ "Effect": "Allow", "Action": "dynamodb:GetItem", "Resource": "arn:aws:dynamodb:eu-west-1:###:table/kibana-instanceLookup" }

1
2
3
4
5

{
"Effect": "Allow",
"Action": "dynamodb:GetItem",
"Resource": "arn:aws:dynamodb:eu-west-1:###:table/kibana-instanceLookup"
}

where ### is your account id
Update the lambda script setup in stage 1 and tweak as shown below

Add the AWS variable to the requires at the top of the file:

var AWS = require("aws-sdk");

1	var AWS = require("aws-sdk");

Update the exports.handler & transform methods and add loadFromDynamo to be:

exports.handler = function (input, context)
{
	this._dynamoDb = new AWS.DynamoDB();	

	// decode input from base64
	var zippedInput = new Buffer(input.awslogs.data, 'base64');

	// decompress the input
	zlib.gunzip(zippedInput, function (error, buffer)
	{
		if (error) { context.fail(error); return; }

		// parse the input from JSON
		var awslogsData = JSON.parse(buffer.toString('utf8'));

		// transform the input to Elasticsearch documents
		transform(awslogsData, (elasticsearchBulkData) =>
		{
			// skip control messages
			if (!elasticsearchBulkData)
			{
				console.log('Received a control message');
				context.succeed('Control message handled successfully');
				return;
			}

			// post documents to the Amazon Elasticsearch Service
			post(elasticsearchBulkData, function (error, success, statusCode, failedItems)
			{
				console.log('Response: ' + JSON.stringify({
					"statusCode": statusCode
				}));

				if (error)
				{
					console.log('Error: ' + JSON.stringify(error, null, 2));

					if (failedItems &amp;&amp; failedItems.length > 0)
					{
						console.log("Failed Items: " +
							JSON.stringify(failedItems, null, 2));
					}

					context.fail(JSON.stringify(error));
				} else
				{
					console.log('Success: ' + JSON.stringify(success));
					context.succeed('Success');
				}
			});
		});
	});
};

function transform(payload, callback)
{
	if (payload.messageType === 'CONTROL_MESSAGE')
	{
		return null;
	}

	var bulkRequestBody = '';

	var instanceId = payload.logStream;

	if (instanceId.indexOf(".") > -1)
	{
		instanceId = instanceId.substring(0, instanceId.indexOf("."));
	}

	loadFromDynamo(instanceId,
		(dynamoTags) =>
		{
			payload.logEvents.forEach(function (logEvent)
			{
				var timestamp = new Date(1 * logEvent.timestamp);

				// index name format: cwl-YYYY.MM.DD
				var indexName = [
					'cwl-' + timestamp.getUTCFullYear(),              // year
					('0' + (timestamp.getUTCMonth() + 1)).slice(-2),  // month
					('0' + timestamp.getUTCDate()).slice(-2)          // day
				].join('.');				

				var source = buildSource(logEvent.message, logEvent.extractedFields);
				source['@id'] = logEvent.id;
				source['@timestamp'] = new Date(1 * logEvent.timestamp).toISOString();
				source['@message'] = logEvent.message;
				source['@owner'] = payload.owner;
				source['@log_group'] = payload.logGroup;
				source['@log_stream'] = payload.logStream;				

				var action = { "index": {} };
				action.index._index = indexName;
				action.index._type = payload.logGroup;
				action.index._id = logEvent.id;

				bulkRequestBody += [
					JSON.stringify(action),
					JSON.stringify(Object.assign({}, source, dynamoTags))
				].join('\n') + '\n';
			});
			callback(bulkRequestBody);
		});
}

function loadFromDynamo(instanceId, callback)
{
	var tagsSource = {};

	try
	{
		var params = {
			Key: {
				"instanceId": {
					S: instanceId
				}
			},
			TableName: "kibana-instanceLookup"
		};
		this._dynamoDb.getItem(params, function (err, data)
		{
			if (err)
			{
				console.log(err, err.stack);
				callback(tagsSource);
			}
			else
			{
				if (data.Item) 
				{
					data.Item.tags.L.forEach((tag) =>
						{

							var key = Object.keys(tag.M)[0];
							tagsSource['@' + key] = tag.M[key].S;
						});
				}

				callback(tagsSource);
			}
		});
	}
	catch (exception)
	{
		console.log(exception);
		callback(tagsSource);
	}
}

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

exports.handler = function (input, context)

{

this._dynamoDb = new AWS.DynamoDB();

// decode input from base64

var zippedInput = new Buffer(input.awslogs.data, 'base64');

// decompress the input

zlib.gunzip(zippedInput, function (error, buffer)

{

if (error) { context.fail(error); return; }

// parse the input from JSON

var awslogsData = JSON.parse(buffer.toString('utf8'));

// transform the input to Elasticsearch documents

transform(awslogsData, (elasticsearchBulkData) =>

{

// skip control messages

if (!elasticsearchBulkData)

{

console.log('Received a control message');

context.succeed('Control message handled successfully');

return;

}

// post documents to the Amazon Elasticsearch Service

post(elasticsearchBulkData, function (error, success, statusCode, failedItems)

{

console.log('Response: ' + JSON.stringify({

"statusCode": statusCode

}));

if (error)

{

console.log('Error: ' + JSON.stringify(error, null, 2));

if (failedItems && failedItems.length > 0)

{

console.log("Failed Items: " +

JSON.stringify(failedItems, null, 2));

}

context.fail(JSON.stringify(error));

} else

{

console.log('Success: ' + JSON.stringify(success));

context.succeed('Success');

}

});

};

function transform(payload, callback)

{

if (payload.messageType === 'CONTROL_MESSAGE')

{

return null;

}

var bulkRequestBody = '';

var instanceId = payload.logStream;

if (instanceId.indexOf(".") > -1)

{

instanceId = instanceId.substring(0, instanceId.indexOf("."));

}

loadFromDynamo(instanceId,

(dynamoTags) =>

{

payload.logEvents.forEach(function (logEvent)

{

var timestamp = new Date(1 * logEvent.timestamp);

// index name format: cwl-YYYY.MM.DD

var indexName = [

'cwl-' + timestamp.getUTCFullYear(), // year

('0' + (timestamp.getUTCMonth() + 1)).slice(-2), // month

('0' + timestamp.getUTCDate()).slice(-2) // day

].join('.');

var source = buildSource(logEvent.message, logEvent.extractedFields);

source['@id'] = logEvent.id;

source['@timestamp'] = new Date(1 * logEvent.timestamp).toISOString();

source['@message'] = logEvent.message;

source['@owner'] = payload.owner;

source['@log_group'] = payload.logGroup;

source['@log_stream'] = payload.logStream;

var action = { "index": {} };

action.index._index = indexName;

action.index._type = payload.logGroup;

action.index._id = logEvent.id;

bulkRequestBody += [

JSON.stringify(action),

JSON.stringify(Object.assign({}, source, dynamoTags))

].join('\n') + '\n';

});

callback(bulkRequestBody);

});

}

function loadFromDynamo(instanceId, callback)

{

var tagsSource = {};

try

{

var params = {

Key: {

"instanceId": {

S: instanceId

}

TableName: "kibana-instanceLookup"

};

this._dynamoDb.getItem(params, function (err, data)

{

if (err)

{

console.log(err, err.stack);

callback(tagsSource);

}

else

{

if (data.Item)

{

data.Item.tags.L.forEach((tag) =>

{

var key = Object.keys(tag.M)[0];

tagsSource['@' + key] = tag.M[key].S;

});

}

callback(tagsSource);

}

});

}

catch (exception)

{

console.log(exception);

callback(tagsSource);

}

The final step is to refresh the index definition within Kibana: Management -> Index patterns -> Refresh field list.

Final thoughts
There are a few things to keep an eye on as you roll this out – bear in mind these may need tweaking over time:

The lambda function that scans EC2 times out, if so, up the timeout
The elastic search index runs out of space, if so, adjust the environment variables used in step 2
The dynamo read capacity threshold hits its ceiling, if so increase the read capacity (this can be seen in the Metrics section of the table in Dynamo)

Happy logging!

Log aggregation in AWS – part 2 – keeping your index under control

June 6, 2017 by boro

This is the second part in the series as a follow on to /log-aggregation-aws-part-1/

Hopefully by this point you’ve now got kibana up and running, gathering all the logs from each of your desired CloudWatch groups. Over time the amount of data being stored in the index will constantly be growing so we need to keep things under control.

Here is a good view of the issue. We introduced our cleanup lambda on the 30th, if we hadn’t I reckon we’d have about 2 days more uptime before the disks ran out. The oscillating items from the 31st onward are exactly what we’d want to see – we delete indices older than 10 days every day.

Initially this was done via a scheduled task from a box we host – it worked but wasn’t ideal as it relies on the box running, potentially user creds and lots more. What seemed a better fit was to use AWS Lambda to keep our index under control.

Getting setup

Luckily you don’t need to setup much for this. One AWS Lambda, a trigger and some role permissions and you should be up and running.

Create a new lambda function based off the script shown below
Add 2 environment variables:
1. daysToKeep=10
2. endpoint=elastic search endpoint e.g. search-###-###.eu-west-1.es.amazonaws.com

Create a new role as part of the setup process

Note, these can then be found in the IAM section of AWS e.g. https://console.aws.amazon.com/iam/home?region=eu-west-1#/roles
Update the role to allow Get and Delete access to your index with the policy:

{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "es:ESHttpGet",
                "es:ESHttpDelete"
            ],
            "Resource": "ARN of elastic search index"
        }
    ]
}

{

"Version": "2012-10-17",

"Statement": [

{

"Effect": "Allow",

"Action": [

"es:ESHttpGet",

"es:ESHttpDelete"

"Resource": "ARN of elastic search index"

}

]

}

Setup a trigger (in CloudWatch -> Events -> Rules)
1. Here you can set the frequency of how often to run e.g. a CRON of
  
  0 2 * * ? *
  
  1
  
  0 2 * * ? *
  
  will run at 2am every night
Test your function, you can always run on demand and then check whether the indices have been removed

And finally the lambda code:

var AWS = require('aws-sdk');

var endpoint; 
var creds = new AWS.EnvironmentCredentials('AWS');

Date.prototype.addDays = function(days) {
	var dat = new Date(this.valueOf());
	dat.setDate(dat.getDate() + days);
	return dat;
}

exports.handler = function(input, context)
{
	endpoint = new AWS.Endpoint(process.env.endpoint);

	let dateBaseline = new Date();

	dateBaseline = dateBaseline.addDays(-parseInt(process.env.daysToKeep));

	console.log("Date baseline: " + dateBaseline.toISOString());

	getIndices(context, function(data)
	{
		data.split('\n').forEach((row) =>
			{
				let parts = row.split(" ");
				
				if (parts.length > 2)
				{
					let indiceName = parts[2];

					if (indiceName.indexOf("cwl") > -1)
					{
						let indiceDate = new Date(indiceName.substr(4, 4), indiceName.substr(9, 2)-1, indiceName.substr(12, 2));
						
						if (indiceDate < dateBaseline)
						{
							console.log("Planning to delete indice: " + indiceName);

							removeIndice("/"+indiceName, context);
						}
					}
				}
			});
	});
}

function removeIndice(indiceName, context) 
{
	makeRequest("DELETE", indiceName, context);
}

function getIndices(context, callback)
{
	makeRequest("GET", '/_cat/indices', context, callback);
}

function makeRequest(method, path, context, callback)
{
	console.log(`Making ${method} call to ${path}`);

	var req = new AWS.HttpRequest(endpoint);

	req.method = method;
	req.path = path;
	req.region = "eu-west-1";
	req.headers['presigned-expires'] = false;
	req.headers['Host'] = endpoint.host;

	var signer = new AWS.Signers.V4(req, 'es');
	signer.addAuthorization(creds, new Date());

	var send = new AWS.NodeHttpClient();
	send.handleRequest(req,
		null,
		function(httpResp)
		{
			var respBody = '';
			httpResp.on('data',
				function(chunk)
				{
					respBody += chunk;
				});
			httpResp.on('end',
				function(chunk)
				{
					if (callback)
					{
						callback(respBody);
					}
					//console.log(respBody);
				});
		},
		function(err)
		{
			console.log('Error: ' + err);
			context.fail('Lambda failed with error ' + err);
		});
}

var AWS = require('aws-sdk');

var endpoint;

var creds = new AWS.EnvironmentCredentials('AWS');

Date.prototype.addDays = function(days) {

var dat = new Date(this.valueOf());

dat.setDate(dat.getDate() + days);

return dat;

}

exports.handler = function(input, context)

{

endpoint = new AWS.Endpoint(process.env.endpoint);

let dateBaseline = new Date();

dateBaseline = dateBaseline.addDays(-parseInt(process.env.daysToKeep));

console.log("Date baseline: " + dateBaseline.toISOString());

getIndices(context, function(data)

{

data.split('\n').forEach((row) =>

{

let parts = row.split(" ");

if (parts.length > 2)

{

let indiceName = parts[2];

if (indiceName.indexOf("cwl") > -1)

{

let indiceDate = new Date(indiceName.substr(4, 4), indiceName.substr(9, 2)-1, indiceName.substr(12, 2));

if (indiceDate < dateBaseline)

{

console.log("Planning to delete indice: " + indiceName);

removeIndice("/"+indiceName, context);

}

});

}

function removeIndice(indiceName, context)

{

makeRequest("DELETE", indiceName, context);

}

function getIndices(context, callback)

{

makeRequest("GET", '/_cat/indices', context, callback);

}

function makeRequest(method, path, context, callback)

{

console.log(`Making ${method} call to ${path}`);

var req = new AWS.HttpRequest(endpoint);

req.method = method;

req.path = path;

req.region = "eu-west-1";

req.headers['presigned-expires'] = false;

req.headers['Host'] = endpoint.host;

var signer = new AWS.Signers.V4(req, 'es');

signer.addAuthorization(creds, new Date());

var send = new AWS.NodeHttpClient();

send.handleRequest(req,

null,

function(httpResp)

{

var respBody = '';

httpResp.on('data',

function(chunk)

{

respBody += chunk;

});

httpResp.on('end',

function(chunk)

{

if (callback)

{

callback(respBody);

}

//console.log(respBody);

});

function(err)

{

console.log('Error: ' + err);

context.fail('Lambda failed with error ' + err);

});

}

Note, if you are running in a different region you will need to tweak req.region = “eu-west-1”;

How does it work?

Elastic search allows you to query the index to find all indices via the url: /_cat/indices. The lambda function makes a web request to this url, parses each row and finds any indices that match the name: cwl-YYYY.MM.dd. If an indice is found that is older than days to keep, a delete request is issued to elasticSearch

Was this the best option?

There are tools available for cleaning up old indices, even ones that Elastic themselves provide: https://github.com/elastic/curator however this requires additional boxes to run hence the choice for keeping it wrapped in a simple lambda.

Happy indexing!

blog.boro2g .co.uk

Some ideas about coding, dev and all things online.

Category Archives: AWS

Customizing logging in a C# dotnetcore AWS Lambda function

Automating a multi region deployment with Azure Devops

AWS Serverless template – inline policies

Serving images through AWS Api Gateway from Serverless Lambda_proxy function

PUB SUB in AWS Lambda via SNS using C#

AWS Lambda now supports Serverless applications including WebApi

Copying large files between S3 buckets

AlexaCore – a c# diversion into writing Alexa skills

Log aggregation in AWS – part 3 – enriching the data sent into ElasticSearch

Log aggregation in AWS – part 2 – keeping your index under control