Serving personalized content as JSON from Sitecore

As with many tools and approaches to solving technical issues, you can often find many ways to achieve the same output. The challenge I was up against was how to serve personalized content as JSON when being served from Sitecore.

The solution below is one way you can achieve this, undoubtedly there are many more!

The over-arching setup

Think of your JSON feed like a Sitecore page. You will need to break a rule of REST as Sitecore personalization requires session, and therefore isn’t stateless. This will need to be reflected in your consuming app, if you don’t provide an identifier every request it will need to understand and persist cookies between requests.

First up you need to select a device, a layout and some renderings. None of this differs to normal Sitecore development. I’ve found for debugging purposes using a new device works better as you can view the content as a web page as well as JSON.

The layout

Assuming you’ve decided on a device, you will need to setup a layout:

The renderings
Again, much like you would for a page, you can create e.g. Controller Renderings which output the content as you need. One thing to note, these will want to render as JSON e.g.:

These components can have datasources setup as normal, and hence personalization is available into your JSON feed.

Via a browser you would then load the url as normal, remembering to specify the device you’ve selected and you should see content based off rules, user information and behaviours and more.

Taking it to the next level
The simpler approach assumes you have one component per page. Complexity comes in when you need to generate valid JSON based off multiple controls – this can be achieved but requires you to either configure things via rendering parameters or at the point the page is rendered, interrogate the counterpart presentation components and work out whether you have sibling controls.

If you find you have sibling components to render you’d need to add commas after your controls to ensure valid JSON.

Who makes the tea? Why not ask Alexa?

Following on from the previous post, AlexaCore, this post will explain some of the challenges you might encounter when launching Alexa Skills into the store. It will also cover some cool things you can do if you want to enrich the feedback users receive as they use their skill.

Time for a brew?

If you need to solve your debate of who makes the tea why not check out Tea Round, recently enabled on the Skill Store.


First up, how can you find skills?

They are all available on the Amazon site, or accessible via the alexa.amazon.com portal.

Running test versions of your skill

This is pretty straight forwards. You need to login to developer.amazon.com, run through the wizard to create a skill which includes pairing up with an AWS Lambda (or controller endpoint). You should then see the newly created skill in your Alexa app but marked with a ‘dev’ flag.

Testing your skill

You have a few options, either you can simply talk to your Alexa with voice commands or use the text test tool within the developer.amazon.com console.

Getting certified gotcha’s

The certification process looks to validate and check a few things:

  • Are there bugs in the skill?
  • Do the descriptions and prompts align with your skill’s intents?
  • Do you leave the users hanging?

Things that caught me out during this process were:

  • Testing the skill where you skip past the launch intent
    • E.g. rather than asking ‘Alexa, open the tea round’ and then allowing the LaunchIntent to run, you can ask ‘Alexa, open the tea round and spin the wheel’. My logic around initializing the session originally ran in the LaunchIntent, some simple refactoring resolved this.
  • Leaving the users hanging – in my opinion this isn’t great UX but rules are rules
    • If you respond to a user without a prompt e.g. without a request of the user, the rules define you should end the session. My AddIntent would respond to ‘Add Nick’ with ‘Ok, Nick is now in the team’. To get past the certification it needed updating to ‘Ok, Nick is now in the team. Why not spin the wheel?’
  • Make sure the suggested prompts you include match up to the text set in your intents. The best bet here is to look at other skills and see how they phrase the prompts.

Saving data beyond a session

Much like a session in a web request, for the context of the lifetime of a skill a session gets persisted. This can be used to store anything you want, some simple examples would be an array of the names of the people in the Tea Round. That’s fine, but next time someone loads the skill, the session will be empty and the user will need to re-add each user – with all the extra questioning needed for certification this could get painful.

AWS provide a document model DB, Dynamo, that’s very well suited to this kind of thing. The Tea Round stores the permanent team in Dynamo, updated from the Lambda function that sits behind the skill.

Understanding users names

This can be tricky as subtle variations behind names can lead to them being spelt, and pronounced very differently – especially when regional dialect comes into play. The best success I’ve found is to provide as broad a set of example names as possible when setting up your {slots} in the intent.

Enriching the responses

A typical request & response cycle will send information that Amazon decodes from speech into your Lambda function. From there you can then return text that gets read out by your Alexa. By using Sitecore as a headless backend, the response text can be driven from the CMS – some recent updates to AlexaCore provide helpers for making these requests.

Where things then get interesting are that you can personalize messaging based off behaviour and user interactions. Big brother is watching?!?!

Closing thoughts

Gasp, after all that I’m thirsty – time for a brew! (how English eh :))

If you fancy allowing Alexa into your tea making process, have a look at Tea Round

AlexaCore – a c# diversion into writing Alexa skills

Following the recent Amazon Prime day, I thought it was time to jump on the home assistant bandwagon – £80 seemed a pretty good deal for an Alexa Echo.

If you’ve not tried writing Alexa skills there are some really good blog posts to help get started at: http://timheuer.com/blog/archive/2016/12/12/amazon-alexa-skill-using-c-sharp-dotnet-core.aspx

Skills can be underpinned by an AWS lambda function. In experimenting with writing these I’ve started putting together some helpers which remove a lot of the boiler-plate code needed for C# Alexa Lambda functions, including some fluent tools for running unit tests.

The code and some examples are available at https://github.com/boro2g/AlexaCore. Hopefully you will find them help get your skills off the ground!

Make request with fiddler based off a timer

Fiddler is a great tool for debugging web requests. Things like the composer section allow you to concoct requests and then test out the responses you get. If you’re using composer and look at the raw view you will see something like:

If you want to run this request every N seconds you can setup a quick script to achieve this:

The full script will be shown below, you need to simply add into the class Handlers code, save the script and then use the new menu options that get added to the Tools menu (Request by Timer, Stop Timer and Request Once):

One thing to note, in your script ensure you keep the trailing \r\n\r\n in the request url otherwise you’ll receive an error.

Enjoy

Log aggregation in AWS – part 3 – enriching the data sent into ElasticSearch

This is the third, and last part of the series that details how to aggregate all your log data within AWS. See Part 1 and Part 2 for getting started and keeping control of the size of your index.

By this point you should have a self sufficient ElasticSearch domain running that pools logs from all the CloudWatch log groups that have been configured with the correct subscriber.

The final step will be how we can enrich the data being sent into the index?

By default AWS will set you up a lambda function that extracts information from the CloudWatch event. It will contain things like: instanceId, event timestamp, the source log group and a few more. This is handled in the lambda via:

Note, a tip around handling numeric fields – in order for ElasticSearch to believe fields are numbers rather than strings you can multiply the value by 1 e.g.: source[‘@fieldName’] = 1*value;

What to enrich the data with?

This kind of depends on your use case. As we were aggregating logs from a wide range of boxes, applications and services, we wanted to enrich the data in the index with the tags applied to each instance. This sounds simple in practice but needed some planning around how to access the tags for each log entry – I’m not sure AWS would look kindly on making 500,000 API requests in 15 mins!

Lambda and caching

Lambda functions are a very interesting offering and I’m sure you will start to see a lot more use cases for them over the next few years. One challenge they bring is they are stateless – in our scenario we need to provide a way of querying an instance for its tags based of its InstanceId. Enter DynamoDb, another AWS service that provides scalable key-value pair storage.

Amazon define Dynamo as: Amazon DynamoDB is a fully managed non-relational database service that provides fast and predictable performance with seamless scalability.

Our solution

There were 2 key steps to the process:

  1. Updating dynamo to gather tag information from our running instances
  2. Updating the lambda script to pull the tags from dynamo as log entries are processed

1. Pushing instance data into Dynamo

Setup a lambda function that would periodically scan all running instances in our account and push the respective details into Dynamo.

  1. Setup a new Dynamo db table
    1. Named: kibana-instanceLookup
    2. Region: eu-west-1 (note, adjust this as you need)
    3. Primary partition key: instanceId (of type string)
      1. Note – we will tweak the read capacity units once things are up and running – for production we average about 50 
  2. Setup a new lambda function
    1. Named: LogsToElasticSearch_InstanceQueries_regionName 
    2. Add environment variable: region=eu-west-1
      1. Note, if you want this to pool logs from several regions into one dynamo setup a lambda function per region and set different environment variables for each. You can use the same trigger and role for each region
    3. Use the script shown below
    4. Set the execution timeout to be: 1 minute (note, tweak this if the function takes longer to run)
    5. Create a new role and give the following permissions:
      1. AmazonEC2ReadOnlyAccess (assign the OTB policy)
      2. Plus add the following policy:
        1. Note, the ### in the role wants to be your account id
    6. Setup a trigger within Cloudwatch -> Rules
      1. To run hourly, set the cron to be: 0 * * * ? *
      2. Select the target to be your new lambdas
        1. Note, you can always test your lambda by simply running on demand with any test event

And the respective script:

Note, if your dynamo runs in a different region to eu-west-1, update the first line of the pushInstanceToDynamo method and set the desired target region.

Running on demand should then fill your dynamo with data e.g.:

2. Querying dynamo when you process log entries

The final piece of the puzzle is to update the streaming function to query dynamo as required. This needs a few things:

  1. Update the role used for the lambda that streams data from CloudWatch into ElasticSearch

  2. where ### is your account id
  3. Update the lambda script setup in stage 1 and tweak as shown below

Add the AWS variable to the requires at the top of the file:

Update the exports.handler & transform methods and add loadFromDynamo to be:

The final step is to refresh the index definition within Kibana: Management -> Index patterns -> Refresh field list.

Final thoughts
There are a few things to keep an eye on as you roll this out – bear in mind these may need tweaking over time:

  • The lambda function that scans EC2 times out, if so, up the timeout
  • The elastic search index runs out of space, if so, adjust the environment variables used in step 2
  • The dynamo read capacity threshold hits its ceiling, if so increase the read capacity (this can be seen in the Metrics section of the table in Dynamo)

Happy logging!

Log aggregation in AWS – part 2 – keeping your index under control

This is the second part in the series as a follow on to /log-aggregation-aws-part-1/

Hopefully by this point you’ve now got kibana up and running, gathering all the logs from each of your desired CloudWatch groups. Over time the amount of data being stored in the index will constantly be growing so we need to keep things under control.

Here is a good view of the issue. We introduced our cleanup lambda on the 30th, if we hadn’t I reckon we’d have about 2 days more uptime before the disks ran out. The oscillating items from the 31st onward are exactly what we’d want to see – we delete indices older than 10 days every day.

Initially this was done via a scheduled task from a box we host – it worked but wasn’t ideal as it relies on the box running, potentially user creds and lots more. What seemed a better fit was to use AWS Lambda to keep our index under control.

Getting setup

Luckily you don’t need to setup much for this. One AWS Lambda, a trigger and some role permissions and you should be up and running.

  1. Create a new lambda function based off the script shown below
  2. Add 2 environment variables:
    1. daysToKeep=10
    2. endpoint=elastic search endpoint e.g. search-###-###.eu-west-1.es.amazonaws.com
  3. Create a new role as part of the setup process
    1. Note, these can then be found in the IAM section of AWS e.g.  https://console.aws.amazon.com/iam/home?region=eu-west-1#/roles
    2. Update the role to allow Get and Delete access to your index with the policy:
  4. Setup a trigger (in CloudWatch -> Events -> Rules)
    1. Here you can set the frequency of how often to run e.g. a CRON of

      will run at 2am every night
  5. Test your function, you can always run on demand and then check whether the indices have been removed

And finally the lambda code:

Note, if you are running in a different region you will need to tweak req.region = “eu-west-1”;

How does it work?

Elastic search allows you to query the index to find all indices via the url: /_cat/indices. The lambda function makes a web request to this url, parses each row and finds any indices that match the name: cwl-YYYY.MM.dd. If an indice is found that is older than days to keep, a delete request is issued to elasticSearch

Was this the best option?

There are tools available for cleaning up old indices, even ones that Elastic themselves provide: https://github.com/elastic/curator however this requires additional boxes to run hence the choice for keeping it wrapped in a simple lambda.

Happy indexing!

Log aggregation in AWS – part 1

As with most technologies these days you get plenty of options as to how you solve your technical and logistical problems. The following set of posts details one way you can approach solving what I suspect is quite a common problem – how to usefully aggregate large quantities of logs in the cloud.

What to expect from these blog posts

  1. Getting started – how to get log data off each box into a search index (this post)
  2. Keeping the search index under control
  3. Enriching the data you push into your search index

Some background
Our production infrastructure is composed of roughly 30 windows instances, some web boxes and some sql boxes. These are split between 2 regions and within each region are deployed to all the availability zones that AWS provides. We generate roughly 500k log entries in 15 mins which ends up as 20-25GB log data per day.

The first attempt
When we started out on this work we didn’t appreciate quite how much log data we’d be generating. Our initial setup was based around some CloudFormation templates provided by AWSLabs: https://github.com/awslabs/cloudwatch-logs-subscription-consumer. Initially this worked fine however we quickly hit the issue where the index, and hence kibana, would stop working. We aren’t elasticSearch, kibana or even linux experts here so troubleshooting was taking more time than the benefits we got from the tools.

Getting log data off your instances
As much as the first attempt for querying and displaying log entries didn’t quite work out, we did make good progress as to how we pooled all the log data generated across the infrastructure. You have a few options here – we chose to push everything to CloudWatch and then stream onto other tools.

Note, CloudWatch is a great way of aggregating all your logs however searching across large numbers of log groups and log streams isn’t particularly simple or quick.

To push data into CloudWatch you have a few options:
– Write log entries directly to a known group – e.g. setup a log4net appender that writes directly to CloudWatch
– If you are generating physical log files on disk, use EC2Config (an ootb solution provided by AWS) which streams the data from log files into CloudWatch.

Note, this needs configuring to specify which folders contain the source log files. See http://docs.aws.amazon.com/AWSEC2/latest/WindowsGuide/UsingConfig_WinAMI.html for more info.

Provided things have gone to plan, you should now start to see log entries show up in CloudWatch:

CloudWatch log group subscribers
CloudWatch allows you to wire up subscribers to log groups – these forward on any log entries to the respective subscriber. Subscribers can be multiple things, either: kinesis streams or to a lambda function. Via the web ui you can select a log group, choose actions then e.g. ‘Stream to AWS Lambda’.

AWS Lambda functions
Lambda functions can be used for many things – in these examples we use them to:
– Transform cloudformation log entries into a format we want to index in ElasticSearch
– Run a nightly cleanup to kill off old search indices
– Run an hourly job to scrape meta data out of EC2 and store into Dynamo
Note, we chose to use the NodeJS runtime – see http://docs.aws.amazon.com/AWSJavaScriptSDK/latest/top-level-namespace.html for the API documentation

ElasticSearch as a service
Our first attempt, ie self hosting, did provide good insight into how things should work but the failure rate was too high. We found we were needing to rebuild the stack every couple weeks. Hence, SAAS was a lot more appealing an option. Let the experts handle the setup.

Note, Troy Hunt has written some good posts on the benefits of pushing as much to SAAS – https://www.troyhunt.com/heres-how-i-deal-with-managed-platform-outages/

Setting up ElasticSearch
This is the cool bit as it can all be done through the AWS UI! The steps to follow are:

  1. Create an Elasticsearch domain
    1. Ensure to pick a large enough volume size. We opted for 500GB in our production account
    2. Select a suitable access policy. We whitelisted our office IP
    3. This takes a while to rev up so wait until it goes green
    4. One neat thing is you now get Kibana automatically available. The UI will provide the kibana url.
  2. Setup a CloudWatch group subscriber
    1. Find the group you want to push to the index, then ‘Actions’ -> ‘Stream to Amazon Elasticsearch Service’
    2. Select ‘Other’ for the Log Format
    3. Complete the wizard, which ultimately will create you a Lambda function
  3. Start testing things out
    1. If you now push items into CloudWatch, you should see indices created in ElasticSearch
    2. Within Kibana you need to let Kibana know how the data looks:
      1. Visit ‘Management’ -> ‘Index Patterns’ -> ‘Add new’
      2. The log format is [cwl-]YYYY.MM.DD

Next up we’ll go through:

  • How to prune old indices in order to keep a decent level of disk space left
  • How to transform the data from CloudWatch, through a Lambda function, into the ElasticSearch index

TypeScript mixins

This popped into my news feed a couple days ago and it seemed like a really neat feature that’s been added into version 2.2. of TypeScript – the concept of mixins.

This terminology (mixin) has been available in SASS for quite a while and provides the ability to extend functionality by composing functionality from various sources. This same concept holds true in TypeScript.

A few examples based off a User class:

Now a couple mixins:

The first adds the ability to ‘tag’ the base class, the second exposes a couple methods to activate/deactivate the class.

Pulling this together you can then create basic TaggableUser:

Or an ActivatableUser:

Or even better, a combination of both:

Pretty neat huh?

The inspiration for this came from https://blog.mariusschulz.com/2017/05/26/typescript-2-2-mixin-classes

Newtonsoft – deserializing POCO objects that contain interfaces

Just a quick post here but hopefully helpful if you hit the same issue. If you are dealing with json serialization, newtonsoft is one of our goto options.

When deserializing json to poco’s, sometimes the structure of your poco’s require a bit of extra setup to play fair with newtonsoft. Consider the following objects:

So when this gets serialized, you’d end up with:

If you then try to deserialize:

then you will get an exception.

The solution is to setup a converter:

Which then gets passed into the deserialize call e.g.:

Then you can simply add as many InterfaceConverter’s as you need

Scan large S3 bucket with node js and AWS lambda

AWS lamdba’s are a really cool way to remove the need for specific hardware when running things like scheduled operations. The challenge we had was to find the latest 2 files in a large bucket which matched specific key prefixes. This is easy enough on smaller buckets as the listObjectsV2 call is limited to return 1000 items. What to do if you need to scan more?

The following example shows how you can achieve this. You need to fill in a couple parts:

  • the bucket name
  • the filename / folder prefix
  • the file suffixes

What’s really neat with Lambda’s is you can pass in parameters from the test event e.g.:

When this runs it will fire off SNS alerts if it finds the files to be out of date.

The key bit is the recursive calls in GetLatestFiles which finally triggers the callback from the parent function (ie the promise in GetLatestFileForType).