AWS Lambda – build yourself a URL shortener in 2 hours

An interesting requirement came up at work this week where we discussed potentially having to run our own URL Shortener because the Universal Links mechanism (in iOS 9 and above) requires a JSON manifest at

https://domain.com/apple-app-site-association

Since the OS doesn’t follow redirects this manifest has to be hosted on the URL shortener’s root domain.

Owing to a limitation with our attribution partner they’re currently not able to shorten links when you have Universal Links configured for your app. Whilst we can switch to another vendor it means more work for our (already stretched) client devs and we really like our partner’s support for attributions in links.

Which brings us back to the question

“should we build a URL shortener?”

swiftly followed by

“how hard can it be to build a scalable URL shortener in 2017?”

Well, turns out it wasn’t hard at all 

ape-shortener

Lambda FTW

For this URL shortener we’ll need several things:

  1. a GET /{shortUrl} endpoint that will redirect you to the original URL
  2. a POST / endpoint that will accept an original URL and return the shortened URL
  3. an index.html page where someone can easily create short URLs
  4. a GET /apple-app-site-association endpoint that serves a static JSON response

all of which can be accomplished with API Gateway + Lambda.

Overall, this is the project structure I ended up with:

  • using the Serverless framework’s aws-nodejs template
  • each of the above endpoint have a corresponding handler function
  • the index.html file is in the static folder
  • the test cases are written in such a way that they can be used both as integration as well as acceptance tests
  • there’s a build.sh script which facilitates running
    • integration tests, eg ./build.sh int-test {env} {region} {aws_profile}
    • acceptance tests, eg ./build.sh acceptance-test {env} {region} {aws_profile}
    • deployment, eg ./build.sh deploy {env} {region} {aws_profile}

ape-shortener-project-structure

Get /apple-app-site-association endpoint

Seeing as this is a static JSON blob, it makes sense to precompute the HTTP response and return it every time.

ape-shortener-app-association

POST / endpoint

For an algorithm to shorten URLs, you can find a very simple and elegant solution on StackOverflow. All you need is an auto-incremented ID, like the ones you normally get with RDBMS.

However, I find DynamoDB a more appropriate DB choice here because:

  • it’s a managed service, so no infrastructure for me to worry about
  • OPEX over CAPEX, man!
  • I can scale reads & writes throughput elastically to match utilization level and handle any spikes in traffic

but, DynamoDB has no such concept as an auto-incremented ID which the algorithm needs. Instead, you can use an atomic counter to simulate an auto-incremented ID (at the expense of an extra write-unit per request).

ape-shortener-auto-incr-id

ape-shortener-auto-incr-id-dynamodb

GET /{shortUrl} endpoint

Once we have the mapping in a DynamoDB table, the redirect endpoint is a simple matter of fetching the original URL and returning it as part of the Location header.

Oh, and don’t forget to return the appropriate HTTP status code, in this case a 308 Permanent Redirect.

ape-shortener-redirect

 

GET / index page

Finally, for the index page, we’ll need to return some HTML instead (and a different content-type to go with the HTML).

I decided to put the HTML file in a static folder, which is loaded and cached the first time the function is invoked.

ape-shortener-index

 

Getting ready for production

Fortunately I have had plenty of practice getting Lambda functions to production readiness, and for this URL shortener we will need to:

  • configure auto-scaling parameters for the DynamoDB table (which we have an internal system for managing the auto-scaling side of things)
  • turn on caching in API Gateway for the production stage

Future Improvements

If you put in the same URL multiple times you’ll get back different short-urls, one optimization (for storage and caching) would be to return the same short-url instead.

To accomplish this, you can:

  1. add GSI to the DynamoDB table on the longUrl attribute to support efficient reverse lookup
  2. in the shortenUrl function, perform a GET with the GSI to find existing short url(s)

I think it’s better to add a GSI than to create a new table here because it avoids having “transactions” that span across multiple tables.

Useful Links

Devops at Scale: Videos

A big thank you to everyone who came to our offices to see our speakers last Wednesday night at the Devops at Scale event. A big thank you, also, to everyone involved in organising everything for the big night!

We’ve uploaded the videos of our talks, so if you weren’t able to come or are interested in what folk had to say, here they all are!

Steve Lowe: Devops at Scale: A Cultural Change

Sam Pointer: Smashing the Monolith for Fun and Profit: Telemetry-led Infrastructure at Hive

Louis McCormack: Monitoring at Scale

Upcoming Event: Devops at Scale

We’re working with Burns Sheehan, Wavefront and Hive to host an event at Space Ape HQ on Wednesday April 12th on the theme of Devops at Scale.

The evening will explore topics focussing on the adoption of DevOps at scale, hearing from businesses and individuals who have successfully driven these new DevOps approaches.

Richard Haigh and Steve Lowe will be speaking from Betfair, telling the story of their shift to DevOps and how pushing for attitudinal change drives effective DevOps implementation. From Hive, Sam Pointer will talk about how they’ve used a telemetry-first approach to break apart a monolithic application and implement infrastructure transformation at scale. Finally, Louis McCormack of Space Ape Games will take a look at the challenges of monitoring everything, when “everything” keeps changing.

You can learn more at the event page, where you can also sign up to attend.

DevOps-At-Scale-Invitation

Space Ape Live Ops Boot Camp – Part 2 (GDC Edition)

(This is the second in our series of posts on Live Ops.  Part 1 can be found here: https://tech.spaceapegames.com/2016/12/07/space-ape-live-ops-boot-camp/)

Last week at the annual Games Developers Conference (GDC) in San Francisco, Space Ape’s first product guy, Joe Raeburn, took to the stage to share what we’ve learned about Live Ops.

The full video will be available in the GDC Vault soon, but we thought we’d share a summary of the talk and annotated slides for those who can’t get access.

Apart from some quality trolling of Kiwis and puns of goats the point of the presentation was to help frame how studios should think about Live Ops.  Joe’s personal experience was formed at his previous company where he was a Product Manager on the hugely successful Sims Social.  That game rocketed to over 60m users in a matter of months but the studio paid a high price with 75% of the studio’s headcount consumed with running the game.  At Space Ape we nearly fell into the same trap with around the same number focussed on operating our live games in 2015.  

The solution:  we radically transformed how we develop and operate games to comply with Joe’s first commandment:

“LEAN COMMANDMENT 1: Thou Shalt Go Lean … or thine studio shalt be encumbered with unsustainable weight and die.”

Today, we are 100 people, with a little over one third of the studio working on live games that more than pay the bills, freeing up the majority of developers to work on new transformative games.

Joe’s talk shows how we quantify the impact of Lean Live Ops and how we designed systems to ensure that the most desirable content we wanted players to chase, was able to be produced cheaply without reliance on developers.   He also shows how we’ve carried this philosophy through to the design of our new games Super Karts and Fastlane: Road to Revenge.

The full slideshare presentation can be found here: https://www.slideshare.net/SimonHade1/lean-live-ops-free-your-devs-annotated-edition-joe-raeburn

Space Ape Live Ops Boot Camp

We spend a lot of time swapping notes with other game developers about one thing or another, but by far, the most common questions we get are about Live Ops. How are our teams organised? What tools and tech did we build vs buy? How impactful is it really? How do we avoid cannibalisation screwing up an already successful game? What stats do we track? How transferable is what we do to other genres?

In three years our games have generated over $80m revenue.  Depending on how you account for it, Live Ops initiatives generated between one and two-thirds of that revenue. At Space Ape Live Ops is more of a pervasive philosophy than a discrete team or tool. We don’t really distinguish in our sprints or dev teams between game features and tools, or between community marketing and events because Live Ops underpins everything we do.  

I thought it would be nice to pull together some of the presentations we’ve shared on this topic over the years into one place.  Combined, they paint a good picture of how we manage our games over the long haul in a very efficient way, freeing up front line developers to work on new concepts.

Plan Now, Live Forever: Engineering Mobile Games for Live Ops from Day 1 

Download the slides here

Designing Successful Live Ops Systems in Free to Play Gacha Economises 

Live Ops Lead Andrew Munden (formerly Live Ops at Kabam and Aeria) shares the content strategies that work in gacha collection games as well as how to build a manageable content furnace and balance player fatigue in a sustainable way.

A Brief History of In-Game Targeting 

Analytics lead Fred Easy (ex Betfair, Playfish/EA) will share the evolution of his offer targeting technology from its belt and braces beginnings to sophisticated value based targeting and the transition to a dynamic in-session machine learning approach.

Under the Hood: Rival Kingdom’s CMS tools

Game changing content is introduced to Rival Kingdoms every month, with in-game events at least every week. Product Manager Mitchell Smallman (formerly Rovio, Next Games) and Steven Hsiao (competitive StarCraft player turned community manager turned Live Ops lead) will demonstrate the content management tools that allow them to keep the game fresh for players without developer support.

Games First Helsinki (April 2016) 

Product Owner Joe Raeburn and Game Designer John Collins talk to the Finnish gaming scene about how we transitioned Samurai Siege into Live Ops mode and laid the foundation of our current Live Ops platform. A good overview of the journey to Live Ops and introduction of the high-level concepts that underpin our approach: the Tools, the Toybox and the Treadmill.

GDC Europe  (June 2016)

Life After Launch: How to Grow Mobile Games with In-Game Events. COO Simon Hade talks generally about Live Ops, the impact it has had on the business and shares more detail about the Space Ape Toybox for in-game events.

The Great British Big Data Game Analytics Show & Tell (February 2015)

Space Ape Analytics Stack. A somewhat outdated but still relevant overview of the original data stack used in Samurai Siege.

I’d love to hear from studios who have used any of the content we shared, or have different approaches that could help us improve. We get inspired by the stories that come out of Finland about the gaming ecosystem’s willingness to share and collaborate and how that ultimately benefits everyone. Hopefully, this will inspire other London developers to do likewise.

Custom Inspec Resources

When developing Chef cookbooks, a good test suite is an invaluable ally. It confers the power of confidence, confidence to refactor code or add new functionality and be…confident that you haven’t broken anything.

But when deciding how to test cookbooks, there is a certain amount of choice. Test Kitchen is a given, there really is no competition. But then do you run unit tests, integration tests, or both? Do you use Chefspec, Serverspec or Inspec? At Spaceape we have settled on writing unit tests only where they make sense, and concentrating on integration tests: we want to test the final state of servers running our cookbooks rather than necessarily how they get there. Serverspec has traditionally been our framework of choice but, following the lead of the good folks at Chef, we’ve recently started using Inspec.

Inspec is the natural successor to Serverspec. We already use it to test for security compliance against the CIS rulebook, so it makes sense for us to try and converge onto one framework. As such we’ve been writing our own custom Inspec resources and, with it being a relatively new field, wanted to share our progress.

The particular resource we’ll describe here is used to test our in-house Redis cookbook, sag_redis. It is a rather complex cookbook that actually uses information stored in Consul to build out Redis farms that register themselves with a Sentinel cluster. We’ll forego all that complexity here and just concentrate on how we go about testing the end state.

In the following example, we’ll be using Test Kitchen with the kitchen-vagrant plugin.

Directory Structure:

Within our sag_redis cookbook, we’ll create an inspec profile. This is a set of files that describe what should be tested, and how. The directory structure of an inspec profile is hugely important, if you deviate even slightly then the tests will fail to run. The best way to ensure compliance is to use the Inspec CLI, which is bundled with later versions of the Chef DK.

Create a directory test/integration then run:

inspec init profile default

This will create an Inspec profile called ‘default’ consisting in a bunch of files, some of which can be unsentimentally culled (the example control.rb for instance). As a bare minimum, we need a structure that looks like this:

—test
└── integration
│       ├── default
│       │   ├── controls
│       │   ├── inspec.yml
│       │   └── libraries

The default inspec.yml will need to be changed, that should be self-evident. The controls directory will house our test specs, and the libraries directory is a good place to stick the custom resource we are about to write.

The Resource:

First, lets take a look at what an ‘ordinary’ Inspec matcher looks like:

describe user('redis') do
  it { should exist }
  its('uid') { should eq 1234 }
  its('gid') { should eq 1234 }
end

Fairly self-explanatory and readable (which incidentally was one of the original goals of the Inspec project). The purpose of writing a custom resource is to bury a certain amount of complexity in a library, and expose it in the DSL as something akin to the above.

The resource we’ll write will be used to confirm that on-disk Redis configuration is as we expect. It will parse the config file and provide methods to check each of the options contained therein. In DSL it should look something like this:

describe redis_config('my_redis_service') do
  its('port') { should eq(6382) }
  its('az') { should eq('us-east-1b') }
end

So, in the default/libraries directory, we’ll create a file called redis_config.rb with the following contents:

class RedisConfig < Inspec.resource(1)
  name 'redis_config'

  desc '
    Check Redis on-disk configuration.
  '

  example '
    describe redis_config('dummy_service_6') do
      its('port') { should eq('6382') }
      its('slave-priority') { should eq('69') }
    end
  '

  def initialize(service)
    @service = service
    @path = "/etc/redis/#{service}"
    @file = inspec.file(@path)

    begin
      @params = Hash[*@file.content.split("\n")
                           .reject{ |l| l =~ /^#/ or l =~ /^save/ }
                           .collect { |v| [ v.chomp.split ] }
                      .flatten]
        rescue StandardError
          return skip_resource "#{@file}: #{$!}"
      end
    end
  end

  def exists?
    @file.file?
  end

  def method_missing(name)
    @params[name.to_s]
  end

end

There’s a fair bit going on here.

The resource is initialised with a single parameter – the name of the Redis service under test. From this we derive the @path of the its on-disk configuration. We then use this @path to initialise another Inspec resource: @file.

Why do this, why not just use a common-or-garden ::File object and be done with it? There is a good reason, and this is important: the test is run on the host machine, not the guest. If we were to use ::File then Inspec would check the machine running Test Kitchen, not the VM being tested. By using the Inspec file resource, we ensure we are checking the file at the given path on the Vagrant VM.

The remainder of the initialize function is dedicated to parsing the on-disk Redis config into a hash (@params) of attribute:value pairs. The ‘save’ lines that configure bgsync snapshotting are unique in that they have more than one value after the parameter name, so we ignore them. If we wanted to test these options we’d need to write a separate function.

The exists? function acts on our Inspec file resource, returning a boolean. Through some Inspec DSL sleight-of-hand this allows us to use the matcher it { should exist } (or indeed it { should_not exist } ).

The final function delegates all missing methods to the @params hash, so we are able to reference the config options directly as ‘port’ or ‘slave-priority’, for instance.

The Controls:

In Inspec parlance, the controls are where we describe the tests we wish to run.

In the interests of keeping it simple, we’ll write a single test case in default/controls/redis_configure_spec.rb that looks like this:

describe redis_config(“leaderboard_service") do
  it { should exist }
  its('slave-priority') { should eq('50') }
  its('rdbcompression') { should eq('yes') }
  its('dbfilename') { should eq('leaderboard_service.rdb') }
end

The Test:

Now we just need to instruct Test Kitchen to actually run the test.

The .kitchen.yml file in the base of our sag_redis cookbook looks like this:

driver:
  name: vagrant
  require_chef_omnibus: 12.3.0
  provision: true
  vagrantfiles:
    - vagrant.rb

provisioner:
  name: chef_zero

verifier:
  name: inspec

platforms:
  - name: ubuntu-14.04
    driver:
      box: ubuntu64-ami
      customize:
        memory: 1024

suites:
  - name: default
    provisioner:
      client_rb:
        environment: test
    run_list:
      - role[sag_redis_default]

Obviously this is quite subjective, but the important points to note are that we set the verifier to be inspec and we provide the name: default to the particular suite we wish to test (recall that our Inspec profile is called ‘default’).

And thats it! Now we can just run kitchen test and our Inspec custom resource will check that our Redis services are configured as we expect.

Space Ape are hiring for Devops

Here on the Space Ape Devops Team, we’ve been busy building out the tech for our next generation of mobile games and now it’s time to bring some fresh faces onto the team to help continue our journey. If you’re a passionate technologist, Devops engineer or infrastructure wrangler then we’d love to hear from you.

Being a Devop at Space Ape is an important role. On our existing titles, you’ll be responsible for maintaining the quality of our players’ experience, working with the team to roll out new features and upgrades and finding new ways to optimise the stacks. On our new titles, you’ll be working with the development team to build out new stacks, solve new problems and prepare for big scale launches.

Along the way, you’ll learn how we use tools to build and update our stacks and roll them out without impacting our players and developers. You’ll also learn how we write those tools in Ruby, Angular and sometimes Go. Eventually, you’ll learn what it is that our teams need and start bringing fresh new ideas for how we can make things better; perhaps improving our containerisation platform, serverless workloads or the security of our platforms.

If you’re interested, have a poke round some of our other posts and drop your details in on our careers page where you can find out a bit more about the Devops role and the technology we use.

Trajectory prediction with Unity Physics

In some recent prototyping work, we needed to display a prediction for a projectile trajectory in the game. You’ve probably seen something similar in many games before, such as Angry Birds:

AngryBirdsTrajectory

The tutorial from Angry Birds 2. Note the dotted line, showing you the predicted trajectory of your bird, if you released the slingshot now.

Our prototype game was in Unity, and the projectile was set up using Unity’s physics engine. We had several requirements for the prediction:

  • Immediate. Player input can change from frame to frame, and the prediction needs to stay in sync with it.
  • Accurate. The time of flight could be several seconds, and any small error will accumulate to product significantly incorrect results.
  • Simulates drag. We’re using drag on our rigidbody, which many solutions do not account for.

I assumed this sort of problem came up often and searched online to see what popular implementations were out there. They generally fell into three groups:

  • Accurate, but slow. These solutions introduce an invisible projectile clone into the world and launch it along the flight path, recording its motion over time. As there’s no way to step the Unity physics simulation along yourself, you have to wait for this prediction in real time. This means that a three-second flight takes three seconds to fully predict. This is far too slow – the prediction would constantly lag behind the player’s changing inputs.
  • Doesn’t include drag. There are some good, accurate solutions, but most will specifically rule out drag.
  • Inaccurate. Some combination of incorrect equations, assumptions, and approximations meant that with longer flight times and more drag (or different gravity) the prediction would be wrong.

Perhaps the perfect solution for us is out there, but I hadn’t found it. By combining existing solutions and running some tests, I came up with my own implementation, which is presented below.

 public static Vector2[] Plot(Rigidbody2D rigidbody, Vector2 pos, Vector2 velocity, int steps)
 {
     Vector2[] results = new Vector2[steps];
 
     float timestep = Time.fixedDeltaTime / Physics2D.velocityIterations;
     Vector2 gravityAccel = Physics2D.gravity * rigidbody.gravityScale * timestep * timestep;
     float drag = 1f - timestep * rigidbody.drag;
     Vector2 moveStep = velocity * timestep;
 
     for (int i = 0; i < steps; ++i)
     {
         moveStep += gravityAccel;
         moveStep *= drag;
         pos += moveStep;
         results[i] = pos;
     }
 
     return results;
 }

This function plots the trajectory of a rigidbody under the effect of Unity’s physics by simulating some FixedUpdate iterations and returning the positions of the projectile at each iteration. It uses the global Physics2D.gravity setting, and takes into account rigidbody drag and gravityScale. Note that the mass of the rigidbody is irrelevant.

float timestep = Time.fixedDeltaTime / Physics2D.velocityIterations;

The code attempts to produce the same results as running the normal Unity physics iterations. To do this, it must also run as an iterative solution. A common error here is to assume that one iteration is run every FixedUpdate(). Instead, the number of iterations to be performed is accessible and tweakable – it’s Physics2D.velocityIterations. This helps us compute the timestep.

Vector2 gravityAccel = Physics2D.gravity * rigidbody.gravityScale * timestep * timestep;

We take into account the rigidbody’s gravityScale property when computing the effect of gravity. We found that we wanted a different amount of gravity and drag on each object, so this per-body setting was really helpful.

float drag = 1f - timestep * rigidbody.drag;

Drag acts as a reduction on moveStep in each iteration. We can compute it upfront and then apply it to each step of the iteration, producing a cumulative effect.

for (int i = 0; i < steps; ++i)
 {
     moveStep += gravityAccel;
     moveStep *= drag;
     pos += moveStep;
     results[i] = pos;
 }

Finally, the main loop. Each iteration, you’ll move due to gravity, reduce the movement due to drag, and then accumulate and store the new position in the results.

This solution worked well for us. While not exhaustively tested, we used it for projectiles that had lots of different velocities and drags, and it proved accurate each time, even after 4-5 seconds of flight.

 

Drawbacks

There’s a lot of computation involved for long trajectories. With default settings, you have to run the loop 400 times for each second of flight you want to predict. We only have one projectile to predict in our prototype, so we’re just running one prediction, which doesn’t cost very much. If you used this to predict lots of projectiles for lots of different launchers in a large scale game, perhaps this would begin to be a problem for you.

Also, it’s only simulating the trajectory, and not actually running the physics engine or simulating anything else in the game. This means it doesn’t predict collisions or collision resolution. If you render this path as-is, it’ll just clip through walls or other obstacles in the world, which obviously isn’t actually what will happen when the projectile is launched.

These drawbacks weren’t a problem for our prototype, so it turned out to be pretty useful code. We share this now in the hope that someone else out there is faced with the same requirements and finds it useful too.

 

Possible Future Upgrades

I have some ideas around the drawbacks of this method. This is the main area for improvement, as the actual functionality is fine.

For performance, no profiling or optimisation work has been performed. I’ve just laid things out in the way that made sense to me. It’s hard to guess at optimisations, but perhaps a little profiling would reveal some simple speedups. The bigger step would be to push this code out to a native dll and get down to nitty-gritty c++ optimisation – perhaps with SIMD instructions. You can’t parallelise the steps (each iteration of the loop depends on the result of the previous) but you could parallelise multiple projectile predictions – e.g. if you have many projectiles, run 4 or 8 predictions in parallel.

The other big upgrade is around prediction. For some games a true prediction would be really valuable – for example, visualising the outcome of collisions and reactions in a pool table game. You’d want to see the predicted path of the ball, even after several bounces. This isn’t going to happen with any simple model if you have any in-depth physics properties. You’d need a big shift in your approach – to run the physics engine yourself. I’d find an appropriate existing physics engine and build it into the game/Unity, which is a shame as it’s duplicating the work that Unity’s already done. But after doing that, you’d have control over the physics simulation and how you update it.

You’d try for a setup where you’d be able to clone the existing simulation and run some update ticks – to, essentially, look into the future – you’ll be tracking the future state of the simulation, assuming no inputs change. This would have to be a separate simulation, as you wouldn’t want the actual state of the pool table to change – just to compute the predicted future state. This will be even more expensive that just running the basic prediction code we had above – it’s the full physics simulation.