API Gateway Limitations
As I've mentioned a couple times in the past, I've been working with Lambda and API Gateway.
We're using it to host/deploy a big app for a big client, as well as some of the ancillary tooling to support the app (such as testing/builds, scheduling, batch jobs, notifications, authentication services, etc.).
For the most part, I love it. It's helped evaporate the most boring—and often most difficult—parts of deploying highly-available apps.
But it's not all sunshine and rainbows. Once the necessary allowances are made for a new architecture (things like: if we have concurrency of 10,000, a runaway process's consequences are amplified, database connection pools are easily exhausted, there's no simple way to use static IP addresses), there are 4 main problems that I've encountered with serving an app on Lambda and API Gateway.
The first two problems are essentially the same. Both headers and query string parameters are clobbered by API Gateway when it creates an API event object.
Consider the following Lambda function (note: this does not use Zappa, but functions provisioned by Zappa have the same limitation):
import json def respond(err, res=None): return { 'statusCode': '400' if err else '200', 'body': err.message if err else json.dumps(res), 'headers': { 'Content-Type': 'application/json', }, } def lambda_handler(event, context): return respond(None, event.get('queryStringParameters'))
Then if you call your (properly-configured) function via the API Gateway URL such as: https://lambdatest.example.com/test?foo=1&foo=2
, you will only get the following queryStringParameters
:
{"foo": "2"}
Similarly, a modified function that dumps the event's headers
, when called with duplicate headers, such as with:
curl 'https://lambdatest.example.com/test' -H'Foo: 1' -H'Foo: 2'
…will result in the second header overwriting the first:
{ … "headers": … "Foo": "2", … … }
The AWS folks have backed themselves into a bit of a corner, here. It's not trivial to change the way these events work, without affecting the thousands (millions?) of existing API Gateway deployments out there.
If they could make a change like this, it might make sense to turn queryStringParameters
into an array when there would previously have been a clobbering:
{"foo": ["1", "2"]}
This is a bit more dangerous for headers:
{ … "headers": … "Foo": [ "1", "2" ], … … }
This is not impossible, but it is a BC-breaking change.
What AWS could do, without breaking BC, is (even optionally, based on the API's Gateway configuration/metadata) supply us with an additional field in the event object: rawQueryString
. In our example above, it would be foo=1&foo=2
, and it would be up to my app to parse this string into something more useful.
Again, headers are a bit more difficult, but (optionally, as above), one solution might be to supply a rawHeaders
field:
{ … "rawHeaders": [ "Foo: 1", "Foo: 2", … ], … }
We've been lucky so far in that these first two quirks haven't been showstoppers for our apps. I was especially worried about a conflict with access-es, which is effectively a proxy server.
The next two limitations (API Gateway, Lambda) are more difficult, but I've come up with some workarounds:
- 6 MB payload size workaround
- 30 second integration timeout workaround