Guide to Serverless & Lambda Testing - Part 3 - Advanced Asynchronous Flows

Ran Isenberg
Mar 27, 2023
7 min read

Software testing increases application quality and reliability. It allows developers to find and fix software bugs, mitigate security issues, and simulate real user use cases.

It is an essential part of any application development.

Serverless is an amazing technology, almost magic-like. Serverless applications, like any other applications, require testing.

However, testing Serverless applications differs from traditional testing and introduces new challenges.

In this post, you will learn to test asynchronous event-driven flows that may or may not contain Lambda functions.

Previous posts in the series:

Part 1 - you learned why Serverless services introduce new testing challenges and my practical guidelines for testing Serverless services and AWS Lambda functions that mitigate these challenges.
Part 2 - you learned to write tests for your Serverless service. We focused on Lambda functions and provided tips & tricks and code examples by writing tests for a real Serverless application. In addition, you learned my Serverless adaptation to the testing pyramid and implemented it.

A complimentary Serverless service project that utilizes Serverless testing best practices can be found here.

Serverless Testing Pyramid Re:cap
My Testing Methodology
1. Lambda Functions are Included
2. Lambda Functions are Not Included
Testing Asynchronous Messaging Flows
Testing Synchronous & Asynchronous Flows
Testing Non Lambda Based Asynchronous Flows

Serverless Testing Pyramid Re:cap

In the first post in the series, I presented the Serverless testing pyramid.

This post assumes you are familiar with the testing pyramid and understand its principals.

We defined the following test stages:

Unit tests for our Lambda function business domain code - test input validation schemas and test isolated functions.
Infrastructure tests for our AWS CDK code that assert security best practices.
Integration tests that run post service deployment to AWS. The tests run in the IDE, call the Lambda handler entry function with a generated event, and simulate happy flows and edge cases (edge cases are simulated with mocks).
End-to-end tests that trigger the entire service on AWS with a customer input event.

My Testing Methodology

In this section, I will present my approach to Serverless testing and how I design the required tests for Serverless microservices based on the Serverless testing pyramid.

I base my approach on whether Lambda functions are present in the flow I'd like to test.

Lambda Functions are Included

We learned how to test Lambda functions in the previous post.

We follow the testing pyramid guidelines and add a unit, integration, and end-to-end tests to our Lambda function.

For the integration tests, we must generate the proper input event according to the asynchronous Lambda trigger: it may be an SQS list of events (a batch), an EventBridge event, a DynamoDB streams event, etc.

For end-to-end tests, we must trigger the chain of events that will eventually trigger the Lambda function on our AWS account.

Lambda Functions are Not Included

Let's discuss use cases where there are no Lambda functions in your microservice.

Some examples come to mind: a Step Function state machine with intrinsic functions and EventBridge pipes that conduct an ETL process and send an event forward to an EventBridge bus.

How do you test them? You can't write unit or integration tests as you did for your Lambda functions; there's no code and entry point to trigger locally; it's all infrastructure configuration!

You should ask yourself, SHOULD I test these AWS-managed services or focus on the bigger picture: the event destination and side effects?

My answer is: Yes, all we can and SHOULD do is an end-to-end test.

We will write end-to-end tests, trigger the chain of events, validate its side effects, and ensure that the event reaches the end of the road properly as expected.

Let's look at some examples and see how to test the service.

Testing Asynchronous Messaging Flows

A pure asynchronous event-driven flow.

A Lambda function is involved so we will write unit and integration tests.

The integration tests will generate a List of records (see AWS SQS event example), call the lambda handler and validate its side effects.

The Lambda function's side effects might include writing to a DynamoDB table, calling an API, etc. In integration tests, you can mock the calls to these side effects locally and assert that they are called and called with the correct parameters.

It's also crucial in this case to make sure the Lambda function, which receives an SQS input event in the form of a list of events (a batch), does not raise an uncaught exception which causes the entire batch of events to get returned to the queue.

Be advised that in this case, we can't simulate any SQS-related side effects locally, such as:

Verify proper handling of partial failures.
Validate we properly configured an SQS dead letter queue.

So these side effects will be tested only by E2E tests.

For E2E happy flow, publish an SNS message to the SNS topic via AWS SDK and validate the function side effects ('put item' to a DynamoDB table, etc.). Another option is to poll the function log group, filter by the expected log timestamp and verify that the function has reached a specific log message that asserts the action has occurred.

For E2E failures, make sure to read every managed service-related documentation.

Amazon SQS has very informative documentation regarding handling partial failures and dead letter queue integration.

You can publish malformed events to the SNS topic and make sure they are sent to the preconfigured dead letters queue by waiting and fetching (with a timeout) messages from the dead letters SQS queue until the expected event is found (test passes) or a timeout occurs (test fails).

Testing Synchronous & Asynchronous Flows

API GW -> DynamoDB -> DynamodbDB streams -> Lambda

This use case contains synchronous and asynchronous flows: the first two being synchronous and the last being asynchronous.

An API Gateway gets a POST request and writes to a DynamoDB table without a Lambda function involved (uses an AWS Service proxy). The DynamoDB table change triggers a DynamoDB streams event that triggers a Lambda function.

You cannot test locally in the IDE that the API Gateway and verify it writes to DynamoDB.

You can only test with an end-to-end test that will send the required POST request and validate the side effect that happened and the item was written. You can do that by directly checking the DynamoDB table or by calling a GET item REST API on the API Gateway (assuming there is such an API).

Now, let's test the asynchronous flow.

A Lambda function is involved so we can write unit and integration tests.

The integrations test will generate an input event based on the DynamoDB stream event schema.

For errors simulation, you can inject an invalid input event data, call the Lambda function handler in the IDE, and ensure it handles it correctly. You can do that by mocking a specific error-handling function and asserting that it was called and was called with the correct parameters.

For the end-to-end tests, we will use the test for the synchronous part. Now, this is where it gets tricky. We need to understand what is the side effect of Lambda. We cannot use its return code as it triggered asynchronously and does not return a response we can get.

We will need to verify its side effect. If the function writes to the DynamoDB table, we will use a polling mechanism (within the limits of reason and set a short timeout and a few retries) and check the DynamoDB entry by calling the get API (if there's no API, we will use AWS SDK to look at the DynamoDB table directly). This method applies to any other side effects it does.

Another option is to poll the function log group, as done in the previous example and validate it has processed the event properly.

Please note that, if the side effect is tricky to validate in E2E flow, it must at least get validated in the integration tests. Once the integration test flow validates it, the log group assertion can be good enough.

Testing Non Lambda Based Asynchronous Flows

SQS -> EventBridge Pipes -> Step Function -> SNS

The EventBridge pipe definition:

In this case, we have a non-Lambda-based EventBridge pipe:

Sets source event from an SQS queue.
Filters the event according to a predefined configuration.
Enriches the event with a Step Function state machine that contains only intrinsic functions (no Lambda functions involved) and returns an enriched event.
Sends the enriched event to an SNS topic.

In this case, we use an AWS-managed Serverless service and write ZERO Lambda function code. Our "code" is the infrastructure configuration code that builds these resources on AWS:

The Step Function states machine
The SQS queue
The SNS topic
The EventBridge pipe itself
All the required roles and the configuration that ties the resources together

Since no Lambda function is involved, we can only write end-to-end tests.

We will need to write a test that puts a message in the entry SQS queue (that matches the pipe filter) and look at the end of the chain, the SNS topic.

In our test environment account (not production), we will add a test-only SQS queue to subscribe to the destination SNS topic. We will not add this SQS in production environments.

In our test, we will fetch messages from the test destination SQS until a message is received or a timeout is reached. The test will fail if we encounter a timeout on our polling, meaning the SNS message was not sent, or if we get a malformed SQS message that does not match the entire enriched message schema the EventBridge pipe was supposed to produce.

Another option for E2E tests is to individually test the Step Function state machine; trigger it with an input, wait for it to finish its run, and validate its output via AWS SDK API, whether it enriched the input event properly or not. You should also test use cases where invalid inputs get to a 'fail' state in the state machine.

A few observations:

While we didn't write ANY line of Lambda function code, we did write parts of logic in the infrastructure configuration code, whether the step function intrinsic steps or the EventBridge pipe filter part contains business domain schema logic.
We require fewer tests (no integration or unit), but I think the tests are less intuitive to write.
We can write only E2E tests that trigger AWS resources and run on our AWS account.