Testing in LocalStack

Test Types
Integration Tests
Parity Testing
Multi-account and Multi-region Testing
Terraform Tests

Rules for stable tests

Through experience, we encountered some guiding principles and rules when it comes to testing LocalStack. These aim to ensure a stable pipeline, keeping flakes minimal and reducing maintenance effort. Any newly added test and feature should keep these in mind!

ID	Rule
R01	Inform code owners and/or test authors about flaky tests by creating a PR skipping them (reason: flaky), so that they can be fixed ASAP.
R02	Do not assume external dependencies are indefinitely available on the same location. They can move and we need to adapt in the future for it.
R03	Where possible, tests should be in control of the resources they use and re-create them if removed.
R04	If on-demand creation is not possible, opt for a fail-fast approach and make retrieval failures clearly visible for further investigation.
R05	Add mechanisms to avoid access failures caused by rate limiting.
R06	Do not wait a set amount of time but instead opt for a reactive approach using notification systems or polling for asynchronous (long-lasting) operations
R07	For tests with multiple steps, handle waits separately and start each wait in the correct state.
R08	Ensure features interacting with account numbers work with arbitrary account numbers and multiple accounts simultaneously.
R09	Make sure that your tests are idempotent and could theoretically run in parallel, by using randomized IDs and not re-using IDs across tests.
R10	Ensure deterministic responses for anything that reaches an assertion or a snapshot match.
R11	Be vigilant about changes happening to dependencies that can affect stability of your added features and tests.
R12	Ensure all dependencies are available and functional on both AMD64 and ARM64 architectures. If a dependency is exclusive to one architecture, mark the corresponding test accordingly.
R13	After the test run, make sure that the created resources are cleaned up properly.
R14	Utilize fixture scopes for ensuring created resources exist as long as they should.

R01

Inform code owners and/or test authors about flaky tests by creating a PR skipping them (reason: flaky), so that they can be fixed ASAP. This way, the flaky tests are not blocking the pipeline and can be fixed in a separate PR. We also set the test author and/or service owner as reviewer to ensure that the test is fixed in a timely manner.

Anti-pattern

Noticing a flake
Ignoring it

Best practice

Noticing a flake
Creating a new PR skipping the test and marking it as flaky

@pytest.mark.skip(reason="flaky")
def test_xyz():
    pass

Setting test author and/or service owner as reviewer

R02

Do not assume external dependencies (AWS resources, files, packages, images, licenses) are indefinitely available on the same location. They can move, and we need to adapt in the future for it. This can be done by checking the status code of the response and reacting accordingly. Ideally, the test should be able to guide anyone to how to find the new location.

Anti-pattern

response = requests.get("http://resource.com/my-resource.tar.gz")
use_resource(response.content)

Best practice

response = requests.get("http://resource.com/my-resource.tar.gz")
if response.status_code == 404:
    further_steps()  # e.g. clear error message, potential documentation on where to find a new location, etc.
use_resource(response.content)

R03

Where possible, tests should be in control of the resources they use and re-create them if removed (e.g., S3 buckets, roles).

Anti-pattern

bucket = s3_client.get_bucket("test-bucket")
use_bucket(bucket)

Best practice

buckets = s3_client.list_buckets()
if "test-bucket" not in buckets:
    s3_client.create_bucket("on-demand-bucket")

bucket = s3_client.get_bucket("on-demand-bucket")
use_bucket(bucket)

R04

If on-demand creation is not possible, opt for a fail-fast approach and make retrieval failures clearly visible for further investigation. We should not proceed with the test if the resource is not available. This could lead to long-lasting loops with long log files and unclear error messages.

Anti-pattern

bucket = s3_client.get_bucket("test-bucket")
use_bucket(bucket)

Best practice

buckets = s3_client.list_buckets()
if "test-bucket" not in buckets:
    pytest.fail("Expected test-bucket to exist - it doesn't")

R05

Add mechanisms to avoid access failures caused by rate limiting. This can be done by adding exponential backoff or caching mechanisms. In some cases, rate limits can be avoided by using an authenticated request.

Anti-pattern

while True:
    response = requests.get("http://resource.com")
    if response.status_code == 429:  # Too many requests
        pass  # immediately try again
    else:
        use(response)

Best practice

cache = TTLCache(ttl=60)


@cached(cache)
def get_resource(url, token, retries=10):
    retry = 0
    while retry < retries:
        response = authenticated_request(url, token)
        if response.status_code == 429:
            time.sleep(2 ** retry)  # Exponential backoff
        else:
            return response


resource = get_resource("http://resource.com", "abdfabdf")
use(resource)

R06

Do not wait a set amount of time but instead opt for a reactive approach using notification systems or polling for asynchronous (long-lasting) operations. Waiting a set amount of time can lead to long test runs and flaky tests, as the time needed for the operation can vary.

Anti-pattern

create_resource()
time.sleep(300)
use_resource()

Best practice

create_resource()
poll_condition(resource_exists, timeout=60)
use_resource()

R07

For tests with multiple steps, handle waits separately and start each wait in the correct state. This way, the test can be more reactive and not wait for a set amount of time.

Anti-pattern

create_resource()
deploy_resource()
use_resource()

or

create_resource()
deploy_resource()
poll_condition(resource_deployed, timeout=60)
use_resource()

Best practice

create_resource()
poll_condition(resource_exists, timeout=20)
deploy_resource()
poll_condition(resource_deployed, timeout=60)
use_resource()

R08

Ensure features interacting with account numbers work with arbitrary account numbers and multiple accounts simultaneously. See here for further documentation for multi account/region testing.

Anti-pattern

Add new feature
Use it with fixed account number
Works -> done

Best practice

Add new feature
Use it with fixed account number
Works
Try with randomized account numbers (as in documentation
Works -> done

R09

Make sure that your tests are idempotent and could theoretically run in parallel, by using randomized IDs and not re-using IDs across tests. This also means that tests should not depend on each other and should be able to run in any order.

Anti-pattern

def test_something():
    key = "test-bucket"
    create_bucket(key)

def test_something_else():
    key = "test-bucket"
    create_bucket(key)

Best practice

def test_something():
    key = f"test-bucket-{short_uid()}"
    create_bucket(key)
    
def test_something_else():
    key = f"test-bucket-{short_uid()}"
    create_bucket(key)

R10

Ensure deterministic responses for anything that reaches an assertion or a snapshot match. This is especially important when you have randomized IDs in your tests as per R09. You can achieve this by using proper transformations. See here for further documentation on parity testing and how to use transformers.

Anti-pattern

snapshot = {"key": "key-asdfasdf"}  # representing snapshot as a dict for presentation purposes


def test_something(snapshot):
    key = f"key-{short_uid()}"
    snapshot.match(snapshot, {"key": key})

Best practice

snapshot = {"key": "<key:1>"}  # representing snapshot as a dict for presentation purposes


def test_something(snapshot):
    snapshot.add_transformer(...)  # add appropriate transformers
    key = f"key-{short_uid()}"
    snapshot.match(snapshot, {"key": key})

R11

Be vigilant about changes happening to dependencies (Python dependencies and other) that can affect stability of your added features and tests.

Anti-pattern

Add dependency
Forget about it
Dependency adds instability
Disregard

Best practice

Add dependency
Check weekly python upgrade PR for upgrades to the dependency
Keep track of relevant changes from the changelog

R12

Ensure all dependencies are available and functional on both AMD64 and ARM64 architectures. If a dependency is exclusive to one architecture, mark the corresponding test accordingly. However, if possible, try to use multi-platform resources.

Anti-pattern

def test_docker():
    docker.run(image="amd64-only-image")

Best practice

def test_docker():
    docker.run(image="multi-platform-image")

if above not possible, then:

@markers.only_on_amd64
def test_docker():
    docker.run(image="amd64-only-image")

R13

After the test run, make sure that the created resources are cleaned up properly. This can easily be achieved by using fixtures with a yield statement. This way, the resources are cleaned up after the test run, even if the test fails. Furthermore, you could use factory fixtures to create resources on demand and then clean them up together.

Anti-pattern

def test_something():
    key = f"test-{short_uid()}"
    s3_client.create_bucket(key)
    use_bucket(key)
    # bucket still exists after test run

Best practice

@pytest.fixture
def bucket():
    key = f"test-{short_uid()}"
    s3_client.create_bucket(key)
    yield key
    s3_client.delete_bucket(key)

def test_something(bucket):
    use_bucket(bucket)
    # bucket is deleted after test run

R14

Utilize fixture scopes for ensuring created resources exist as long as they should. For example, if a resource should exist for the duration of the test run, use the session scope. If a resource should exist for the duration of the test, use the function scope.

Anti-pattern

@pytest.fixture(scope="function") # function scope is default
def database_server():
    server = start_database_server()
    yield server
    stop_database_server()

@pytest.fixture(scope="function") # function scope is default
def database_connection(database_server):
    conn = connect_to_database(database_server)
    yield conn
    conn.close()

def test_insert_data(database_connection):
    insert_data(database_connection)
    # The database server is started and stopped for each test function,
    # leading to increased overhead and potential performance issues.

def test_query_data(database_connection):
    query_data(database_connection)
    # Similar issue here, the server is started and stopped for each test.

Best practice

@pytest.fixture(scope="session")
def database_server():
    server = start_database_server()
    yield server
    stop_database_server()

@pytest.fixture(scope="function") # function scope is default
def database_connection(database_server):
    conn = connect_to_database(database_server)
    yield conn
    conn.close()

def test_insert_data(database_connection):
    insert_data(database_connection)

def test_query_data(database_connection):
    query_data(database_connection)

Test markers

For tests, we offer additional markers which can be found in: localstack/testing/pytest/marking.py

Files

README.md

Latest commit

History

README.md

File metadata and controls

Testing in LocalStack

Rules for stable tests

R01

Anti-pattern

Best practice

R02

Anti-pattern

Best practice

R03

Anti-pattern

Best practice

R04

Anti-pattern

Best practice

R05

Anti-pattern

Best practice

R06

Anti-pattern

Best practice

R07

Anti-pattern

Best practice

R08

Anti-pattern

Best practice

R09

Anti-pattern

Best practice

R10

Anti-pattern

Best practice

R11

Anti-pattern

Best practice

R12

Anti-pattern

Best practice

R13

Anti-pattern

Best practice

R14

Anti-pattern

Best practice

Test markers