Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AmazonServiceException. The security token included in the request is expired (Service: AmazonChimeSDKMessaging; Status Code: 403; Error Code: ExpiredTokenException) #3648

Open
Orbyt opened this issue Dec 6, 2024 · 8 comments
Labels
question General question

Comments

@Orbyt
Copy link

Orbyt commented Dec 6, 2024

Describe the bug
We've suddenly begun receiving error reports via our error reporting software that many users are encountering the following exception when attempting to send a message via Chime:

AmazonServiceException

The security token included in the request is expired (Service: AmazonChimeSDKMessaging; Status Code: 403; Error Code: ExpiredTokenException; Request ID: .....)

This issue appeared suddenly about a week ago across multiple app versions in the wild and doesn't seem to be associated with any application changes we've made (recent changes have been completely unrelated to anything using AWS/Chime/Cognito/Amplify). The number of occurrences seems to be rising quickly. The exception appears to occur when users are hitting the send button in our app, which sends a message via Chime's SendChannelMessageRequest.

I have not found any relevant information regarding this error online, either related to what it signifies, why it began, or how to resolve it.

To Reproduce
We're unsure of the cause, or how to reproduce the issue. The issue seems to be occurring when a user attempts to send a message (i.e. sending a SendChannelMessageRequest via messagingClient.sendChannelMessage(request))

Which AWS service(s) are affected?
We are using AWS Chime, Cognito, and Amplify, via the following dependencies:

// Amplify
implementation("com.amplifyframework.ui:authenticator:1.2.0")
implementation("com.amplifyframework:aws-storage-s3:2.19.1")

// Chime
implementation("com.amazonaws:aws-android-sdk-chimesdkmessaging:2.35.0")
implementation("com.amazonaws:aws-android-sdk-chimesdkidentity:2.35.0")

Expected behavior
We're expecting no exception to occur.

Screenshots
Not applicable.

Environment Information (please complete the following information):

  • AWS Android SDK Version:
// Amplify
implementation("com.amplifyframework.ui:authenticator:1.2.0")
implementation("com.amplifyframework:aws-storage-s3:2.19.1")

// Chime
implementation("com.amazonaws:aws-android-sdk-chimesdkmessaging:2.35.0")
implementation("com.amazonaws:aws-android-sdk-chimesdkidentity:2.35.0")
  • Device: Seen on multiple device models, e.g. Samsung Galaxy S22, Pixel 6a, etc.

  • Android Version: Seen on multiple versions, Android 9, Android 13, Android 14

  • Specific to simulators: No

Additional context
Again, this issue appears to have begun occurring suddenly, and without any changes introduced by us. User's using older versions of our Android application also seem to have begun encountered the issue about a week ago, lending credit to the theory that this doesn't seem to be associated with any recent changes we've made to our application.

@github-actions github-actions bot added pending-triage Issue is pending triage pending-maintainer-response Issue is pending response from an Amplify team member labels Dec 6, 2024
@tylerjroach
Copy link
Member

Can you share code snippets in how you are configuring the Chime clients? Are you following the instructions to inject a custom credentials provider as shown here: https://docs.amplify.aws/gen1/android/sdk/configuration/amplify-compatibility/.

By default, the clients would use AWSMobileClient which is incompatible with Amplify v2 and would actually caused wiped credentials.

@tylerjroach tylerjroach added the question General question label Dec 6, 2024
@github-actions github-actions bot removed pending-maintainer-response Issue is pending response from an Amplify team member pending-triage Issue is pending triage labels Dec 6, 2024
@Orbyt
Copy link
Author

Orbyt commented Dec 6, 2024

Hey @tylerjroach, thanks for your prompt reply. Here's how the messaging client is being configured:

val CHIME_SDK_APP_INSTANCE_ARN="arn:aws:chime:us-east-1:xxxxxxxx"

lateinit var messagingClient: AmazonChimeSDKMessagingClient
lateinit var chimeUser: ChimeUser
lateinit var chimeUserCredentials: ChimeUserCredentials

/**
 * The primary channel the user uses to chat with their provider.
 */
lateinit var primaryChannelArn: String

private var session: MessagingSession? = null
private var sessionMessagingId: String? = null
fun fetchSession(queryForMessages: Boolean = true) {
    viewModelScope.launch {
        withContext(Dispatchers.IO) {
            Amplify.Auth.fetchAuthSession({
                val session = it as AWSCognitoAuthSession
                when (session.identityIdResult.type) {
                    AuthSessionResult.Type.SUCCESS -> {
                        val tempCreds = session.awsCredentialsResult.value as AWSTemporaryCredentials
                        chimeUserCredentials = ChimeUserCredentials(
                            accessKeyId = session.awsCredentialsResult.value!!.accessKeyId,
                            secretAccessKey = session.awsCredentialsResult.value!!.secretAccessKey,
                            sessionToken = tempCreds.sessionToken)
                         chimeUser = ChimeUser(
                             chimeDisplayName = "test",
                             chimeUserId = session.identityIdResult.value!!,
                             chimeAppInstanceUserArn = "${CHIME_SDK_APP_INSTANCE_ARN}/user/${session.userSubResult.value!!}")
                         initializeClient(chimeUserCredentials, chimeUser, queryForMessages)
                      }
                      AuthSessionResult.Type.FAILURE -> {
                          Log.w("AuthQuickStartAmplify", "IdentityId not found", session.identityIdResult.error)
                       }
                    }
                },
            {
                Log.e("AuthQuickStartAmplify", "Failed to fetch session", it)
            })
        }
    }
}
    
    
private fun initializeClient(credentials: ChimeUserCredentials,
                             chimeUser: ChimeUser,             
                             queryForMessages: Boolean) {      
                                                               
    messagingClient = AmazonChimeSDKMessagingClient(           
        BasicSessionCredentials(                               
        credentials.accessKeyId,                               
        credentials.secretAccessKey,                           
        credentials.sessionToken)                                                          
    ).apply { setRegion(Region.getRegion("us-east-1")) }       
                                                               
    if (queryForMessages) {                                    
        listChannelsForUser(messagingClient, chimeUser)        
    } else {                                                   
        startMessagingSession(null)                            
    }                                                          
}                                                              

Are you following the instructions to inject a custom credentials provider as shown here: https://docs.amplify.aws/gen1/android/sdk/configuration/amplify-compatibility/.

By default, the clients would use AWSMobileClient which is incompatible with Amplify v2 and would actually caused wiped credentials.

I'm not entirely sure that I understand your comments here. I don't see a AWSMobileClient reference in our application so I assume the answer to your second question is "no". We started using Amplify after v2 was released, and therefore aren't using anything related to v1 (assuming the v2 setup docs didn't reference v1 instructions). Would you mind elaborating on your questions? Hopefully the code sample I provided adds more context. I appreciate the assistance!

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 6, 2024
@tylerjroach
Copy link
Member

The recommended way to initialize the Chime clients would be to implement an AmplifyCredentialsProvider as shown here: https://docs.amplify.aws/gen1/android/sdk/configuration/amplify-compatibility/#creating-an-amplifycredentialsprovider.

The issue with the way your code is configured above is that it is taking credentials directly, and the client itself does not have the capability to refresh. When the token expires (by default, I think Cognito tokens only last 1 hour, I believe you would run into the issue you have posted. The token received in fetchAuthSession may not even be newly generated so it could even be expiring much sooner.

If you pass an implementation of AWSCredentialsProvider (such as the AmplifyCredentialsProvider we suggest) to the client constructor instead of AWSCredentials, the client will ensure it is using valid tokens and received the refreshed ones as needed.

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 6, 2024
@Orbyt
Copy link
Author

Orbyt commented Dec 6, 2024

@tylerjroach Thanks again for your prompt reply.

The recommended way to initialize the Chime clients would be to implement an AmplifyCredentialsProvider as shown here: https://docs.amplify.aws/gen1/android/sdk/configuration/amplify-compatibility/#creating-an-amplifycredentialsprovider.

Is this noted anywhere in the Chime documentation? Previous to your reply I had not read anything about a AmplifyCredentialsProvider, the single Chime example app doesn't have it. Regardless, as per your suggestion, I've attempted the following:

class AmplifyCredentialsProvider : AWSCredentialsProvider {

    override fun getCredentials(): AWSCredentials = runBlocking {
        suspendCoroutine { continuation ->
            Amplify.Auth.fetchAuthSession(
                { authSession ->
                    val awsTemporaryCredentials = (authSession as? AWSCognitoAuthSession)
                        ?.awsCredentialsResult?.value as? AWSTemporaryCredentials

                    val sdkCredentials = awsTemporaryCredentials?.let {
                        BasicSessionCredentials(it.accessKeyId, it.secretAccessKey, it.sessionToken)
                    }

                    if (sdkCredentials != null) {
                        continuation.resume(sdkCredentials)
                    } else {
                        val authException = RuntimeException("Failed to get credentials")
                        continuation.resumeWithException(authException)
                    }
                },
                {
                    continuation.resumeWithException(
                        RuntimeException("Failed to get credentials. See exception.", it)
                    )
                }
            )
        }
    }

    override fun refresh() = runBlocking {
        suspendCoroutine { continuation ->
            Amplify.Auth.fetchAuthSession(
                AuthFetchSessionOptions.builder().forceRefresh(true).build(),
                // We do not need to capture value if refresh succeeds
                { continuation.resume(Unit) },
                // We do not need to throw if refresh fails
                { continuation.resume(Unit) }
            )
        }
    }
}

Then, replacing the following:

// Replaced this:
// messagingClient = AmazonChimeSDKMessagingClient(
//     BasicSessionCredentials(
//         credentials.accessKeyId,
//         credentials.secretAccessKey,
//         credentials.sessionToken)
// )

// With this:
 messagingClient = AmazonChimeSDKMessagingClient(AmplifyCredentialsProvider())
            .apply { setRegion(Region.getRegion("us-east-1")) }

That change seems to break the integration entirely. Breakpoints in the refresh and getCredentials functions in AmplifyCredentialsProvider show those functions are never called. Is the above implementation incorrect? It was copied without changes from the documentation you referenced. Does something else need to be done to ensure those overridden functions are invoked?

The issue with the way your code is configured above is that it is taking credentials directly, and the client itself does not have the capability to refresh. When the token expires (by default, I think Cognito tokens only last 1 hour, I believe you would run into the issue you have posted. The token received in fetchAuthSession may not even be newly generated so it could even be expiring much sooner.

The original implementation I posted was called in a Fragment's ViewModel whenever the Fragment was created. I'm not sure what you meant regarding fetchAuthSession; why would credentials from a fresh call to fetchAuthSession already be expired? Those credentials are then immediately used to create a new instance of the messaging client, so even if the messaging client doesn't refresh it's own credentials internally, and instance of the messaging client should be using relatively fresh credentials as it would have been instantiated in the successful callback to fetchAuthSession.

As an aside, I'd like to reiterate that this issue only just recently started occurring (~10 days ago). The implementation I originally posted has been successfully used in production for months. If the issue was an implementation issue, why did this just suddenly start occurring? Could this have been caused by configuration changes upstream, or something of the like?

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 6, 2024
@tylerjroach
Copy link
Member

I'm not sure what you meant regarding fetchAuthSession; why would credentials from a fresh call to fetchAuthSession already be expired? Those credentials are then immediately used to create a new instance of the messaging client, so even if the messaging client doesn't refresh it's own credentials internally, and instance of the messaging client should be using relatively fresh credentials as it would have been instantiated in the successful callback to fetchAuthSession.

They would not be immediately expiring, but credentials from fetchAuthSession are cached. For example, lets say you call fetchAuthSession for the first time and the tokens have a 1 hr validity period.

Then 30 minutes from now (either through another fetchAuthSession call within the same session, or even a device restart), the next fetchAuthSession call is going to return those same credentials that are now only valid another 30 minutes.

You can actually force fetchAuthSession to get brand new credentials by adding forceRefresh = true. https://docs.amplify.aws/gen1/android/build-a-backend/auth/accessing-credentials/#force-refreshing-session

That change seems to break the integration entirely. Breakpoints in the refresh and getCredentials functions in AmplifyCredentialsProvider show those functions are never called. Is the above implementation incorrect?

What happens when you make a call from within the client. You wouldn't expect to see credentials fetched until a method from within the client is called. What happens when you call one? Do you get an error message? Can you show the code for listChannelsForUser as an example?

I'm not sure why it would begin having issues all of a sudden. If continued issues persist, it may be necessary to reach out to the Chime service team, as my team is only responsible for the SDK (which in this case has not changed). I'm just trying to rule out any configuration issues, and noticed the potential that tokens could expire (which would line up with the 403 code).

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 6, 2024
@Orbyt
Copy link
Author

Orbyt commented Dec 10, 2024

They would not be immediately expiring, but credentials from fetchAuthSession are cached. For example, lets say you call fetchAuthSession for the first time and the tokens have a 1 hr validity period.

Then 30 minutes from now (either through another fetchAuthSession call within the same session, or even a device restart), the next fetchAuthSession call is going to return those same credentials that are now only valid another 30 minutes.

You can actually force fetchAuthSession to get brand new credentials by adding forceRefresh = true. https://docs.amplify.aws/gen1/android/build-a-backend/auth/accessing-credentials/#force-refreshing-session

Interesting! I was under the impression they were refreshed automatically.

What happens when you make a call from within the client. You wouldn't expect to see credentials fetched until a method from within the client is called. What happens when you call one? Do you get an error message? Can you show the code for listChannelsForUser as an example?

That is what I expected, but you're right, it appears the credentials call is not invoked on client instantiation, but rather when some of functionality needs it (e.g. loading messages).

I'm not sure why it would begin having issues all of a sudden. If continued issues persist, it may be necessary to reach out to the Chime service team, as my team is only responsible for the SDK (which in this case has not changed). I'm just trying to rule out any configuration issues, and noticed the potential that tokens could expire (which would line up with the 403 code).

I'm not familiar with how long lived those tokens are expected to be. This issue started occurring in late November, about a month (and a day or two), after our initial public release. Is it possible the original tokens had a one month validity period, and are just now starting to expire? I will look into contacting Chime support; I assume that is done via an AWS Support account as per this link, but please let me know otherwise.

I've refactored our code replacing fetchAuthSession with the new AmplifyCredentialsProvider discussed earlier. One of the problems that has arisen is that Chime, as per the Chime example app available here , requires a ChimeUserCredentials object (a simple data class containing a accessKeyId, secretAccessKey, and sessionToken) for various purposes, including establishing a web socket connection. For example, those credentials are used here to set a X-Amz-Security-Token. If the AmplifyCredentialsProvider can refresh tokens at will (or rather, can be used to refresh as needed), how does one ensure those values are properly updated in an e.g. DefaultMessagingSession?

That example app has not been updated in some time, so I assume it may be severely outdated and not in line with some of the other Chime documentation. Please let me know if I should be looking elsewhere!

@github-actions github-actions bot added the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 10, 2024
@Orbyt
Copy link
Author

Orbyt commented Dec 10, 2024

As an aside:

You can actually force fetchAuthSession to get brand new credentials by adding forceRefresh = true

Would adding that option to the fetchAuthSession I originally posted sufficiently resolve this problem? It seems the refresh() implementation in the AmplifyCredentialsProvider simply performs a call to Amplify.Auth.fetchAuthSession with .forceRefresh(true), so I'm wondering if simply making that change would be more prescient then swapping the entire implementation to use the new AmplifyCredentialsProvider. Would there be any notable drawbacks in doing that?

@tylerjroach
Copy link
Member

It looks like there may be a newer Android Chime SDK here: https://github.com/aws/amazon-chime-sdk-android. It may be worth exploring that.

Interesting! I was under the impression they were refreshed automatically.

The token will be refreshed automatically. This happen on a call to fetchAuthSession where we detect the token is expired (or close to expiration). You can just pass the forceRefresh flag to force it.

I'm not familiar with how long lived those tokens are expected to be.

You can check access token validity length in the Cognito console.

As far as the sample app goes, our team maintains Amplify and the AWS Android SDK. The Chime team would own the sample apps. I would recommend reaching out via the link you provided, as well as requesting that the sample app be updated to Amplify v2 as Amplify v1 is in maintenance mode and no longer supported. You may want to create a GH issue here: https://github.com/aws-samples/amazon-chime-sdk/tree/main/apps/chat-android

@github-actions github-actions bot removed the pending-maintainer-response Issue is pending response from an Amplify team member label Dec 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question General question
Projects
None yet
Development

No branches or pull requests

2 participants