-
Notifications
You must be signed in to change notification settings - Fork 34
Re-subscribing to a chid that was recently unsubscribed gives code 410 errno 103 #1445
Comments
From a user, it seems like we're hitting this in non-FxA applications as well: mozilla-mobile/fenix#15028 (comment) |
I wonder if this is connected a little bit to the behaviour in #1444, in which I wrote:
Perhaps this is now evidence of "devices that seem to be in such a state in the wild", since "UAID not found" is exactly the sort of error message I'd expect for a client whose uaid record had been discarded by the server. |
FWIW, https://sql.telemetry.mozilla.org/queries/73067#183005 is showing send tab success nose-diving on Android - which prompted me to see how it works for me, and logcat shows what Ryan was getting at:
|
If this started happening in the last week or so, it's probably because on Android we see the I'm guessing the send tab metrics may have been reporting it as a success but the messages never reached the device and silently dropped. |
In theory, if the |
On the other hand, if the uaid record had been discarded by the server, I would expect the call to |
Yes, it's why I gave the bug the weight I did. There's going to be a bit of investigation to see exactly what's going on here and I'm not sure how easily I can replicate what's happened. The database appears to have gotten into a strange state. |
If it's helpful, we're tracking the exceptions from the unsubscribe here in nightly here: https://sentry.prod.mozaws.net/operations/firefox-nightly/issues/10175063/ |
The sentry breadcrumbs tell us that it is in fact the unsubscribe call that's failing, which makes more sense. |
I believe it's the responsibility of the rust code to handle the "missing uaid" case better, and have filed mozilla/application-services#3751 to follow up. |
Doing some research on this, I am not able to reproduce it locally. One kind of dumb thought is that if the CHID isn't included and the client sends a In any case, I see that the number of subscription errors seems a lot lower recently. |
@jrconlin I left a comment on that Sentry exception that I'll copy here:
We reverted our changes that found the error during the holiday break since we didn't have a solution. We do still have devices that are stuck in this state as well. |
Yeah, I get that. FWIW, I was working with someone outside trying to resolve an issue and was seeing something kind of odd. I cataloged it here: mozilla-mobile/android-components#9426 but I wonder if it might be related? Basically, i wonder if firefox might be getting an old FCM id for some weird reason, which would get rejected by FCM on the server side, leading to the 410. Kinda gets back to the whole "FCM is a weird black box" and I'd love to get away from it, but that's just not happening anytime soon. |
I also have a device that is stuck with a |
I am contributing here, as many related issues are closed with reference to this issue. Samsung Galaxy S9 / Android 10 / OneUI 2.5
Please let me know if anything else is needed from my side or if I should file a new issue. |
@wurstsemmel Thanks for this, but I think it would be better posted on the fenix repo. This repo is the back-end service that sends the notifications to desktop and mobile devices. Thanks! |
Is this the same bug mozilla-mobile/fenix#19152 My device logs (which I have just attached to that issue) have the same message I am using Firebase Messaging Web library and had been repeatedly subscribing and toggling notification settings for testing, if that's relevant. Happy to participate in any logging/debugging |
I don't think this is the same issue - if I'm reading the client code correctly, the check for a |
I'd have to dig in to the code a bit. I can't think of a reason off hand that you should get a non-error registration response that doesn't have a UAID in it, unless there might be something that's causing the UAID value to be unreadable by the client (e.g. dashes can sometimes be a problem, if I recall. The code tries to return the same value format that it got, but something could be tripping it up.) |
In Android Components, we recently landed a change that attempts to fix an expired FxA push subscription when the FxA server sets the flag
subscriptionExpired
.The fix is to perform an "unsubscribe" and then a "subscribe" with the same chid. However, we're seeing logs where the subscribe fails with a code 410 errno 103 ("Expired URL endpoint") after a successful unsubscribe:
This is the code that is executed to show the logs above.
A breakdown of the logs:
Un-subscribing successful: true
is when the Android code executes the native Rust component'sunsubscribe
.Subscribe call should give you a new endpoint.
is the native result coming back as true (as seen in the previous line).Re-subscribing failed; FxA push events will not be received.
is when receive aPushError
from the native component's wrapper.Since we have received a result from the unsubscribe that it was successful, the subscribe call to the same chid should work as well provide us with a new endpoint.
Sentry (Nightly): https://sentry.prod.mozaws.net/operations/firefox-nightly/issues/10175063/
The text was updated successfully, but these errors were encountered: