-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Hack Week] A generic solution for resolving the ANRs caused by accessing local database #20937
Conversation
Generated by 🚫 Danger |
Quality Gate passedIssues Measures |
📲 You can test the changes from this Pull Request in WordPress by scanning the QR code below to install the corresponding build.
|
📲 You can test the changes from this Pull Request in Jetpack by scanning the QR code below to install the corresponding build.
|
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## trunk #20937 +/- ##
==========================================
- Coverage 40.93% 40.93% -0.01%
==========================================
Files 1518 1519 +1
Lines 69550 69554 +4
Branches 11473 11473
==========================================
Hits 28473 28473
- Misses 38491 38495 +4
Partials 2586 2586 ☔ View full report in Codecov by Sentry. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your work on this @jarvislin 🙇
I've tested the specific use case and it works as expected for me 🎉
The introduced AsyncHandler
helper makes a lot of sense and adds structure to the code especially when calling from java
🥇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this PR has already been merged, but looking at it I felt it could be worth a post-merge Review to maybe rethink if this makes sense to be kept as-is.
I understand this PR is more of a quick solution to some of the ANR problems we have, but I have concerns about its rationale being a bit too simplistic and potentially introducing other issues that degrade user experience and are harder to track (non-crashing/behavioral bugs).
Besides the technical concerns I pointed below in the code comments, I feel like each place in the codebase that currently calls DB/network code from the main thread needs to be investigated individually, and the long-term solution for most of them is likely some refactor to work correctly with async and concurrent calls.
That said it might make sense to still have some quick fix like this for some stuff, but we need to be aware of the risks.
*/ | ||
@JvmStatic | ||
fun <T> load(backgroundTask: () -> T, callback: AsyncTaskCallback<T>) { | ||
CoroutineScope(Dispatchers.IO).launch { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a few concerns about the usage of Coroutines here.
CoroutineScope
The CoroutineScope
is being created ad hoc instead of using an existing Scope or Context/Job, which is usually attached to some lifecycle-aware component. This is essentially the same as using the GlobalScope
, which should be avoided.
This means that the Job
launched here is completely standalone, bypassing one of the biggest advantages introduced by coroutines: structured concurrency.
In practice, this means that even if the caller of the coroutine dies, the callback lambda will still be called (since the Scope is never canceled) which can cause issues if said callback references something that was destroyed.
TBH I'm not sure what would be the best solution here, but using an existing Scope that's close to the component that launches this Job would be preferred, all-in-all, thinking about this helper class, I'd say receiving the CoroutineScope
via argument would make more sense so the caller can pass an appropriate Scope.
Hardcoded Dispatchers.IO
This is another thing that should be avoided, not only because it's hard to test, but also because, based on the name of this class (async TASK), it looks like this is supposed to be a generic "offload-to-background" helper.
This means that passing a backgroundTask
that runs computational work could be expected, but that kind of work should be run in the Dispatchers.Default
, while Dispatchers.IO
should be used only for IO/waiting tasks. More info here.
The solution for this should probably be to receive the CoroutineDispatcher
as an argument since the caller would be the one who knows what kind of task is being run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thomashorta Thank you for pointing out the potential issues in the current implementation.
I'd say receiving the CoroutineScope via argument would make more sense so the caller can pass an appropriate Scope.
Yeah this is also the way to handle the lifecycle stuff properly, will update it later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the Hardcoded Dispatchers.IO, I'm not sure if I should receive the CoroutineDispatcher as an argument since the initial idea was to address issues related to accessing the database, opening it up to all dispatchers seems a bit over-engineered to me.
Therefore, I propose we keep this part as is and handle it through renaming for now. If a future need arises, we can then add the dispatcher argument. What do you think?
P.S. I think this helper is more related to the legacy code, especially in Java. Those codebase write in Kotlin should be able to handle the requirements without using this helper. So the code will use this helper might be less than we expected (and will be less after some refactors).
final boolean isAskingToFollow = !ReaderTagTable.isFollowedTagName(currentTag.getTagSlug()); | ||
|
||
final String slugForTracking = currentTag.getTagSlug(); | ||
AsyncTaskHandler.load( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand the usage here was more of an example of how to use this new AsyncTaskHandler
but in this case (and I guess many other cases that are currently using DBs/network in the main thread) I feel the legacy code is in a bad state and the real fix should be a complete refactor.
For instance: this is a RecyclerView Adapter and it's dealing with data loading, which is clearly a bad practice, so a more coherent solution would be refactoring how the data is updated in the Adapter, probably moving the data loading to the appropriate ViewModel
(which would have access to a proper CoroutineScope and ways of doing async work) and propagating the list state correctly to the Activity/Fragment containing the Adapter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My approach is to first address the root cause with minimal changes and then look for opportunities to refactor. Refactoring involves more extensive changes, and dealing with legacy code introduces even more uncertainty. Therefore, I avoid making major changes right away unless I'm very confident or have plenty of time. It's a bit like a philosophical trade-off, but I agree that these related files need a complete refactor. In the long run, I believe the final outcome will be consistent.
@thomashorta Thank you so much for reviewing this PR. I plan to create a follow-up PR to handle the issues in the current implementation. And I just updated the section of |
This is a Hack Week project, this PR also fixes #20850 for demonstrating how to use the helper to update the current codebase.
Why
There are a number of Android ANR errors reported on Sentry, and some of these ANRs are caused by local database access. The likely cause is that some code is accessing the local database without switching to the background thread, which can lead to ANRs in certain situations.
To address this issue, I have created a generic helper that can be used to easily handle thread switching. This helper can be used by anyone who encounters a similar situation in the future.
P.S. This helper is not limited to accessing the database; this helper can also be used for any requirement that needs to be executed on a IO thread and then switched back to the main thread.
Use with Caution
This helper looks convenient, but the goal is not to use it extensively. Good code should have a well-designed architecture and proper implementations, which wouldn't require this helper. It's meant for quick fixes in the code's incorrect implementations. However, in the long run, we should aim for a complete refactor.
Note
While working on #20851, I found that the solution to modify the original blocking operation to thread switching introduced a race condition that caused UI errors. The culprit was a piece of code operating on another thread. This means that the original code needs to be refactored further to avoid this issue. It's crucial to carefully check for potential race conditions when dealing with similar problems.
To Test:
Reader
tab.Regression Notes
Potential unintended areas of impact
What I did to test those areas of impact (or what existing automated tests I relied on)
What automated tests I added (or what prevented me from doing so)
PR Submission Checklist:
RELEASE-NOTES.txt
if necessary.Testing Checklist (strike-out the not-applying and unnecessary ones):
skipped