Asyncio #210

michaelweiser · 2022-03-31T14:05:50Z

These changes switch the rest of Peekaboo beyond the sanic server to asyncio to avoid the symptoms stemming from interaction of threading and asyncio we've seen (e.g. #205). Multiprocessing is maintained through threadpool.

This is meant for discussion and inclusion early in the 2.2 cycle to give us time for testing and improvements.

Currently Peekaboo uses classic threading for the work queue and toolbox modules, particularly Cuckoo and Cortex job trackers. The server now uses asyncio. This combination throws up problems such as signal handling needing to be done by the event loop to work reliably but the event loop getting blocked by toolbox module startup not being async. Instead of working around these symptoms we fully switch to asyncio: - Threads can become asyncio Tasks. - Shutdown threading.Events can be replaced by simply cancelling the tasks. - Locking mostly goes away. The Queue still needs to lock accesses to the duplicates logs at least where asyncio awaitables are awaited and may yield from the critical section modifying the log. - sqlite (still serialised through a 1-connection QueuePool) gives no deadlocks/database is locked errors any more. This state works but has a number of issues that are to be solved in succeeding commits: - Because of the elimination of threads we no longer scale beyond one CPU core. - requests still needs replacing by aiohttp to become non-blocking. - We only get to see any exceptions thrown in tasks created by asyncio.ensure_future() when retrieving their result with await on close_down(), i.e. on shutdown. Handing all tasks to a single asyncio.gather() in the daemon might remedy that.

This change restores multiprocessing and scaling beyond a single CPU core by using a concurrent.futures.ThreadPoolExecutor. This allows running blocking code in a separate thread while internally using a queue to implement a pool - basically what our Queue was doing before asyncio. We add this to the sample for calculating the sha256sum and dumping processing info and the ruleset where we use it to delegate oletools and filetools to executors.

The Cuckoo toolbox module uses requests which does not interact well with the asyncio event loop. These changes make the Cuckoo module use aiohttp instead of requests. For retries we switch to tenacity.

The Cortex toolbox module used to use cortex4py and was recently converted to requests. Both do not interact well with the asyncio event loop. Therefore we switch to aiohttp instead of requests. Retries are implemented using tenacity instead of a requests Retry policy.

Python 3.6 does not have asyncio.get_running_loop(). At the time, calling asyncio.get_event_loop() was the only way to get at the running event loop. Unfortunately this has the side-effect of silently starting a new loop if there is none. We want to know if that were to happen, either through an exception or follow-up errors. So we intentionally use asyncio.get_running_loop. Now we monkey-patch it into asyncio if not present based on the code of Python 3.8.

Adjust the testsuite (again) to the asyncio changes.

Our running of workers, job trackers and the cluster duplicate handler as asyncio tasks shows any unexpected error conditions by way of exceptions only on shutdown. This was also the case with classic threading and threads (and now tasks) would die silently, stop functioning and even prevent regular shutdown. To finally remedy this situation we make all startup routines that create asyncio tasks return those tasks so we can build a list of awaitables from which to gather any error conditions immediately when they happen. A single call to asyncio.gather with these awaitables now becomes the central serialisation point in the daemon instead of PeekabooServer.serve(). The latter doesn't even make it into the list of awaitables since asyncio servers in general and Sanic servers in particular run fully enclosed in the event loop and do not expose any means to gather their error conditions. Here we need to continue to rely on Sanic's built-in error handling and logging (for now). This restructuring now also allows us to fully start up the server before reporting readiness to systemd. Logging is restructured and extended to make every component report startup and shutdown in a similar fashion.

Jack28

LGTM

michaelweiser added the enhancement label Mar 31, 2022

michaelweiser added this to the 2.2 milestone Mar 31, 2022

michaelweiser requested a review from Jack28 March 31, 2022 14:05

michaelweiser self-assigned this Mar 31, 2022

michaelweiser force-pushed the asyncio branch 2 times, most recently from 3ec2736 to 693cbfe Compare April 6, 2022 09:42

michaelweiser force-pushed the asyncio branch from 693cbfe to 793c764 Compare April 14, 2022 07:15

michaelweiser force-pushed the asyncio branch from 793c764 to 152ec3d Compare May 18, 2022 11:06

michaelweiser added 6 commits May 25, 2022 12:43

cuckoo: Switch module to asyncio

835b119

The Cuckoo toolbox module uses requests which does not interact well with the asyncio event loop. These changes make the Cuckoo module use aiohttp instead of requests. For retries we switch to tenacity.

testsuite: Sync up with asyncio support

e367a5d

Adjust the testsuite (again) to the asyncio changes.

michaelweiser force-pushed the asyncio branch 2 times, most recently from 4189e79 to 51ccbd4 Compare May 25, 2022 12:50

michaelweiser mentioned this pull request May 30, 2022

Knowntools efficiency improvements #217

Closed

Jack28 approved these changes Jun 1, 2022

View reviewed changes

michaelweiser merged commit b2734b9 into scVENUS:master Jun 2, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Asyncio #210

Asyncio #210

michaelweiser commented Mar 31, 2022

Jack28 left a comment

Asyncio #210

Asyncio #210

Conversation

michaelweiser commented Mar 31, 2022

Jack28 left a comment

Choose a reason for hiding this comment