Total CPU time increasing for high numbers of futures workers #876

sciaba · 2023-08-07T14:44:16Z

sciaba
Aug 7, 2023

I noticed that if I run the AGC workload for increasing numbers of Python futures workers on a node with 128 CPU cores and reading from the local file system, the total CPU time for the entire process tree as measured by PrMon starts increasing for high numbers of workers (~30 or more). Ideally, it should remain constant, if the CPU time depended only on the amount of data to process. A similar thing happens with the RDF version, so it's not specific to Coffea, at least qualitatively (the causes might be different). Any idea about possible causes for this overhead?
I am attaching a plot comparing the CPU time to the processtime (as reported by Coffea) and to the wallclock time of the entire AGC workload multiplied by the number of workers:

lgray · 2023-08-07T14:52:22Z

lgray
Aug 7, 2023
Maintainer

@sciaba My first inclination is to think this has something to do with dask (if that's what you're using!) and the way to communicates with workers. It might be worth cross post on their github (with a bunch of details about the setup) and see if anyone comes back with some immediate answers.

14 replies

lgray Aug 7, 2023
Maintainer

Crank the CPU affinity way up and see what happens?

lgray Aug 7, 2023
Maintainer

Data cache thrashing does make a bunch of sense.

lgray Aug 7, 2023
Maintainer

Could also be TLB misses from context switching all the time. Pinning processes to cores would fix that.

sciaba Aug 7, 2023
Author

I'd be happy to try but I don't know how... (in case you can help me with that send me a DM not to pollute the thread!).

lgray Aug 9, 2023
Maintainer

@sciaba taskset -pc <cpu-number> <pid> :-)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Total CPU time increasing for high numbers of futures workers #876

{{title}}

Replies: 1 comment 14 replies

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

{{title}}

Select a reply

Total CPU time increasing for high numbers of futures workers #876

sciaba Aug 7, 2023

Replies: 1 comment · 14 replies

lgray Aug 7, 2023 Maintainer

lgray Aug 7, 2023 Maintainer

lgray Aug 7, 2023 Maintainer

lgray Aug 7, 2023 Maintainer

sciaba Aug 7, 2023 Author

lgray Aug 9, 2023 Maintainer

sciaba
Aug 7, 2023

Replies: 1 comment 14 replies

lgray
Aug 7, 2023
Maintainer

lgray Aug 7, 2023
Maintainer

lgray Aug 7, 2023
Maintainer

lgray Aug 7, 2023
Maintainer

sciaba Aug 7, 2023
Author

lgray Aug 9, 2023
Maintainer