project.py: Add new 'joblib' dependency to parallelize update #713

RobertGalatNordic · 2024-06-11T08:00:03Z

The projects from manifests were updated one by one. The parallel approach has potential to fully utilize CPU and network connection to speed up the update.

Needs more testing, I checked only few common cases and it seems fine.

Performance tests:
west init -m https://github.com/nrfconnect/sdk-nrf --mr main
time west update

real 14m35.399s
user 6m2.368s
sys 1m34.562s

system and network load during update:

Clean up and replace west for version from this PR
rm -rf <.west and all downloaded repositories>
pip3 uninstall west
cd <west_source_dir>
pip3 install -e .
cd ..

repeat the test
west init -m https://github.com/nrfconnect/sdk-nrf --mr main
time west update

real 6m34.408s
user 9m25.977s
sys 2m7.339s

system and network load during parallel update:

As we can see, the result is substantially faster. The same operation finishes in less than half of the original time (from 14.5 minutest down to 6.5 minutes).

ktaborowski

looks good to me

RobertGalatNordic · 2024-06-11T11:51:06Z

Another tip for faster fetch would be to use ssh instead of http.
By adding the following config to git it will automatically translate the http to ssh, for github repos.
git config --global url.ssh://git@github.com/.insteadOf https://github.com/

marc-hb · 2024-06-11T16:12:02Z

There are a few test failures in https://github.com/zephyrproject-rtos/west/actions/runs/9462265663/job/26086666688?pr=713, please take a look at "Running the tests" in the README.rst

Ignore the macOS failures, unrelated. Fixed in #712.

But more importantly, unless I'm confused (please correct me) I believe there was a previous attempt to do this and it had to be reverted because of terminal issues on Windows, see 2nd and abandoned attempt in #551. Were you aware of it? This new #713 attempt looks surprisingly much shorter: sounds too good to be true.

@M1cha can you take a look at this #713?

RobertGalatNordic · 2024-06-12T08:36:32Z

I did have a minor copy-paste bug, The tests ware very helpful to catch this.
I rerun the tests locally, and got a green mark. So I hope this time it will be OK.

I looked into the previous attempt, and my approach seems simpler because I used a library that has been designed to be a drop-in replacement to for loops.

marc-hb

I looked into the previous attempt, and my approach seems simpler because I used a library that has been designed to be a drop-in replacement to for loops.

Ha! That explains. Re-using a higher-level library: great idea. This makes the PR very short indeed, nice job.

So this is actually the most important change in this PR: the addition of a new dependency! This must always be performed very carefully because it carries maintenance and security risks https://cacm.acm.org/practice/surviving-software-dependencies/

This new dependency will also break all existing users, so its name must be at the very least visible in the commit title (and PR title). Something like: "Add new 'joblib' dependency to parallelize update". (while at it don't forget to fix the spelling of "paralelize")

I had a quick look at joblib and it seems like a serious and actively maintained library; nice pick!

Nevertheless, joblib cannot be immune to weird terminal issues. Does it have workarounds for all Windows quirks? There is only one way to tell: testing. So please state clearly in the commit message all the operating systems and terminals you've tested this with. This must include Powershell and CMD.EXE which is apparently still very important:

project: overwrite stdout/stderr objects as well #533 (comment)

Don't forget to make sure colors are still working:

Parallel update #551 (comment)

Last but not least, you must provide some -j 1 option. Not just for optimizing and fine-tuning the number of threads but more importantly for the ability to turn that feature completely OFF for two important reasons:

Debugging git configuration and network issues.
As a workaround in case someone has weird terminal or other issues.

I have debugged countless build issues and whether it's make, ninja, west or anything concurrent, the very first step is always -j 1 to stop relevant errors from being drowned by concurrent threads. Even when debugging race conditions or other concurrency issues, -j 1 is still a critical datapoint and reference for comparison purposes.

For those reasons, 1. requires unbuffered git output, see --help string in #551. Can joblib do that? If not then you need to add a "joblib-free" code path. Automatically falling back on this joblib-free code path when the dependency is missing would be the icing on the cake (but not required IMHO)

marc-hb · 2024-06-12T17:57:29Z

I did have a minor copy-paste bug, The tests ware very helpful to catch this.

Yes, tests save lives. Speaking of which: make sure all the existing tests stick to -j 1 by default. Again, the last thing we want is concurrency when debugging some totally unrelated issue. You should also add at least one test using a parallel update for minimal test coverage.

(pytest should run tests in parallel but that's a completely different story)

For those reasons, 1. requires unbuffered git output, see --help string in #551.

If joblib provides terminal output ONLY after a job is done, then this means west update will now be surprisingly quiet for a possibly long time. This could misled users into thinking they have a network issue or some proxy misconfiguration when git is actually running fine. So now I'm actually wondering whether -j 1 should maybe stay the default, among other benefits it would avoid disorienting old users. It would also make the new dependency "more optional". Either way, when all output is buffered then print a message like "parallel update started, be patient..."

Another tip for faster fetch ...

BTW: performance How long things take

marc-hb · 2024-06-12T18:07:10Z

Now that memories are coming back I feel like I just started repeating the previous discussions on this topic. To avoid such a repeat, please review all past discussions in #533, #551 and the ones before that. It's a big long sorry but it will be shorter than repeating them.

RobertGalatNordic · 2024-06-18T09:06:39Z

rebased on main,
added detection if joblib is available (if not use old single core approach),
added -j [JOBS] parameter to west update ( -j will use all cores, same as -j -1)
when joblib is not installed or -j 1 has been passed or -j has not been passed at all, the old single thread approach with for loop will be used.
Updated tests to add parametrization to test cases

-j 1 - same as without -j
-j - use all threads
-j -1 - use all threads
-j 2 - use two cores

RobertGalatNordic · 2024-06-18T09:21:29Z

I did not test those changes on Windows, as I do not work on that platform. I think that if there is an issue there, we could remove this feature for that platform, or maybe someone more experienced on Windows could help us. But as I mentioned previously, I used off the shelf library to do the heavy lifting, and it is possible that we could delegate any new issues to joblib project

stale review

marc-hb

Thank you, I didn't review in detail yet but this is looking better and better.

All tests are failing, hopefully only codestyle issues. Can you please fix? Do you have any issue running tox on your system? I added some tox tips in the README.md a while ago.

I did not test those changes on Windows, as I do not work on that platform.

I'm sorry but you must find a Windows system or "Windows friend" to smoke test this PR with at least -j1 and not, in both CMD.EXE and Powershell. The previous attempt went through a couple rounds of reverts and we really don't want to do that again. Maybe ask on Discord? There is a #west channel there; also try all the CI and infrastructure channels. People there may not use Windows but they may desire this enough to go and find a "Windows friend" of theirs :-)

I think that if there is an issue there, we could remove this feature for that platform,

Yes: for instance we could default to -j1 on Windows, at least initially! This would make it so much easier to find "Windows friends" willing to test it: now they don't have to fetch a PR to test. This is exactly what I strongly advocated in the previous attempt but I was unfortunately overruled.

src/west/app/project.py

The projects from manifests were updated one by one. The parallel aproach has potential to fully utilize CPU and network connection to speed up the update. Signed-off-by: Robert Gałat <robert.galat@nordicsemi.no>

RobertGalatNordic · 2024-06-19T13:54:21Z

I tested it on Windows in Power shell.

When using west update or west update -j 1 the results are as expected, nice clean output.

On the other hand, when more threads are used, the output gets a little messy.
It seems like the multiple threads push to stdout at the same time, and some of the lines are placed in the middle of other lines.

I can see how this could be fixed, but it will require a lot more refactoring in the self.update(project) which does all the job the output is generated and pushed to stdout with print statements, or it calls the git commands without catching its output, We would have to collect all this output and return it from the self.update(project) function.

So just to summarize, the default behavior is the same as before this PR, so you have to explicitly state that you want to use multiple threads for update. On Windows the output with multiple threads is messy, but when single thread is used there is no change compared to version from main.

On Linux the output looks a little better, as the lines are not interrupted, but you see lines from multiple processes one under another.

IMO this is not a big deal, to make it right we would have to collect all the output, and print it only after the job is completed, this would cause some idle time for the user (there is a package to provide progress bar for joblib so this could solve our issue of idle output for user https://pypi.org/project/joblib-progress/).

Furthermore, there is always option to not use this, and I imagine that this feature will be used in automation scripts, where we do not care about the output, just on the time.

marc-hb · 2024-06-19T15:17:28Z

I believe there was a previous attempt to do this and it had to be reverted because of terminal issues on Windows, see 2nd and abandoned attempt in #551. Were you aware of it? This new #713 attempt looks surprisingly much shorter: sounds too good to be true.

I looked into the previous attempt, and my approach seems simpler because I used a library that has been designed to be a drop-in replacement to for loops.

Ha! That explains. Re-using a higher-level library: great idea. This makes the PR very short indeed, nice job.
[...]
Either way, when all output is buffered then please print a message like "parallel update started, be patient..."

On the other hand, when more threads are used, the output gets a little messy.
It seems like the multiple threads push to stdout at the same time, and some of the lines are placed in the middle of other lines.

I can see how this could be fixed, but it will require a lot more refactoring in the self.update(project) which does all the job the output is generated and pushed to stdout with print statements, or it calls the git commands without catching its output, We would have to collect all this output and return it from the self.update(project) function.

Wait, so you're saying joblib/loky have no ability to capture terminal/standard outputs?

I skimmed their APIs and I found joblib/loky#306 which all seem to confirm they indeed don't care about standard outputs. Their focus is parallel computation, which I bet uses standard outputs rarely - only when something goes wrong.

This is a MAJOR disappointment :-(

All this time I was confused and assumed joblib/loky offered this ability to capture standard outputs. I thought this was why the joblib solution was much shorter. Now I understand it is shorter because it does not try to capture outputs and just let them pass through.

I wouldn't mind a first shot that does not capture outputs (others may disagree) but there would need to be very clear path towards implementing this in the future and I see none with joblib right now; it looks like a dead-end.

Now that memories are coming back I feel like I just started repeating the previous discussions on this topic. To avoid such a repeat, please review all past discussions in #533, #551 and the ones before that. It's a big long sorry but it will be shorter than repeating them.

As you can see the title of the aborted fix #533 was project: overwrite stdout/stderr objects as well, that was a pretty strong clue. I also mentioned multiple times that the previous attempt was reverted because of terminal issues.

Furthermore, there is always option to not use this, and I imagine that this feature will be used in automation scripts, where we do not care about the output, just on the time.

Failures happen in automation, then output is important too. In case of failure, outputs can be even more important than in interactive use because there is often no simple access to the automation environment and no or little "trial and error" possibility to test again. Changing verbosity levels and other parameters is also more difficult at best and impossible at worst.

PS: I also discovered that the main reason loky was created was to avoid bugs in Python multiprocessing. But these bugs have apparently been fixed in 3.7 and west requires 3.8 minimum

marc-hb · 2024-06-19T20:27:23Z

Failures happen in automation, then output is important too.

Moreover: compilers, linkers etc. are used to have their outputs interleaved with concurrent processes, it's been like this for them for decades. So they expect concurrency and generally make a good effort to push standalone "chunks" of information. At the receiving end, make and ninja try to somehow respect that.

On the other hand, I really doubt git output comes prepared for such concurrency and "naive" parallelization like the one in this PR does not either; your test in Powershell proves that.

Trying to summarize:

I would approve a PR that defaults to -j 1 and prints interleaved garbage terminal output for -j > 1 but only if it does not add any new, significant dependency (like joblib) that does not care about outputs and just lets them through. Maybe Python's built-in multiprocessing could be enough, I don't know.
Also, there should be a very clear warning in the --help that -j > 1 can produce garbage.
Based on past discussions (links already shared above), I suspect @mbolivar-ampere may disagree with 1. He should be back from vacation soon. August 2024 EDIT: he doesn't have time to maintain west any more.
Without waiting, please extract from this PR and submit separately the creation of the project_update() and project_update_some() functions. Leave aside the actual -j bits for now. This will lay the ground for ANY parallelization attempt with pretty much any library and it can't hurt so I think we can merge that preliminary bit yesterday. Let's get those mundane bits merged and out of the way.
Preferably in a separate commit, please also submit your test changes like this:

pytest.mark.parametrize("options", [""]) #, "-j 1", "-j 2", "-j -1"])`
def test_update_some_with_imports(options, repos_tmpdir):

Again, this will help anyone experimenting with any sort of -j prototype. We don't want to lose this no matter what happens next.

pdunaj · 2024-06-26T15:55:41Z

tests/test_project.py

@@ -416,14 +416,14 @@ def test_grep(west_init_tmpdir):

    assert re.search('west-commands', cmd('grep -- -- -commands'))

-
-def test_update_projects(west_init_tmpdir):
+@pytest.mark.parametrize("options", ["", "-j 1", "-j 2", "-j"])


is -j indented? should not be -j -1?

I agree seems so from here: parser.add_argument('-j', '--jobs', nargs='?', const=-1, that -1 should be passed

marc-hb

See above - joblib looks nice but it does not seem to add any value compared to built-in Python modules.

pdgendt · 2024-08-28T09:30:57Z

See above - joblib looks nice but it does not seem to add any value compared to built-in Python modules.

Agreed, something like?

from multiprocessing.pool import Pool

# ...
        processes = self.args.jobs if self.args.jobs > 0 else None
        if processes == 1:
            for project in self.manifest.projects:
                project_update(project)
        else:
            pool = Pool(processes=processes)
            pool.map_async(project_update, self.manifest.projects)
            pool.join()

marc-hb · 2024-08-28T16:05:22Z

Agreed, something like?

Yes something like this.

pdgendt · 2024-09-12T17:05:55Z

@RobertGalatNordic any update?

marc-hb · 2024-09-12T21:38:11Z

@pdgendt if you have time to spend on this, I think you could make pretty short work of a re-implementation of this that uses some built-in, multiprocessing module instead (assuming all old multiprocessing bugs that motivated the creation of joblib have been fixed)

pdgendt · 2024-09-13T07:16:36Z

@pdgendt if you have time to spend on this, I think you could make pretty short work of a re-implementation of this that uses some built-in, multiprocessing module instead (assuming all old multiprocessing bugs that motivated the creation of joblib have been fixed)

I took a quick stab at it, but it's not as easy apparently, because of the shared memory stuff.

pdgendt · 2024-09-13T07:16:58Z

src/west/app/project.py

+                                help='''Use multiple jobs to paralelize update process.
+                                Pass -1 to use all avaliable jobs.


Suggested change

help='''Use multiple jobs to paralelize update process.

Pass -1 to use all avaliable jobs.

help='''Use multiple jobs to parallelize the update process.

Pass -1 to use all available jobs.

ktaborowski approved these changes Jun 11, 2024

View reviewed changes

RobertGalatNordic force-pushed the make_west_update_parallel branch from 3fa370f to 6c1fb2e Compare June 11, 2024 08:32

RobertGalatNordic changed the title ~~project.py: use asyncio to paralelize west update~~ project.py: paralelize west update Jun 11, 2024

marc-hb mentioned this pull request Jun 11, 2024

.github: upgrade to checkout@v4 and setup-python@v5 #712

Merged

RobertGalatNordic force-pushed the make_west_update_parallel branch from 6c1fb2e to da14c09 Compare June 12, 2024 08:24

marc-hb previously requested changes Jun 12, 2024

View reviewed changes

marc-hb added the performance How long things take label Jun 12, 2024

RobertGalatNordic changed the title ~~project.py: paralelize west update~~ project.py: Add new 'joblib' dependency to parallelize update Jun 14, 2024

RobertGalatNordic force-pushed the make_west_update_parallel branch from da14c09 to 7f70536 Compare June 18, 2024 08:58

marc-hb reviewed Jun 18, 2024

View reviewed changes

src/west/app/project.py Outdated Show resolved Hide resolved

project.py: Add new 'joblib' dependency to parallelize update

9d8c8af

The projects from manifests were updated one by one. The parallel aproach has potential to fully utilize CPU and network connection to speed up the update. Signed-off-by: Robert Gałat <robert.galat@nordicsemi.no>

RobertGalatNordic force-pushed the make_west_update_parallel branch from 7f70536 to 9d8c8af Compare June 19, 2024 12:05

pdunaj approved these changes Jun 26, 2024

View reviewed changes

marc-hb requested changes Jun 26, 2024

View reviewed changes

carlescufi requested a review from pdgendt August 27, 2024 11:34

pdgendt reviewed Sep 13, 2024

View reviewed changes

pdgendt added this to the v2.0.0 milestone Sep 27, 2024

marc-hb mentioned this pull request Sep 27, 2024

Add progress info in west update command #742

Open

marc-hb mentioned this pull request Oct 17, 2024

app: Update Forall command to allow multiple concurrent processes #755

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

project.py: Add new 'joblib' dependency to parallelize update #713

project.py: Add new 'joblib' dependency to parallelize update #713

RobertGalatNordic commented Jun 11, 2024 •

edited

Loading

ktaborowski left a comment

RobertGalatNordic commented Jun 11, 2024

marc-hb commented Jun 11, 2024

RobertGalatNordic commented Jun 12, 2024

marc-hb left a comment

marc-hb commented Jun 12, 2024 •

edited

Loading

marc-hb commented Jun 12, 2024

RobertGalatNordic commented Jun 18, 2024 •

edited

Loading

RobertGalatNordic commented Jun 18, 2024

marc-hb left a comment •

edited

Loading

RobertGalatNordic commented Jun 19, 2024

marc-hb commented Jun 19, 2024

marc-hb commented Jun 19, 2024 •

edited

Loading

pdunaj Jun 26, 2024 •

edited

Loading

sigvartmh Jul 3, 2024

marc-hb left a comment

pdgendt commented Aug 28, 2024 •

edited

Loading

marc-hb commented Aug 28, 2024

pdgendt commented Sep 12, 2024

marc-hb commented Sep 12, 2024

pdgendt commented Sep 13, 2024

pdgendt Sep 13, 2024

		help='''Use multiple jobs to paralelize update process.
		Pass -1 to use all avaliable jobs.

project.py: Add new 'joblib' dependency to parallelize update #713

Are you sure you want to change the base?

project.py: Add new 'joblib' dependency to parallelize update #713

Conversation

RobertGalatNordic commented Jun 11, 2024 • edited Loading

ktaborowski left a comment

Choose a reason for hiding this comment

RobertGalatNordic commented Jun 11, 2024

marc-hb commented Jun 11, 2024

RobertGalatNordic commented Jun 12, 2024

marc-hb left a comment

Choose a reason for hiding this comment

marc-hb commented Jun 12, 2024 • edited Loading

marc-hb commented Jun 12, 2024

RobertGalatNordic commented Jun 18, 2024 • edited Loading

RobertGalatNordic commented Jun 18, 2024

marc-hb left a comment • edited Loading

Choose a reason for hiding this comment

RobertGalatNordic commented Jun 19, 2024

marc-hb commented Jun 19, 2024

marc-hb commented Jun 19, 2024 • edited Loading

pdunaj Jun 26, 2024 • edited Loading

Choose a reason for hiding this comment

sigvartmh Jul 3, 2024

Choose a reason for hiding this comment

marc-hb left a comment

Choose a reason for hiding this comment

pdgendt commented Aug 28, 2024 • edited Loading

marc-hb commented Aug 28, 2024

pdgendt commented Sep 12, 2024

marc-hb commented Sep 12, 2024

pdgendt commented Sep 13, 2024

pdgendt Sep 13, 2024

Choose a reason for hiding this comment

RobertGalatNordic commented Jun 11, 2024 •

edited

Loading

marc-hb commented Jun 12, 2024 •

edited

Loading

RobertGalatNordic commented Jun 18, 2024 •

edited

Loading

marc-hb left a comment •

edited

Loading

marc-hb commented Jun 19, 2024 •

edited

Loading

pdunaj Jun 26, 2024 •

edited

Loading

pdgendt commented Aug 28, 2024 •

edited

Loading