Start move to dask histEFT #422

btovar · 2024-07-09T16:20:21Z

@bryates This is as far as I can confidently go. I tried to update the code in corrections.py, but it involves knowing the dimensions of the arrays with regards to physics, so I didn't want to introduce errors there.

The rough roadmap you want to follow is to replace numpy with dask.array, and awkard with dask_awkward. Also, the only time you should call dask.compute is in run_analysis.

My first roadblock in corrections.py is to create a random matrix with similar dimensions as a dask_awkward array. (Looking at the code, I think you don't need to generate the full matrix?) The issue is that the original code used to flatten/unflatten arrays, and such operations don't really make sense in dask when the arrays are not known. I think that probably someone that understand the physics can change that computation to use dak.mask, etc.

bryates · 2024-07-09T16:27:14Z

@bryates This is as far as I can confidently go. I tried to update the code in corrections.py, but it involves knowing the dimensions of the arrays with regards to physics, so I didn't want to introduce errors there.

The rough roadmap you want to follow is to replace numpy with dask.array, and awkard with dask_awkward. Also, the only time you should call dask.compute is in run_analysis.

My first roadblock in corrections.py is to create a random matrix with similar dimensions as a dask_awkward array. (Looking at the code, I think you don't need to generate the full matrix?) The issue is that the original code used to flatten/unflatten arrays, and such operations don't really make sense in dask when the arrays are not known. I think that probably someone that understand the physics can change that computation to use dak.mask, etc.

Thanks @btovar! I agree with you that the randomness in the Rochester can be done better. We can take a look at other things as well.

btovar · 2024-07-09T16:35:06Z

The futures test was not removed, it was renamed to test_dask. There is not a "futures" executor in the new coffea.

bryates · 2024-07-09T16:35:59Z

The futures test was not removed, it was renamed to test_dask. There is not a "futures" executor in the new coffea.

Yes, that's what I meant to put, thanks!

Fixes CI not finding local root file

btovar added 2 commits July 9, 2024 12:12

move run_analysis.py to new histEFT, taskvine

fb7bb98

move analysis_processor to dask

0ec938a

bryates added 3 commits July 9, 2024 12:29

Adding dask test, restoring old futures test for CI

d251a3e

Remove blank lines

6a373f3

Futures was removed

93a4bfb

bryates added 4 commits July 9, 2024 12:37

Adding dask stuff to environment yml

f17064b

Unpin coffea, use python 3.10

0a2801e

Default redirector '' -> '.'

d8dba61

Fixes CI not finding local root file

Use dask_hists branch of TopCoffea

a206030

bryates linked an issue Jul 9, 2024 that may be closed by this pull request

Switch processor to dask_awkward [example](https://github.com/cmstas/ewkcoffea/pull/14/files) #392

Open

btovar and others added 7 commits July 9, 2024 13:39

uncomment actual computation

fc55fca

Allow coffea warnings for now

a87b457

Various da and dak fixes

275d979

Disable random for now, other minor fixes

40f3657

Fix lint

5ed215f

Another lint change

680f239

Remove print

ea2e8c2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Start move to dask histEFT #422

Start move to dask histEFT #422

btovar commented Jul 9, 2024 •

edited

Loading

bryates commented Jul 9, 2024

btovar commented Jul 9, 2024

bryates commented Jul 9, 2024

Start move to dask histEFT #422

Are you sure you want to change the base?

Start move to dask histEFT #422

Conversation

btovar commented Jul 9, 2024 • edited Loading

bryates commented Jul 9, 2024

btovar commented Jul 9, 2024

bryates commented Jul 9, 2024

btovar commented Jul 9, 2024 •

edited

Loading