-
Notifications
You must be signed in to change notification settings - Fork 253
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WW3 cannot build pre-and-post processing jobs #2502
Comments
I'm working on some updates here: https://github.com/JessicaMeixner-NOAA/WW3/tree/bugfix/buildww3devufs which have allowed me to successfully build the pre & post scripts in the global-workflow. More testing is necessary but sharing current progress. |
I'm really confused by this. The WW3 build in UFS has always referenced files which were only in the dev/ufs-weather-model branch. The previous (prior to PIO commit) src file list for example contained
Why was this working prior to PIO and now is not? |
The files in nuopc_mesh_cap_src were previously not referenced in the ftn_src files. There are now files in ftn_src trying to call files that are in the nuopc_mesh_cap_src. Moreover, the standalone WW3 builds do not seem to be finding PIO in the build either. With the additions I added in https://github.com/JessicaMeixner-NOAA/WW3/tree/bugfix/buildww3devufs we can build the pre/post WW3 scripts, however we cannot build with MPI, which is where we start to get into additional issues including not being able to find the PIO variables, etc. |
These mesh-cap branch files
have always been in the EDIT: I suspect what you might need to do is just move some of the new files to the ftn_src list. |
@DeniseWorthen - Moving things into the ftn_src list would solve some of the issues, yes. We will still need W3_MPI if defs around things that are parallel/use MPI so that when we build without MPI (for pre and post processing steps) that can still be built. I haven't quite figured out how to deal with the PIO parts - and the fact that we aren't finding PIO in WW3. I'm not sure if that's unrelated and just showing up because other errors have not yet been fixed, cmake updates are needed, or if we also need something in WW3 to say yes/no for PIO. |
I'm not the best source for CMake build know-how. But when I first started working on the PIO feature, I played around w/ adding PIO to the model/src/CMakeLists.txt, thinking I might need it. It turned out I didn't
|
Thank you @DeniseWorthen ! I didn't find any CMake additions in ufs-weather-model for PIO either, so maybe we just don't need it? Eventually when this goes back into develop, we'll have to figure out how to get around potentially putting more barriers aroudn PIO, but we don't have to worry about that for today. I added a lot more files to the ftn_src and got the MPI build going, but now the standalone is breaking because those files have places we need the ifdefs for MPI. I'll want to run some additional standalone regression testing and a quick g-w run, but we should have a fix that shouldn't change any answers or anything like that tomorrow or Thursday. I keep adding my updates to: https://github.com/JessicaMeixner-NOAA/WW3/tree/bugfix/buildww3devufs if anyone is curious. |
Okay - I ended up going a slightly different direction today (https://github.com/JessicaMeixner-NOAA/WW3/tree/bug/addPIOswitch) and found a minor bug that we needed to move and end statement inside of a W3_MPI ifdef, but I think the new way, which adds a W3_PIO if defs, will help us in the future and is a lot cleaner. It's not perfect in the sense that I cannot build standalone ww3 w/PIO - yet - because of EMSF dependencies in routines, but it opens up the door for that in the future. I can build the pre-and post-processed standalone routines with and without MPI. That being said, I think we either need to change some of the src lists or maybe add a check that the nuopc cap has to use the PIO? (which I think is desirable, and I don't see anyone using NUOPC that would object). Still to do: more standalone ww3 runs, ufs-rt tests, and running in g-w. I'm hoping I can do all of these things by Friday and have a PR ready by no later than Monday is my goal, if not before. @DeniseWorthen - let me know if you would like to see other tests, have objections to this approach, or would like to see other additions. |
This change will breaks the capability of turning the netcdf feature on/off at run time vs compile time. That's a big step backward, imo. I understand the intent is simply to solve the immediate problem of compiling the codes that are needed for pre/post in G-W. I think there are better ways of solving this, but those would take time to sort out. My concern is that it will persist (as things in WW3 have a tendency to do). That leads to all sorts of messy problems, eg broken use-cases, like trying to set init-from-binary true and not have either restartnc or PIO set. There's basically no utility to having any code inside a use_restartnc or use_historync flag if you're not using PIO. The PIO feature in the develop branch will require creating new nml namelists so that PIO can be configured that way (CICE has this implemented). I don't suspect that is happening any time soon. |
Only having this based on a compile time flag means that even if you are not using PIO you have to have the library built correctly, yes? My concern with this is that requires the many users around the world outside of NOAA to at minimum build PIO, which is possibly non-trivial and not possible for them. WW3 is used by many small meteorology organizations where even building netcdf in the past was difficult and we need to discuss adding PIO with the llarger WW3 community which has not been done. I will happily bring this to that larger community and see if this is okay, but generally this is the way WW3 handles adding libraries that are not required for all use-cases. No it's not perfect, and I understand how you see this as a major step backwards. I just don't know another solution at the moment. I agree it might be a while until a feature is ready to be used from PIO in the develop branch, but my hope is this is a small step towards enabling dev/ufs-weather-model to get merged into the develop branch without this feature fully ready. And solving the immediate issue of being able to run this in the end-to-end system. We do need to get some sort of solution in sooner than later, I've mostly stopped all of my other work to address this, and will continue to so if you have a better solution - please let me know and I'll start implementing that immediately. At this point I am going to start testing this to make sure this is truly a viable solution. |
In terms of dev/ufs-weather-model going into develop in the short term, I think that is putting the cart before the horse. Unless there is a way (ie, outside of ESMF config) to use the feature, then what is the point? Get that done, then I can see the utility. And of course dev/uwm is a year behind develop---so how is that going to work? There is also utility in UWM pointing to it's own branch. If UWM uses develop, then the merging of PRs to WW3 develop means that users 'around the world' are stuck in the UWM commit Q. WW3 cannot merge PRs to develop unless UWM also points to develop. Given how slow the UWM Q works..... In any case, my essential point still holds. Anything w/in logical flags |
@DeniseWorthen if you can explain the other way to hide PIO please let me know and I'll start to implement it. |
I don't have a solution at this instant. I'm saying I think it could be done. I'm also getting confused because it seems you're doing some things in some branches by adding an MPI switch, and other things in other branches adding a PIO switch. I guess I'm not doing a good job of explaining....you have inside of a
But that doesn't make sense---nothing inside of the use_restartnc is usable w/o PIO. So why not just #ifdef PIO |
@DeniseWorthen Apologies for the confusion. My first effort was to see if just putting things behind W3_MPI siwtches in this branch: https://github.com/JessicaMeixner-NOAA/WW3/tree/bugfix/buildww3devufs however, I realized in the end it wasn't sufficient and I thought a cleaner way (and possibly more future proof), would be to add a new switch W3_MPI instead. So I abandoned that branch, and started a new approach here: https://github.com/JessicaMeixner-NOAA/WW3/tree/bug/addPIOswitch) I was trying to put it in as small of place of possible while I was debugging an unrelated issue and forgot to circle back. I agree, it makes more sense to put it in the larger area - I'll make that change now. If you think of a different solution - please let me know and I'm happy to dedicate time to implement it. I'll try to brainstorm other ways too. |
@DeniseWorthen change in the if-defs is committed here: JessicaMeixner-NOAA/WW3@f528003 |
In terms of thinking about 'a better way', exactly which codes needed for pre/post in G-W are in conflict w/ how things are currently implemented? |
w3init and w3wave are the two codes that caused build errors. |
Understood, but which pre/post programs? I've been assuming these are 'ww3_X.F90' standalone programs which are built on top of w3 modules that are causing the issues. |
Ah, thanks. Looks very much like the |
@JessicaMeixner-NOAA I've had a re-think, and I'm pretty sure now that adding a PIO switch would not break the ability to choose netcdf vs binary at run-time for the UWM model itself. |
As long as the PIO switch is there, you then have the runtime option of yes/no in the UWM. To run the NUOPC cap, we have to have the PIO switch included in the build. I know we've had a history of not changing things, but if anyone comes up with a different solution to what I've implemented now -I really am happy to implement it. |
Description
In PR #2445 WW3 was updated with PR NOAA-EMC/WW3#1303 which makes references to modules that are only in the cap code. This means that we cannot run the end-to-end system (for example see: NOAA-EMC/global-workflow#3106 (comment)).
We will need ot make an update to WW3 so that we can compile the pre-and post-scripts within the dev/ufs-weather-model.
To Reproduce:
The easiest way to reproduce this is to clone the g-w, update the model to the hash in develop and then build all or just the ww3 pre/post:
Additional context
@sbanihash had noticed this while reviewing a PR FYI @jswhit who ran into this in g-w.
@sbanihash and myself will be meeting with @MatthewMasarik-NOAA first thing tomorrow morning to find a solution.
Output
The text was updated successfully, but these errors were encountered: