Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neural snow depth model #345

Merged
merged 1 commit into from
Dec 13, 2023
Merged

Neural snow depth model #345

merged 1 commit into from
Dec 13, 2023

Conversation

a-charbon
Copy link
Contributor

Purpose

Opening a request to move neural-network-based snow depth model into the main repository. Developing tests and tutorials for public usage as well as restructuring for resubmission of associated paper.

To-do

  1. Write tests for code
  2. Develop tutorials (paper figures)
  3. Integrate into existing snow model for usage in next-steps model
  4. Check existing guidelines/policies against submitted code.

Content

  • four tools files (data, model, test/train, display) for usage (will eventually be moved to an appropriate "src" folder)
  • code for generating the dataset used for the paper
  • example code for training/validation pipeline (likely to become part of an example/tutorial)

  • I have read and checked the items on the review checklist.

@a-charbon a-charbon requested a review from kmdeck October 5, 2023 01:49
Copy link
Member

@kmdeck kmdeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

great work, Andy! the code is clean and well documented. I left a number of small comments, but the more major things are:

  • think about which files should be modules that go into src (DataTools, ModelTools ?), ideally these are fairly agnostic about the specific dataset in hand. Functions in src ideally are tested with unit tests.
  • It might make more sense for DataTools to be a local file in experiments. but we need the model code in src to evaluate it as part of ClimaLSM.
  • use existing tools in MLUtils when possible (for train test split, other standard ML tasks)
  • in experiments or tutorials, can show the results of training (test_run.jl) using functionality from display_tools.jl, analysis_tools.jl.
  • dont need to run data_retriever in the experiment itself, but can include this file (and DataTools.jl?) saying that this would be run to obtain the preprocessed/transformed data. Then, the completely clean/transformed data is saved in box & used in the training script, which is run in experiments.

What do you think?

docs/src/APIs/Snow.md Outdated Show resolved Hide resolved
Return the trainable weights for the developed neural model.

# Arguments
- `model::Chain`: the neural model to be used.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, Chain is a Flux type, the neural model is of type Chain. Can the linear model be expressed as a Chain also, but with fewer layers/fixed width of the layer? Does it need its own type?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If so, we wouldnt need two methods for each of these, right? get_model_ps, evaluate, etc.

Copy link
Contributor Author

@a-charbon a-charbon Oct 29, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there's no need to get the model parameters for the linear model, as the linear model as an object is just the coefficient/parameter list (it's trained directly as A\b upon creation and not trained in a trainmodel! call, calling a created linear model just returns the vector of coefficients for the linear model), there is an evaluate call already written - that is, this function is specific to neural model types and not to linear models (it's not like "evaluate" etc, in the same way settimescale! is specific to the neural model)

@@ -0,0 +1,549 @@
module DataTools
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am wondering if this particular module belongs in src vs in the docs/tutorials/Snow folder. The reason why is because this functionality is needed to create the dataset used in your paper, and has some handy function for processing data, but, nothing in here is required for evaluating a trained model to get the snow depth at the next step - which is what we need in the land model simulation (all that is in ModelTools.jl, which definitely belongs in src).

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I figured I was helping future work in case CliMA eventually wanted other types of SNOTEL data in training other models (albedo, runoff, etc., the module has all this functionality) and therefore wasn't specifically tied to this particular tutorial/use-case, and therefore should be in a location intended for wider usage. If you disagree I have no issues moving it to the tutorial folder since others can always go grab the tools from there anyway.

Copy link
Member

@kmdeck kmdeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mostly minor comments - thanks for putting this together! The tutorials seem really clear. Let me know if you would like to discuss further.

@a-charbon a-charbon marked this pull request as ready for review November 9, 2023 01:52
@a-charbon a-charbon marked this pull request as draft November 9, 2023 02:06
Copy link
Member

@kmdeck kmdeck left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good Andy. Once you squash/rebase, it should show as a single commit. Then all the tests should pass, and we can merge it in.

I just added a couple comments re: making sure the float type is maintained - it might be good to add a couple tests for this.

@kmdeck kmdeck marked this pull request as ready for review November 13, 2023 21:35
@kmdeck kmdeck changed the title first commit Neural snow depth model Dec 11, 2023
updated manifests

new tutorial folder structure

format changes

changed file referencing
@kmdeck kmdeck merged commit 1f46dc4 into main Dec 13, 2023
6 checks passed
@kmdeck kmdeck deleted the neural_snow branch December 13, 2023 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants