Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: Calculations on 3hr frequency #45

Open
emsonali opened this issue Feb 15, 2024 · 14 comments
Open

Question: Calculations on 3hr frequency #45

emsonali opened this issue Feb 15, 2024 · 14 comments

Comments

@emsonali
Copy link
Contributor

Hi Ludwig,

I'd like to check if computations of climate indices on a 3hr frequency are in-built into index_calculator? If so, could you point me to the script where I can see how the 3hr computations are handled, particularly for UTCI? Thank you!

Best Regards,
Sonali

@ludwiglierhammer
Copy link
Collaborator

Hi @emsonali,

here are two points to mention, why there is no explicit treatment for 3-hourly data in the code:

  • the calculation of UTCI is independent of the input frequency, as the calculation is just a "conversion" of several parameters into a single index, whereby the frequency is not changed.
  • most indices, for which input and output frequency differs, need daily input. Thus, the main focus of the index_calculator is on daily input data.

If you want to calculate UTCI from 3-hourly input data just use this:

result = xcalc.index_calculator(
  ds = ds,
  freq="3hr",
  crop_time_axis=False,
  index="UTCI",
  ...
)

crop_time_axis trims the left and right input time axis depending on the output frequency. For instance, if freq is year the left bound is January, 1st and the right bound is the last day of December. This option is not working with 3-hourly input data (not implemented yet). So please set this option to False.

If you have any questions please don't hesitate to contact me.

Cheers,
Ludwig

@emsonali
Copy link
Contributor Author

Hi @ludwiglierhammer, thank you for the clarification. May I know what xcalc is and the appropriate way to install it?

@larsbuntemeyer
Copy link
Collaborator

larsbuntemeyer commented Feb 27, 2024

Hey @emsonali , i think what ludwig means here is just an abbreviation, e.g., xcalc is just and abbreviation for index_calculator:

import index_calculator as xcalc

@emsonali
Copy link
Contributor Author

@larsbuntemeyer @ludwiglierhammer I managed to get index_calculator working for 3h by adding in crop_time_axis=False and modifying freq="3hr", however the outputs are still on the day scale. Do you know why this might be happening? I'm running it with the index_calculation function on Levante

@ludwiglierhammer
Copy link
Collaborator

😕 @emsonali: Can you please send me the ncdumps of your input and output data.

@emsonali
Copy link
Contributor Author

@ludwiglierhammer attaching ncdumps from sample input and output files
input_ncdump.txt
output_ncdump.txt

@ludwiglierhammer
Copy link
Collaborator

@emsonali: I think I could figure out the problem. The first step of the preprocessor is do convert the input frequency into the requested input frequency. Unfortunately, this converter converts your 3-hourly input data into daily data, since the index_calculator is mainly written for daily input data.

Here you may add some adjustments to the code. Just fork the ìndex_calculator to your private github account, create a new branch and create a new pull request. I am pretty sure @larsbuntemeyer and/or @KatharinaBuelow can help you with the technical issues. I can help you editing the code.

Here are some suggestions:

if self.ifreq not in fjson.keys():
    warnings.warn(
        f"Could not convert to frequency {self.ifreq}",
        f"Try one of {fjson.keys()}.",
    )
    return ds
  • in addition you can add a new option similar to check_time_axis (e.g. convert_time_axis) which steers the converter.
if convert_time_axis is True:
    ds_ = self._convert_to_frequency(self.ds)

@larsbuntemeyer
Copy link
Collaborator

@emsonali here are some hints how to contribute using a fork. Let me know if you need more help.

@emsonali
Copy link
Contributor Author

Hi @ludwiglierhammer @larsbuntemeyer I've been attempting to modify _preprocessing.py, _processing.py, _postprocessing.py, _consts.py and convert_to_frequency.json so that index_calculator is able to do the computation on 3h frequencies and output the 3h UTCI values as 5 year files. I managed to produce an output now, but the UTCI values show as nan and previously I got a memory error, because the files being written were too big. Essentially I think the issue has to do with pyhomogenize as it only handles daily data. Is there a way to bypass pyhomogenize in the code and hardcode 3h computations, and creation of the time dimensionn of the netCDF files, or do we need to edit pyhomogenize to be able to handle 3h computations?

@ludwiglierhammer
Copy link
Collaborator

Hi @emsonali, the testing suite is not working anymore with your changes. Please make sure that all tests are running succesfully. Afterwards we can fix your code. pyhomogenize does not only handle daily data. I think the problem is here. It returns None for the dataset if time frequency is not 3h.

We use pyhomogenize to manage the time axis. pyhomogenize is able to "work" with those frequencies .

Can you please make a PR. This makes many things easier.

@emsonali
Copy link
Contributor Author

Hi @ludwiglierhammer I reverted the changes made to _preprocessing.py and index_calculator is able to run on 3h frequency just with the addition I made to convert_to_frequency.json.

2 remaining issues:

  1. I have a runtime issue on Levante, as the max. runtime for each job is only 8h. Any advice on how I could break up the jobs further? Should I modify index_calculation to facilitate this or is there a way to increase runtime? @larsbuntemeyer @KatharinaBuelow
  2. I get the following runtimewarning (see attached), of which there are many (hundreds of lines of the same error), but I checked the output UTCI values and they seem to make sense. Should I do anything about this?
    Screenshot 2024-03-14 at 13 40 59

@ludwiglierhammer
Copy link
Collaborator

  1. The maximum job runtime on compute nodes is 8 hours.
  2. Those are just warnings due to divison through zero. This explains some Nans in your UTCI result. You can disable those warnings:
import warnings
warnings.filterwarnings("ignore")

I can have a look at your PR #46 next week.

@emsonali
Copy link
Contributor Author

@ludwiglierhammer could you point me to the script(s) where the time bounds for the index calculations are set? Either in index_calculator or index_calculation. I will modify the time bounds there so I can submit smaller jobs that can run within 8h

@ludwiglierhammer
Copy link
Collaborator

Is that what you are lookin for? You can simply set your left and right time bounds by calling index_calculator.preprocessing .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants