You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am trying to run ai-models on a set of models. I want to launch multiple forecasts at different timesteps and compare them with the outputs of ERA5. Due to the requirements of my HPC, I do not have access to internet and thus, needed to download the data. For that purpose, I have created a Zarr dataset from netcdf previously requested from CDS. From this dataset I extract the initial conditions required for a model and convert them to netcdf to use as input for ai-models.
Now, I'm encountering many difficulties mainly on the use of earthkit-data to parse this netcdf files. The main one is on the use of the object NetCDFMultiFieldList as it's not implementing to_xarray.
To guide why this object is created, function reader from earthkit.data.readers.netcdf returns fs as fs.has_fields() returns True. This creates a NetCDFFieldListReader with all the fields read from the netcdf. Afterwards, this object mutates to a NetCDFMultiFieldList and ai-models tries to convert it to Xarray and fails since this conversion is not implemented.
If fs.has_fields() were to be False, a NetCDFReader object would be created, while this raises other errors, a conversion to Xarray is implemented for it. I have noticed that the methods mutate_source from NetCDFFieldListReader and NetCDFFieldListUrlReader have comments regarding themselves as already being NetCDFReader objects (" # A NetCDFReader is a source itself"). I wonder if these classes were supposed to inherit from NetCDFReader or if there are plans for this to occur at some point.
What are the steps to reproduce the bug?
Requirements with versions to reproduce the bug:
xarray=2024.0.0
netCDF4=1.7.1.post2
ai-models=0.7.0
ai-models-panguweather=0.0.7
Download 2023-01-01 initial conditions required to run Panguweather from CDS. To do that, I have used the following climetlab requests.
Locate them and join them in a single netcdf file. I joined them with xr.open_mfdataset default arguments without any errors and preserving all the variables with the appropiate dimensions. Given the names "sfc_pangu_test.nc" and "pl_pangu_test.nc". Note that pressure_level is renamed to pl since it will be the levtype used to select atmospheric variables in ai-models (in this line)
Run ai-models with the created file pangu_test.nc as follows: ai-models --download-assets --input file --file pangu_test.nc --output file --path pangu_output.grib --date 20230101 --time 1200 --lead-time 24 panguweather
I will append the returned log. Don't mind the ONNX error, it is not relevant for this issue and it simply implies that the model will be running in CPU instead of GPU as it didn't find cudnn installed. This code was ran from an environment with the aforementioned dependencies between others. This code was launched in a login node from the MareNostrum5 accelerated partition that has access to internet and is only accessible by staff from Barcelona Supercomputing Center. I am confident that the error is not related with the platform and it will happen again for any other platform.
Version
0.10.4
Platform (OS and architecture)
Linux alogin4 5.14.0-284.30.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Aug 25 09:13:12 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
Relevant log output
2024-10-23 12:25:01,987 INFO Writing results to pangu_output.grib
2024-10-23 12:25:01,987 INFO Downloading /gpfs/home/bsc/bsc032010/git/climate-emulators/test_earthkit/pangu_weather_24.onnx
2024-10-23 12:25:01,987 INFO Downloading https://get.ecmwf.int/repository/test-data/ai-models/pangu-weather/pangu_weather_24.onnx
2024-10-23 12:25:48,428 INFO Downloading /gpfs/home/bsc/bsc032010/git/climate-emulators/test_earthkit/pangu_weather_6.onnx
2024-10-23 12:25:48,428 INFO Downloading https://get.ecmwf.int/repository/test-data/ai-models/pangu-weather/pangu_weather_6.onnx
2024-10-23 12:26:40,144 INFO Using device 'GPU'. The speed of inference depends greatly on the device.
2024-10-23 12:26:40,145 INFO ONNXRuntime providers: ['CUDAExecutionProvider', 'CPUExecutionProvider']
2024-10-23 12:26:40.755689924 [E:onnxruntime:Default, provider_bridge_ort.cc:1992 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1637 onnxruntime::Provider&onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcudnn.so.9: cannot open shared object file: No such file or directory
2024-10-23 12:26:40.755744660 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:965 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Require cuDNN 9.* and CUDA 12.*. Please install all dependencies as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported.2024-10-23 12:26:50,047 INFO Loading ./pangu_weather_24.onnx: 9 seconds.2024-10-23 12:26:50.663496877 [E:onnxruntime:Default, provider_bridge_ort.cc:1992 TryGetProviderInfo_CUDA] /onnxruntime_src/onnxruntime/core/session/provider_bridge_ort.cc:1637 onnxruntime::Provider& onnxruntime::ProviderLibrary::Get() [ONNXRuntimeError] : 1 : FAIL : Failed to load library libonnxruntime_providers_cuda.so with error: libcudnn.so.9: cannot open shared object file: No such file or directory2024-10-23 12:26:50.663522316 [W:onnxruntime:Default, onnxruntime_pybind_state.cc:965 CreateExecutionProviderInstance] Failed to create CUDAExecutionProvider. Require cuDNN 9.* and CUDA 12.*. Please install all dependencies as mentioned in the GPU requirements page (https://onnxruntime.ai/docs/execution-providers/CUDA-ExecutionProvider.html#requirements), make sure they're in the PATH, and that your GPU is supported.
2024-10-23 12:27:00,014 INFO Loading ./pangu_weather_6.onnx: 9 seconds.
2024-10-23 12:27:00,065 INFO Starting date is 2023-01-01 12:00:00
2024-10-23 12:27:00,066 INFO Writing input fields
2024-10-23 12:27:00,067 INFO Total time: 1 minute 58 seconds.
Traceback (most recent call last):
File "/gpfs/projects/bsc32/ml_models/emulator_models/ecmwf_ai_models/wf_emulator_snake_2/bin/ai-models", line 8, in<module>sys.exit(main())
File "/gpfs/projects/bsc32/ml_models/emulator_models/ecmwf_ai_models/wf_emulator_snake_2/lib/python3.10/site-packages/ai_models/__main__.py", line 362, in main
_main(sys.argv[1:])
File "/gpfs/projects/bsc32/ml_models/emulator_models/ecmwf_ai_models/wf_emulator_snake_2/lib/python3.10/site-packages/ai_models/__main__.py", line 310, in _main
run(vars(args), unknownargs)
File "/gpfs/projects/bsc32/ml_models/emulator_models/ecmwf_ai_models/wf_emulator_snake_2/lib/python3.10/site-packages/ai_models/__main__.py", line 335, in run
model.run()
File "/home/bsc/bsc032010/.local/lib/python3.10/site-packages/ai_models_panguweather/model.py", line 89, in run
self.write_input_fields(fields_pl + fields_sfc)
File "/gpfs/projects/bsc32/ml_models/emulator_models/ecmwf_ai_models/wf_emulator_snake_2/lib/python3.10/site-packages/ai_models/model.py", line 545, in write_input_fields
fields.save("input.grib")
File "/gpfs/projects/bsc32/ml_models/emulator_models/ecmwf_ai_models/wf_emulator_snake_2/lib/python3.10/site-packages/earthkit/data/decorators.py", line 65, in wrapped
return func(self, *args, **kwargs)
File "/gpfs/projects/bsc32/ml_models/emulator_models/ecmwf_ai_models/wf_emulator_snake_2/lib/python3.10/site-packages/earthkit/data/core/fieldlist.py", line 1418, in save
self.write(f, **kwargs)
File "/gpfs/projects/bsc32/ml_models/emulator_models/ecmwf_ai_models/wf_emulator_snake_2/lib/python3.10/site-packages/earthkit/data/readers/netcdf/fieldlist.py", line 286, in write
return self.to_netcdf(*args, **kwargs)
File "/gpfs/projects/bsc32/ml_models/emulator_models/ecmwf_ai_models/wf_emulator_snake_2/lib/python3.10/site-packages/earthkit/data/readers/netcdf/fieldlist.py", line 215, in to_netcdf
returnself.to_xarray().to_netcdf(*args, **kwargs)
File "/gpfs/projects/bsc32/ml_models/emulator_models/ecmwf_ai_models/wf_emulator_snake_2/lib/python3.10/site-packages/earthkit/data/readers/netcdf/fieldlist.py", line 363, in to_xarray
raise NotImplementedError(
NotImplementedError: NetCDFMultiFieldList.to_xarray() does not supports NetCDFMaskFieldList
Accompanying data
No response
Organisation
Barcelona Supercomputing center
The text was updated successfully, but these errors were encountered:
@aperaza-bsc, thank you for the detailed report. Apart from the issue withNetCDFMultiFieldList I am not sure ai-models can be run with NetCDF input. The documentation only mentions GRIB.
Thanks for the quick response!
You are right. I badly assumed that as it used earthkit-data as its backed it would be also format agnostic.
Unless you are still interested in looking at the issue, we can close this from my part since even if this was solved, I might encounter other issues by using netcdf for ai-models.
What happened?
I am trying to run
ai-models
on a set of models. I want to launch multiple forecasts at different timesteps and compare them with the outputs of ERA5. Due to the requirements of my HPC, I do not have access to internet and thus, needed to download the data. For that purpose, I have created aZarr
dataset fromnetcdf
previously requested fromCDS
. From this dataset I extract the initial conditions required for a model and convert them to netcdf to use as input forai-models
.Now, I'm encountering many difficulties mainly on the use of
earthkit-data
to parse thisnetcdf
files. The main one is on the use of the objectNetCDFMultiFieldList
as it's not implementingto_xarray
.To guide why this object is created, function reader from earthkit.data.readers.netcdf returns
fs
asfs.has_fields()
returns True. This creates aNetCDFFieldListReader
with all the fields read from the netcdf. Afterwards, this object mutates to aNetCDFMultiFieldList
and ai-models tries to convert it to Xarray and fails since this conversion is not implemented.If
fs.has_fields()
were to be False, aNetCDFReader
object would be created, while this raises other errors, a conversion to Xarray is implemented for it. I have noticed that the methodsmutate_source
fromNetCDFFieldListReader
andNetCDFFieldListUrlReader
have comments regarding themselves as already beingNetCDFReader
objects (" # A NetCDFReader is a source itself"). I wonder if these classes were supposed to inherit fromNetCDFReader
or if there are plans for this to occur at some point.What are the steps to reproduce the bug?
Requirements with versions to reproduce the bug:
xarray=2024.0.0
netCDF4=1.7.1.post2
ai-models=0.7.0
ai-models-panguweather=0.0.7
xr.open_mfdataset
default arguments without any errors and preserving all the variables with the appropiate dimensions. Given the names "sfc_pangu_test.nc" and "pl_pangu_test.nc". Note that pressure_level is renamed to pl since it will be the levtype used to select atmospheric variables in ai-models (in this line)ai-models
with the created filepangu_test.nc
as follows:ai-models --download-assets --input file --file pangu_test.nc --output file --path pangu_output.grib --date 20230101 --time 1200 --lead-time 24 panguweather
I will append the returned log. Don't mind the ONNX error, it is not relevant for this issue and it simply implies that the model will be running in CPU instead of GPU as it didn't find cudnn installed. This code was ran from an environment with the aforementioned dependencies between others. This code was launched in a login node from the MareNostrum5 accelerated partition that has access to internet and is only accessible by staff from Barcelona Supercomputing Center. I am confident that the error is not related with the platform and it will happen again for any other platform.
Version
0.10.4
Platform (OS and architecture)
Linux alogin4 5.14.0-284.30.1.el9_2.x86_64 #1 SMP PREEMPT_DYNAMIC Fri Aug 25 09:13:12 EDT 2023 x86_64 x86_64 x86_64 GNU/Linux
Relevant log output
Accompanying data
No response
Organisation
Barcelona Supercomputing center
The text was updated successfully, but these errors were encountered: