Performance regressions on changing netcdf dependency to hdf5 v1.10.8 #2229
Replies: 3 comments 1 reply
-
Hi @abhibaruah thank you for your patience! The performance regressions with HDF5 1.10.x as compared to 1.8.x is expected, as you point out; I will take a look to see if I observe similar regressions. I know that the HDF group has put effort into reverting some of these regressions in the 1.12.x branch, have you tried your benchmarks against the latest HDF5 1.12.x release? |
Beta Was this translation helpful? Give feedback.
-
As @WardF points out, this is a known problem in HDF5, and beyond the control of netCDF. However, using HDF5-1.12.1 helps with performance. Here's a recent chart in which we measured the different HDF5 versions in a real-world forecast output on HPC systems: Upgrading to HDF5-1.12.1 is transparent to netCDF, just rebuild netCDF with HDF5-1.12.1 and everything will work just as before, but slightly faster. ;-) |
Beta Was this translation helpful? Give feedback.
-
Thanks a lot @WardF and @edwardhartnett. The highest regressions that we see are with NetCDF VLEN functions. |
Beta Was this translation helpful? Give feedback.
-
NetCDF version: 4.8.1
OS: Windows 10 and Debian 10
I am trying to change the dependency of NetCDF from hdf5 v1.8.12 to v1.10.8. I was able to change the dependency successfully (thanks to help from you). After this, when I run our performance suite, we see regressions in several of the netcdf functions. I know that netcdf regressions with hdf5 1.10 is a known thing, but still, I thought I would report this and attach a summary table of some of the regressions that we saw.
For each of the functions below, I am comparing the same workflow in my sandboxes with hdf5 1.8.12 and hdf5 1.10.8. I made several runs and the results below are the average of all the runs.
Function: nc_get_var_string
Workflow: Create a temp file, write an NC_STRING variable to it and read it back
Windows % Regression: 88.48
Linux %Regression: 21.048951
Function: nc_create
Workflow: Create a netcdf-4 file
Windows % Regression: 26.53
Linux %Regression: 115.43
Function: nc_get_var
Workflow: Read a VLEN variable of basetype NC_BYTE from the attached file (/numeric_types/samples_int8)
Windows % Regression: 1561.09
Linux %Regression: 2.11
Function: nc_get_var
Workflow: Read a VLEN variable of basetype NC_STRING from the attached file (/text_types/samples_string)
Windows % Regression: 76.74
Linux %Regression: 36.74
The highest regressions are seen with nc_get_var and nc_get_att with VLEN types. I am still working on getting the performance numbers for nc_get_att and will post them as soon as they are ready.
Even though I have posted results for nc_get_var with basetype NC_BYTE and NC_STRING, I could see regressions with other basetypes as well.
I know that regressions with hdf5 1.10.8 are expected, but still, I would like to get your opinions on the results I posted above.
Are these slowdowns expected? Is there any way to improve performance with hdf5 1.10.8?
Kindly let me know if you want more information regarding the investigation.
Here is the link to the nc file we used for the 'nc_get_var' tests:
https://mathworks-my.sharepoint.com/:u:/p/abaruah/ES2lQMAc0MRJpBOF8huTQ3EB02Y0oynhIY1CYC_mo08j-w?e=Sfb4iz
Beta Was this translation helpful? Give feedback.
All reactions