-
Notifications
You must be signed in to change notification settings - Fork 285
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Severe slicing bug when slicing a cube with a boolean ndarray #6251
Comments
Whoops forgot to tag poor @schlunma who worked heaps on isolating the behaviour out of the stack 🍺 |
Just as an extra clue: moving the slice forward in time, and asking for elements equally shifted in time, results in consitent results lazy-realized-netCDF4: import netCDF4
import iris
import dask
print(dask.__version__)
import numpy as np
fil = "tos_Omon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc"
def do_lazy_iris(c):
bo = np.zeros(c.core_data().shape[0], dtype=bool)
bo[1460:1860] = True
d = c[bo, ...]
print("Lazy Iris")
print(d[100, 30, 50].data)
print(d[56, 30, 50].data)
print(d[144, 30, 50].data)
def do_realized_iris(c):
bo = np.zeros(c.data.shape[0], dtype=bool)
bo[1460:1860] = True
d = c[bo, ...]
print("Realized Iris")
print(d[100, 30, 50].data)
print(d[56, 30, 50].data)
print(d[144, 30, 50].data)
def do_netCDF4():
ds = netCDF4.Dataset(fil)
print(ds["tos"][1560, 30, 50])
print(ds["tos"][1516, 30, 50])
print(ds["tos"][1604, 30, 50])
cub = iris.load_cube(fil)
do_lazy_iris(cub)
print("\n")
do_realized_iris(cub)
print("\n")
do_netCDF4() and in the same vein, trying to retrieve the data elements frm the example above (the MRE), with a slice shifted in time, results in consistent results too: import netCDF4
import iris
import dask
print(dask.__version__)
import numpy as np
fil = "tos_Omon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc"
def do_lazy_iris(c):
bo = np.zeros(c.core_data().shape[0], dtype=bool)
bo[1460:1860] = True
d = c[bo, ...]
print("Lazy Iris")
print(d[200, 30, 50].data)
print(d[156, 30, 50].data)
print(d[244, 30, 50].data)
def do_realized_iris(c):
bo = np.zeros(c.data.shape[0], dtype=bool)
bo[1460:1860] = True
d = c[bo, ...]
print("Realized Iris")
print(d[200, 30, 50].data)
print(d[156, 30, 50].data)
print(d[244, 30, 50].data)
def do_netCDF4():
ds = netCDF4.Dataset(fil)
print(ds["tos"][1660, 30, 50])
print(ds["tos"][1616, 30, 50])
print(ds["tos"][1704, 30, 50])
cub = iris.load_cube(fil)
do_lazy_iris(cub)
print("\n")
do_realized_iris(cub)
print("\n")
do_netCDF4() so there is something very odd when starting the slice at time index 1560 out of 1980 index length |
...and shifting the slice forward in time, results in consistent results: import netCDF4
import iris
import dask
print(dask.__version__)
import numpy as np
fil = "tos_Omon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc"
def do_lazy_iris(c):
bo = np.zeros(c.core_data().shape[0], dtype=bool)
bo[1660:1860] = True
d = c[bo, ...]
print("Lazy Iris")
print(d[100, 30, 50].data)
print(d[56, 30, 50].data)
print(d[144, 30, 50].data)
def do_realized_iris(c):
bo = np.zeros(c.data.shape[0], dtype=bool)
bo[1660:1860] = True
d = c[bo, ...]
print("Realized Iris")
print(d[100, 30, 50].data)
print(d[56, 30, 50].data)
print(d[144, 30, 50].data)
def do_netCDF4():
ds = netCDF4.Dataset(fil)
print(ds["tos"][1760, 30, 50])
print(ds["tos"][1716, 30, 50])
print(ds["tos"][1804, 30, 50])
cub = iris.load_cube(fil)
do_lazy_iris(cub)
print("\n")
do_realized_iris(cub)
print("\n")
do_netCDF4() anything special about 1560 out of 1980? Apart from 1650 / 1980 = 0.78787878... which is a beautiful number 😁 |
@valeriupredoi To clarify if a bug was solved or introduced: which of the two versions of Dask produces the correct numbers? |
ah a good point @bouweandela 🍺 :
|
Thanks, that is very useful information. Have you been able to reproduce this using only Dask arrays? Or does the issue only show up when the data is loaded through Iris? |
let me try get it to go through netCDF4+Dask only, can't really load the data in a safe way in any other way (so that I preserve some iris-like structure) |
OK, netCDF4-python+Dask with MRE: def do_netCDF4_Dask(c):
ds = netCDF4.Dataset(fil)
bo = np.zeros(c.data.shape[0], dtype=bool)
bo[1560:1860] = True
dat = da.array(ds["tos"])[bo, ...]
print(dat[100, 30, 50].compute())
print(dat[56, 30, 50].compute())
print(dat[144, 30, 50].compute()) gives correct ie unrotated results for both dask 2024.11 (12 too) and 2024.8
|
Curiously, I can't seem to replicate this behaviour directly from the `NetCDFDataProxy object. from iris.fileformats.netcdf import NetCDFDataProxy
def do_netcdfproxy(c):
bo = np.zeros(c.data.shape[0], dtype=bool)
bo[1560:1860] = True
print("NetCDF Proxy")
proxy = NetCDFDataProxy(shape=(1980, 384, 320), dtype=np.float32(), path=fil, variable_name="tos", fill_value=999)
dask_proxy = da.from_array(proxy, chunks=(300, 384, 320))
dask_slice = dask_proxy[bo, ...]
print(dask_slice[100, 30, 50].compute())
print(dask_slice[56, 30, 50].compute())
print(dask_slice[144, 30, 50].compute()) This seems like a pretty tricky one to debug. I think for the time being a dask pin is probably sensible. |
It's also worth noting that this problem doesn't seem to be due to iris specific indexing since the core data dask array also has this problem. def do_lazy_iris_core(c):
bo = np.zeros(c.core_data().shape[0], dtype=bool)
bo[1560:1860] = True
print("Lazy Iris core data")
dc = c.core_data()[bo, ...]
print(dc[100, 30, 50].compute())
print(dc[56, 30, 50].compute())
print(dc[144, 30, 50].compute()) |
@stephenworsley thanks for looking into this! Am afraid this is an iris issue, only for the lazy data array though. Please see #6251 (comment) - that uses pure netCDF4+Dask, and all is fine with both the lazy, and the realized array. A pin on Dask is really unwelcome, since that version we'd pin on is already 4+ months old. An alternative, that I would propose to my colleagues, would be to use the int indexed ndarray, literally converting the boolean slice to an int dtype-d array. Let me know if you need me to run any more fixes. Also pinging @trexfeathers in case he's seen such behaviour in the past 🍺 |
@valeriupredoi we've discussed this at length, and we think there's a significant risk of others also encountering this problem, and potentially in ways where a workaround is not possible. We don't know how long it will take to develop a full understanding of the problem, and also how long it might take to fix, especially with the Christmas period coming up. To avoid leaving our users exposed to this problem, a pin is unfortunately necessary. We're going to implement the pin in a bugfix release, which should capture most 'naive' users, including those on PyPI. If the pin is a problem for you then you can specify a different Iris version in your requirements. |
many thanks for the heads up @trexfeathers 🍺 Sounds reasonable to me, completely understand the Xmas bit - I myself have just about an hour left to work this year, and am not gonna ask others nor give myself more work 😀 Have a great Xmas break @stephenworsley and @trexfeathers - see you folks in the new year 🎄 |
@HGWright this is not completed, the pin on iris is but a workaround to avoid it 🍺 |
It is indeed solved upstream in Dask, see the issue dask/dask#11614 but I suggest keeping the issue open until both Dask release the version with the fix, and iris unpins Dask, I'd also pin Dask to that version in your next release 🍺🎄 |
Hey @valeriupredoi, the work to unpin dask has been captured in #6264, so I think this is good to remain closed (assuming there are no issues with yesterdays v3.11.1 release)? |
well, it's up to you guys to decide how to handle an issue, my take is keep it open as long as the bug is still out there, the workaround the bug (ie pin Iris) is done, but the actual bug is still lurking (in Dask, as we've seen, until Dask release a new version, after the Xmas hols). At any rate, you folks have a good break and happy Xmas! 🎄 |
🐛 Bug Report
Hi folks, buggy-bug here, please see the following:
MRE
file can be downloaded directly from https://esgf.ceda.ac.uk/thredds/catalog/esg_cmip6/CMIP6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Omon/tos/gn/v20190308/catalog.html?dataset=esg_cmip6/CMIP6/CMIP/NCAR/CESM2/historical/r1i1p1f1/Omon/tos/gn/v20190308/tos_Omon_CESM2_historical_r1i1p1f1_gn_185001-201412.nc
Environment
Things that matter:
Results
Run the above Python code for the two versions of Dask (2024.8 and 2024.11) and you'll see the differences 😁
Fixing the issue
Hints:
Good luck 🍺
The text was updated successfully, but these errors were encountered: