Shape of 4DSTEM .mib data triggered with lineskip=1 #270

emichr · 2024-06-10T08:13:26Z

Describe the bug

When loading 4DSTEM .mib data (hdr file:
SPEDSTEM_256x256_8cm_100pct.txt) with lineskip=1, hyperspy.api.load() throws an IO error:
error_msg.txt. The error disappears when navigation_shape=(256,256) is given, but the resulting signal is distorted:

To load the data correctly with the current version of rsciio, an extra frame must be given in the x-axis of the navigation shape: navigation_shape=(257, 256) and this gives

As a workaround this is fine, but maybe some improvements can be made to the reader.

The data is available at: https://drive.google.com/drive/folders/1jFQIKGnT1Ru0faFC_rLzbhrjrVIqx_8q?usp=drive_link.

To Reproduce

Steps to reproduce the behavior:

signal = hs.load("SPEDSTEM_256x256_8cm_100pct.mib", lazy=True) #Throws IO error
signal = hs.load("SPEDSTEM_256x256_8cm_100pct.mib", lazy=True, navigation_shape=(256,256)) #Works but resulting signal is distorted.
signal = hs.load("SPEDSTEM_256x256_8cm_100pct.mib", lazy=True, navigation_shape=(257,256)) #Works and resulting signal is correct.

Expected behavior

The raw .mib data has a total of 65792 frames (as seen in the .hdr file), where $256\times256=65536$ frames contain the actual data (can be seen from the .hdr ScanX and ScanY parameters). Then hs.load("SPEDSTEM_256x256_8cm_100pct.mib", lazy=True, navigation_shape=(256,256)) should skip every 256th frame (counting from 0) as this is the most usual way of interpreting the navigation shape. It should be possible to get the required information from the .hdr file and thus avoid the need for specifying the navigation shape when reading the data.

Python environement:

RosettaSciIO version: 0.4
Python version: 3.11.9
HyperSpy version: 2.1.0
Pyxem version: 0.19.0

The text was updated successfully, but these errors were encountered:

ericpre · 2024-06-10T15:17:04Z

Considering the number of frame to skip per line is not saved in the file, what logic would you suggest to guess the number of frame to skip, that would also support incomplete acquisition?
@matkraj, can you please confirm that this is correct that the number of frame to skip for each line is not saved in the files?

Out of curiosity, are these being used to mitigate flyback, when line trigger or pixel trigger are not used? Is so, would it be more simple not to use line/pixel trigger instead? If there are use cases, it would be useful to document them for future reference.

ericpre · 2024-06-10T15:22:57Z

Traceback error when reading a reading a file containing more frames than the number of scanned pixels (in this case 1 pixel at the end of the line):

ERROR:hyperspy.io:If this file format is supported, please report this error to the HyperSpy developers.

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[15], line 1
----> 1 s = hs.load(str(datapath), lazy=False)

File /cluster/projects/itea_lille-nv-fys-tem/miniforge3/envs/pyxem0.19.0/lib/python3.11/site-packages/hyperspy/io.py:536, in load(filenames, signal_type, stack, stack_axis, new_axis_name, lazy, convert_units, escape_square_brackets, stack_metadata, load_original_metadata, show_progressbar, **kwds)
    533         objects.append(signal)
    534 else:
    535     # No stack, so simply we load all signals in all files separately
--> 536     objects = [
    537         load_single_file(filename, lazy=lazy, **kwds) for filename in filenames
    538     ]
    540 if len(objects) == 1:
    541     objects = objects[0]

File /cluster/projects/itea_lille-nv-fys-tem/miniforge3/envs/pyxem0.19.0/lib/python3.11/site-packages/hyperspy/io.py:537, in <listcomp>(.0)
    533         objects.append(signal)
    534 else:
    535     # No stack, so simply we load all signals in all files separately
    536     objects = [
--> 537         load_single_file(filename, lazy=lazy, **kwds) for filename in filenames
    538     ]
    540 if len(objects) == 1:
    541     objects = objects[0]

File /cluster/projects/itea_lille-nv-fys-tem/miniforge3/envs/pyxem0.19.0/lib/python3.11/site-packages/hyperspy/io.py:596, in load_single_file(filename, **kwds)
    590     raise ValueError(
    591         "`reader` should be one of None, str, " "or a custom file reader object"
    592     )
    594 try:
    595     # Try and load the file
--> 596     return load_with_reader(filename=filename, reader=reader, **kwds)
    598 except BaseException:
    599     _logger.error(
    600         "If this file format is supported, please "
    601         "report this error to the HyperSpy developers."
    602     )

File /cluster/projects/itea_lille-nv-fys-tem/miniforge3/envs/pyxem0.19.0/lib/python3.11/site-packages/hyperspy/io.py:617, in load_with_reader(filename, reader, signal_type, convert_units, load_original_metadata, **kwds)
    615 lazy = kwds.get("lazy", False)
    616 if isinstance(reader, dict):
--> 617     file_data_list = importlib.import_module(reader["api"]).file_reader(
    618         filename, **kwds
    619     )
    620 else:
    621     # We assume it is a module
    622     file_data_list = reader.file_reader(filename, **kwds)

File /cluster/projects/itea_lille-nv-fys-tem/miniforge3/envs/pyxem0.19.0/lib/python3.11/site-packages/rsciio/quantumdetector/_api.py:585, in file_reader(filename, lazy, chunks, mmap_mode, navigation_shape, first_frame, last_frame, print_info)
    582     else:
    583         navigation_shape = (frame_per_trigger, frames_number // frame_per_trigger)
--> 585 data = load_mib_data(
    586     filename,
    587     lazy=lazy,
    588     chunks=chunks,
    589     mmap_mode=mmap_mode,
    590     navigation_shape=navigation_shape,
    591     first_frame=first_frame,
    592     last_frame=last_frame,
    593     mib_prop=mib_prop,
    594     print_info=print_info,
    595     return_mmap=False,
    596 )
    597 data = np.flip(data, axis=-2)
    599 # data has 3 dimension but we need to to take account the dimension of the
    600 # navigation_shape after reshape

File /cluster/projects/itea_lille-nv-fys-tem/miniforge3/envs/pyxem0.19.0/lib/python3.11/site-packages/rsciio/quantumdetector/_api.py:342, in load_mib_data(path, lazy, chunks, mmap_mode, navigation_shape, first_frame, last_frame, mib_prop, return_headers, print_info, return_mmap)
    340 # remove navigation_dimension with value 1 before reshaping
    341 navigation_shape = tuple(i for i in navigation_shape if i > 1)
--> 342 data = data.reshape(navigation_shape + mib_prop.merlin_size)
    343 if lazy and isinstance(chunks, tuple) and len(chunks) > 2:
    344     # rechunk navigation space when chunking is specified as a tuple
    345     data = data.rechunk(chunks)

ValueError: cannot reshape array of size 0 into shape (65793,256,256)

When there is a mismatch in shape, we could raise an better error that mentioning additional frame per line and add a additional argument (frame_to_skip_per_line ?) and / or use the approach in #235?

emichr · 2024-06-10T15:44:24Z

Considering the number of frame to skip per line is not saved in the file, what logic would you suggest to guess the number of frame to skip, that would also support incomplete acquisition? @matkraj, can you please confirm that this is correct that the number of frame to skip for each line is not saved in the files?

Out of curiosity, are these being used to mitigate flyback, when line trigger or pixel trigger are not used? Is so, would it be more simple not to use line/pixel trigger instead? If there are use cases, it would be useful to document them for future reference.

If the data was acquired using the "Scan mode" of the Merlin software, the information is already in the .hdr file as "ScanX" and "ScanY" fields. I am working on a pull request to add this (will probably have it ready tomorrow or the day after).

If the data was acquired in "Standard mode", it becomes a bit more complicated. However. This is slightly outdated, and also requires the user to keep track of this information themselves anyways. In this case, it should suffice to add an example to the docstring on how to give a suitable navigation shape.

emichr · 2024-06-11T05:28:10Z

I created a PR #272 that should solve some of this. I did not touch some of the docstrings to give more detailed examples of how to handle these cases where the .hdr is either missing or does not contain the required ScanX and ScanY fields. Let me know what you think @ericpre and @matkraj, and I'll try to get it into the PR if you think it's a good idea.

emichr · 2024-06-11T05:31:32Z

By the way, I didn't add a better exception when the data cannot be reshaped. I guess It's straight forward, but it might require some more detailed work? Not sure about what kind of "freak" cases that might appear in real life...

emichr added the type: bug Something isn't working label Jun 10, 2024

ericpre added status: feature request and removed type: bug Something isn't working labels Jun 10, 2024

ericpre mentioned this issue Jun 10, 2024

Support more features of the mib reader #219

Open

4 tasks

emichr mentioned this issue Jun 11, 2024

Add support for reading mib acquired with a given number of frame to skip per line #272

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shape of 4DSTEM .mib data triggered with lineskip=1 #270

Shape of 4DSTEM .mib data triggered with lineskip=1 #270

emichr commented Jun 10, 2024

ericpre commented Jun 10, 2024

ericpre commented Jun 10, 2024 •

edited

Loading

emichr commented Jun 10, 2024

emichr commented Jun 11, 2024

emichr commented Jun 11, 2024

Shape of 4DSTEM .mib data triggered with lineskip=1 #270

Shape of 4DSTEM .mib data triggered with lineskip=1 #270

Comments

emichr commented Jun 10, 2024

Describe the bug

To Reproduce

Expected behavior

Python environement:

ericpre commented Jun 10, 2024

ericpre commented Jun 10, 2024 • edited Loading

emichr commented Jun 10, 2024

emichr commented Jun 11, 2024

emichr commented Jun 11, 2024

ericpre commented Jun 10, 2024 •

edited

Loading