Add support for saving segments in a single file #464

fedorov · 2023-01-25T16:35:09Z

WIP towards addressing #462

WIP towards QIICR#462

fedorov · 2023-01-25T17:42:36Z

Failures of the round-trip tests are expected, since output is not what is expected.

Issues identified that need to be fixed:

copy of the unpacked frame in https://github.com/QIICR/dcmqi/blob/master/libsrc/ImageSEGConverter.cpp#L757-L772 must be very inefficient - use something like https://github.com/InsightSoftwareConsortium/ITK/blob/633f84548311600845d54ab2463d3412194690a8/Examples/Iterators/ImageRegionIterator.cxx#L201-L212 instead? (also https://examples.itk.org/src/core/common/importpixelbufferintoanimage/documentation)

For whatever reason, a new list was created for each segment

fedorov · 2023-02-06T21:05:13Z

🎆

fedorov · 2023-02-06T21:08:01Z

Now I need to fix DICOMSegmentationPlugin.py which unfortunately makes the assumption that there will always be one segment per nrrd file:

https://github.com/QIICR/QuantitativeReporting/blob/71f01876d95b6190218b4e23b2dd30d9d5e82488/DICOMPlugins/DICOMSegmentationPlugin.py#L142-L150

perhaps this is faster than iterating over the pixels in a loop?

Introducing a new class OverlapUtil that is able to check whether segments of a given DICOM segmentation are overlapping. It computes an overlap matrix denoting which segments overlap with each other. Also, from that it can derive groups of segments that do not overlap, e.g. for exporting them later into NRRD files or similar formats that cannot deal with overlapping segments.

The code no produces first results - NRRD file(s) for merged segments, though they still need to be checked for correctness. Also on some segmentations the code still seems to crash (to be fixed).

Ensure that errors are passed up the hierarchy. This esp. fixes handling of files with non-consecutive segment numbers (such as an old verison of the brain atlas that uses weird segment numbers).

Fix computation of Image origin for ITK which has been introduced when refactoring the code. Let DCMTK make use of STL classes which gives a some speedup in particular regarding the usage of OFMap that is used in the functional group classes (dcmfg module).

Fix assignment of frames to NRRD file (used index instead of vector element at that index). Speed up writing of segmentation by disabling functional group structure checking. The segmentation to ITK validator should now produce valid NRRD files, grouping non-overlapping segments into the same file (if option --mergedSegments is being provided on the command line).

This also make sure that the updated workflow actions are used.

Since ITK to DICOM and DICOM to ITK conversion do not have much in common, the original class ImageSEGConverter is now splitted into Itk2DicomConverter and Dicom2ItkConverter.

fedorov · 2023-11-03T16:00:54Z

@michaelonken I think I know why it fails. The NRRD outputs you are saving start with prefix 0, while they used to start from 1. E.g., 23x38x3-0.nrrd should have been named 23x38x3-1.nrrd. It should be a trivial fix, since I just think you need to bump the key returned in this map to start from 1: https://github.com/fedorov/dcmqi/blob/merged-segments/apps/seg/segimage2itkimage.cxx#L43. I think it would be better it is fixed at the time that map is created rather than in the loop that writes the files, for consistency of the API, so I think it will be more efficient if you do it, because I can't tell for sure what this may affect.

michaelonken · 2023-11-03T21:15:52Z

Makes sense,thanks! I'll try this on Monday.

fedorov · 2023-11-06T03:50:34Z

@michaelonken here are commands to download the image and segmentation for an oblique acquisition:

seg: s5cmd --no-sign-request --endpoint-url https://s3.amazonaws.com cp 's3://idc-open-data/c6d7ad8a-835c-478f-956e-a09171698f51/*' .
image: s5cmd --no-sign-request --endpoint-url https://s3.amazonaws.com cp 's3://idc-open-data/89ac2bc6-893e-4fc7-9dcc-4731617fd716/*' .

You will need to have s5cmd installed first: https://github.com/peak/s5cmd.

For the sake of completeness, the above commands will download corresponding DICOM series from the public AWS buckets maintained by NCI Imaging Data Commons (IDC). The samples are from the QIN-Prostate-Repeatablity collection, which is hosted on IDC.

Fixed tests caused by bug not exporting last frame of a segment. The underlying API has been changed so usage error is less likely to occur. Simplified includes, better error messages, documentation.

Bumping node version -- but not too high since at some version node requires higher GLIBC version. Higher verisons than 12.19.1 may work, though, earlier versions did not work.

The coordinate relevant for frame sorting is now identified by computing the cross product for image orientation patient and selecting the coordinate that results in the bigges absolute value. Minor enhancements in error handling and documentation.

Before, all ITK images have been returned at once after conversion. Now, this is done with an iterator-like access that allows to delete an ITK image once produced by the converter. The converter class does not keep any reference to the ITK image, so once it goes out of scope of the "user" class (like segimage2itkimage.cxx), memory of the image is freed (converter returns smart pointer).

Fixed bug when creating the JSON for merged segments where more than one file was produced. Instead of then splitting segments into different entries in the outer array of the JSON, the segments have been written to the inner array instead.

Added test that writes 3 segments, 2 of them non-overlapping. That means that two NRRD files will be produced, one with 2 segments, the other with only a single one.

michaelonken · 2023-11-10T13:57:03Z

Current status of this pull request:

Overall state: The DICOM to ITK conversion is now in a quite stable state -- from my point of view :-). More tests would be good though since I used limited amount of test data so far.
Refactoring: I refactored the SegImageConverter code into two separate classes (Itk2DicomConverter and Dicom2ItkConverter) since both use cases did not have much doe in common. Also I (earlier) refactored the DIcom2Itk code quite a lot to make it more readable and reusable.
Tests: Tests are working now (again) on all 3 supported platforms, including one test to verify the NRRD and JSON files produced for the --mergeSegment case.

Weaknesses:

The slice sorting is done by identifying the relevant coordinate for sorting (by evaluation cross product of image orientation vectors). The resulting frame (positions) are grouped into a "logical frame" if they are not further away than 1% of slice thickness. I am not sure whether this is a valid approach or not.
Right now the overlap checking compares 8 pixels at once through a binary AND operation. This works well for images where rows * cols MOD 8 is 0, i.e. dividable by 8 since then it is easy to access each frame separately. When this is not the case, frames need to be unpacked from 1 bit binary DICOM into 8 bit pixel cells (one byte per bit) which takes significant time. It should be evaluated whether the byte-wise comparison method also works for rows * cols % 8!=0 cases, or, at least with some extra code.

Other remaining work:

Bonus: NRRD export w/o ITK: Looking at the simplicity of the NRRD files, I could imagine a direct export from DICOM to NRRD without the ITK detour would also be significantly faster. But maybe we can see whether performance and memory is already good enough right now. For instance, writing the BWH Total Segmentator example to a merged ITK image only(?) takes around 5 seconds right now. This could also go to DCMTK (as the OverlapUtil class probably will) as a little extra tool.

fedorov · 2023-11-14T22:53:23Z

@michaelonken I built the branch, and it appears that the output files that are created by segimage2itkimage are still numbered starting from 0. Is there a reason for this? I think there is benefit to maintaining numbering conventions for compatibility with other tools. Specifically, I ran across this because Slicer plugin expected 0-based numbering. I can fix that, but I wonder if there are other tools/workflows that can get broken because of this change.

michaelonken · 2023-11-15T08:03:01Z

@michaelonken I built the branch, and it appears that the output files that are created by segimage2itkimage are still numbered starting from 0.

Hm, I could not confirm this on my system, e.g. see the following log:

/dcmqi-build/bin/segimage2itkimage --inputDICOM ~/data/dcm/SEG/bwh_NLST_207114_TotalSegmentator/seg/segmentation.dcm --outputDirectory /tmp

dcmqi repository URL: git@github.com:QIICR/dcmqi.git revision: c1f09ef tag: latest-89-gc1f09ef
Row direction: 1 0 0
Col direction: 0 -1 0
Z direction: 0 0 -1
Total frames: 9654
Total frames with unique IPP: 455
Total overlapping frames: 455
Origin: [-179.1, 185.217, -31.755]
Slice extent: 329.15
Slice spacing: 0.725002
Will not merge segments: Splitting segments into 77 groups
Writing itk image to /tmp/1.nrrd ... done
Writing itk image to /tmp/2.nrrd ... done

Also using --mergedSegements starts exports with:
Writing itk image to /tmp/1.nrrd ... done
Listing /tmp also shows that there is no file /tmp/0.nrrd and the lowest number found is /tmp/1.nrrd.

Maybe I am overlooking something, or have you maybe used an old version that has been in your PATH or so?

fedorov · 2023-11-15T17:37:50Z

@michaelonken I am very sorry - I messed it up. Indeed, I was on a stale branch. I thought I updated, but I was multitasking, and apparently I did something wrong. Output file numbering works as expected! I am fixing few items in the Slicer QuantitativeReporting extension in this PR that I discovered in the process of testing.

fedorov · 2023-11-15T19:28:05Z

@michaelonken I did testing with various samples from IDC, and it works great!

I am very tempted to just merge it and get this out into Slicer. We can address refinements in subsequent work. What do you think?

It should be evaluated whether the byte-wise comparison method also works for rows * cols % 8!=0 cases, or, at least with some extra code.

Indeed, it will require some extra code, since AND is applied to frames before unpacking. Such frame sizes are not common. For the record, I used the query below to search IDC for such segmentations:

SELECT
  collection_id,
  PatientID,
  StudyInstanceUID,
  SeriesInstanceUID,
  `Rows`,
  `Columns`,
  ARRAY_LENGTH(SegmentSequence) AS numberOfSegments
FROM
  `bigquery-public-data.idc_current.dicom_all`
WHERE
  Modality="SEG"
  AND MOD(`Rows`*`Columns`, 8) <> 0
ORDER BY
  Collection_id desc

and only three collections contain such samples, all of them containing just single segment. So I do not think it makes sense to spend too much time on this, or delay integration of this PR.

michaelonken · 2023-11-16T07:51:13Z

Thanks for testing @fedorov ! Nice SQL query :-)

For the MOD 8 issue: I agree, let's skip it for now. I will look into it if there is some time left. I thought about it and I think the bit-wise comparison will also work for the other frame sizes as long as the "margin" bytes at the beginning and at the end of the frame, which may contain overlapping bits from other frames, are masked out correctly first. This is a bit intricate so I don't want to implement this quickly without proper unit tests.

The one remaining thing that we might to guard against are oblique slices. If I understand correctly, in those Series the position (Image Position Patient) does not only vary in one (e.g. z) coordinate between slices, but also in the other two coordinates (x,y). The overlap detection code right now superimposes frames that are on the same z coordinate, without "aligning" them on their x,y coordinate. This way pixels are compared that might have totally different positions in 3D space, eventually leading to broken merges or non-merges of segments. The right way would be to accommodate for the x,y shift before doing the comparison. Probably it is just some "simple" matrix operation(s) but nothing I could shake off my sleeve. As a workaround, one might try to detect oblique slicing and then fallback to the non-merging operation.

What do you think?

fedorov · 2023-11-16T18:50:16Z

@michaelonken I spent some time looking at the code in question, which as I understand is around here: https://github.com/fedorov/dcmqi/blob/merged-segments/libsrc/OverlapUtil.cpp#L736-L761.

I think this logic could be significantly simplified. Instead of trying to detect relevantCoordinate, we can use the existing FrameSorter class to order frames by their position along the ImageOrientationPatient vector, we already have the function for this: https://github.com/fedorov/dcmqi/blob/merged-segments/include/dcmqi/framesorter.h#L230. Given that ordering, I think we should use the difference between the values of dist here https://github.com/fedorov/dcmqi/blob/merged-segments/include/dcmqi/framesorter.h#L268 to detect overlapping frames, applying the same threshold logic. I am pretty sure this should be robust to any orientation.

michaelonken · 2023-11-16T19:12:40Z

@fedorov I can check this in detail Tuesday next week. From just taking a quick look I am not sure whether the frame sorting code you referenced takes into account the x,y coordinate shifting that I mentioned. It's clear, also for oblique cases, how to identify the relevant coordinate (using Image Orientation Patient as the framesorter code and the OverlapUtil code does).

The problem I see is in the shifting of the other two coordinates, so that if two frames are on the same "plane" (e.g. z coordinate, "relevant coordinate"), we cannot simply compare value for "pixel" 1,1 from frame A to the 1,1 from frame B since Frame B might start 2 cm more to the right (e.g. x direction) and 1 cm more in y-direction. Here is a very sophisticated picture to demonstrate my bad explanation :-)

fedorov · 2023-11-16T19:30:43Z

Michael, I do not think this would be the case for any volumetric acquisition. In oblique scans, the frames are not shifted when observed along the scan direction. What you show would be the case if one projected oblique acquisition slices to a plane.

michaelonken · 2023-11-16T19:35:43Z

Ah, thank you. Sorry for the noise then. And we can always assume that we don't have such kind of shifts in any acquisitions we handle?

I think then we are safe to go with the current code. We could simplify by using the existing framesorter code (that I just didn't know) but maybe this can be a second step a bit later if you want to merge.

P.S: Note that Frame A and Frame B are usually parts of two different segments, not of a real "acquisition" of a modality, so a segmentation creator could in theory at least place frames in space where he wants to; this is what I am afraid of.

fedorov · 2023-11-16T20:01:54Z

P.S: Note that Frame A and Frame B are usually parts of two different segments, not of a real "acquisition" of a modality, so a segmentation creator could in theory at least place frames in space where he wants to; this is what I am afraid of.

This may be a valid scenario in theory, but it would be crazy, and in any case it is probably not handled in dcmqi in other places. I am not sure. I have never seen such cases. Would be nice to add an error check for this or something like this!

I will go ahead merging this, to make this functionality easier accessible to the users, and make it easier to test and identify deficiencies!

There are two very important improvements in this updated version: 1. Performance of conversion is improved perhaps by at least an order of magnitude, cutting down the conversion for large DICOM SEG (such as TotalSegmentator results) from minutes to seconds. 2. It is now possible to save non-overlapping segments into a single file instead of previous convention of saving each segment as a separate file. See details in QIICR/dcmqi#464. Note that to use this feature you will need to use the --merge flag (there may be changes needed to the MITK dcmqi wrapper - I am sure you will test this - but there are no changes to the JSON schema)

Summary: Upgrade DCMQI to 1.3.0 There are two very important improvements in this updated version: 1. Performance of conversion is improved perhaps by at least an order of magnitude, cutting down the conversion for large DICOM SEG (such as TotalSegmentator results) from minutes to seconds. 2. It is now possible to save non-overlapping segments into a single file instead of previous convention of saving each segment as a separate file. See details in QIICR/dcmqi#464. Note that to use this feature you will need to use the --merge flag (there may be changes needed to the MITK dcmqi wrapper - I am sure you will test this - but there are no changes to the JSON schema) Update dcmqi to 1.3.1 Bump to DCMQI v1.3.2 Adapt to new DCMQI interface + fixed T30286 Fixed error in label generation when loading multiple groups from DICOM Seg Fixed error in assigning the label name when DCM Seg is loaded Added support for SegmentOverlap tag. If segment overlap == NO then all segements are added as label of one group. In all other cases each segment is added as new group (old behavior). + fixed memory leak Test Plan: code review, test to load and save segs as/from DCM Seg. Segmentations of a group in MITK should stay now a group also after loading. Loading old/other DCM seg results in a group per segment/label. Reviewers: O1 MITK Reviewer Group I, kislinsk Reviewed By: O1 MITK Reviewer Group I, kislinsk Subscribers: kislinsk Maniphest Tasks: T30286 Differential Revision: https://phabricator.mitk.org/D966

fedorov added 2 commits January 25, 2023 11:10

ENH: write all segments to the same file

52de374

ENH: adding parameterization for merged segments

65b9418

WIP towards QIICR#462

fedorov added 4 commits January 27, 2023 10:50

BUG: add segments to existing segmentation

ae1cab1

For whatever reason, a new list was created for each segment

STYLE: remove blanks per codefactor

980ccd3

ENH: fix tests

0b6e726

BUG: fix overwrite of segment metadata

cbad6aa

fix metadata init with mergedSegments on

2daf29c

fedorov mentioned this pull request Feb 20, 2023

allow import of multiple segments QIICR/QuantitativeReporting#270

Merged

fedorov and others added 13 commits April 7, 2023 13:08

WIP: use ImportImageFilter + iterators to copy frame

0a4f0a1

perhaps this is faster than iterating over the pixels in a loop?

WIP: Fully integrated code producing first results.

d9dc34f

The code no produces first results - NRRD file(s) for merged segments, though they still need to be checked for correctness. Also on some segmentations the code still seems to crash (to be fixed).

Better error handling.

57cd7ef

Ensure that errors are passed up the hierarchy. This esp. fixes handling of files with non-consecutive segment numbers (such as an old verison of the brain atlas that uses weird segment numbers).

Error handling.

921b947

Merge latest code from 'master'.

3c3df73

This also make sure that the updated workflow actions are used.

Merge branch 'master' into merged-segments

1f136ca

Fix accidentially commited typo.

d9fce9d

Cleanups, robustness, documentation.

8c32931

Refactored converter classes.

5761fc5

Since ITK to DICOM and DICOM to ITK conversion do not have much in common, the original class ImageSEGConverter is now splitted into Itk2DicomConverter and Dicom2ItkConverter.

Merge branch 'master' into merged-segments

52a8e92

michaelonken added 3 commits November 7, 2023 16:43

Fixed tests and bug, minor changes.

d677992

Fixed tests caused by bug not exporting last frame of a segment. The underlying API has been changed so usage error is less likely to occur. Simplified includes, better error messages, documentation.

Try to fix Linux docker build.

137f309

Bumping node version -- but not too high since at some version node requires higher GLIBC version. Higher verisons than 12.19.1 may work, though, earlier versions did not work.

michaelonken added 6 commits November 9, 2023 11:49

Try to fix compile error on Windows.

9ee7d5b

Another fix for windows build errors.

0a070f2

Replace use of deprecated jsoncpp class.

35e761f

Fix bug when creating JSON output for merges.

c1f09ef

Fixed bug when creating the JSON for merged segments where more than one file was produced. Instead of then splitting segments into different entries in the outer array of the JSON, the segments have been written to the inner array instead.

Added test for merging segments into NRRD files.

25d8165

Added test that writes 3 segments, 2 of them non-overlapping. That means that two NRRD files will be produced, one with 2 segments, the other with only a single one.

fedorov assigned pieper and unassigned pieper Nov 14, 2023

fedorov requested a review from pieper November 14, 2023 17:45

fedorov marked this pull request as ready for review November 16, 2023 21:24

fedorov merged commit a67c6c7 into QIICR:master Nov 16, 2023
5 of 6 checks passed

fedorov mentioned this pull request Dec 5, 2023

Upgrade DCMQI to 1.3.1 - performance improvements and merged segments support MITK/MITK#279

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for saving segments in a single file #464

Add support for saving segments in a single file #464

fedorov commented Jan 25, 2023

fedorov commented Jan 25, 2023 •

edited

Loading

fedorov commented Feb 6, 2023

fedorov commented Feb 6, 2023

fedorov commented Nov 3, 2023

michaelonken commented Nov 3, 2023

fedorov commented Nov 6, 2023 •

edited

Loading

michaelonken commented Nov 10, 2023

fedorov commented Nov 14, 2023

michaelonken commented Nov 15, 2023

fedorov commented Nov 15, 2023

fedorov commented Nov 15, 2023

michaelonken commented Nov 16, 2023

fedorov commented Nov 16, 2023

michaelonken commented Nov 16, 2023 •

edited

Loading

fedorov commented Nov 16, 2023

michaelonken commented Nov 16, 2023 •

edited

Loading

fedorov commented Nov 16, 2023

Add support for saving segments in a single file #464

Add support for saving segments in a single file #464

Conversation

fedorov commented Jan 25, 2023

fedorov commented Jan 25, 2023 • edited Loading

fedorov commented Feb 6, 2023

fedorov commented Feb 6, 2023

fedorov commented Nov 3, 2023

michaelonken commented Nov 3, 2023

fedorov commented Nov 6, 2023 • edited Loading

michaelonken commented Nov 10, 2023

fedorov commented Nov 14, 2023

michaelonken commented Nov 15, 2023

fedorov commented Nov 15, 2023

fedorov commented Nov 15, 2023

michaelonken commented Nov 16, 2023

fedorov commented Nov 16, 2023

michaelonken commented Nov 16, 2023 • edited Loading

fedorov commented Nov 16, 2023

michaelonken commented Nov 16, 2023 • edited Loading

fedorov commented Nov 16, 2023

fedorov commented Jan 25, 2023 •

edited

Loading

fedorov commented Nov 6, 2023 •

edited

Loading

michaelonken commented Nov 16, 2023 •

edited

Loading

michaelonken commented Nov 16, 2023 •

edited

Loading