Get dcm_write() to work. Switch to using Dict for handling data. #22

notZaki · 2018-01-29T15:10:13Z

This was meant to be a simple fix to get dcm_write() to work, but it kinda got out of control somehow.
The good news is dcm_write() now works, but breaking changes were made along the way, so old code using DICOM.jl will have to be updated.

Summary of Changes

dcm_write() now works - on the current test cases at least. It also accepts a path to a file now, instead of requiring an IOStream.
dcm_parse() now works on a larger variety of DICOM files. It can read files without headers (Ability to read DICOM files without headers #5), it can parse signed data (Properly support signed types in pixeldata_parse #3) and data with multiple frames. It might be able to handle private creators too (Handle private creator data elements #6) - not sure about the latter, but it does allow the user to provide VR in case lookup_vr() fails, or to just skip them entirely. It can't handle big-endian files, perhaps the changes from PR Handle big-endian data #18 could be incorporated eventually?
Breaking change: The DcmElt composite type is removed. Now, dcm_parse returns a dictionary. This means we no longer need lookup(d, tag) and can now just use d[tag]. The double indexing issue (fix Arrays for 0.5 #14) is also gone. I think this leads to cleaner code, e.g. we can now do dcm[(0x0020,0x0010)] instead of lookup(dcm,(0x0020,0x0010)).data[1][1] to get the number of rows.
runtests.jl now has more tests (tests #1), using sample data from http://www.barre.nom.fr/medical/samples/ including cases with signed data, missing headers, multiple frames, and retired elements. The tests no longer create temporary directories, and instead create a folder in the Pkg dir (ignored by git). This was to make it easier to find/see the results of the tests, such as the written/saved dicom files.
added a simple readme.md to get users started (addresses comment in tests #1), though I started to ramble a bit in the latter portion with the VR-handling. Some examples, made while testing out the new changes, are in https://gist.github.com/notZaki/d89e6cf6bf44886cddbd2717b1749bf9 which might be useful - I'll add the examples using dcm_write() eventually.
updated travis to use Julia 0.6 and nightly, and to use code coverage - I'm not sure if the latter will work through a PR. The tests on nightly fail for now.
added some general comments; two new entries to dcm_dict.jl; specified types in functions,
updated license

The performance of the package is about the same as before. Results from BenchmarkTools are in the Table below (bold is better result)

Test	Master	PR
Memory (KiB)	342	335
Allocs.	2113	1754
Median Time (μs)	528	567
Samples	8155	7948

…ays.

notZaki · 2018-01-30T15:56:22Z

Update: Tests pass on nightly now. For some reason, some elements were String in v0.6 and became SubStrings in v0.7, and this made string_write() unhappy. Fix was to just add a version that accepts SubStrings.

There's still a swarm of deprecation warnings on nightly, which can be resolved later when v0.7 releases.

ihnorton · 2018-02-27T16:43:04Z

Took a quick look through - looks good to me. You have commit access now, so merge away when ready.

@notZaki if you want to adopt/maintain this package please do so! You clearly have a good sense of what is going on. Feel free to ping on issues/PRs and I'll do my best to help out, if needed. But this package has been orphaned for a while so would be great to see someone run with it.

notZaki · 2018-03-03T00:42:46Z

I postponed the merge because the website containing the sample data was down - so all tests were failing. Maybe the test data ought to be moved into the repo.
Anyways, seems like that is resolved, so I'll merge.

@ihnorton, is code coverage (with codecov) enabled for this repository? It is configured in .travis.yml, but I think the owner has to enable it.

ihnorton · 2018-03-03T02:17:06Z

It's enabled, seems to be showing something since you made this PR: https://codecov.io/gh/JuliaIO/DICOM.jl

ihnorton · 2018-03-08T20:52:25Z

FWIW, one comment about data: my personal rule-of-thumb is to check in "small" test data, to make baseline regression testing as simple as possible. But avoid checking in anything over say 50kb or so. The reason is image data tends not to diff well, even if git has improved binary diffs. So checked-in data tends to bloat the repo history with objects that can only practically be removed by rewriting history. For anything larger, git-lfs is potentially a good solution (there are many other ways of course, but git-lfs seems the easiest to integrate and reuses existing knowledge).

notZaki · 2018-03-09T23:51:49Z

Most of the test dicom files are ~500kb.
I think that's small by modern standard, but at the same time, it's a road to bloat-town cause once a binary file is added to the repo, that file can never truly be removed without re-writing history.

I considered making a new repo solely for the test data, but the tests are working fine for now and I'll wait until there are any new hiccups.

notZaki added 15 commits January 22, 2018 12:31

Add de-identification fields (18,98-99)

b47f80c

Switch from using Array{DcmElt} to Dict

3c10684

Ignore testdata folder

b171ffc

Prevent skip_spaces() from exceeding length; Avoid single-element arr…

b903a49

…ays.

Tests now include implicit and explicit VR types

e6120a5

Update travis to 0.6 and nightly

2b4e9ef

Uncouple data and VR in loaded Dict

5168fee

Allow partial, headless, and unsigned reading

b26523d

Allow user-supplied VRs & Recognize pixel bits from dcm hdr

a950a01

Add test for DICOM files without header or retired elements

c643029

Add code coverage

9d56dce

Add multi-frame test. Fix minor typo

2956cfb

Allow master VR to handle all missing VR cases

29838ea

Update license to 2018

e96ac55

Add basic details to readme

36e255e

Handle substrings in string_write(). Fix typos.

ba26bd1

ihnorton self-requested a review February 14, 2018 02:37

Merge branch 'master' into Switch2Dict

0d1470c

notZaki merged commit 1ed787c into JuliaHealth:master Mar 3, 2018

notZaki deleted the Switch2Dict branch March 3, 2018 03:07

notZaki mentioned this pull request Sep 11, 2019

Update URLs in tests #36

Merged

notZaki mentioned this pull request Oct 22, 2019

Support parsing of big-endian & RGB files. Also refactoring. #38

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Get dcm_write() to work. Switch to using Dict for handling data. #22

Get dcm_write() to work. Switch to using Dict for handling data. #22

notZaki commented Jan 29, 2018 •

edited

Loading

notZaki commented Jan 30, 2018

ihnorton commented Feb 27, 2018

notZaki commented Mar 3, 2018

ihnorton commented Mar 3, 2018

ihnorton commented Mar 8, 2018

notZaki commented Mar 9, 2018

Get dcm_write() to work. Switch to using Dict for handling data. #22

Get dcm_write() to work. Switch to using Dict for handling data. #22

Conversation

notZaki commented Jan 29, 2018 • edited Loading

notZaki commented Jan 30, 2018

ihnorton commented Feb 27, 2018

notZaki commented Mar 3, 2018

ihnorton commented Mar 3, 2018

ihnorton commented Mar 8, 2018

notZaki commented Mar 9, 2018

notZaki commented Jan 29, 2018 •

edited

Loading