Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory pressure and streaming received DICOM content to disk #39

Open
richard-viney opened this issue Jan 30, 2023 · 3 comments · May be fixed by #41
Open

Memory pressure and streaming received DICOM content to disk #39

richard-viney opened this issue Jan 30, 2023 · 3 comments · May be fixed by #41
Assignees
Labels
enhancement New feature or request

Comments

@richard-viney
Copy link
Contributor

I've run into situations where large DICOMs being C-STORE'd cause the container running Node.js + dcmjs-dimse to run out of memory. This appears to be because the whole DICOM dataset is stored in memory, which is understandable because it makes the implementation and interface simpler, but it also means that it's very easy to crash the server running dcmjs-dimse simply by sending it a large enough DICOM! (Or likely also by sending a smaller number of medium-size DICOMs in parallel).

One solution would be to stream received data straight to a DICOM file on disk, thus avoiding OOM problems, but I'm not currently familiar enough with the DIMSE protocol to know how difficult this would be to implement. I do recall seeing some kind of streaming support go into dcmjs a while back, but I never worked with it directly myself and also don't know if it covered streaming writes.

A few related questions:

  • Does the SCP know at the beginning how big the DICOM content being C-STORE'd to it is going to be? If so, could we use this knowledge to put an upper bound on the size of DICOM content allowed to be received by the SCP?
  • Has there been any work done to address memory usage and and scalability in dcmjs-dimse or dcmjs?
  • How amenable to streaming straight to disk is the DIMSE protocol? Is an DIMSE implementation, at least in theory, able to handle large datasets in a memory-constrained environment?
@PantelisGeorgiadis PantelisGeorgiadis self-assigned this Jan 30, 2023
@PantelisGeorgiadis PantelisGeorgiadis added the enhancement New feature or request label Jan 30, 2023
@PantelisGeorgiadis
Copy link
Owner

Hi Richard!
Thank you for reporting this!
I totally understand the situations you have run into. It is actually something that we have been discussing lately with @kinsonho, another good friend of dcmjs-dimse!
Currently, the received DIMSE fragment buffers are appended, within the _processPDataTf function, in a self-growing buffer which is parsed and returned to the user as a Dataset, once the last fragment flag is received.
We can change that to optionally passing the incoming fragment buffers to the user, giving the application the chance to perform any kind of streaming operations (e.g. write to disk, upload to cloud, etc.). Due to the fact that the first fragment of a C-STORE operation might contain the full metadata object (plus part of the pixelData), a nice feature would be to also return a “pixel less” Dataset to the user.
That is as far as we discussed it with @kinsonho but no implementation was attempted!

Now, regarding your questions (although they might have been answered by the above!):

  • Not really unless you examine the attributes such as rows, columns, bitsAllocated, etc. and then do your own calculation to get an estimate. However, most often you know what images are going to be sent from a particular device. So, it is easier to configure based on device, rather than dynamically determine per association.
  • Unfortunately no (I’m talking about dcmjs-dimse), but now is a good chance for us to engage in a discussion that will lead us in addressing some memory usage and scalability issues.
  • Not so much DIMSE implementation per se, but DICOM allows you to use multiple fragments per frame so that you can have smaller fragments to process each time and then recombine them to a single frame (what the library is currently doing, but in-memory).

@richard-viney
Copy link
Contributor Author

Thanks for that and the recent PR, we're keen to look at this as it should help a lot with memory pressure and stability.

To clarify re. the last point about fragments - if a dataset did not use fragments for its image data, is there still the possibility for a piece of data of arbitrarily large size to be received?

@PantelisGeorgiadis
Copy link
Owner

Hi @richard-viney,
I think that you are mixing pixel data fragments with PDV (Presentation Data Values) fragments which, as far as I can say, are unrelated. During transmission, a dataset is split into chunks (whether it has fragmented pixel data -- i.e. encapsulated transfer syntax, or not) with a maximum length of the negotiated max PDU length. These chunks are placed in a PDV structure and are sent across the wire. These are the chunks that we are taking advantage of to perform streaming operations and not the pixel data fragments. Therefore, dataset streaming will always be performed in a degree decided by the negotiated max PDU length. Is that clear now?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants