Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with sys.argv #1

Open
creativedoctor opened this issue Apr 16, 2022 · 7 comments
Open

Issue with sys.argv #1

creativedoctor opened this issue Apr 16, 2022 · 7 comments
Labels
bug Something isn't working enhancement New feature or request

Comments

@creativedoctor
Copy link

Hello there. Thanks for putting this out there first of all.
I am attempting at setting this up at my workplace (without docker), though I have been having issues with the sys.argv calls.
Namely, right at the beginning of the inference.py on the GPU call (easily circumvented by replacing the call for 'cpu') though this keeps going for line 45 of the inference.py (and I suspect, of the ones following it as well). Using Jupyter lab I believe it is because I do not have an output for sys.argv[3] or over and your script calls for [3], [4] and [5] at some point.
Would you have any ideas on how to deal with this?
Thanks.

@ravnoor ravnoor reopened this Apr 18, 2022
@ravnoor
Copy link
Contributor

ravnoor commented Apr 18, 2022

Hey João!

Thank you for taking the time to test this!

I've added a demo jupyter notebook (app/inference.demo.ipynb) detailing the end-to-end analysis for FCD detection. This should help you to get the detection up and running without fiddling with the sys.argv calls in inference.py. Make sure to retrieve the latest version of the main branch though.

In the latest version, I've also clustered all the sys.argv calls towards the beginning of the script, so porting any future iterations to a notebook should be trivial. Thanks for pointing it out!

Also, I wouldn't recommend running the analysis using CPU (not sure if that's the case for you). Using a cheap GPU would easily net you a 10-20x speedup. For reference, the current notebook (for a single patient) takes 50 minutes to execute on a TITAN RTX.

Let me know how it goes. I'll leave the issue open for your feedback and questions.

Best,
Ravnoor

@creativedoctor
Copy link
Author

creativedoctor commented Apr 18, 2022

Hi Ravnoor,

Thanks for your update. Yes, so I have been messing around with this since yesterday and in the end I found out the directory structure and all and got it running...just for it to crash on me after the process being killed by too little memory in a linux VM in my 6-core macbook pro (I thought since it runs freesurfer, probably could handle this). Currently looking into cloud computing with a gpu for testing.

Also, I see you have Training[TODO] in your readme file. Does that mean there's a training component for others to use as well (e.g. site or scanner-specific)?

@ravnoor
Copy link
Contributor

ravnoor commented Apr 18, 2022

The documentation was clearly lacking, but I'm glad you got it to work! I have updated theREADME to indicate the expected documentation structure. It's not BIDS compliant, but that's probably a future upgrade.

I haven't profiled the RAM usage, but it could be the reason for the crash. You could try reducing the variables options['batch_size'] = 350000 and options['mini_batch_size'] = 2048 and monitor RAM usage and see if it helps.

Alternatively, you could use Google Colab to test the notebook with a GPU. It has a free-tier with time-restrictions and a Pro version that's reasonably priced.

Yes, we're planning to release a train.py for anyone to use on their own data. The patch-based data based off the 9 sites in the Neurology article is already open-source. Essentially, anyone could use their own data (a fraction of) and/or our patch-based data to train a new model that's tuned to their specific site/scanner.

@creativedoctor
Copy link
Author

I lowered batches as low as 5 (five) and mini_batch as low as 2 (two), and the script stills get killed in my macbook. I understand it should take a long time but tuning it down so low wouldn't you expect it to keep running?

NOTE: I m still running the version before you updated the other day. Would that cause some issue you corrected in the meantime?

@ravnoor
Copy link
Contributor

ravnoor commented May 3, 2022

Unless it exits with error, lowering the batch parameters should be fine.

The older versions should work just fine.

What amount of RAM and logical cores are allocated to the VM? What Linux (and version) are you using? I can try simulating your environment to replicate the issue.

@creativedoctor
Copy link
Author

I am running CentOS 8.4 on a VM with 4 processors and 10GB RAM allocated in a MB Pro 2019 6-core 16GB RAM.

@ravnoor
Copy link
Contributor

ravnoor commented May 4, 2022

I haven't tested this on CentOS. I won't be able to spin up a working virtual CentOS installation to help diagnose your issue before the end of next week.

The easiest solution would be to use the docker version. I can show you how to access the bash terminal without actually running inference.

The next best thing would be to use an Ubuntu 18.04/20.04 LTS VM or baremetal system. These are the two versions tested to work with deepFCD. Besides, I have access to these systems to help troubleshoot any issues.

@ravnoor ravnoor added bug Something isn't working enhancement New feature or request labels May 14, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants