Skip to content

Latest commit

 

History

History
15 lines (8 loc) · 1.53 KB

File metadata and controls

15 lines (8 loc) · 1.53 KB

NIH logo

NIH Molecular Imaging Branch - Deep Learning Preprocessing Tools

This repository shares the preprocessing scripts we use at the NIH Molecular Imaging Branch. Depending on your raw data and your labels several preprocessing steps are needed before Deep Learning models can be trained. This repository focuses mainly on Computer Vision and Natural Language Processing tasks.

Contents

1. Image patch extractor for VOI files

The VOI file format is used to save medical imaging segmentations. Although not very common, it is still used by some applications e.g. Brainmaker or NIH MIPAV. With this library image patches and masks can be created based on segmentations in VOI format.

2. DOC/DOCX document converter

Microsoft Office switched from the old DOC format to the XML based DOCX format in 2003. This format is more suitable for Natural Language Processing since extracting strings and parsing text is pretty straightforward. This script converts DOC files to DOCX files or vice versa and can be run directly in the command line.