This repository is for Meditables, a 200 image camera-captured dataset of medical reports with table annotations. Two tables types are common in healthcare documents which we have referred to as a T1 type table (conventional) and a T2 type table (key-value pairs). The dataset can be accesed here: https://zenodo.org/record/5048287#.YNzazBMzZhE
In this repo, we also provide pre-trained models in this repo for localizing these tables.
For additional details, check out our paper MediTables: A New Dataset and Deep Network for Multi-Category Table Localization in Medical Documents accepted for ORAL presentation at The 14th IAPR International Workshop on Graphics Recognition (GREC 2021).
Sample images with table annotations from MediTables dataset:
Comparison of our proposed model with baselines:
All the model code is self contained in the Modified-Unet Directory,the baselines trained in the paper are described in the baselines directory. To run the code first we need to setup a virtualenv To setup a Virtual Environment run :
pip install virtualenv
virtualenv <environment name> --python=python3
source <environment name>/bin/activate
Then Install all the required packages
pip install -r requirements.txt
Once the packages installation is done , download the dataset from the above link.
The downloaded data is to be arranged in the following format:
.
+-- train
| +-- 1.jpg
| +-- 1.json
| +-- ..
| +-- ..
+-- Val
| +-- 1.jpg
| +-- 1.json
| +-- ..
| +-- ..
+-- test
| +-- 1.jpg
| +-- 1.json
| +-- ..
| +-- ..
i.e the the image files and the corresponding annotation files from a split are to be places in the same directory. For example please view the "sample_data_folder" directory.
The download_data_split.py
script can be used to download once split of dataset at a time using the urls.txt present in the dataset.
python ModifiedUnet/train.py --train_dir <path to directory containing training data > --val_dir <path to directory containing the validation data> --num_class <number of classes to train on>
Other parameters like batch_size and lr can also be modified.
Once the training is finished, the model checkpoints and tensoboard logs will be saved in outputs
directory.
python ModifiedUnet/inference.py --checkpoint_path <path to trained checkpoint> --infer_dir <Path to Inference Images Directory> --num_class <number of classes i.e deafult = 2 >
Once the Inference is finished the predicted json will be saved in the infer_results
directory.The Jsons follow COCO annotations format. Annotation format Converters like https://github.com/fcakyon/labelme2coco can be used to change between formats.