Processing and tiling of histological slides

Repository setup

We suggest to use miniconda as package manager for your system. Create and activate conda environment with:

conda env create -f environment.yml
conda activate wsi-pre2

Coding best practices

The main branch is protected. Please checkout your own development branch and draft a Pull Request to commit changes.

This repository uses pre-commit hooks and Github Actions for code quality, inspired by the Lightning Hydra Template.

The pre-commit library comes installed with the conda environment. You should then setup pre-commit which uses the hooks from .pre-commit-config.yaml:

pre-commit install

After that your code will be automatically reformatted on every new commit.

To reformat all files in the project use command:

pre-commit run -a

To update hook versions in .pre-commit-config.yaml use:

pre-commit autoupdate

Run tiling with provided config files

The main script to run is tile_generator.py. We provide configs in the configs/ folder which generate tables of patch locations with the corresponding pixel sizes. The tables are then stored as .csv files for each slide in the configured output_path. By default multiprocessing is enabled, such that multiple slides can be processed simultaneously.

As example the tiling of TCGA slides with patch_size=256 can be started as follows:

python tile_generator.py --config configs/tcga-crc_256.json

Config parameters

The table shows descriptions for the most important config parameters:

Dictionary Entry	Description
check_resolution	Perform a resolution check of all slides before extracting patches
use_tissue_detection	Toggle the activation of tissue detection
remove_top_border	Useful for Camelyon slides. Default is false
save_patches	In old pipelines we used to store patches. In this project the default is false
zip_patches	Experimental to try if zipped patch image directories increase transfer speeds. Default is false.
tissue_coverage	Threshold [0,1] for how much tissue coverage is necessary, default is 0.8
processing_level	Level of downscaling by openslide - Lowering the level will increase precision but more time is needed, default is 3
blocked_threads	Number of threads that won't be used by the program
patches_per_tile	Number of patches used for lower resolution operations like tissue detection
overlap	Value [0,1] to set the overlap between neighbouring unannotated patches
annotation_overlap	Value [0,1] to set the overlap between neighbouring annotated patches
patch_size	Output pixel size of the quadratic patches
calibration
use_non_pixel_lengths	Activate calibration and use micrometers instead of pixels
patch_size_microns	Specify the patch size in micrometers. At 0.25 $\mu\text{m}$ / pixel, 64 $\mu\text{m}$ equal 256 pixels
resize	Whether to resize the patches in micrometers to the given patch_size
dataset	Provide name for the dataset
slides_dir	Directory where the different slides and subdirs are located
slideinfo_file	Provide a .csv file with filenames and labels
annotation_dir	Directory where the annotations are located
annotation_file_format	File format of the input annotations ("xml","geojson")
output_path	Output directory to where the resulting files will be stored
skip_unlabeled_slides	Boolean to skip slides without an annotation file
save_annotated_only	Boolean to only save annotated patches
output_format	Image output format. Either "jpeg" or "png"
metadata_format	Format in which slide metadata is stored. Default is "csv"
write_slideinfo	Write information about the processed slide
show_mode	Boolean to enable plotting of some intermediate results/visualizations
label_dict	Structure to set up the operator and the threshold for checking the coverage of a certain class
type	Operator type ["==", ">=", "<="]
threshold	Coverage threshold for the individual class

Name		Name	Last commit message	Last commit date
Latest commit History 115 Commits
.github/workflows		.github/workflows
configs		configs
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
README.md		README.md
count_tiles.py		count_tiles.py
environment.yml		environment.yml
tile_generator.py		tile_generator.py
tissue_detection.py		tissue_detection.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Processing and tiling of histological slides

Repository setup

Coding best practices

Run tiling with provided config files

Config parameters

About

Releases

Packages

Languages

DBO-DKFZ/wsi_preprocessing-crc

Folders and files

Latest commit

History

Repository files navigation

Processing and tiling of histological slides

Repository setup

Coding best practices

Run tiling with provided config files

Config parameters

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages