DSMIL: Dual-stream multiple instance learning networks for tumor detection in Whole Slide Image

##Training.

  $ python train_tcga.py --dataset=[DATASET_NAME]

You will need to adjust --num_classes option if the dataset contains more than 2 positive classes or only 1 positive class and 1 negative class (binary classifier). See the next section for details.

Useful arguments:

[--num_classes]         # Number of non-negative classes.
[--feats_size]          # Size of feature vector (depends on the CNN backbone).
[--thres]               # List of thresholds for the classes returned by the training function.
[--embedder_weights]    # Path to the embedder weights file (saved by SimCLR). Use 'ImageNet' if ImageNet pretrained embedder is used.
[--aggregator_weights]  # Path to the aggregator weights file.
[--bag_path]            # Path to a folder containing folders of patches.
[--patch_ext]            # File extensino of patches.
[--map_path]            # Path of output attention maps.

Folder structures

Data is organized in two folders, WSI and datasets. WSI folder contains the images and datasets contains the computed features.

root
|-- WSI
|   |-- DATASET_NAME
|   |   |-- CLASS_1
|   |   |   |-- SLIDE_1.svs
|   |   |   |-- ...
|   |   |-- CLASS_2
|   |   |   |-- SLIDE_1.svs
|   |   |   |-- ...

Once patch extraction is performed, sinlge folder or pyramid folder will appear.

root
|-- WSI
|   |-- DATASET_NAME
|   |   |-- single
|   |   |   |-- CLASS_1
|   |   |   |   |-- SLIDE_1
|   |   |   |   |   |-- PATCH_1.jpeg
|   |   |   |   |   |-- ...
|   |   |   |   |-- ...
|   |   |-- pyramid
|   |   |   |-- CLASS_1
|   |   |   |   |-- SLIDE_1
|   |   |   |   |   |-- PATCH_LOW_1
|   |   |   |   |   |   |-- PATCH_HIGH_1.jpeg
|   |   |   |   |   |   |-- ...
|   |   |   |   |   |-- ...
|   |   |   |   |   |-- PATCH_LOW_1.jpeg
|   |   |   |   |   |-- ...
|   |   |   |   |-- ...

Once feature computing is performed, DATASET_NAME folder will appear inside datasets folder.

root
|-- datasets
|   |-- DATASET_NAME
|   |   |-- CLASS_1
|   |   |   |-- SLIDE_1.csv
|   |   |   |-- ...
|   |   |-- CLASS_2
|   |   |   |-- SLIDE_1.csv
|   |   |   |-- ...
|   |   |-- CLASS_1.csv
|   |   |-- CLASS_2.csv
|   |   |-- DATASET_NAME.csv

Feature vector csv files explanation

For each bag, there is a .csv file where each row contains the feature of an instance. The .csv is named as "bagID.csv" and put into a folder named "dataset-name/category/".
There is a "dataset-name.csv" file with two columns where the first column contains the paths to all bagID.csv files, and the second column contains the bag labels.
Labels.

For binary classifier, use 1 for positive bags and 0 for negative bags. Use --num_classes=1 at training.
For multi-class classifier (N positive classes and one optional negative class), use 0~(N-1) for positive classes. If you have a negative class (not belonging to any one of the positive classes), use N for its label. Use --num_classes=N (N equals the number of positive classes) at training.

Citation

If you use the code or results in your research, please use the following BibTeX entry.

@inproceedings{li2021dual,
  title={Dual-stream multiple instance learning network for whole slide image classification with self-supervised contrastive learning},
  author={Li, Bin and Li, Yin and Eliceiri, Kevin W},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={14318--14328},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
simclr		simclr
README.md		README.md
attention_map.py		attention_map.py
compute_feats.py		compute_feats.py
download.py		download.py
dsmil.py		dsmil.py
env.yml		env.yml
eval.py		eval.py
init.pth		init.pth
model-v0.pth		model-v0.pth
testing_tcga.py		testing_tcga.py
train_mil.py		train_mil.py
train_tcga.py		train_tcga.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DSMIL: Dual-stream multiple instance learning networks for tumor detection in Whole Slide Image

Folder structures

Feature vector csv files explanation

Citation

About

Releases

Packages

Languages

raycaohmu/DSMIL-code

Folders and files

Latest commit

History

Repository files navigation

DSMIL: Dual-stream multiple instance learning networks for tumor detection in Whole Slide Image

Folder structures

Feature vector csv files explanation

Citation

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages