Skip to content

Source code for Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach (CVPR 2024)

Notifications You must be signed in to change notification settings

rayat137/VisualPromptGFSS

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach (CVPR 2024)

Mir Rayat Imtiaz Hossain, Mennatullah Siam, Leonid Sigal, James J. Little

This repository contains source code for our CVPR 2024 paper titled, Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach.

🎬 Getting Started

1️⃣ Requirements

We used Python 3.9.0 in our experiments and the list of packages is available in the requirements.txt file. You can install them using pip install -r requirements.txt.

Setting up CUDA kernel for MSDeformAttn

After preparing the required environment, run the following command to compile CUDA kernel for MSDeformAttn:

cd VisualPromptGFSS/src/model/ops/
sh make.sh

2️⃣ Dataset

We used the versions of PASCAL and MS-COCO provided by DIaM. You can download the dataset from here.

The data folder should look like this:

data
├── coco
│   ├── annotations
│   ├── train
│   ├── train2014
│   ├── val
│   └── val2014
└── pascal
|   ├── JPEGImages
|   └── SegmentationClassAug

The train/val split

The train/val split can be found in the diectory src/lists/. We borrowed the list from https://github.com/Jia-Research-Lab/PFENet.

3️⃣ Download pre-trained base models

Please download our pre-trained base models from this google drive link. Please place the initmodel directory at the src/ directory of this repo. It contains the pre-trained resnet model. The directories coco and pascal contains pre-trained base models for different splits of coco-20i and pascal-5i

🗺 Overview of the repo

Default configuration files can be found in config/. The directory src/lists/ contains the train/val splits for each dataset. All the codes are provided in src/.

⚙ Training The Base

If you want to train the base models from scratch please run the following:

python3 train_base.py --config=../config/pascal_split0_resnet50_base_m2former.yaml --arch=M2Former  # For pascal-20 split0 base class training
python3 train_base.py --config=../config/coco_split0_resnet50_base_m2former.yaml --arch=M2Former  # For coco-80 split0 base class training

Modify the config files accordingly for the split that you want to train.

🧪 Few-shot fine-tuning

For inductive fine-tuning, please modify the coco_m2former.yaml or pascal_m2former.yaml (depending on the dataset you want to run inference). Please specify the split and numer of shots you want to evaluate on in the evaluate file, along-with the pre-trained model.

For transductive fine-tuning, please modify the coco_m2former_transduction.yaml or pascal_m2former_transduction.yaml in similar manner for the split and number of shots you want to evaluate on.

To run few-shot inference run first go to src/ directory and execute any of the following commands for inference:

python3 test_m2former.py --config ../config/pascal_m2former.yaml  --opts  pi_estimation_strategy self  n_runs 5 gpus [0]  # For pascal inductive inference
python3 test_m2former.py --config ../config/coco_m2former.yaml  --opts  pi_estimation_strategy self  n_runs 5 gpus [0]  # For coco inductive inference
python3 test_m2former.py --config ../config/pascal_m2former_transduction.yaml  --opts  pi_estimation_strategy self  n_runs 5 gpus [0]  # For pascal transductive inference
python3 test_m2former.py --config ../config/coco_m2former_transduction.yaml  --opts  pi_estimation_strategy self  n_runs 5 gpus [0]  # For coco transductive inference

🙏 Acknowledgments

We thank the authors of DIaM and Mask2Former from which some parts of our code are inspired.

📚 Citation

If you find this project useful, please consider citing:

@inproceedings{hossain2024visual,
  title={Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach},
  author={Hossain, Mir Rayat Imtiaz and Siam, Mennatullah and Sigal, Leonid and Little, James J},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={23470--23480},
  year={2024}
}

About

Source code for Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach (CVPR 2024)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published