Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach (CVPR 2024)

Mir Rayat Imtiaz Hossain, Mennatullah Siam, Leonid Sigal, James J. Little

This repository contains source code for our CVPR 2024 paper titled, Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach.

🎬 Getting Started

1️⃣ Requirements

We used Python 3.9.0 in our experiments and the list of packages is available in the requirements.txt file. You can install them using pip install -r requirements.txt.

Setting up CUDA kernel for MSDeformAttn

After preparing the required environment, run the following command to compile CUDA kernel for MSDeformAttn:

cd VisualPromptGFSS/src/model/ops/
sh make.sh

2️⃣ Dataset

We used the versions of PASCAL and MS-COCO provided by DIaM. You can download the dataset from here.

The data folder should look like this:

data
├── coco
│   ├── annotations
│   ├── train
│   ├── train2014
│   ├── val
│   └── val2014
└── pascal
|   ├── JPEGImages
|   └── SegmentationClassAug

The train/val split

The train/val split can be found in the diectory src/lists/. We borrowed the list from https://github.com/Jia-Research-Lab/PFENet.

3️⃣ Download pre-trained base models

Please download our pre-trained base models from this google drive link. Please place the initmodel directory at the src/ directory of this repo. It contains the pre-trained resnet model. The directories coco and pascal contains pre-trained base models for different splits of coco-20i and pascal-5i

🗺 Overview of the repo

Default configuration files can be found in config/. The directory src/lists/ contains the train/val splits for each dataset. All the codes are provided in src/.

⚙ Training The Base

If you want to train the base models from scratch please run the following:

python3 train_base.py --config=../config/pascal_split0_resnet50_base_m2former.yaml --arch=M2Former  # For pascal-20 split0 base class training
python3 train_base.py --config=../config/coco_split0_resnet50_base_m2former.yaml --arch=M2Former  # For coco-80 split0 base class training

Modify the config files accordingly for the split that you want to train.

🧪 Few-shot fine-tuning

For inductive fine-tuning, please modify the coco_m2former.yaml or pascal_m2former.yaml (depending on the dataset you want to run inference). Please specify the split and numer of shots you want to evaluate on in the evaluate file, along-with the pre-trained model.

For transductive fine-tuning, please modify the coco_m2former_transduction.yaml or pascal_m2former_transduction.yaml in similar manner for the split and number of shots you want to evaluate on.

To run few-shot inference run first go to src/ directory and execute any of the following commands for inference:

python3 test_m2former.py --config ../config/pascal_m2former.yaml  --opts  pi_estimation_strategy self  n_runs 5 gpus [0]  # For pascal inductive inference
python3 test_m2former.py --config ../config/coco_m2former.yaml  --opts  pi_estimation_strategy self  n_runs 5 gpus [0]  # For coco inductive inference
python3 test_m2former.py --config ../config/pascal_m2former_transduction.yaml  --opts  pi_estimation_strategy self  n_runs 5 gpus [0]  # For pascal transductive inference
python3 test_m2former.py --config ../config/coco_m2former_transduction.yaml  --opts  pi_estimation_strategy self  n_runs 5 gpus [0]  # For coco transductive inference

🙏 Acknowledgments

We thank the authors of DIaM and Mask2Former from which some parts of our code are inspired.

📚 Citation

If you find this project useful, please consider citing:

@inproceedings{hossain2024visual,
  title={Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach},
  author={Hossain, Mir Rayat Imtiaz and Siam, Mennatullah and Sigal, Leonid and Little, James J},
  booktitle={Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition},
  pages={23470--23480},
  year={2024}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach (CVPR 2024)

🎬 Getting Started

1️⃣ Requirements

Setting up CUDA kernel for MSDeformAttn

2️⃣ Dataset

The train/val split

3️⃣ Download pre-trained base models

🗺 Overview of the repo

⚙ Training The Base

🧪 Few-shot fine-tuning

🙏 Acknowledgments

📚 Citation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Visual Prompting for Generalized Few-shot Segmentation: A Multi-scale Approach (CVPR 2024)

🎬 Getting Started

1️⃣ Requirements

Setting up CUDA kernel for MSDeformAttn

2️⃣ Dataset

The train/val split

3️⃣ Download pre-trained base models

🗺 Overview of the repo

⚙ Training The Base

🧪 Few-shot fine-tuning

🙏 Acknowledgments

📚 Citation