Skip to content

Latest commit

 

History

History
127 lines (90 loc) · 5.25 KB

ColonINST-v1.md

File metadata and controls

127 lines (90 loc) · 5.25 KB

ColonINST-v1

Visualisation of the ColonINST

Dataset Information

ColonINST is a large-scale instruction tuning dataset designed for multimodal analysis in colonoscopy. This dataset comprises 303,001 colonoscopy images, aggregated from 19 publicly available sub-dataset sources. Utilizing a semi-automated pipeline powered by GPT-4V, we have generated 128,620 detailed medical captions, enhancing the dataset’s utility for AI model training. We finally restructured 450,724 visual dialogues to guide the AI model through four downstream tasks, ie. image classification (CLS), referring expression generation (REG), referring expression comprehension (REC), and caption generation (CAP), critical for multimodal medical AI applications.

Dataset Meta Information

Language Task File Format Data Count Data Type
English VQA .json, .jpg, .png 450,724 image-text pair

Dataset Information Statistics

Details of the multimodal instruction tuning dataset, ColonINST.

This figure shows: (a) Three sequential steps to create the instruction tuning dataset for multimodal research. (b) Numbers of colonoscopy images designated for training, validation, and testing purposes. (c) Data taxonomy of three-level categories. (d) A word cloud of the category distribution by name size. (e) Caption generation pipeline using the VL prompting mode of GPT-4V. (f) Numbers of human-machine dialogues created for four downstream tasks.

  • Colonoscopy images

    The following table shows the data statistics of colonoscopy images designated for training, validation, and testing purposes.

    Positive Negative Total
    Train set 74,407 106,570 180,977
    Val set 8,929 17,328 26,257
    Test set 45,284 50,483 95,767
    Total 128,620 174,381 303,001
  • Medical captions: We feed a custom prompt and a hierarchical category prior to the advanced chatbot, GPT-4V. This model can generate detailed, professional medical descriptions for colonoscopy images, enhancing diagnostic clarity and specificity.

  • Instruction tuning pairs: In the following table, we summarise the image-instruction pairs used for the training, validation, and test by four task purposes.

    CLS REG REC CAP Total
    Train 74,407 54,237 54,237 74,407 257,288
    Val 8,929 4,874 4,874 8,929 27,606
    Test 45,284 37,631 37,631 45,284 165,830
    Total 128,620 96,742 96,742 128,620 450,724

Dataset Example

File Structure

├──cache
    ├──ColonINST
        ├──Json-file
            ├──train
                ├──ColonINST-train.json
            ├──val
                ├──ColonINST-val-cls.json
                |...
            ├──test
                ├──ColonINST-test-cls.json
                |...

        ├──Positive-images
            ├──CPC-Paired
                ├──Train
                    ├──polyp
                        |──image_name.jpg
                        |...
                ├──Val
                    ├──polyp
                        |──image_name.jpg
                        |...
                ├──Test
                    ├──polyp
                        |──image_name.jpg
                        |...
            |...

Authors and Institutions

Ge-Peng Ji (Australian National University, Canberra, Australia)

Jingyi Liu (Keio University, Yokohama, Japan)

Peng Xu (Tsinghua University, Beijing, China)

Nick Barnes (Australian National University, Canberra, Australia)

Fahad Shahbaz Khan (Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE)

Salman Khan (Mohamed bin Zayed University of Artificial Intelligence, Abu Dhabi, UAE)

Deng-Ping Fan (Nankai University, Tianjin, China)

Source Information

Official Website: https://github.com/ai4colonoscopy/IntelliScope

Download Link: https://huggingface.co/ai4colonoscopy/ColonGPT-v1

Article Address: https://arxiv.org/abs/2410.17241

Publication Date: 2024-10

Citation

@article{ji2024frontiers
  author = {Ji, Ge-Peng and Liu, Jingyi and Xu, Peng and Barnes, Nick and Khan, Fahad Shahbaz and Khan, Salman and Fan, Deng-Ping},
  title = {Frontiers in Intelligent Colonoscopy},
  journal = {arXiv preprint arXiv:2410.17241},
  year = {2024}
}