Is finetune.py incompatible with older GPUs? #156

umm-maybe · 2024-03-19T13:26:59Z

Hi, while running on a Colab A100 instance I noticed that the VRAM consumed by finetune.py was only about 5 GB for starcoderbase-1b so I attempted it on my local machine which has a GTX 1070 card (8 GB VRAM, Pascal architecture). This didn't work, and I got a similar error when attempting again with either starcoderbase-1B or starcoderbase-3B on a larger, but still older GPU (NVIDIA Quadro P6000; 24GB VRAM). Here is the error:

RuntimeError: expected mat1 and mat2 to have the same dtype, but got: c10::Half != float

At first I thought this might be due to some difference in architecture (Pascal vs. Ampere) but this is contradicted by the fact that I have a Kaggle Code notebook which can fine-tune Starcoder with two P100 GPUs, which is also Pascal.

Is there some other explanation for this?

Longer stacktrace attached.
dump.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is finetune.py incompatible with older GPUs? #156

Is finetune.py incompatible with older GPUs? #156

umm-maybe commented Mar 19, 2024

Is finetune.py incompatible with older GPUs? #156

Is finetune.py incompatible with older GPUs? #156

Comments

umm-maybe commented Mar 19, 2024