Is it possible to permanently move a complete dataset to GPU memory before training? #20554

NiklasKappel · 2025-01-20T15:58:36Z

NiklasKappel
Jan 20, 2025

I have a very small dataset on disk. I do not need to do any data augmentation. I would like to load the dataset into CPU memory, then move it to GPU memory, then proceed with training as usual. Is this possible and supported by lightning? What would be the best way to go about it?

In my mind, if I have only one GPU, the whole dataset should be in that GPU's memory. If I have multiple GPUs, one copy of the dataset should be in each GPU's memory.

In the single GPU case, I think I can move the dataset to GPU by calling .to("cuda") once on each tensor in the dataset. Then I would like to pass the dataset into a regular PyTorch dataloader and pass that dataloader to the lightning trainer's .fit method. Would that be correct?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is it possible to permanently move a complete dataset to GPU memory before training? #20554

{{title}}

Replies: 0 comments

Select a reply

Is it possible to permanently move a complete dataset to GPU memory before training? #20554

NiklasKappel Jan 20, 2025

Replies: 0 comments

NiklasKappel
Jan 20, 2025