Multiple workers for single batch

lorenz-gorini · July 12, 2021, 2:17pm

Hi everyone!
I am working on a project with histopathological images (called Whole-Slide Image). Each of these images is ~1 GB, so they are really hard to handle.
Particularly I struggle when I use DataLoader(num_worker=N) (where N>1) because PyTorch starts loading and preprocessing (slight data augmentation in our case) multiple batches in RAM and then the RAM fills up fast.
I wanted to know if there are other people working on implementing an alternative DataLoader mechanism that could allow us to have multiple workers working on the same batch.
I would also like to know if you have any suggestions regarding this topic.
For example I was wondering if the best option was to let worker processes create new child processes that elaborate single elements of the batch (so that it was easier to combine hybrid approach with workers working on the same batch or different batches). The alternative option could be having the main process gathering processed samples (from the same batch) from workers.
Since I never opened a Pytorch PR and since I noticed that worker shutdown/handling is a very delicate matter, do you think I could open a draft and then someone could provide some suggestions and support?

smth · July 13, 2021, 5:10am

Have a look at WebDataset: Efficient PyTorch I/O library for Large Datasets, Many Files, Many GPUs | PyTorch

vitaly and team are working on a more performant and more flexible dataset API update.
Read the RFC here: RFC-0009: DataLoader architecture updates by VitalyFedyunin · Pull Request #15 · pytorch/rfcs · GitHub

Another thread worth reading might be: [RFC] Add tar-based IterableDataset implementation to PyTorch · Issue #38419 · pytorch/pytorch · GitHub

Bernd1969 · July 14, 2021, 3:43pm

+1
I would also love that feature!
multiple workes really speeds up my Mask RCNN project, but this also multiplies my CPU RAM footprint.
so I usually have to use a lower number than I would like, otherwise I exceed my 32GB RAM

Topic		Replies	Views
Overhead in `nn.Module` causing massive slowdowns compared to raw CuBLAS or Torchscript performance	0	1670	January 28, 2021
GPU Overheads and Fused Strassen performance	0	2220	February 13, 2021
Meta PyTorch Team 2025 H1 Roadmaps	17	5348	June 24, 2025
TorchInductor Update 6: CPU backend performance update and new features in PyTorch 2.1 compiler	0	1976	September 22, 2023
Torch.nn H2 2021 Lookback and H1 2022 Lookahead frontend API	0	958	January 14, 2022

Multiple workers for single batch

Related topics