Training RuntimeError: stack expects each tensor to be equal size, but got XXX vs YYY #262

Open
opened 2026-01-29 21:46:17 +00:00 by claunia · 0 comments
Owner

Originally created by @oTree-org on GitHub (Sep 23, 2022).

Hi, thank you so much for providing this amazing framework!

I'm running training using "train_gfpgan_v1_simple.yml" with my own images, not FFHQ. They are irregularly sized .png files. Using 1 GPU.

When I launch training, I get:

RuntimeError: stack expects each tensor to be equal size, but got [3, 660, 472] at entry 0 and [3, 862, 616] at entry 1

Here is the traceback:

Traceback (most recent call last):
  File "C:\gfpgan\gfpgan\train.py", line 11, in <module>
    train_pipeline(root_path)
  File "C:\ProgramData\Miniconda3\lib\site-packages\basicsr\train.py", line 159, in train_pipeline
    train_data = prefetcher.next()
  File "C:\ProgramData\Miniconda3\lib\site-packages\basicsr\data\prefetch_dataloader.py", line 76, in next
    return next(self.loader)
  File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\dataloader.py", line 681, in __next__
ic| '    data = self._next_data()resizing back to
  File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1376, in _next_data
', h: 329, w: 235    return self._process_data(data),
  File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1402, in _process_data
 scale: 3.037853791816147
    data.reraise()
  File "C:\ProgramData\Miniconda3\lib\site-packages\torch\_utils.py", line 461, in reraise
ic| img_lq.shape: (329,     raise exception235
RuntimeError,:  Caught RuntimeError in DataLoader worker process 0.
Original Traceback (most recent call last):
  File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\_utils\worker.py", line 302, in _worker_loop
    data = fetcher.fetch(index)
  File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\_utils\fetch.py", line 52, in fetch
    return self.collate_fn(data)
  File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\_utils\collate.py", line 160, in default_collate
    return elem_type({key: default_collate([d[key] for d in batch]) for key in elem})
  File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\_utils\collate.py", line 160, in <dictcomp>
    return elem_type({key: default_collate([d[key] for d in batch]) for key in elem})
  File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\_utils\collate.py", line 141, in default_collate
    return torch.stack(batch, 0, out=out)
RuntimeError: stack expects each tensor to be equal size, but got [3, 660, 472] at entry 0 and [3, 862, 616] at entry 1
3
)

I have verified that (862, 616) is the size of one of my GT images, and (660, 472) is what it gets downsampled to in ffhq_degradation_dataset.py. But the LQ image immediately gets upscaled back up to its original dimensions, so I don't understand how pytorch knows anything about the downscale that happened.

Any ideas?

Originally created by @oTree-org on GitHub (Sep 23, 2022). Hi, thank you so much for providing this amazing framework! I'm running training using "train_gfpgan_v1_simple.yml" with my own images, not FFHQ. They are irregularly sized .png files. Using 1 GPU. When I launch training, I get: > RuntimeError: stack expects each tensor to be equal size, but got [3, 660, 472] at entry 0 and [3, 862, 616] at entry 1 Here is the traceback: ``` Traceback (most recent call last): File "C:\gfpgan\gfpgan\train.py", line 11, in <module> train_pipeline(root_path) File "C:\ProgramData\Miniconda3\lib\site-packages\basicsr\train.py", line 159, in train_pipeline train_data = prefetcher.next() File "C:\ProgramData\Miniconda3\lib\site-packages\basicsr\data\prefetch_dataloader.py", line 76, in next return next(self.loader) File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\dataloader.py", line 681, in __next__ ic| ' data = self._next_data()resizing back to File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1376, in _next_data ', h: 329, w: 235 return self._process_data(data), File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\dataloader.py", line 1402, in _process_data scale: 3.037853791816147 data.reraise() File "C:\ProgramData\Miniconda3\lib\site-packages\torch\_utils.py", line 461, in reraise ic| img_lq.shape: (329, raise exception235 RuntimeError,: Caught RuntimeError in DataLoader worker process 0. Original Traceback (most recent call last): File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\_utils\worker.py", line 302, in _worker_loop data = fetcher.fetch(index) File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\_utils\fetch.py", line 52, in fetch return self.collate_fn(data) File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\_utils\collate.py", line 160, in default_collate return elem_type({key: default_collate([d[key] for d in batch]) for key in elem}) File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\_utils\collate.py", line 160, in <dictcomp> return elem_type({key: default_collate([d[key] for d in batch]) for key in elem}) File "C:\ProgramData\Miniconda3\lib\site-packages\torch\utils\data\_utils\collate.py", line 141, in default_collate return torch.stack(batch, 0, out=out) RuntimeError: stack expects each tensor to be equal size, but got [3, 660, 472] at entry 0 and [3, 862, 616] at entry 1 3 ) ``` I have verified that (862, 616) is the size of one of my GT images, and (660, 472) is what it gets downsampled to in `ffhq_degradation_dataset.py`. But the LQ image immediately gets upscaled back up to its original dimensions, so I don't understand how pytorch knows anything about the downscale that happened. Any ideas?
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: TencentARC/GFPGAN#262