mirror of
https://github.com/TencentARC/GFPGAN.git
synced 2026-02-15 22:04:35 +00:00
unable to start training : NCCL library error #155
Reference in New Issue
Block a user
Delete Branch "%!s()"
Deleting a branch is permanent. Although the deleted branch may continue to exist for a short time before it actually gets removed, it CANNOT be undone in most cases. Continue?
Originally created by @nowfalcodmeric on GitHub (Feb 3, 2022).
RuntimeError: RuntimeErrorRuntimeErrorNCCL error in: ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:957, invalid usage, NCCL version 21.0.3
ncclInvalidUsage: This usually reflects invalid usage of NCCL library (such as too many async ops, too many collectives at once, mixing streams in a group, etc).: RuntimeError:
NCCL error in: ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:957, invalid usage, NCCL version 21.0.3
ncclInvalidUsage: This usually reflects invalid usage of NCCL library (such as too many async ops, too many collectives at once, mixing streams in a group, etc).: NCCL error in: ../torch/csrc/distributed/c10d/ProcessGroupNCCL.cpp:957, invalid usage, NCCL version 21.0.3
ncclInvalidUsage: This usually reflects invalid usage of NCCL library (such as too many async ops, too many collectives at once, mixing streams in a group, etc).
@Asuka001100 commented on GitHub (Feb 8, 2022):
maybe you GPU numbers is not ture for parameter setting
@ucalyptus2 commented on GitHub (Dec 7, 2022):
@Asuka001100 couldn't understand what u meant