Closed
Description
🐛 Describe the bug
#114367 move some functions and classes which are only used in cuda to TraceUtils.h. And these symbols will not be found in third-party devices implement when include TraceUtils.h.
/opt/_internal/cpython-3.8.18/lib/python3.8/site-packages/torch/include/torch/csrc/distributed/c10d/TraceUtils.h:269:1: error: ‘DebugInfoWriter’ does not name a type
269 | DebugInfoWriter::DebugInfoWriter(int rank) {
....
/opt/_internal/cpython-3.8.18/lib/python3.8/site-packages/torch/include/torch/csrc/distributed/c10d/TraceUtils.h:328:43: error: ‘CUDAEvent’ is not a member of ‘at::cuda’
328 | using EventList = std::vector<at::cuda::CUDAEvent>;
...
/opt/_internal/cpython-3.8.18/lib/python3.8/site-packages/torch/include/torch/csrc/distributed/c10d/TraceUtils.h:412:13: error: ‘struct c10d::NCCLTraceBuffer::Entry’ has no member named ‘start_’; did you mean ‘state_’?
412 | if (r.start_ != nullptr) {
| ^~~~~~
| state_
Versions
PyTorch version: 2.2.0.dev20231126
Is debug build: False
CUDA used to build PyTorch: None
ROCM used to build PyTorch: N/A
OS: CentOS Linux 7 (AltArch) (aarch64)
GCC version: (GCC) 10.2.1 20210130 (Red Hat 10.2.1-11)
Clang version: Could not collect
CMake version: version 3.27.7
Libc version: glibc-2.17
Python version: 3.8.18 (default, Nov 13 2023, 04:17:39) [GCC 10.2.1 20210130 (Red Hat 10.2.1-11)] (64-bit runtime)
Python platform: Linux-4.15.0-112-generic-aarch64-with-glibc2.17
Is CUDA available: False
CUDA runtime version: No CUDA
CUDA_MODULE_LOADING set to: N/A
GPU models and configuration: No CUDA
Nvidia driver version: No CUDA
cuDNN version: No CUDA
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True
CPU:
Architecture: aarch64
Byte Order: Little Endian
CPU(s): 192
On-line CPU(s) list: 0-191
Thread(s) per core: 1
Core(s) per socket: 48
Socket(s): 4
NUMA node(s): 8
Model: 0
BogoMIPS: 200.00
L1d cache: 64K
L1i cache: 64K
L2 cache: 512K
L3 cache: 24576K
NUMA node0 CPU(s): 0-23
NUMA node1 CPU(s): 24-47
NUMA node2 CPU(s): 48-71
NUMA node3 CPU(s): 72-95
NUMA node4 CPU(s): 96-119
NUMA node5 CPU(s): 120-143
NUMA node6 CPU(s): 144-167
NUMA node7 CPU(s): 168-191
Flags: fp asimd evtstrm aes pmull sha1 sha2 crc32 atomics fphp asimdhp cpuid asimdrdm jscvt fcma dcpop asimddp asimdfhm ssbs
Versions of relevant libraries:
[pip3] numpy==1.24.4
[pip3] torch==2.2.0.dev20231126
[pip3] torchvision==0.15.2
[conda] Could not collect
cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @rohan-varma @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @malfet @seemethere
Activity