How to build PyTorch with a latest LLVM build?

I have built the LLVM project locally. I wish to use my clang build to compile PyTorch source code. Currently I am facing issues during the PyTorch build.

LLVM version: 17.0.6
Python version: 3.10.8
PyTorch version: 2.2.0-rc8
CUDA Driver Version: 545.23.08
CUDA Version: 12.3
nvcc version: release 12.3, V12.3.107

I am using the following environment variables:

export CC=/home/llvm-project-llvmorg-17.0.6/installed/bin/clang; \
export CXX=/home/llvm-project-llvmorg-17.0.6/installed/bin/clang++; \
export MAX_JOBS=32; \
export DEBUG=1; \
export USE_DISTRIBUTED=0; \
export USE_MKLDNN=0; \
export USE_CUDA=1; \
export CMAKE_CUDA_FLAGS="-allow-unsupported-compiler" \
export BUILD_TEST=0; \
export USE_FBGEMM=0; \
export USE_NNPACK=0; \
export USE_QNNPACK=0; \
export USE_XNNPACK=0;

The build command is simply: python setup.py develop

The build fails with some error messages, prominently:

...
...
[611/2206] Performing build step for 'nccl_external'
FAILED: nccl_external-prefix/src/nccl_external-stamp/nccl_external-build ...
...
...
  151 | #error -- unsupported clang version! clang version must be less than 16 and greater than 3.2 . The nvcc flag '-allow-unsupported-compiler' can be used to override...
...
...

So, is there a way to build the PyTorch project with the latest/newer versions of LLVM?

Apparently, the problem is that the newer version of clang uses c++17 as the default dialect. As per this NVIDIA forum post, one solution might be, to manually set the C++ dialect version using: -std=c++XX flag.

I tried

...
export CMAKE_CUDA_FLAGS="-allow-unsupported-compiler -std=c++17"; \
...

However, I am receiving the same error.

So, now the question is:

  1. Can the issue be resolved by setting the C++ version to 17?
  2. If yes, how to correctly set the C++ dialect version for the build process?

For now, I was able to build PyTorch 2.2.0 with CUDA by downgrading my clang compiler (from 17.0.6 to 10):

clear; \
export CC=/usr/bin/clang-10; \
export CXX=/usr/bin/clang++-10; \
#export CC=/home/anurag/llvm-project-llvmorg-17.0.6/installed/bin/clang; \
#export CXX=/home/anurag/llvm-project-llvmorg-17.0.6/installed/bin/clang++; \
export MAX_JOBS=32; \
export DEBUG=1; \
export USE_DISTRIBUTED=OFF; \
export USE_MKLDNN=OFF; \
export USE_CUDA=ON; \
#export CMAKE_CXX_FLAGS="-Wmissing-braces"; \
export CMAKE_CUDA_COMPILER=/usr/local/cuda/bin/nvcc; \
#export CMAKE_CUDA_HOST_COMPILER=/home/anurag/llvm-project-llvmorg-17.0.6/installed/bin/clang++; \
#export CMAKE_CUDA_FLAGS="-allow-unsupported-compiler -std=c++17"; \
export BUILD_TEST=OFF; \
export USE_FBGEMM=OFF; \
export USE_NNPACK=OFF; \
export USE_QNNPACK=OFF; \
export USE_XNNPACK=OFF; \
export NVCC_GENCODE="-gencode=arch=compute_70,code=sm_70"; \
export USE_GOLD_LINKER=ON; \
python setup.py clean; \
ccache -cC; \
python setup.py develop;

(Some more flags may be set according to cmake CUDA flags)

So the real question of how to compile PyTorch with CUDA using newer clang still stands.