TorchInductor: a PyTorch-native Compiler with Define-by-Run IR and Symbolic Shapes

So far, the torch inductor and torch.compile() have been a huge success.

However, I feel confused about some details.

In the docstring of torch.compile(), mode is default.

Which options does mode=default implement? I didn’t find any further explanation here.

@jansel

Mapping of modes to options is here:

default is just empty set

All options are here:

1 Like

Hello @jansel.

I’m wondering if I can make Triton codegen kernels for all of my operators without using extern_kernels .
I modified the config as follows, but to no avail

torch._inductor.config.max_autotune_gemm_backends = "TRITON" # removed ATEN
torch._inductor.config.max_autotune = True

I get the error

File "/home/amodab01/anaconda3/envs/ml_training/lib/python3.11/site-packages/torch/_inductor/kernel/mm.py", line 156, in tuned_mm
    return autotune_select_algorithm("mm", choices, [mat1, mat2], layout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/amodab01/anaconda3/envs/ml_training/lib/python3.11/site-packages/torch/_inductor/select_algorithm.py", line 991, in autotune_select_algorithm
    return _ALGORITHM_SELECTOR_CACHE(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/amodab01/anaconda3/envs/ml_training/lib/python3.11/site-packages/torch/_inductor/select_algorithm.py", line 723, in __call__
    raise RuntimeError(
torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
LoweringException: RuntimeError: No choices to select, please consider adding ATEN into max_autotune_gemm_backends config (defined in torch/_inductor/config.py) to allow at least one choice. 
  target: aten.mm.default
  args[0]: TensorBox(
    ReinterpretView(
      StorageBox(
        InputBuffer(name='primals_3', layout=FixedLayout('cuda', torch.float32, size=[100], stride=[1]))
      ),
      FixedLayout('cuda', torch.float32, size=[1, 100], stride=[100, 1]),
      origins={view}
    )
  )
  args[1]: TensorBox(
    ReinterpretView(
      StorageBox(
        InputBuffer(name='primals_1', layout=FixedLayout('cuda', torch.float32, size=[100, 100], stride=[100, 1]))
      ),
      FixedLayout('cuda', torch.float32, size=[100, 100], stride=[1, 100]),
      origins={permute}
    )
  )

Any idea if I’m missing something?

I found the torch.fx documentation and TorchDynamo Deepdive to be very insightful.

@jansel any design documentation for TorchInductor of similar flavor? Any documentation or google colab notebook tutorial to understand the flow of inductor?

PT2’s ASPLOS Tutorial should be a helpful resource.

2 Likes

@peterbell10 @Chillee

Very much enjoyed your ASPLOS presentation on the inner workings Inductor. Are these details documented anywhere, or is there a comparable manual for Inductor such as @ezyang 's excellent “torch.compile: the missing manual”, which is more focused on Dynamo? Would be helpful in understanding how to contribute graph optimizations / custom lowerings and kernel templates.

There is a paper linked in this blog : PyTorch 2 paper and tutorial @ ASPLOS 2024 | PyTorch