TorchInductor: a PyTorch-native Compiler with Define-by-Run IR and Symbolic Shapes

shaoyuyoung · April 28, 2024, 2:30pm

So far, the torch inductor and torch.compile() have been a huge success.

However, I feel confused about some details.

In the docstring of torch.compile(), mode is default.

Which options does mode=default implement? I didn’t find any further explanation here.

@jansel

jansel · April 28, 2024, 4:47pm

Mapping of modes to options is here:

github.com

pytorch/pytorch/blob/6761b49551f8bedd4cc01fbf90962d3fb1ec1a7c/torch/_inductor/init.py#L109-L141


      
          def list_mode_options(
              mode: Optional[str] = None, dynamic: Optional[bool] = None
          ) -> Dict[str, Any]:
              r"""Returns a dictionary describing the optimizations that each of the available
              modes passed to `torch.compile()` performs.
          
              Args:
                  mode (str, optional): The mode to return the optimizations for.
                  If None, returns optimizations for all modes
                  dynamic (bool, optional): Whether dynamic shape is enabled.
          
              Example::
                  >>> torch._inductor.list_mode_options()
              """
          
              mode_options: Dict[str, Dict[str, bool]] = {
                  "default": {},
                  # enable cudagraphs
                  "reduce-overhead": {
                      "triton.cudagraphs": True,

This file has been truncated. show original

default is just empty set

All options are here:

github.com

pytorch/pytorch/blob/main/torch/_inductor/config.py

import os  # noqa: C101
import sys
from typing import Any, Callable, Dict, List, Optional, TYPE_CHECKING, Union

import torch


def is_fbcode():
    return not hasattr(torch.version, "git_version")


# add some debug printouts
debug = False

# add inf and NaN checkers
debug_check_inf_and_nan = False

# Whether to disable a progress bar for autotuning
disable_progress = True

This file has been truncated. show original

Amogh · May 2, 2024, 4:04am

Hello @jansel.

I’m wondering if I can make Triton codegen kernels for all of my operators without using extern_kernels .
I modified the config as follows, but to no avail

torch._inductor.config.max_autotune_gemm_backends = "TRITON" # removed ATEN
torch._inductor.config.max_autotune = True

I get the error

File "/home/amodab01/anaconda3/envs/ml_training/lib/python3.11/site-packages/torch/_inductor/kernel/mm.py", line 156, in tuned_mm
    return autotune_select_algorithm("mm", choices, [mat1, mat2], layout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/amodab01/anaconda3/envs/ml_training/lib/python3.11/site-packages/torch/_inductor/select_algorithm.py", line 991, in autotune_select_algorithm
    return _ALGORITHM_SELECTOR_CACHE(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/amodab01/anaconda3/envs/ml_training/lib/python3.11/site-packages/torch/_inductor/select_algorithm.py", line 723, in __call__
    raise RuntimeError(
torch._dynamo.exc.BackendCompilerFailed: backend='inductor' raised:
LoweringException: RuntimeError: No choices to select, please consider adding ATEN into max_autotune_gemm_backends config (defined in torch/_inductor/config.py) to allow at least one choice. 
  target: aten.mm.default
  args[0]: TensorBox(
    ReinterpretView(
      StorageBox(
        InputBuffer(name='primals_3', layout=FixedLayout('cuda', torch.float32, size=[100], stride=[1]))
      ),
      FixedLayout('cuda', torch.float32, size=[1, 100], stride=[100, 1]),
      origins={view}
    )
  )
  args[1]: TensorBox(
    ReinterpretView(
      StorageBox(
        InputBuffer(name='primals_1', layout=FixedLayout('cuda', torch.float32, size=[100, 100], stride=[100, 1]))
      ),
      FixedLayout('cuda', torch.float32, size=[100, 100], stride=[1, 100]),
      origins={permute}
    )
  )

Any idea if I’m missing something?

Abhishekghosh1998 · June 6, 2024, 3:04pm

I found the torch.fx documentation and TorchDynamo Deepdive to be very insightful.

@jansel any design documentation for TorchInductor of similar flavor? Any documentation or google colab notebook tutorial to understand the flow of inductor?

mit · July 11, 2024, 2:34am

PT2’s ASPLOS Tutorial should be a helpful resource.

jeromeku · July 27, 2024, 9:29am

@peterbell10 @Chillee

Very much enjoyed your ASPLOS presentation on the inner workings Inductor. Are these details documented anywhere, or is there a comparable manual for Inductor such as @ezyang 's excellent “torch.compile: the missing manual”, which is more focused on Dynamo? Would be helpful in understanding how to contribute graph optimizations / custom lowerings and kernel templates.

CfRod · July 29, 2024, 8:09am

There is a paper linked in this blog : PyTorch 2 paper and tutorial @ ASPLOS 2024 | PyTorch

Topic		Replies	Views
How to Access Triton Kernels from TorchInductor when running on CPU? compiler	1	853	August 12, 2024
When does the inductor code run? compiler	5	773	May 15, 2024
TorchInductor Update 6: CPU backend performance update and new features in PyTorch 2.1 compiler	0	2101	September 22, 2023
TorchInductor Update 4: CPU backend started to show promising performance boost compiler	1	3009	November 25, 2022
Pytorch to Triton for Non-GPU Devices compiler	7	1717	August 30, 2024

TorchInductor: a PyTorch-native Compiler with Define-by-Run IR and Symbolic Shapes

Related topics