Custom C++ External Kernel for TorchInductor

Is there a way to use a custom C++ lowering of a torch op (which is not part of aten) as an external kernel in TorchInductor?

In TorchInductor, we can choose to lower a torch op either with external kernels or triton kernels. However, the external kernels are bind to an aten implementation. I want to know if there is a way to have an external C++ implementation of an op and use them as an external kernel?

cc @zou3519

You need to wrap C++ function in a custom op. See:
PyTorch Custom Operators Landing Page — PyTorch main documentation

1 Like

After you wrap the C++ function with the custom op, you might also need to register function on “meta” key (either via c++ or easier via python) for shape propagation. Here is an example how we do that in ipex. FYI:
C++ function for rmsnorm: intel-extension-for-pytorch/csrc/cpu/aten/RMSNorm.cpp at 4027749462a5bb5ece1bcf89fdae463e883e3934 · intel/intel-extension-for-pytorch · GitHub
Meta: intel-extension-for-pytorch/intel_extension_for_pytorch/_meta_registrations.py at 4027749462a5bb5ece1bcf89fdae463e883e3934 · intel/intel-extension-for-pytorch · GitHub