Is there a way to use a custom C++ lowering of a torch op (which is not part of aten) as an external kernel in TorchInductor?
In TorchInductor, we can choose to lower a torch op either with external kernels or triton kernels. However, the external kernels are bind to an aten
implementation. I want to know if there is a way to have an external C++ implementation of an op and use them as an external kernel?