Hi all,
I’ve been working to create a disaggregated backend for PyTorch making use of the open registration example, and trying to reuse the CUDA kernel implementations to instead be dispatched via the disaggregated OS. The main issues I’ve run into on this matter are pertaining to the registration of various kernels like elu
or logexpadd
. Which seem to use a macro to form structured stubs instead, I was hoping to work out how I can register those operations with the custom backend, while still using the RegisterDispatch with the stub and appropriate key.
As some quick code stubs:
REGISTER_DISPATCH(logaddexp_stub, &logaddexp_kernel_pu1);
Faulty code:
at::Tensor &logaddexp_out(const at::Tensor &self, const at::Tensor &other,
at::Tensor &out) {
return at::native::logaddexp_out(self, other, out);
}