Ufunc codegen for pointwise operators

Our goal is to define every ufunc in PyTorch as a simple templated inner loop function like below:

template <typename T>
C10_HOST_DEVICE T add(T self, T other, T alpha) {
 return self + alpha * other;
}

and transform this into a full torch.add kernel.

Proposal for how to do this is at Ufunc codegen for pointwise operators - Google Docs
WIP PR is ufunc codegen by ezyang · Pull Request #65851 · pytorch/pytorch · GitHub

1 Like