Possible to use custom backend to create TensorImpl that allows custom datatype?

ggutierrez545 · October 25, 2023, 7:07pm

Forgive the potential naiveté of this question, but would it be possible to build a custom PyTorch backend that would allow me create a Tensor with a custom datatype?

For context, this custom datatype is actually a class defined in C++ and acts as a “virtual” float32. The class has all of its binary / arithmetic operators defined, so one can treat it like a “regular” float32. In a similar use-case, we’re able to use Numpy arrays with dtype=object to create arrays of our “virtual” float32 and conduct various operations using Numpy’s Python API. Effectively, this “virtual” float32 wouldn’t be calculating anything explicitly – more just internally building its own graph of computations that created it.

PyTorch is an impressive and complicated beast (much more so than Numpy) so figuring out how to do an analogous extension is difficult. Previously, I was able to create a wrapper subclass of PyTorch’s Python Tensor class and utilize the __torch__dispatch__ method to achieve my desired results. Basically, these WrappedTensors would just carry along a Numpy array of my “virtual” float32s and __torch_dispatch__ would ensure whatever Tensor operation was occurring would also apply a Numpy equivalent to the carried array.

Unfortunately, this approach ran into trouble when I wanted to overload base aten operations that didn’t involve my WrappedTensor. Without a way to force a __torch_dispatch__ interception, I couldn’t bake in the logic I needed.

So that turned my attention to, potentially, creating a new PyTorch backend that could support my needs and these “virtual” float32s. My understanding of custom TensorImpls and how they can be used to create custom Tensors is rough at best so I’m not even sure what I’m asking is possible. I believe I’d be able to create a TensorImpl that allowed me to carry-along these “virtual” float32s in a new attribute (an array container, perhaps even Numpy’s PyArrayObject). But if I could just make this new TensorImpl allow for these “virtual” float32s off the bat, that would be even better / cleaner.

In the end, this isn’t so much a hardware backend, more like a “virtual” hardware backend (or even just another software backend layer).

If anyone has any thoughts on the feasibility of this idea, they would be much appreciated!

Revess · April 2, 2025, 10:47pm

Any progress with this? I have been looking into this, creating my own custom dtype class and compiling Torch from source with my new dtype. But it is a lot to maintain.

qihqi · April 11, 2025, 9:49pm

If you want to overload __torch_dispatch__ to also capture tensor constructors, you can do it via a combination of torch dispatch modes and torch function modes. I made a demo wrapping MLX into a Pytorch backend: tnt/torch_mlx.py at main · qihqi/tnt · GitHub

The same idea can be used to wrap any Python numerical library as a pytorch backend (I have done for jax as well: xla/torchax at master · pytorch/xla · GitHub

I described the approach in more details here: Embrace tensor subclass as a Python device registration API

Topic		Replies	Views
Is there a place for storing custom data within PyTorch Tensor hardware-backends	8	1553	March 23, 2023
Supporting new dtypes in PyTorch	2	2352	January 25, 2024
Float8 in PyTorch [1/x]	1	14275	April 7, 2025
Python DispatchKey missing for custom op	3	870	January 3, 2024
Memory operations on a custom backend hardware-backends	4	1173	July 5, 2022

Possible to use custom backend to create TensorImpl that allows custom datatype?

Related topics