If you want to overload __torch_dispatch__
to also capture tensor constructors, you can do it via a combination of torch dispatch modes and torch function modes. I made a demo wrapping MLX into a Pytorch backend: tnt/torch_mlx.py at main · qihqi/tnt · GitHub
The same idea can be used to wrap any Python numerical library as a pytorch backend (I have done for jax as well: xla/torchax at master · pytorch/xla · GitHub
I described the approach in more details here: Embrace tensor subclass as a Python device registration API