Testing in PrivateUse1 for out-of-tree PyTorch Backends

This is a great question!
There is a reason why this one is still very open ended and why we haven’t made much progress there yet.

This is mainly because it spans a very wide breadth of topics, including, but not limited to:

  • What is considered “close enough” numerical error for element-wise, reduction ops, etc
  • What is considered “good coverage” out of the 3k+ ops in PyTorch, dtypes, broadcasting, etc
  • What is considered “good support of components” based on the device generic tests (about autograd, distributed, nn, optim, foreach, sparse, serialization, etc)
  • What is considered “good support in community” based on third party repos (vllm, deepspeed, ray I guess but also hf/transformer, lightning, ao, bnb, safetensors, etc)
  • How are these things going to be checked
  • Who is going to be an independent third party to validate these results
  • How do we present to the end user the result of this for them to have a good understanding on the current state of a given backend.

I do expect it is going to take some time and concerted efforts to figure this out.

2 Likes