Hi,
I a working on he integration for a new device in PyTorch.
I have been reading the (good) documentation on adding a new backend, so far things seems pretty simple: adding operator implementations with a custom dispatcher key and compiling it as a C++ extension.
However one piece of information is missing from my reading (sorry if I missed it from the doc), how do I make PyTorch able to handle the memory operations (allocation, memcpy, copy across different type of devices)?
I see a reference to VulkanOpaqueTensorImpl that could be helpful, however I struggle to see which bit I can modify to make the allocations happen on my device.
What I was expecting to do is to providing callback functions for memory management.
Can I get a little help with this? Maybe some reference will do.
Thank you!