Is there a place for storing custom data within PyTorch Tensor

Sujoy_Saraswati · February 28, 2023, 7:21am

Hi,
Is there a place within the PyTorch C++ tensor object (which is based on TensorImpl class) to store some custom data? It could be a “void*” blob which will be only used if a backend wants to keep some specific data for a tensor.

While it is possible to have a separate map in the backend for keeping this kind of information, having it within the tensor would make it more efficient to store and retrieve.
Regards,
Sujoy

albanD · February 28, 2023, 5:10pm

This is loosely answering the question but we do have a strict 1-1 match between the TensorImpl and the python Tensor object. So if that works for you, you can store any custom data you want on the python object (as attributes) and that will just work!

We don’t have any extra blob field that I’m aware of in C++ though.

dzhulgakov · March 1, 2023, 8:25am

If you have your own “DispatchKey” for your tensor, you can create your descendant of TensorImpl with additional fields. See Extending dispatcher for a new backend in C++ — PyTorch Tutorials 1.13.1+cu117 documentation for some pointers.

Examples of doing that is SparseTensor: pytorch/SparseTensorImpl.h at master · pytorch/pytorch · GitHub

We also have helper class OpaqueTensorImpl in case you want to completely hide backend implementation, e.g. pytorch/MetalTensorImpl.h at master · pytorch/pytorch · GitHub

Just curious, what kind of information are you trying to attach?

Sujoy_Saraswati · March 1, 2023, 11:38am

Using a TensorImpl subclass works, but it has issues with shallow copy. While doing weight sharing, the shallow copy from a CPU TensorImpl to a backend subclass doesn’t work.

We have a backend specific data layout for physical device memory of the tensors in some cases. PyTorch natively supports NCHW and channel last NHWC, but we want to store a different layout information for these tensors and use this information within the backend. This is an example information that we would like to keep within the C++ Tensor object.

Regards,
Sujoy

ezyang · March 6, 2023, 12:18am

We could set up some sort of linked list or hash map in ExtraMeta which is accessible from TensorMeta. Most of the trouble is I am not sure what the most appropriate data structure for this is.

ppiskorski · March 10, 2023, 9:44am

Hi,
Just a super basic proposal that would fill the bill for our backend and hopefully push the discussion.

struct BackendMeta with a virtual destructor that is intended for overloading by a backend.
extra_meta_.backend_meta_ field. Per our needs ideally is a shared_ptr (or an intrusive equivalent).
A setter and getter in the TensorImpl that reaches to the extra_meta_.backend_meta_

@ezyang can you elaborate why map?

ezyang · March 10, 2023, 5:52pm

If you’re willing to subclass the ExtraMeta, you might as well just subclass TensorImpl. The whole point of the map is to avoid having to subclass.

ppiskorski · March 10, 2023, 8:28pm

Sure, my description was not clear. I meant that an intrusive pointer to BackendMeta would be a new field in ExtraMeta. And the intention is that backends inherit BackendMeta producing a type unknown to the framework that contains whatever additional attributes are needed.

But you’ve mentioned some sort of linked list or hash map, which, I am guessing, means ability to store multiple additional attributes. Why such design?

ppiskorski · March 23, 2023, 8:59am

Please review the PR 97429 with my proposal.
In short the idea is that since there are downsides to overriding TensorImpl, that PR introduces BackendMeta which is intended for overriding by backends. The contract would be that for the framework the actual implementation is always opaque.

Topic		Replies	Views
Possible to use custom backend to create TensorImpl that allows custom datatype? hardware-backends	2	437	April 11, 2025
How to add custom attributes to pytorch tensor	3	108	May 28, 2025
Embrace tensor subclass as a Python device registration API hardware-backends	5	467	March 28, 2025
Custom TensorImpl and TorchDynamo hardware-backends	1	592	September 10, 2023
Memory operations on a custom backend hardware-backends	4	1213	July 5, 2022

Is there a place for storing custom data within PyTorch Tensor

Related topics