Automatic out-of-tree backend loading

Hello,

Currently I use following somewhat ugly sample to load OpenCL backend shared object:

if r.device.find('opencl')==0:
    torch.ops.load_library("build/libpt_ocl.so")

Is there any location in pytorch tree that I can put my backend shared object/dll and it will be loaded automatically? This way I also access various low level backend functions via ops library, for example, to start low level profiling. or access cache meta data/operations

For example:

torch.ops.oclops.prof_start(device)

Now I’d like to have it accessed as it is done today for cuda, for example:

torch.cuda.empty_cache()
torch.opencl.empty_cache()

Or alternatievely it would be great to have some generic op like:

torch.backed_from_device(device).empty_cache()

Any thoughts? Is it even doable as out-of-tree backend?

Is there any location in pytorch tree that I can put my backend shared object/dll and it will be loaded automatically?

I don’t think there is such place.
The usual thing is to make you .py load that so when the user imports the python package.

I’m not sure about generic backend as each of them can have the API they want.
Maybe you want to hijack some of the torch.cuda. functions so that they slightly different way when opencl is available.