A new strategy for automatic custom operators functionalization

@laithsakka @bdhirsh

On pytorch nightly 2.9.0.dev20250907+cu128, running the above example gives the following result:

import torch
import torch._inductor.config as inductor_config
from torch._logging import set_logs
import logging

@torch.library.custom_op("mylib::my_func", mutates_args={"x"})
def my_func(x: torch.Tensor) -> None:
    x[0] = 1

@torch.compile()
def func():
    f = torch.ones(10)

    my_func(f)

    return f

if __name__ == "__main__":
    set_logs(inductor=logging.DEBUG)
    with inductor_config.patch({"enable_auto_functionalized_v2": False}):
        func()

Toggling enable_auto_functionalized_v2 has no effect.

Both cases result in the following logs:

I0909 16:26:19.457000 3205004 torch/_inductor/fx_passes/reinplace.py:560] For node auto_functionalized_v2, attempted to reinplace range(0, 1). We were unable to reinplace []; [] (if non-empty) are possible missed reinplacing opportunities that may be bad for memory usage and performance. Total size of missed opportunities with static shapes is : 0 bytes.
=== BEFORE ===
graph():
    %x_1 : [num_users=2] = placeholder[target=x_1]
    %auto_functionalized_v2 : [num_users=2] = call_function[target=torch.ops.higher_order.auto_functionalized_v2](args = (mylib.inc_.default,), kwargs = {_x_base_index: 0, _all_bases: [%x_1]})
    %getitem : [num_users=0] = call_function[target=operator.getitem](args = (%auto_functionalized_v2, 0), kwargs = {})
    %getitem_1 : [num_users=1] = call_function[target=operator.getitem](args = (%auto_functionalized_v2, 1), kwargs = {})
    %copy_ : [num_users=1] = call_function[target=torch.ops.aten.copy_.default](args = (%x_1, %getitem_1), kwargs = {})
    return copy_
=== AFTER ===
graph():
    %x_1 : [num_users=2] = placeholder[target=x_1]
    %inc__default : [num_users=0] = call_function[target=torch.ops.mylib.inc_.default](args = (%x_1,), kwargs = {})
    %copy_ : [num_users=1] = call_function[target=torch.ops.aten.copy_.default](args = (%x_1, %x_1), kwargs = {})
    return copy_

A few questions:

  • Is the copy in the post-grad graph a no-op since it’s copying x into x?
  • Why is there no difference whether autofunc v2 is enabled?
  • How should I interpret the reinplace warning: For node auto_functionalized_v2, attempted to reinplace range(0, 1). We were unable to reinplace...Total size of missed opportunities with static shapes is : 0 bytes.