How to trace torch.autograd.backward or torch.autograd.grad?

botbw · November 27, 2023, 7:08am

Hi there, I would like to trace the backward graph, which has multiple outputs (stage_output) and inputs (input_values), and some of outputs require grad (specified by outputs_with_grads_idxs, which looks like below and contains backward() operation.

def stage_backward(stage_output, output_grads, input_values, outputs_with_grads_idxs: List[int]):
    # some preprocessing code
    torch.autograd.backward(
        stage_output_tensors, # outputs that need backward, i.e. stage_output_tensors = stage_output[outputs_with_grad_idxs]
        grad_tensors=output_grad_tensors
    )

(check full code here).

I tried to trace into functions like

def stateless_backward(params, buffers, activations, kwargs_for_stage_backward):
        func_out= stage_backward(**kwargs_for_stage_backward)
        grads = {k: v.grad for k, v in params.items()}
        return func_out, grads

gm = make_fx(stateless_backward, 'fake')(*args)

where params and buffers are FakeTensors saved from forward tracing (something like make_fx(stateless_forward, 'fake')), and activations are FakeTensors collected by iterating _saved_xxx of grad_fn for all stage_output.

The problem is that the resulting graph always contains _tensor_constant, and I suspect that I missed some tensors in my stateless_backward function arguments so they got cloned and saved in traced code.

I’m a beginner with the stateless graph and FakeTensor so I’m even not sure if it is a not-even-wrong intention. Please share if you have any idea :), any suggestion will be appreciated!

shuokay · December 5, 2023, 8:55am

aot_export_module should provide you some help, otherwise you may need to use __torch_dispatch__ to achieve your requirements.

shuokay · December 5, 2023, 8:58am

These discussions might provide some help: How does torch.compile work with autograd? - #4 by Chillee

botbw · December 5, 2023, 9:40am

Thanks shuokay! The default_partition func is just what I need.

Btw do you have any idea if I only want to trace the backward graph (instead of tracing a joint one and partitioning it)?

Topic		Replies	Views
How to get the backward graph while using torch.export? compiler	4	315	July 29, 2024
How does torch.compile work with autograd?	13	3580	November 21, 2023
`torch.compile` `AOTAutograd` backwards _inductor function compiler	0	438	January 23, 2024
Highlighting a few recent autograd features (H2 2023) autodiff	0	666	January 5, 2024
Backward module does not contains weight's gradients calculation FX	0	470	June 16, 2023

How to trace torch.autograd.backward or torch.autograd.grad?

Related topics