I mean how to determine which intermediate variables to save for the most efficient backward. Still trying to understand the post Min-cut optimal(*) recomputation (i.e. activation checkpointing) with AOTAutograd - #9 by Chillee .
Related topics
Topic | Replies | Views | Activity | |
---|---|---|---|---|
How to get the backward graph while using torch.export? | 4 | 269 | July 29, 2024 | |
Torch.compile with AOT Autograd can be debugged now! | 1 | 757 | October 31, 2023 | |
`torch.compile` `AOTAutograd` backwards _inductor function | 0 | 413 | January 23, 2024 | |
[Fatal Bug] changed nn.Module.training does not trigger recompilation | 0 | 387 | July 20, 2023 | |
How to trace torch.autograd.backward or torch.autograd.grad? | 3 | 1074 | December 5, 2023 |