@Chillee I’m trying to understand what AOT autograd does to the computation 1.0 / (exp(-x) + 5)
, and I write down the manual joint graph as follows:
Red circles are what would be saved by the eager mode autograd engine.
I expect that AOT autograd can do some optimization, e.g. only save x2 for backward, and recompute x4 during backward, so that memory cost can be reduced. However, after running the AOT autograd engine, I find that both x2 and x4 are saved for backward.
Is it an expected case? How can I only save x2 if I’m striving for memory efficiency?