How does torch.compile work with autograd?

I mean how to determine which intermediate variables to save for the most efficient backward. Still trying to understand the post Min-cut optimal(*) recomputation (i.e. activation checkpointing) with AOTAutograd - #9 by Chillee .