@Chillee Related question: Is there a way for me extract the post fused FX graph or better the inductor scheduled graph (I know there’s TORCH_COMPILE_DEBUG=1 but was looking at a programmatic way) ?
Interested in extending to byte counting while factoring in as much of fusion stuff as I can.