State of PT2: Apr 13, 2024
I’m feeling lazy so short report today.
- Monthly review was this week. Big item in this corner was nested tensor for inference. We talked a bit about the PoR at composability sync https://www.youtube.com/watch?v=9JoHZLSLb14 and it seems Sherlock has been waylaid into working on it.
- I spent my return from vacation working on a bunch of unbacked SymInt related bugs that had piled up while I was gone. A particular tricky set that I finished diagnosing this week was the remaining blocker for Inductor compilation on ShardedEBC (though it still hangs when you run it). Meta only status: https://docs.google.com/document/d/1ftC2LOtC9ULOQn4OfEpRd9JZc06esAW2Rt67m2Re7EU/edit#heading=h.u9pn2mueqcry
- James Wu is going to be working on AOTAutograd caching, stay tuned for design doc.
- Some behind the scenes negotiating going on with OpenTelemetry. A lot of the friction is dealing with the dual environments that PyTorch is used in at Meta: fbcode (which has a lot of preexisting libraries and infra that our PEs would prefer we use) vs conda on Mast (which has nothing). We might end up writing our own implementation of the OpenTelemetry APIs so that it plugged directly into our preexisting infrastructure in fbcode, but this doesn’t really solve the conda on Mast situation.
- Ivan Kobzarev has a proposal to allow us to constant propagate arbitrary tensors, so that when you call tolist() on them you get out constants. Piggy backing off of fake tensor constant propagation (but note that today, fake tensor is only willing to constant propagate numel 1 tensors.)
- Horace thinks the important thing about distributed compilation is overlap, but actually if you just make sure things are appropriately queued, you get it “automatically”. Just need this capability. In some sense this is lazy scheduler but taken even further.
- Thanks Brian for writing up AOTAutograd mutation invariants https://docs.google.com/document/d/1VA-qREPiS0KSNb8RlQs66n4-U6rUU965HSvdOuqAC-w/edit I talked about them in this week’s podcast
- Notable bug reports:
- maybe_evaluate_static performance problem with unbacked SymInts
- Dynamo unsupported: dynamic padding - interesting case where our min/max reasoning is soft
- Substitutions result in unbacked SymInts showing up before their definition sites - time traveling substitutions are bad news when dependent types are involved
- Inductor GuardOnDataDependentSymNode in cat
- Accurately typed ValueRanges / convenient OpsHandler dtype inference - this should be possible, it just needs to be done
- Dynamo unsupported: Dynamic slicing on data-dependent value is not supported - needs to be looked into
- torch.compile dynamo fails indexing into array from internal mutable state - needs to be looked into
- Notable fixes: