State of symbolic shapes: Mar 26 edition
Previous update: State of symbolic shapes branch - #46 by ezyang
Executive summary
This was a sleepy week. There’s a lot of activity going on in various dynamic shapes bugs on the bug tracker, though less fixes than I would have liked.
- Some bug fixes:
- Horace has a dynamic shapes minifier fix PR up at: Fix minifier for symbolic shapes by Chillee · Pull Request #97428 · pytorch/pytorch · GitHub patch it in if you need to minify stuff
- Brian has posted initial draft of AOTDispatch: [POC] initial version of AOTDispatch by bdhirsh · Pull Request #97540 · pytorch/pytorch · GitHub
-
State of real world model enablement.
- MNIST with dynamic batches triggering lots of workstreams, check out Notes from training mnist dynamic=True - Google Docs
- HuggingFace T5 accuracy failure has a repro, ready to investigate!
- wav2vec2 inference is working again
-
torch_geometric performance - a lot of subtasks:
- puririshi98’s GCN NeighborSampling minimal test is only slightly faster with static compile (and only after fixing graph break); dynamic shapes is slower even with graph break fix, it could be split reductions, but even with static there is not much payoff.
- rusty1s’s SAGEConv minimal test is not faster E2E even with static compile. pytorch-geometric has benchmarks at Compiled Graph Neural Networks — pytorch_geometric documentation showing E2E speedup, I don’t know how to reproduce it yet.
- In any case, they are affected by overspecialization in backwards, maybe. torch.compile (torch dynamo specifically) failing for simple GNNs trained with Neighbor Sampling (dynamic batches) · Issue #94640 · pytorch/pytorch · GitHub
- vision_maskrcnn (also `torch.compile` + `torch.no_grad` not working for Mask R-CNN · Issue #97340 · pytorch/pytorch · GitHub) - although it’s failing accuracy in the benchmark script, it can be run locally after SymIntify roi_align by ezyang · Pull Request #7448 · pytorch/vision · GitHub and some massaging with static by default. Repro script at https://gist.github.com/ezyang/2024bfdb6a2161f65ad7820264057fe7
- No update: LLAMA, InstructPix2Pix, detectron2, OpenNMT, fused boolean mask
The numbers:
-
Model status on master. See also Symbolic shapes work items tracker - Google Sheets
- aot_eager inference: -1 (unchanged). Still vision_maskrcnn sympy error from
reshape(torch.empty(s1, (s0 + 1)//2, 2), (s1, s0))
. - aot_eager training: -2 (unchanged). Still botnet26t_256 and eca_botnext26ts_256.
- inductor inference: -7 (-1 WoW). The regression is tf_efficientnet_b0 from [inductor] hoist symbolic padding expressions by ngimel · Pull Request #97099 · pytorch/pytorch · GitHub; this is a case of temporarily breaking things to move globally in a better direction
- inductor training: -1 (+1 WoW). pytorch_unet was fixed, leaving tf_efficientnet_b0 (NameError: name ‘s1’ is not defined)
- aot_eager inference: -1 (unchanged). Still vision_maskrcnn sympy error from
- Graph breaks on master. Sorry, CI is still not updated to do dynamic, maybe next week. We think the graph breaks are not super high priority at the moment, unless they are also affecting performance.
-
Tracing cost of enabling dynamic shapes (aot_eager).
benchmarks/dynamo/run_delta.sh --backend aot_eager --devices cuda --cold-start-latency --ci
. Mean: 12s (unchanged), Max: 153s (+2 WoW, probably noise)
What’s coming next?
- Horace on PTO next week
- Voz: finish refactor at Adjusted design for ShapeEnv. by ezyang · Pull Request #97164 · pytorch/pytorch · GitHub, then shape_env + 2 tier guard cache
- Avik: Work on dynamic_dim(x, 0) <= 2 constraints API after Voz finishes refactor
- Brian: More AOTDispatch
- Edward: Not sure, flag me if you’re blocked