State of symbolic shapes branch

ezyang · October 17, 2022, 2:21am

State of symbolic shapes branch: Oct 16 edition

The symbolic-shapes branch (PyTorch: Symbolic shapes by ezyang · Pull Request #84246 · pytorch/pytorch · GitHub ) is a long running branch containing a large number of features and bugfixes related to dynamic shapes support in PyTorch. Previous update: State of symbolic shapes branch - #5 by ezyang

Commit ID at time of writing: 95cb550231fd36e0fb0e3283c033dee384e3397b

Executive summary

15 out of 48 torchbench models are passing with training on master (data collected by @wconstab); compare with 36 out of 48 on branch from last week. This means, modulo inductor, we have hit our goal for dynamic shapes (the goal was “10+ e2e (including BERT) training demonstration on PyTorch master (no recompilation with variable batch size)”)
@Chillee got 9 torchbench models to run E2E in inference with inductor on Monday. The dynamic shape aware timings are comparable, although in some cases slower. These numbers don’t include compilation time.

image890×542 27.7 KB
OpInfo tests on branch:
- For test_proxy_tensor.py -k test_make_fx_symbolic_exhaustive, we are at 305 passed (+8 week over week (WoW)), 267 failed (-5 WoW), 35 (unchanged) skipped
- For test_aotdispatch -k test_aot_autograd_symbolic_exhaustive , we are at 209 passed (+15 WoW), 271 failed (-16 WoW), 127 skipped (+4 WoW). The new skips are probably nn.functional.batch_norm (0 is not tracked with proxy) and some more operators identified as having data-dependent control flow.
Notable bug fixes:
- wconstab identified and fixed a pretty major gotcha for braced scalar initializers that meant changing std::vector<T>({0}) would either produce a one-element list containing zero (when T=int64_t) or a zero-element list (when T=SymInt). This is now durably fixed via Set -Werror=braced-scalar-init by wconstab · Pull Request #86911 · pytorch/pytorch · GitHub
- bdhirsh identified a functionalization bug manifesting as SymInts showing up without proxies that showed up in longformer, tracking it down to this line of code pytorch/torchgen/gen_functionalization_type.py at 811b8e012b3ddcb84adb2e483089758e84b6a995 · pytorch/pytorch · GitHub (not sure if this is fixed in master/in branch yet?)
Nick Korovaiko is transitioning off dynamic shapes and moving to help with inductor burndown.

Previous branch diff: 39 files changed, 1309 insertions(+), 233 deletions(-)
Current branch diff: 30 files changed, 1209 insertions(+), 225 deletions(-)

We briefly were at <900 insertions on Monday, before reverts and more pushes to the branch brought it up again.

Retrospective on merge to master

Reverted Reland 2 of Merge more symbolic meta kernels and symint changes from branch (#86334) because it broke Executorch with “RuntimeError: Missing out variant for functional op: aten::split.Tensor(Tensor(a → *) self, SymInt split_size, int dim=0) → Tensor(a) . Make sure you have loaded your custom_ops_generated_lib”. @albanD had a lot of trouble importing the PR as it often merge conflicted on import. On Sunday he managed to merge it internally (https://www.internalfb.com/diff/D40303705), but GH1 did not automatically export the PR, so the GH side needs a manual land.
There were a few cases of the same PR being posted multiple times by different people: examples include adding a meta function to upsample_nearest2d.vec (#86354) and [min/max support for SymInt/Floats, finish as_strided/scatter/squeeze() backward symint support]
Reverted Add meta support for _adaptive_avg_pool2d_backward because of land race (unexpected success in functorch)
Reverted symintify nll loss fns because a seemingly unrelated CI failure was ignored when it was actually relevant to the PR (test_autocast_nn_fp32)
Re-land*4 "SymIntify cat and narrow" by malfet · Pull Request #86468 · pytorch/pytorch · GitHub was reverted because it changed a native:: declaration that had internal usages; the initial reland didn’t work because we didn’t include the BC definitions in all the same locations the original native declarations were available (it appears this was fixed by adding the necessary includes at use sites). @malfet helped land this one.
Lands were impeded for some portion of time because of the importing outage due to functorch/test symlink being removed.

How to run models E2E

Dynamo has merged into PyTorch repo, so the benchmark instructions are simplified:

TORCHDYNAMO_DYNAMIC_SHAPES=1 AOT_DYNAMIC_SHAPES=1 python benchmarks/dynamo/torchbench.py --only BERT_pytorch --accuracy --backend aot_eager --training

What’s new on the branch this week?

Like last week, all changes are included even if they were merged into master

symintify einsum (anjali411)
meta impl for unsqueeze_ (anjali411)
symintify autograd view chaining (anjali411)
Widen type checks to include PySymFloat (wconstab)
symintify pad ops (bdhirsh)
some FunctionalTensorWrapper fixes; dont perform meta reference compu… (bdhirsh)
fix conv_backward meta and _fused_moving_avg_obs_fq_helper meta, add … (bdhirsh)
symintify nll loss fns (anjali411)
add sym_int (anjali411)
remove _to_copy decomp and add a meta impl (anjali411)
symintify to_expand called through setitem (anjali411)

What’s made it to master this week?

Some PRs were merged by not their authors; the original authors are noted in parentheses

albanD (merge captain this week)
- Reland 2 min/max support for SymInt/Floats, finish as_strided/scatter/squeeze() backward symint support (bdhirsh)
- Symintify NLL loss, copy and squeeze (anjali411, bdhirsh)
- More symintification of get/set item
- symintify autograd view chaining (anjali411)
- symintify einsum (anjali411)
- Reland 3 of Symintify getitem and add the required helper functions (#86207)
wconstab
anjali411
- symintify rand and randint functions and meta suport for randint
- symintify unbind_backward and tensor_split
bdhirsh
Chillee
- Fixed partitioner issue with getitem and made metadata a storage more consistent
- Unified symbolic shape variables between AOTAutograd and Inductor

Currently open PRs

albanD
anjali411
bdhirsh
- ban .sizes() and .strides() calls in derivatives.yaml (was blocked on split, now unblocked)

What’s coming next?

The prime directives (these haven’t really changed):

E2E training on master with inductor.
- Plumb fake tensors up to torchdynamo
- Plumb ShapeEnv guards up to torchdynamo guards
- Resolve strategy for sharing ShapeEnv between forward and backwards (support passing in symint as input?)
Full operator coverage for all benchmark models on the branch
Fallback implementation for custom operators without symbolic shape propagation, inferred by running fallback on real operators
All OpInfo tests passing

Some miscellaneous tactical stuff:

Redundant guards involving FloorDiv are not simplifying away (seen in resnet18, discovered by @Chillee)
Fix PT with torchdeploy/multipy by making Python op registration work with multiple Python interpreters (@ezyang)
Get item() tracing working with symbolic floats for Executorch tracing (Michael Voznesensky)

Topic		Replies	Views
How to invoke symbolic shape propagation? frontend API	3	504	November 16, 2023
State of PyTorch core: September 2021 edition frontend API	1	9468	September 21, 2021
Lazy Tensor Core hardware-backends	20	7827	July 12, 2022
Symbolic Shape Inference torchscript	1	1600	March 31, 2021
Understanding dynamic shapes and guards and when it does/does not cause graph breaks compiler	1	409	November 7, 2024