State of symbolic shapes branch

State of symbolic shapes: Dec 19 edition

Previous update: State of symbolic shapes branch - #20 by ezyang

Commit ID at time of writing: 212873c615dd3455a24d390605335aeeebd76236

Executive summary

This week, we turned on dynamic shapes with aot_eager on CI in the inductor job. Compared with static shapes aot_eager, we only have a 17 failures difference on master! Inductor remains in bad shape in master, as we are still waiting on @Chillee to submit his PR with fixes.

In other news, @ezyang has released a benchmark for reasoning on shape computation: GitHub - ezyang/SMT-LIB-benchmarks-pytorch-shapes: SMT-LIB benchmarks for shape computations from deep learning models in PyTorch If you work on SMT solvers or like symbolic reasoning systems, check it out! It offers an easy way to test out new ideas about how to symbolically reason over shape compute. We still have a number of infinite loops in Sympy, although this week we are now just suppressing all stack overflows induced by Sympy.

  • Model training status on master. See also Symbolic shapes work items tracker - Google Sheets
  • OpInfo tests on symbolic shapes.
    • pytest test/test_proxy_tensor.py -k test_make_fx_symbolic_exhaustive - 513 passed (+5 WoW), 522 skipped (no change), 227 xfailed (-3 WoW)
    • pytest test/functorch/test_aotdispatch.py -k test_aot_autograd_symbolic_exhaustive - 286 passed (+5 WoW), 142 skipped (+1 WoW), 203 xfailed (-5 WoW)

Notable bugs

  • Despite overhauling ShapeEnv guard production in Dynamo two weeks ago, there were still more stragglers that had to be addressed this week. The main source of problems was a mismatch between when we added a tensor to GraphArgs (as it is an FX graph input) and when we allocated dynamic shapes for a tensor (so we may need to determine the source of its symbolic shape). This lead to more refactoring in Dynamo so that we could guarantee that whenever a tensor had symbolic shapes allocated for it, we also tracked it for the purposes of guard creation. This fixed all bugs, except one(!), which @ezyang has an open PR set for (involving more refactoring.)
  • Assert for functional graph is FINALLY in master, and it caught more bugs in inductor lowerings when it landed. Hooray for more stringent asserts.

What’s made it to master this week?

ezyang

jbschlosser

voz

bdhirsh

What’s coming next?

By Person:

  • voz: vacation
  • ezyang: vacation
  • bdhirsh: continue AOTAutograd v2 follow up
  • jbschlosser: merge to master and burn down
  • Chillee: inductor integration (apparently, Horace “has a few fixes” but they’re still not posted yet)

Our north star:

  • All benchmark models are passing aot_eager and inductor training on branch
  • Fallback implementation for custom operators without symbolic shape propagation, inferred by running fallback on real operators
  • All OpInfo tests passing
  • Dynamic shapes on by default for developers / users