State of symbolic shapes branch

State of symbolic shapes branch: Dec 12 edition

Previous update: State of symbolic shapes branch - #19 by ezyang

Commit ID at time of writing: bcb284d77fe865373b2f1617867320fb32ea68af

Executive summary

We master now peeps! We are officially retiring the symbolic-shapes branch. There are still some changes that need to be merged to master (Symbolic shapes work items tracker - Google Sheets “Merge to master” sheet), but the major fixes for shape guard creation have landed to master, so all that remains on the branch are some QOL fixes (in particular the debug interpreter is no longer on by default), a little bit of op coverage, and some experimental code (esp for inductor integration) that needs to be rewritten anyway.

Previous branch diff: 68 files changed, 2612 insertions(+), 554 deletions(-)
Current branch diff: 0 files changed, 0 insertions(+), 0 deletions(-)

Notable bugs

  • It turns out checkpointing doesn’t operate on ShapeEnv, but it should. Dynamo uses checkpoints to roll back its internal state after executing an instruction (or many instructions, in the case of an inlined function call) fails, so that it can pretend those instructions never executed. However, because ShapeEnv isn’t checkpointed, shape guards that occur during the rolled back instructions still end up getting installed. This can result in a hard error if the guards refer to variables that we don’t know about from the outer context (we think hf_Reformer and swin_base_patch4_window7_224 are affected by this). Checkpointing the ShapeEnv performantly is nontrivial, as we refine the context with equalities and use that to drive sympy simplification, all of which would need to be undone. This bug is still unfixed.
  • Preserve original GraphArgs for shape guard codegen and Rewrite dynamo cond() handling to not recursively call export are both fixes for pretty interesting bugs, if I don’t say so myself. Go checkout their PR descriptions for more details.

What’s made it to master this week?

ezyang

nkaretnikov

voz

SherlockNoMad

What’s coming next?

By Person:

  • voz: Guard refactor in dynamo
  • ezyang: burn down symbolic shapes, fix bugs, work on exporting all shape expressions to form a benchmark, aot autograd default api maybe?
  • bdhirsh: continue fixing AOTAutograd v2 follow up bugs
  • jbschlosser: merge to master tasks, burn down symbolic shapes
  • unallocated: inductor integration

Our north star:

  • All benchmark models are passing aot_eager and inductor training on branch
  • Fallback implementation for custom operators without symbolic shape propagation, inferred by running fallback on real operators
  • All OpInfo tests passing
  • Dynamic shapes on by default for developers / users
4 Likes