State of symbolic shapes branch

State of symbolic shapes: May 6 edition

Executive summary

  • Big grind for dynamic by default. @voz reports that enabling dynamic by default (or, more specifically, static by default but automatically turn on dynamic on recompile) has turned into a bit of a slog. Some of the reasons: (1) people have been adding new test files to the Dynamo test suite, but those tests have not been simultaneously run with dynamic shapes so we were running blind there, (2) it’s not risk free to recompile into dynamic shapes (our plan to derisk here is to remove dynamic_shapes=True but keep automatic dynamic false to start) and in practice some things broke on the full PyTorch test suite + dynamo, (3) there a lot of conditionals on dynamic_shapes scattered throughout the codebase and they all have to be rewritten not to do this.
  • Accuracy work on dynamic shapes should be unblocked by improved accuracy minifier. We’ve fixed one bug with the help of the new minifier infra described at Major updates to the after AOT accuracy minifier! ; hopefully we can nail more!
  • Notable bug fixes.

CI skips. 0, 0, 0, -2 (no change WoW). We’re planning to use the improved accuracy minifier tools to nail the accuracy failures.

The dashboard (as of 675029a). This week on HUD.

Here are the top line metric changes from this week:

Metric Torchbench Huggingface TIMM models
Passrate 86%, 51/59 → 85%, 50/59 :small_red_triangle_down: 98%, 44/45 98%, 61/62 → 100%, 62/62
Speedup 1.07x → 1.09x 1.40x → 1.41x 1.03x → 1.08x
Comptime 87s → 84s 111s → 110s 132s → 133s :small_red_triangle:
Memory 0.86x 1.06x 1.01x

The graphs suggest it was a bit of a roller coaster week:

What’s coming next?

  • Voz: still grinding on dynamic by default, also moonlighting on improving Deberta performance
  • Edward: nail some accuracy bugs, think about de-TensorImpl’ification or partial CUDA graphs application, maybe help on HF
  • Horace: optimizing distributed collectives
  • Joel: jagged tensor in inductor