[RFC] Improve Dynamic Shapes Support Across Aten Operators and Expand Test Coverage

kurator14 · June 2, 2026, 5:51pm

I’d like to propose a focused effort to improve dynamic shapes support across ATen operators and expand the corresponding test coverage. I would really appreciate any feedback on the scope, prioritization, or approach – and I am happy to adjust based on the community’s needs.

Motivation

When `torch.compile(dynamic=True)` works well, users get flexible compiled programs that avoid excessive recompilation as tensor dimensions change. In practice, however, many ATen operators still call concrete-shape APIs (`numel()`, `size()`, `sizes()`) instead of their symbolic counterparts (`sym_numel()`, `sym_size()`, `sym_sizes()`), or use `guard_int` / `guard_static_shape` in Inductor lowerings where symbolic arithmetic would suffice.

The result is that users hit unexpected errors like:

RuntimeError: Cannot call numel() on tensor with symbolic sizes/strides

or experience silent recompilation when guards are placed on dimensions that didn’t need them.

Four Failure Modes

My audit of the existing xfail lists and C++/Python operator code reveals that operators crash in four distinct ways. Understanding these modes is essential for choosing the right fix:

Mode	Frequency	Error	Root Cause	Fix Pattern
1: Direct Throw	~80%	Cannot call numel/sizes/strides on tensor with symbolic sizes	C++ `.numel()`/`.sizes()`/`.size(dim)`/`.strides()` returns concrete type, throws immediately on symbolic tensor	Replace with `sym_numel()`, `sym_sizes()`, `sym_size(dim)`
2: Python SymInt Branch	~10%	GuardOnDataDependentSymNode or unexpected guard	Python `.numel()`/`.size()` ALREADY returns SymInt (no throw), but if `val == 0:` branches on it	Wrap with `guard_or_false()` or `torch._check()`
3: TensorIterator Reject	~5%	TensorIterator does not support symbolic shapes	TensorIterator checks `has_symbolic_sizes_strides_` flag directly	Implement op in `torch/_refs`
4: Implicit SymInt->int64	~5%	Over-specializing guard or throw in guard_int()	After fixing Mode 1, the returned SymInt flows into code expecting `int64_t` (loops, pointers, parallel_for)	Restructure to avoid needing concrete value, or decompose

Most operators fall into Mode 1 and are straightforward to fix. Modes 2-4 require progressively more judgment but are well-understood patterns with existing precedent in the codebase.

Example: PR #182004 (fixing `cross_entropy_loss` for dynamic shapes),

I think a systematic pass would meaningfully improve the `torch.compile(dynamic=True)` experience for users.

Approach

I plan to land this as a series of small, focused PRs – each fixing a handful of related operators and their corresponding test xfails. This keeps each change easy to review, easy to revert if needed, and allows CI to catch any regressions early.

For each operator, the workflow is:

Reproduce the failure with a minimal `torch.compile(dynamic=True)` script
Fix the C++ operator (replace concrete APIs with symbolic equivalents) or the Inductor lowering (replace `guard_int` with symbolic arithmetic)
Remove xfails from the relevant test suites
Add targeted regression tests where coverage is thin, following the pattern established in PR #182004

Benefits

Better user experience : Fewer cryptic crashes and fewer silent recompilations when using `torch.compile(dynamic=True)`
Broader operator coverage : A more complete and reliable dynamic shapes story across the operator surface area
Improved CI signal : Converting xfails to passing tests gives earlier warning if future changes regress dynamic shape support
Incremental and low-risk : Small PRs mean easy review and safe rollback

How You Can Help

Prioritization feedback : If any of the listed operators are particularly important to your workloads, please let us know so we can tackle those first.
Additional operators : Here is our current list of operators [Operators List]. If you’ve encountered dynamic shape issues with operators not listed here, we’d love to hear about them.
Review bandwidth: We’d appreciate reviewer support as we work through the list. Each PR should be small and self-contained.

Thank you for reading! I am excited to chip away at this and would appreciate any thoughts or suggestions.

cc @morrison-turnansky , @groenenboomj

laithsakka · June 8, 2026, 4:40pm

LGTM Thank you for working on this. !

Topic		Replies	Views
State of symbolic shapes branch compiler	96	36521	July 7, 2024
Example inputs to compilers are now fake tensors compiler	9	4175	February 13, 2024
How to invoke symbolic shape propagation? frontend API	3	550	November 16, 2023
Debugging story: The case of the garbage text generation compiler	0	2107	March 30, 2023
Empty tensor with SymInt size FX	1	227	April 15, 2025

[RFC] Improve Dynamic Shapes Support Across Aten Operators and Expand Test Coverage

Four Failure Modes

Related topics