TorchDynamo: An Experiment in Dynamic Python Bytecode Transformation

jansel · January 19, 2022, 8:57pm

Training support is not finished yet. I don’t see anything about optimizers that would prevent them from being captured with minor tweaks.

Regarding single whole-program graphs, TorchDynamo generates single graphs often – but there is no guarantee you will get a whole program graph and that it not the goal. The design philosophy is mixed mode execution working with Python and prioritizing preserving the usability of PyTorch. Tons of things will result in graph breaks including: using Python types (e.g. Tensor.item, Tensor.tolist, torch.any, etc); calling external C libraries (e.g. numpy); printing/logging; control flow (e.g. early stopping in training loop); constructing custom Python classes; and more. If you absolutely require whole program graphs above all else, then a different approach, like AOT tracing or Lazy Tensors, might be a better fit.

Topic		Replies	Views
TorchDynamo Update: 1.48x geomean speedup on TorchBench CPU Inference compiler	0	5827	November 12, 2021
TorchDynamo Update 4: LazyTensor & nvFuser Experiments compiler	4	4940	February 9, 2024
TorchDynamo Update 3: GPU Inference Edition compiler	12	6889	February 2, 2023
TorchDynamo Update 5: Improved Capture & Bigger Graphs compiler	4	4159	April 14, 2022
TorchDynamo Update 6: Training support with AOTAutograd compiler	0	5778	March 29, 2022

TorchDynamo: An Experiment in Dynamic Python Bytecode Transformation

Related topics