TorchDynamo: An Experiment in Dynamic Python Bytecode Transformation

Hi! I am interested in the training support. Will the training support contain optimizer? If so, will the weight update logics be seperated or all things (fwd+bwd+optim) in a single graph?