Just to state another valid solution โ is this an opportunity to introduce a torch.optim2, that cleans up some of these problems with the existing APIs, while not giving a damn about backward-compatibility (because we use a new namespace)?
Or is the change not sufficiently new and innovative that the expensiveness of introducing a v2 API is not warranted?