what’s the focus of each
the post is trying to clarify that, to summarize pt2e is mostly for static quantization use cases, and torchao is for others.
One of my random guess is: is it true that if I want to deploy on edge device for inference, I will need to export, and makes pt2e the better choice? But for LLM training torchao is better because of more modern feature?
it’s true that edge use cases is mostly pt2e quant so far, but we also have edge + LLM use cases that’s using torchao (quantize_) API as well, so it depends on the type of quantization you’d like to do, e.g. static, v.s. dyanmic, weight only, or more advanced ones like AWQ, GPTQ, smoothquant etc.
And thanks for you pt2e example! looks like it has more feature than the torch.ao. Sounds like no matter I choose p2te or torchao quant method, I could think torch.ao as retired?
that’s true, we are depreacting torch.ao.quantization
, more details in Torch.ao.quantization Migration Plan