Generalizing AMP to work on CPU

ezyang · April 26, 2021, 2:20pm

Intel is interested in bringing automatic mixed precision to CPU in [RFC] Extend Autocast to CPU/CUDA with BF16 data type · Issue #55374 · pytorch/pytorch · GitHub One big question is what the API for autocasting should be for CPU; should we provide a single, generalized API torch.autocast (keep in mind that CPU autocasting would be through bfloat16, while the existing GPU autocasting is via float16), or provide separate APIs for CPU/CUDA? If you have any thoughts or opinions on the subject, please chime in on the issue.

Topic		Replies	Views
JIT scripting & Autocast torchscript	12	3591	August 8, 2024
Improve the extension with `PrivateUse1` for custom device hardware-backends	8	1549	April 17, 2023
Inconsistent precision between PyTorch's built-in backends for the same device compiler	0	62	May 27, 2025
Compiled autograd with custom ops error FX	1	304	July 19, 2024
Intel GPU & CPU Enabling Status and Feature Plan – 2025 H1 Update hardware-backends	1	543	April 16, 2025

Generalizing AMP to work on CPU

Related topics