I am looking into compiling libtorch into a “forward-only“ version, e.g. dropping all the CUDA backward kernels. This would result in lower binary sizes for inference. Did anybody try something like this before?
Related topics
| Topic | Replies | Views | Activity | |
|---|---|---|---|---|
| TorchInductor: a PyTorch-native Compiler with Define-by-Run IR and Symbolic Shapes | 46 | 75029 | July 29, 2024 | |
| What’s preventing PyTorch from being competitive with Llamafile? | 8 | 754 | December 10, 2024 | |
| Embrace tensor subclass as a Python device registration API | 5 | 698 | March 28, 2025 | |
| Next Steps for PyTorch Compilers | 9 | 10870 | October 21, 2021 | |
| State of PyTorch core: September 2021 edition | 1 | 9499 | September 21, 2021 |