Forward-only library

I am looking into compiling libtorch into a “forward-only“ version, e.g. dropping all the CUDA backward kernels. This would result in lower binary sizes for inference. Did anybody try something like this before?