Hi, I am developing a memory saving algorithm for pytorch.
It basically goes like this:
while (not_enough_memory()) { free_memory(); }
however, this do not really work in extreme case:
suppose an operator use to much memory, pytorch will oom without ever calling free_memory().
I had added hooks in cuda caching allocator to free_memory(), but it turn out to be not enough, as cublas also try to malloc memory, and return CUBLAS_STATUS_NOT_INITIALIZED for failing malloc.
Do I have some ways to catch this (and potentially other error) with exception in C++? I could then free up some memory and retry the offending memory-allocating computation.