Is Dynamo enabled per-thread or per-process?

I’m diving deep into the internal of dynamo, and find a confusing snippet of code:

static int active_dynamo_threads = 0;

static PyObject* increment_working_threads(PyThreadState* tstate) {
  active_dynamo_threads = active_dynamo_threads + 1;
  if (active_dynamo_threads > 0) {

static PyObject* decrement_working_threads(PyThreadState* tstate) {
  if (active_dynamo_threads > 0) {
    active_dynamo_threads = active_dynamo_threads - 1;
    if (active_dynamo_threads == 0) {

static PyObject* set_eval_frame(PyObject* new_callback, PyThreadState* tstate) {
  // Change the eval frame callback and return the old one
  //  - None: disables TorchDynamo
  //  - False: run-only mode (reuse existing compiles)
  //  - Python callable(): enables TorchDynamo
  PyObject* old_callback = eval_frame_callback_get();

  // owned by caller

  if (old_callback != Py_None && new_callback == Py_None) {
  } else if (old_callback == Py_None && new_callback != Py_None) {


  // Set thread local callback. This will drive behavior of our shim, if/when it
  // is installed.

  is_dynamo_compiling = !(new_callback == Py_None);
  return old_callback;

Per my understanding, both callback and eval_frame function are per-thread variables. Then each thread should turn on/off its own eval_frame. Why it is controlled by a per-process variable active_dynamo_threads? And if it is shared across threads, maybe it should be protected via lock to avoid race condition?

The pep523 eval frame API in CPython is per-process, not per-thread. This is a bit annoying and what the code you are pointing to is handling.

The reason no lock is needed for active_dynamo_threads is it would be redundant with the Python GIL.

1 Like

That’s indeed a bit annoying. Thanks for the clear explanation, Jason. Your explanation always makes things crystally clear.