Is Dynamo enabled per-thread or per-process?

youkaichao · September 25, 2023, 2:58am

I’m diving deep into the internal of dynamo, and find a confusing snippet of code:

pytorch/pytorch/blob/265acd4bea50f8336524d8f1bec73d3ca50cdfac/torch/csrc/dynamo/eval_frame.c#L1021


      
            } else {
              DEBUG_TRACE("create skip %s", get_frame_name(frame));
              Py_DECREF(result);
              set_extra_state(frame->f_code, SKIP_CODE);
              // Re-enable custom behavior
              eval_frame_callback_set(callback);
              return eval_frame_default(tstate, frame, throw_flag);
            }
          }
          
          static int active_dynamo_threads = 0;
          
          static PyObject* increment_working_threads(PyThreadState* tstate) {
            active_dynamo_threads = active_dynamo_threads + 1;
            if (active_dynamo_threads > 0) {
              enable_eval_frame_shim(tstate);
            }
            Py_RETURN_NONE;
          }
          
          static PyObject* decrement_working_threads(PyThreadState* tstate) {

static int active_dynamo_threads = 0;

static PyObject* increment_working_threads(PyThreadState* tstate) {
  active_dynamo_threads = active_dynamo_threads + 1;
  if (active_dynamo_threads > 0) {
    enable_eval_frame_shim(tstate);
  }
  Py_RETURN_NONE;
}

static PyObject* decrement_working_threads(PyThreadState* tstate) {
  if (active_dynamo_threads > 0) {
    active_dynamo_threads = active_dynamo_threads - 1;
    if (active_dynamo_threads == 0) {
      enable_eval_frame_default(tstate);
    }
  }
  Py_RETURN_NONE;
}

static PyObject* set_eval_frame(PyObject* new_callback, PyThreadState* tstate) {
  // Change the eval frame callback and return the old one
  //  - None: disables TorchDynamo
  //  - False: run-only mode (reuse existing compiles)
  //  - Python callable(): enables TorchDynamo
  PyObject* old_callback = eval_frame_callback_get();

  // owned by caller
  Py_INCREF(old_callback);

  if (old_callback != Py_None && new_callback == Py_None) {
    decrement_working_threads(tstate);
  } else if (old_callback == Py_None && new_callback != Py_None) {
    increment_working_threads(tstate);
  }

  Py_INCREF(new_callback);
  Py_DECREF(old_callback);

  // Set thread local callback. This will drive behavior of our shim, if/when it
  // is installed.
  eval_frame_callback_set(new_callback);

  is_dynamo_compiling = !(new_callback == Py_None);
  return old_callback;
}

Per my understanding, both callback and eval_frame function are per-thread variables. Then each thread should turn on/off its own eval_frame. Why it is controlled by a per-process variable active_dynamo_threads? And if it is shared across threads, maybe it should be protected via lock to avoid race condition?

jansel · October 4, 2023, 3:54pm

The pep523 eval frame API in CPython is per-process, not per-thread. This is a bit annoying and what the code you are pointing to is handling.

The reason no lock is needed for active_dynamo_threads is it would be redundant with the Python GIL.

youkaichao · October 7, 2023, 3:21am

That’s indeed a bit annoying. Thanks for the clear explanation, Jason. Your explanation always makes things crystally clear.

Topic		Replies	Views
A minimal working example of standalone usage for dynamo eval_frame compiler	2	703	September 29, 2023
Supporting Dynamo in Python 3.11 - CPython Frame Evaluation compiler	0	1413	August 8, 2023
Supporting Dynamo in Python 3.12 compiler	0	778	July 26, 2024
Does dynamo trigger real kernel execution? compiler	4	70	July 1, 2025
A proposal to make Dynamo more understandable to users compiler	3	1498	September 5, 2023

Is Dynamo enabled per-thread or per-process?

Related topics