Which papers are relevant to understand how the core of PyTorch and the GPU kernels and torch.compile are put together?

Hi, I’m a newbie who is embarking on this project of studying PyTorch holistically. At this point I am mostly interested in the core of PyTorch, the CUDA kernels and the torch.compile project. I am already somewhat familiar with the CUDA kernels as my job involves some hardware aware optimizations.

As a first step I want to take a step back and read some relevant papers to understand how PyTorch is put together, before doing a deep dive in the code.

These are the papers I have found on the topic:

  • PyTorch: An Imperative Style, High-Performance
    Deep Learning Library
  • PyTorch 2: Faster Machine Learning Through Dynamic
    Python Bytecode Transformation and Graph
    Compilation
  • Automatic differentiation in PyTorch

For completeness sake I wanted to ask whether there are other papers or documentation that would be of relevance to read. In particular I cannot find any papers that detail the system design decisions of aten-CUDA. And are there any papers on torch.inductor and torch.dynamo and everything related to torch.compile?

EDIT I have since read the first two papers. I see that the second paper covers torch.inductor and torch.dynamo.

2 Likes

I’m not sure if this is exactly what you want, but this is a well-known blog (by one of the PyTorch devs) that’s quite good PyTorch internals : ezyang’s blog.

He also hosts a great podcast where he explains design decisions and history on various PyTorch libraries etc. https://pytorch-dev-podcast.simplecast.com/episodes

It may not be specifically torch.compile or how PyTorch goes down to CUDA kernels, but maybe it’s helpful.

I haven’t read the papers you’ve listed, but as a PyTorch beginner myself, it’s most welcome to see you share those. I will read them myself now :slight_smile:

Update: Oh, and here’s another amazing resource with some PyTorch folks involved https://www.youtube.com/channel/UCJgIbYl6C5no72a0NUAPcTA. Again, it may not 100% be what you’re specifically asking for here, but it’s a great resource for learning.

1 Like

Thanks, I found the blog quite helpful. I also found the Wiki on the project page on github contains a lot of information. It is unfortunately all a bit all over the place. I was hoping there would be a more organized documentation of PyTorch internals, but I suspect the way the project evolved that was never possible.

There is also https://docs.pytorch.org/assets/pytorch2-2.pdf

Perfect, thanks, I had have already started studying it.

I came across these that might be relevant:

The onboarding docs also have some labs, but the cuda ones seems to be internal-only.

Hope this helps.

Thanks I didn’t have those yet. Yeah, I think I can get quite far just reading the source and then I’ll just ask on the forum when I need advice.