PrimTorch: How backend/compiler writers interact with various IRs

I’m getting familiar with PT2.0 recently, by reading several docs, the following are some of my understandings about different IRs, I’m wondering if I was right or not:

  1. I could get all the functions of pytorch by integrating with any IR, i.e. even if my backend integrate with only Core ATen IR or Prims IR or Inductor Loop-level IR, the backend could get all the “” operators.

  2. For Core ATen IR is the subset of aten operators, the way I use it is the same with former, i.e. through “native_functions.yaml”. ( But how could I distinguish who are the “Core ATen” ops and who are the “ATen” ones? :face_with_monocle:)

  3. As for the Prims IR, I need to implement my compiler using Python language, and then dispatch the fused op to my hardware(?)

The last two questions are about how to integrate with PrimTorch IRs. I have no confidence in my understandings as no relevant reference have been found :joy:. If I missed any integration related docs or tutorials please let me know, thanks very much~

Hi @Minerva_Yu,

No answers, just hints - I am at the same situation as you are - trying to understand how should I integrate my compiler with the torch IR. I got some pointers here:

You can play with the backend and get only with no aten, or only aten if you run the aot compiler. It seems that to get the core aten and prims - you can use what @ezyang pointed there - using proxy_tensor, make_fx & TorchRefsMode, but I am not sure how these levels relate to IRs — PyTorch master documentation.

Also see the post here: PrimTorch: could we get pure core-aten-ops or prims-ops after aot_autograd - #5 by SherlockNoMad, on how to get the core aten ir.

Seems also that you can define the decompositions you want, in python, and get varying levels of IRs from to “core atens and prims” - See the notebook I linked to in that issue.

For the second question, some new discoveries:
ops in “native_functions.yaml” file with a tag “core” are the Core ATen IRs.

And by trying with the inplace variants torch.abs_(x), FX graph found is as follow:

abs_1: f16[2048] = torch.ops.aten.abs.default(arg0_1)
copy_: f16[2048] = torch.ops.aten.copy_.default(arg0_1, abs_1);  arg0_1 = None

Although I have not found where the abs_ ops is decomposed to abs and copy_ ops. Probably that is some magic with def register_inplace() function.