The "Ideal" PyTorch FLOP Counter (with __torch_dispatch__)

albanD · April 17, 2023, 2:05pm

Hi,

When you do out = mm(x, y) during the forward pass, then the backward pass is give gOut and needs to compute gX and gY.
The formula for this is gX = mm(gOut, y^T) and gY = mm(x^T, gOut) that should explain why you have 2 mms in the backward pass.
Also since convolutions are just special mm, you get very similar formulas and that’s where the transpose comes from.
I would recommend you check online for the difference between transposed and regular convolutions. There are blogpost with visualizations that will be much better than anything I could write!

Cheers,
Alban

Topic		Replies	Views
What (and Why) is __torch_dispatch__? frontend API	3	15834	July 2, 2024
Where do the 2000+ PyTorch operators come from?: More than you wanted to know compiler	13	15202	November 15, 2024
State of PyTorch core: September 2021 edition frontend API	1	9515	September 21, 2021
Estimate theoritical FLOPs of backward pass of a DNN performance	1	1194	April 16, 2023
How to read the autograd codebase frontend API	1	3106	October 26, 2021

The "Ideal" PyTorch FLOP Counter (with __torch_dispatch__)

Related topics