NNC Per-Operator Benchmarks (on CPU)

albanD · January 27, 2021, 6:27pm

That makes sure that no graph is created, but most of the autograd logic still runs (related to view/inplace in particular).
But that would only change the fixed dispatcher overhead. So won’t change these results too much beyond making PyTorch more competitive for small sizes.

Topic		Replies	Views
NNC walkthrough: how PyTorch ops get fused nnc	10	7654	November 3, 2021
Python Operator Authoring w/ NNC nnc	5	2595	June 7, 2022
Single-op fusion benchmarking compiler	0	870	February 4, 2021
Depthwise conv2d: An NNC Case Study compiler	0	1506	April 7, 2021
State of PyTorch core: September 2021 edition frontend API	1	9524	September 21, 2021

NNC Per-Operator Benchmarks (on CPU)

Related topics