FMAs (and softmax (and floating point)) considered harmful
|
|
2
|
760
|
November 21, 2024
|
CUDAGraphs in Pytorch 2.0
|
|
6
|
5295
|
November 20, 2024
|
Where do the 2000+ PyTorch operators come from?: More than you wanted to know
|
|
13
|
14277
|
November 15, 2024
|
Compiled Optimizer w/ LR Scheduler Now Supported
|
|
3
|
577
|
November 13, 2024
|
Understanding dynamic shapes and guards and when it does/does not cause graph breaks
|
|
1
|
320
|
November 7, 2024
|
What's the difference between `next_variable()` and `reconstruct()` in `IteratorVariable`
|
|
2
|
68
|
October 26, 2024
|
Support for _set with other mutations in graph
|
|
2
|
146
|
October 18, 2024
|
Is it possible to disable inlining of custom module for torch.compile?
|
|
1
|
262
|
October 11, 2024
|
Impact of multithreading and local caching on torch.compile
|
|
3
|
831
|
September 27, 2024
|
TorchInductor Update 9: Harden Vectorization Support and Enhance Loop Optimizations in TorchInductor CPP Backend
|
|
0
|
533
|
September 4, 2024
|
TorchInductor Update 8: Max-autotune Support on CPU with GEMM Template
|
|
0
|
505
|
September 4, 2024
|
PyTorch Runtime Error with Compiled Autograd
|
|
1
|
295
|
August 31, 2024
|
Pytorch to Triton for Non-GPU Devices
|
|
7
|
1471
|
August 30, 2024
|
Difference between the graph break reasons: `Dynamic control flow is not supported at the moment.` and `generic_jump TensorVariable()`
|
|
0
|
324
|
August 30, 2024
|
Defining the Core ATen Opset
|
|
12
|
5818
|
August 21, 2024
|
Why core aten IR doesn’t contain scalar_tensor overload version of some ops, like bitwise-like ops, remainder and so on?
|
|
0
|
39
|
August 21, 2024
|
How to Access Triton Kernels from TorchInductor when running on CPU?
|
|
1
|
699
|
August 12, 2024
|
Connecting PyTorch sparse tensors with MLIR
|
|
4
|
1221
|
August 8, 2024
|
JIT scripting & Autocast
|
|
12
|
3601
|
August 8, 2024
|
PyTorch/XLA 2.4 dev update
|
|
0
|
396
|
August 6, 2024
|
How to get the backward graph while using torch.export?
|
|
4
|
364
|
July 29, 2024
|
TorchInductor: a PyTorch-native Compiler with Define-by-Run IR and Symbolic Shapes
|
|
46
|
68682
|
July 29, 2024
|
How to set wrap function using TorchDynamo graph capture?
|
|
4
|
457
|
June 28, 2024
|
Understanding torch.fx.traceback.preserve_node_meta()
|
|
0
|
136
|
July 26, 2024
|
Supporting Dynamo in Python 3.12
|
|
0
|
835
|
July 26, 2024
|
Inlining a custom triton kernel
|
|
0
|
172
|
July 22, 2024
|
Compiled autograd with custom ops error
|
|
1
|
308
|
July 19, 2024
|
AOTAutograd incorrect lowering composite ops in inference_mode
|
|
3
|
339
|
July 17, 2024
|
[RFC] Performance profiling at scale with detailed NVTX annotations
|
|
0
|
446
|
July 10, 2024
|
State of symbolic shapes branch
|
|
96
|
32169
|
July 7, 2024
|