I spent some time investigating 219 reverts starting the week of 05/03/2021 to 10/27/2021.
Some highlights:
- The top reason (29.7%) for reverts is missing PR signal.
- Breaking down the missing signal, we see a plethora of signal types with no clear majority.
- Some of the missing signal are now a thing of the past. We have enabled force_on_cpu tests and Windows CUDA smoke tests on PRs, addressed the docs jobs inconsistency, and disabled tbb tests and target determination on PRs.
- Almost 30% of the reverted commits ignored OSS signal. It is worth tackling why this is the case. Is it flakiness? Did the signals take too long?
- Further breaking down by ignored signal, we see ~25% comes from lint + mypy + clang-tidy, which are all relatively fast signals.
- 26.2% broke multiple jobs!
- Something that surprised me: a significant 12.3% of reverts were due to internal breakages not captured by OSS
- Note: out of the 11 changes not exported reverts, only 1 dates after the full enablement of the
pytorch-diff-not-exported
Sandcastle signal
DISCLAIMER: these stats were manually sorted and thus are subject to error despite my utmost efforts. You can find the raw stats here: https://docs.google.com/spreadsheets/d/14aQ2HBg8gyNlqLZdGorkm7uBh4sIzY4Ux8O6XSocLMs/edit#gid=0&fvid=1408746951