State of PT2: Jul 6, 2024
I’m back after having a baby.
- torch.compile, the missing manual - Google Docs is out! I think it’s pretty cool. You should read it if you do anything at all with torch.compile/PT2.
- We’ve been internally deploying remote cache at Meta, and one of the interesting bumps we’ve hit along the way is that if you make a single rank compile much faster because it cache hits, while other ranks cache miss, this is an easy way to end up with a NCCL timeout! I wrote about this publicly a bit in torch.compile, the missing manual - Google Docs but we are having discussions about how exactly to deal with cases when compilation across ranks happens asymmetrically. The most likely to work short term idea is a way to add time to the NCCL timeout of ranks when compilation happens on other ranks, but we’re also looking at adding more restrictions to compilation in a distributed setting, e.g., all ranks must compile at the same time.
- Notable fixes:
- Make sympify’ing SymInt/etc produce their sympy expression - We’re pretty sure this will nail the dreaded 125.000000 showing up in integer position bug
- Print float with full precision, don’t truncate - This is a pretty funny problem that was discovered by Animesh tightening up guards. It should fix excess recompilation in some cases, although it’s not clear this ever was actually a problem in real world code.
- Enable TORCH_TRACE by default on Conda on Mast - Someone at Meta please tell me if this actually worked or not LOL. (Note: you can easily manually turn it on by running your job with TORCH_TRACE=/logs, so it’s not a big deal now)
- Stop immediately specializing common constants 0/1 for plain int - We nearly gave up on this one but it turned out we didn’t need to fix that many bugs. Nice to have, especially helps with things like step counters.
- Make sym_min/sym_max handle Numpy scalars - IDK man, solves an internal problem, but y’all using np.int64 you’re weird LOL
- Stop updating hints - Beefy performance fix for unbacked code!
- Don’t mark conversion to float as is_integer = False - This fixes a nasty False != True assertion error. On close inspection Sympy’s behavior here makes sense, but it was very unintuitive!
- Fix typo in floordiv solver code that affects flipped relation - This one’s a real howler, I’m glad it was easy to fix
- Correctly put mark_unbacked symbols in shape_env_to_source_to_symbol_cache - mark_unbacked finally works with this, it didn’t really work prior to this.
- Make are_strides_like_channels_last size oblivious - Makes channels last code with unbacked SymInts work better
- Add mark_unbacked - This got landed while I was on vacation, but it’s pretty nice, it essentially treats an input size as unbacked so you won’t recompile it for the 0/1 cases.