A bit delayed, but - we have quite a few commits in the 1.11 release and some things that are interesting for people that develop within PyTorch.
You can find below a curated list of these changes:
Developers
Python API
- OpInfo improvements:
- More operators now have OpInfo tests:
- Added
OpInfofornn.functional.batch_norm(#63218), - Added
OpInfofortorch.argsort(#65454) - Added
OpInfofortorch.repeat_interleave(#65455) - Added
OpInfofor2d fft functions(#66128) - Added
Opinfo’s foravg_pooling(#64214) - Added
OpInfofortorch.bucketize(#65821) - Added
OpInfo’s forisfinite,isinf,isposinf,isneginf,isnan,isreal(#66400) - Added
OpInfofortorch.nn.functional.pairwise_distance(#65460) - Added
OpInfofortorch.nn.pixel_shuffle(#65467) - Added
OpInfofortorch.nn.pixel_unshuffle(#65468) - Added
OpInfofortorch.bincount(#65796) - Added
OpInfofornormops (#67442, #68526) - Added
OpInfofortorch.nn.functional.gaussian_nll_loss(#67356) - Added
OpInfofornn.functional.hinge_embedding_loss(#67381) - Added
OpInfofornn.functional.gaussian_nll_loss(#67376) - Added
OpInfofornn.functional.poisson_nll_loss(#67371) - Added
OpInfofornn.functional.ctc_loss(#67464) - Added
OpInfofornn.functional.cosine_embedding_loss(#67465) - Added
OpInfoforadaptive_max_pool(#67405) - Added
OpInfoforlogical_or,logical_and,logical_xor(#67178) - Added
OpInfofortorch.allclose(#68023) - Added
OpInfofornn.functional.cross_entropy(#63547) - Added
OpInfofortorch.nn.bilinearandtorch.nn.glu(#67478) - Added
OpInfofortorch.histc(#67452) - Added
OpInfosforstft, istft, fftshift, ifftshift(#68198) - Added
OpInfosforparcel Elementwise Binary II(#68085) - Added
OpInfofortorch.linalg.tensorsolve(#68810) - Added
OpInfofortorch.nn.functional.kl_div(#65469) - Added
OpInfofortorch.diagflat(#65680) - Added
OpInfos for some Tensor dtype conversion methods (#64282) - Added
OpInfofor*_likefunctions (#65941) - Added
OpInfofortorch.uniqueandtorch.unique_consecutive(#67529) - Added
OpInfofornew_functions and some_likefunctions (#67357) - Added
OpInfofortorch.nonzero(#67459) - Added
OpInfosfortorch.atleast_{1d, 2d, 3d} (#67355) - Added
OpInfoforembedding_bag(#67252) - Added
OpInfosforcombinations,cartesian_prod,sum_to_size,ldexp, andas_strided (#68853) - Added
OpInfosfor miscnn.functionaloperators (#68922) - Added
OpInfotests for(svd|pca)_lowrank(#69107) - Added
OpInfofornn.functional.dropout2d, revise sample inputs fordropout(#67891) - Added
OpInfosfornormal,bernoulli,multinomial(#66358) - Added
OpInfosforflatten,column_stack(#69237)
- Added
- Other improvements to
OpInfotesting:- Added inplace_variant for resize_
OpInfo(#66135) - Added reference vs. noncontiguous
OpInfotest (#67434) - Split channels_last test cases for tensor conversion
OpInfos(#67368) - Remove
OpInfonon-contig inputs (#67677) - Improve
OpInfotest for norm ops: make inputs independent - [opinfo] use dtypes instead of
dtypesIfCPU(#68732) - Fix for python 3.10 for gradient
Opinfo(#68113) OpInfo: Convert moresample_input_funcsto generators (#69976)- Updated
poisson_nll_lossOpinfosamples (#70300) - Removed unnecessary skips in rsub
OpInfo(#69973) - Merged index_{add,fill,copy,select}
OpInfosampling (#68184) - Labeled more elementwise binary operators correctly as
BinaryUfuncInfos(#71622) - Deactivated the tracking of gradients in sampling functions within
OpInfos(#68522) - Removed special FX
OpInfolist (#67520)
- Added inplace_variant for resize_
- More operators now have OpInfo tests:
- More informative messages for None types comparisons (#69802)
- Killed the
test_torch.pymixin and created test_scatter_gather_ops (#71691) - Relaxes tolerance on ROCm
test_noncontiguous_samples_matmul(#67593) - Added support for automated error and warning testing (#67354)
- Skip forward-over-reverse gradgrad check for pinv singular on CUDA (#70123)
- Made meta tensor data access error message for expressive in
assert_close(#68802) - Removed skips from determinant tests (#70034)
- Refactored repetitions into
TorchVersion._cmp_wrapper(#71344) - Expect
test_fn_fwgrad_bwgradto fail because forward AD is not implemented (#71944) - Some python tensor subclass improvements:
- Added Tensor._make_wrapper_subclass (#65340)
- getitem: Ensure Tensor subclasses are not treated as tuples (#67202)
- Fixed
_make_wrapper_subclass’s storage_offset handling (#68268) - Make empty **and ** _like factory functions respect tensor subclasses (#65677)
- Make new_empty/new_ones/new_zeros/new_full respect subclass (#65169)
- Ensure that “None” tensors in python map to “undefined” tensors in C++ (#67793)
- Rationalized API exports in torch_python (#68095)
- Removed
tensor.datausage from a few places in internals (#65389)
C++ API
- Convolution consolidation:
- Factored backend routing logic out of convolution forward (#67790)
- General convolution_backward function (#69044, #70112, #71489, #71490, #71491, #69584, #67283, #70661)
- Removed finput, fgrad_input, columns, and ones from slow{2,3}d and slow{2,3}d_transpose signatures (#68897, #68898, #68899)
- Removed backward ops for: cuDNN convolution, cuDNN transposed convolution, deprecated cuDNN convolution, miopen convolution, miopen convolution, miopen transposed convolution, miopen depthwise convolution, slow dilated 2d convolution, slow 2d transposed convolution, slow 3d convolution, slow dilated 3d convolution, mkldnn convolution, low 3d transposed convolution, 2d depthwise convolution, 3d depthwise convolution, NNPACK spatial convolution (#69901, #69902, #71128, #69987, #69987, #70063, #70064, #70067, #70333, #69978, #70068, #70467, #69933, #70461,#69902, #70462, #70305)
- Removed TH/THC logic (#68127, #68556, #69040, #69041, #65942, #69929, #67940)
- Added tanh_backward to AT symbols (#70071)
- Improved documentation of comparison internals (#68977)
- Added isUndefined to ExclusivelyOwnedTraits debug msg (#70638)
- Removed buggy ExclusivelyOwnedTraits> (#70647)
- Generated aten_interned_strings.h automatically (#69407)
- Empty_strided: Factor out generic implementation (#70614)
- Empty_meta: Add functions that don’t depend on Tensor (#70615)
- Consolidated the overloads of TensorImpl::shallow_copy_and_detach (#68953)
- Improved storage assertion of Tensor’s enforce_invariants (#70380)
- Fixed aten’s native’s folder docs. (#71395)
- Use of new_empty in dropout (#72078)
- Simplified TensorImpl size check and fix error message (#72070)
- Added output_mask argument to
grid_sampler_2d_backward(#66068) - Avoided no-op shared_ptr dtor when constructing tuple (#69337)
- slow_conv2d grad_weight: call gemm directly (#65726)
- Made handle_torch_function_no_python_arg_parser public (#66054)
- slow_conv3d: Avoided dispatch in parallel region (#65737)
- slow_conv3d grad_input: Avoided dispatch in parallel region (#65757)
- slow_conv3d: Used at::sum for grad_bias accumulation (#65758)
- TBB: Use static partitioner to match OpenMP scheduling (#65327)
- Move intraop_launch_future from Parallel.h (#64166)
- slow_conv3d grad_weight: call gemm directly (#65759)
- Wextra fix for Tensorshape.cpp (#66320)
- Add InplaceOrView boxed kernel (#63878)
- Used
at::native::is_nonzeroin a few places to skip an unnecessary dispatch trip (#67195) - Added tags for inplace view ops in native_functions.yaml (#65412)
- Fixed C++ BatchNorm pretty_print() with optional momentum (#67335)
- Inserted check for PyObject_IsInstance in THPVariableCheck (#67588)
- Added SiLU backward Aten symbol (#67665)
- Bumped dlpack.h to latest version (#65047)
- Remove dWindowsTorchApiMacro.h in favor of Export.h (#69585)
- Added macro to register CPU kernel for all arch types (#70332)
c10::irangearound the codebase instead of for loops (#70326)
Autograd
- Forward AD can be tested in gradcheck and OpInfos without also testing backward AD (#65040)
- Extended OpInfo and gradgradcheck to test forward-over-reverse Hessian-vector products (#69740)
- Extended OpInfo and gradcheck to test batched forward grad (#66294)
- Enabled warning tests for nondeterministic backward functions (#66736)
- Extended autograd functional benchmarking to run vectorized tasks (#67045)
- Disallowed requires_grad=True in OpInfo’s
make_tensorfunction for integral inputs (#67149) - Made autograd codegen for differentiable outputs safer to use (#65823)
Build
- Improved disable name match (#71499)
- Made permission errors more human readable when using setup.py (#66492)
torch.nn
- Added testing across
memory_formattypes toModuleInfos(#69317) - Added private
_masked_softmaxfunction (#69268, #69272, #69924) - Added
native_dropout(#63937) F.interpolate: Removed JIT FC tweaks forantialiasflag andnearest-exactmode (#71937)F.pad: Replacedempty()withnew_empty()(#68565)F.softmax: Changeddtypeto support TorchScript and MyPy (#68336)nn.BatchNorm*d: Incrementednum_batches_trackedin place for improved graph safety (#70444)nn.Embedding: Passed arguments of embedding as named arguments (#67574)nn.FractionalMaxPool2d: Fixed to index correct_random_samplesdimension when provided (#70031)nn.{GRU, LSTM, RNN}: Fixed links to docs in comments (#68828)nn.Module: Added private_statelessAPI (#61447, #68969)nn.modules.utils.{_single,_pair,_triple,_quadruple}: Populated__name__(#70459)nn.Parameter: Usedtorch.empty()instead oftorch.tensor()(#66486)optim: UpdatedCODEOWNERS(#65773)optim.Optimizer: Integratedmulti_tensorzero_gradinto base class (#69936)- Refactored cuDNN convolution memory format and conv-bias-relu code (#65594)
- Testing
- Set cuDNN deterministic flag for
test_conv_double_backward_cuda(#69941) - Increased tolerance for
test_adadelta(#69919) - Set test owner for nn tests (#66850)
- Changed
test_conv_largeparameter initialization (#71521) - Obliviated
ALL_TENSORTYPESandALL_TENSORTYPES2(#71153) - Removed repeat test for types in
test_nn.py(#70872) - Tweaked
rel_tolfortest_adadelta(#71880) - Added no-input-grad-needed cases to
test_grid_sample(#66071) - Added OpInfo entries for
nn.functional.{conv1d, linear}(#67747, #65498) - Added host-side memory requirement for
test_softmax_64bit_indexing(#67922) - Made
@dtypesmandatory when using@dtypesIf(#68186) - Added testing for complex non-vanilla SGD (#66261)
- Skipped failing tests in
test_nn.pyif compiled without LAPACK (#70913)
- Set cuDNN deterministic flag for
torch.fx
- Supported type annotations in
operator_support.py(#65136) - Added algo recorder/replayer to
lower.py(#68194) - Traced asserts with fx by looking at bytecode (#70960)
- Fixed type checking errors in node.py (#68124)
AMD
- Updated ROCm build to avoid relying on
CUDA_VERSIONorHIP_VERSIONmacros (#65610)