Defining the Core ATen Opset

I have a few concerns regarding the new proposed Core ATen decompositions:

aten::_unsafe_index.Tensorindex.Tensor

aten::_unsafe_index was created as a hint to inductor that the indices originate from a decomposition rather than a user and as such it should be trusted. This means we don’t need to generate a tl.device_assert call checking it’s in bounds. Decomposing it to index.Tensor would result in worse performance.

aten::atan2atan(input / other)

This is incorrect as it doesn’t select the correct branch of the atan function, e.g. atan2(-x, -y) != atan2(x, y) and atan2(-x, y) != atan2(x, -y).

aten::diagonal

Decomposing views into as_strided should be discouraged because there is far more semantic information in the aten::diagonal call which inductor uses to generate much more efficient code.

aten::div.Tensor_mode, aten::floor_divide → floor(divide(x, y))

This decomposition gives different results from python’s floor division. Currently inductor does this decomposition, but I don’t think it should be baked in for export.

aten::expm1, aten::log10, aten::log1p(x), aten::log2

These are not just convenience functions, they give more numerical precision so shouldn’t be decomposed.

aten::var_mean.correctionreturn mean(x), var(x)

Inductor implements a single pass var_mean which already computes the mean, and is currently not CSE’d with mean. So this should result in worse performance.

2 Likes