Defining the Core ATen Opset

@peterbell10 and @Lezcano thank you for your feedback. Based on your comments, we are making the following changes to the list of identified core operators.

  1. Operators which cleanly map to hardware intrinsics will be promoted to the core operators. The same treatment is applied to operators where decomposing it will impact the numerical precision/stability of the output. In accordance with this, the following operators which were previously decomposed are now added to core:

    • aten::trunc
    • aten::expm1
    • aten::log10
    • aten::log1p
    • aten::log2
    • aten::atan2
  2. div.Tensor_mode and div.Scalar_mode has been added as a “core” operator. The "trunc" and "floor" rounding modes are more complex to decompose than initially thought, as both need to handle floating point and integer data types separately, and “floor” in particular is quite complex to decompose as it needs to replicate Python’s floor division behaviour. The decomposition for thisoperator would be too similar to an outright implementation of the operator, which is why it is preferable to add it as a “core” operator.

Despite these changes, there are still some additional considerations I am working through.

  1. For aten::diagonal, as @peterbell10 called out decomposing into as_strided is not ideal. I am in favor of moving this to a core op as well, but need to confirm this decision internally.
  2. We are still undecided on how to handle var_mean.correction. We can remove this decomposition for Inductor, but need to determine if there is a need to add the op as core so that the single pass algorithm can be acessed.
  3. As a general point for the .Scalar variant of ops, should they be decomposed to using full to construct tensor argument using the Scalar argument, then call the .Tensor variant?
1 Like