Skip to content

Commit

Permalink
Refactor complex asin/acos/asinh/acosh to use asin_acos_kernel. Impro…
Browse files Browse the repository at this point in the history
…ve real acos accuracy. (#2496)

As in the title.

This PR introduces asin_acos_kernel operation that implements modified
Hull et al algorithm for evaluating complex asin, acos, asinh, and acosh
operations. This task corresponds to the refactor request in
#2411 (comment)
and it resolves
pearu/functional_algorithms#15.

In addition, the PR also improves the real acos accuracy as a fix to
#2452 . As a result, the
accuracy of single-precision real acos improved as follows (using 1
million samples):
```
                             main       this PR
ULP difference == 0 count is 971713  -> 999045
ULP difference == 1 count is 28158   ->    962
ULP difference == 2 count is 54      ->      0
ULP difference == 3 count is 20      ->      0
ULP difference >= 4 count is 62      ->      0
```


The cause of the revert by PR
#2449 (`acos(-1) -> 0` while
expecting `pi`) is fixed in functional_algorithms which now also
generates extra samples that correspond to limiting values of real acos.
For instance, `acos(-1)->pi`, `acos(nextafter(-1, -inf))->nan`, and
`acos(nextafter(-1, int)-> approx pi` are now included in the tests.


The required functional_algorithms version is now 0.9 which includes a
fix to `fa.utils.real_samples` to return samples that are distributed
uniformly with respect to ULP differences of neighboring samples. This
fix lead to updates of `stablehlo/tests/math/*.mlir`.
  • Loading branch information
pearu authored Aug 21, 2024
1 parent c49838a commit f8fed84
Show file tree
Hide file tree
Showing 24 changed files with 1,069 additions and 1,356 deletions.
6 changes: 3 additions & 3 deletions build_tools/math/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ following requirements:

- Python 3.11 or newer
- mpmath 1.3 or newer
- functional_algorithms 0.7.0 or newer
- functional_algorithms 0.9.1 or newer

that can be installed via pypi:

Expand Down Expand Up @@ -62,8 +62,8 @@ To execute generated tests from a `build` directory, use:

```sh
for t in $(ls ../stablehlo/tests/math/*.mlir); \
do bin/stablehlo-opt --chlo-legalize-to-stablehlo $t \
| bin/stablehlo-translate --interpret ; done
do echo $t && ( bin/stablehlo-opt --chlo-legalize-to-stablehlo $t \
| bin/stablehlo-translate --interpret ) ; done
```

When new implementations are generated, one likely needs to update
Expand Down
3 changes: 2 additions & 1 deletion build_tools/math/generate_ChloDecompositionPatternsMath.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ def main():
sources = []
target = fa.targets.stablehlo
for chloname, fname, args in [
("CHLO_AsinAcosKernelOp", "asin_acos_kernel", ("z:complex",)),
("CHLO_AsinOp", "complex_asin", ("z:complex",)),
("CHLO_AsinOp", "real_asin", ("x:float",)),
("CHLO_AcosOp", "complex_acos", ("z:complex",)),
Expand All @@ -92,7 +93,7 @@ def main():
func = getattr(fa.algorithms, fname, None)
if func is None:
warnings.warn(
"{fa.algorithms.__name__} does not define {fname}. Skipping.")
f"{fa.algorithms.__name__} does not define {fname}. Skipping.")
continue
ctx = fa.Context(paths=[fa.algorithms])
graph = ctx.trace(func, *args).implement_missing(target).simplify()
Expand Down
2 changes: 2 additions & 0 deletions build_tools/math/generate_tests.py
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,8 @@ def main():
include_subnormal=not flush_subnormals,
).flatten()

samples = np.concatenate((samples, fa.utils.extra_samples(opname, dtype)))

expected = getattr(nmp, mpmath_opname).call(samples,
enable_progressbar=True)
expected = np.array(expected, dtype)
Expand Down
1 change: 1 addition & 0 deletions stablehlo/dialect/ChloOps.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,7 @@ namespace chlo {
inferredReturnShapes); \
}

INFER_RETURN_TYPE_COMPONENTS_FROM_OPERANDS(AsinAcosKernelOp)
INFER_RETURN_TYPE_COMPONENTS_FROM_OPERANDS(AcosOp)
INFER_RETURN_TYPE_COMPONENTS_FROM_OPERANDS(AcoshOp)
INFER_RETURN_TYPE_COMPONENTS_FROM_OPERANDS(AsinOp)
Expand Down
22 changes: 22 additions & 0 deletions stablehlo/dialect/ChloOps.td
Original file line number Diff line number Diff line change
Expand Up @@ -471,6 +471,28 @@ class CHLO_UnaryElementwiseOp<string mnemonic, list<Trait> traits,
}];
}

def CHLO_AsinAcosKernelOp : CHLO_UnaryElementwiseOp<"_asin_acos_kernel",
[HLO_CompatibleOperandsAndResultType], HLO_AnyComplexTensor> {
let summary = "AsinAcosKernel operator";

let description = [{
Returns `AsinAcosKernel(operand)` element-wise.

If
w = _asin_acos_kernel(z)
w' = _asin_acos_kernel(I * z)
then
asin(z) = complex(atan2(z.real, w.real), sign(z.imag) * w.imag)
acos(z) = complex(atan2(w.real, z.real), -sign(z.imag) * w.imag)
asinh(z) = complex(sign(z.real) * w'.imag, atan2(z.imag, w'.real))
acosh(z) = complex(w.imag, sign(z.imag) * atan2(w.real, z.real))

This op is used as an intermediate value in decompositions and
should never be constructed directly by frameworks or consumed by
backends.
}];
}

def CHLO_AcosOp : CHLO_UnaryElementwiseOp<"acos",
[HLO_CompatibleOperandsAndResultType], HLO_AnyFpOrComplexTensor> {
let summary = "Acos operator";
Expand Down
1,214 changes: 616 additions & 598 deletions stablehlo/tests/chlo/chlo_legalize_to_stablehlo.mlir

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions stablehlo/tests/math/acos_complex128.mlir

Large diffs are not rendered by default.

4 changes: 2 additions & 2 deletions stablehlo/tests/math/acos_complex64.mlir
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,11 @@
// This file is generated, see build_tools/math/README.md for more information.
module @acos_complex64 {
func.func private @samples() -> tensor<169xcomplex<f32>> {
%0 = stablehlo.constant dense<"0x000080FF000080FFFFFF7FFF000080FFFEFF7FFF000080FFF30435BA000080FFF0379897000080FF01000080000080FF00000000000080FF01000000000080FFF0379817000080FFF304353A000080FFFEFF7F7F000080FFFFFF7F7F000080FF0000807F000080FF000080FFFFFF7FFFFFFF7FFFFFFF7FFFFEFF7FFFFFFF7FFFF30435BAFFFF7FFFF0379897FFFF7FFF01000080FFFF7FFF00000000FFFF7FFF01000000FFFF7FFFF0379817FFFF7FFFF304353AFFFF7FFFFEFF7F7FFFFF7FFFFFFF7F7FFFFF7FFF0000807FFFFF7FFF000080FFFEFF7FFFFFFF7FFFFEFF7FFFFEFF7FFFFEFF7FFFF30435BAFEFF7FFFF0379897FEFF7FFF01000080FEFF7FFF00000000FEFF7FFF01000000FEFF7FFFF0379817FEFF7FFFF304353AFEFF7FFFFEFF7F7FFEFF7FFFFFFF7F7FFEFF7FFF0000807FFEFF7FFF000080FFF30435BAFFFF7FFFF30435BAFEFF7FFFF30435BAF30435BAF30435BAF0379897F30435BA01000080F30435BA00000000F30435BA01000000F30435BAF0379817F30435BAF304353AF30435BAFEFF7F7FF30435BAFFFF7F7FF30435BA0000807FF30435BA000080FFF0379897FFFF7FFFF0379897FEFF7FFFF0379897F30435BAF0379897F0379897F037989701000080F037989700000000F037989701000000F0379897F0379817F0379897F304353AF0379897FEFF7F7FF0379897FFFF7F7FF03798970000807FF0379897000080FF01000080FFFF7FFF01000080FEFF7FFF01000080F30435BA01000080F037989701000080010000800100008000000000010000800100000001000080F037981701000080F304353A01000080FEFF7F7F01000080FFFF7F7F010000800000807F01000080000080FF00000000FFFF7FFF00000000FEFF7FFF00000000F30435BA00000000F037989700000000010000800000000000000000000000000100000000000000F037981700000000F304353A00000000FEFF7F7F00000000FFFF7F7F000000000000807F00000000000080FF01000000FFFF7FFF01000000FEFF7FFF01000000F30435BA01000000F037989701000000010000800100000000000000010000000100000001000000F037981701000000F304353A01000000FEFF7F7F01000000FFFF7F7F010000000000807F01000000000080FFF0379817FFFF7FFFF0379817FEFF7FFFF0379817F30435BAF0379817F0379897F037981701000080F037981700000000F037981701000000F0379817F0379817F0379817F304353AF0379817FEFF7F7FF0379817FFFF7F7FF03798170000807FF0379817000080FFF304353AFFFF7FFFF304353AFEFF7FFFF304353AF30435BAF304353AF0379897F304353A01000080F304353A00000000F304353A01000000F304353AF0379817F304353AF304353AF304353AFEFF7F7FF304353AFFFF7F7FF304353A0000807FF304353A000080FFFEFF7F7FFFFF7FFFFEFF7F7FFEFF7FFFFEFF7F7FF30435BAFEFF7F7FF0379897FEFF7F7F01000080FEFF7F7F00000000FEFF7F7F01000000FEFF7F7FF0379817FEFF7F7FF304353AFEFF7F7FFEFF7F7FFEFF7F7FFFFF7F7FFEFF7F7F0000807FFEFF7F7F000080FFFFFF7F7FFFFF7FFFFFFF7F7FFEFF7FFFFFFF7F7FF30435BAFFFF7F7FF0379897FFFF7F7F01000080FFFF7F7F00000000FFFF7F7F01000000FFFF7F7FF0379817FFFF7F7FF304353AFFFF7F7FFEFF7F7FFFFF7F7FFFFF7F7FFFFF7F7F0000807FFFFF7F7F000080FF0000807FFFFF7FFF0000807FFEFF7FFF0000807FF30435BA0000807FF03798970000807F010000800000807F000000000000807F010000000000807FF03798170000807FF304353A0000807FFEFF7F7F0000807FFFFF7F7F0000807F0000807F0000807F"> : tensor<169xcomplex<f32>>
%0 = stablehlo.constant dense<"0x000080FF000080FFFFFF7FFF000080FFFEFF7FFF000080FF0000C0BF000080FF0000E09F000080FF01000080000080FF00000000000080FF01000000000080FF0000E01F000080FF0000C03F000080FFFEFF7F7F000080FFFFFF7F7F000080FF0000807F000080FF000080FFFFFF7FFFFFFF7FFFFFFF7FFFFEFF7FFFFFFF7FFF0000C0BFFFFF7FFF0000E09FFFFF7FFF01000080FFFF7FFF00000000FFFF7FFF01000000FFFF7FFF0000E01FFFFF7FFF0000C03FFFFF7FFFFEFF7F7FFFFF7FFFFFFF7F7FFFFF7FFF0000807FFFFF7FFF000080FFFEFF7FFFFFFF7FFFFEFF7FFFFEFF7FFFFEFF7FFF0000C0BFFEFF7FFF0000E09FFEFF7FFF01000080FEFF7FFF00000000FEFF7FFF01000000FEFF7FFF0000E01FFEFF7FFF0000C03FFEFF7FFFFEFF7F7FFEFF7FFFFFFF7F7FFEFF7FFF0000807FFEFF7FFF000080FF0000C0BFFFFF7FFF0000C0BFFEFF7FFF0000C0BF0000C0BF0000C0BF0000E09F0000C0BF010000800000C0BF000000000000C0BF010000000000C0BF0000E01F0000C0BF0000C03F0000C0BFFEFF7F7F0000C0BFFFFF7F7F0000C0BF0000807F0000C0BF000080FF0000E09FFFFF7FFF0000E09FFEFF7FFF0000E09F0000C0BF0000E09F0000E09F0000E09F010000800000E09F000000000000E09F010000000000E09F0000E01F0000E09F0000C03F0000E09FFEFF7F7F0000E09FFFFF7F7F0000E09F0000807F0000E09F000080FF01000080FFFF7FFF01000080FEFF7FFF010000800000C0BF010000800000E09F010000800100008001000080000000000100008001000000010000800000E01F010000800000C03F01000080FEFF7F7F01000080FFFF7F7F010000800000807F01000080000080FF00000000FFFF7FFF00000000FEFF7FFF000000000000C0BF000000000000E09F000000000100008000000000000000000000000001000000000000000000E01F000000000000C03F00000000FEFF7F7F00000000FFFF7F7F000000000000807F00000000000080FF01000000FFFF7FFF01000000FEFF7FFF010000000000C0BF010000000000E09F010000000100008001000000000000000100000001000000010000000000E01F010000000000C03F01000000FEFF7F7F01000000FFFF7F7F010000000000807F01000000000080FF0000E01FFFFF7FFF0000E01FFEFF7FFF0000E01F0000C0BF0000E01F0000E09F0000E01F010000800000E01F000000000000E01F010000000000E01F0000E01F0000E01F0000C03F0000E01FFEFF7F7F0000E01FFFFF7F7F0000E01F0000807F0000E01F000080FF0000C03FFFFF7FFF0000C03FFEFF7FFF0000C03F0000C0BF0000C03F0000E09F0000C03F010000800000C03F000000000000C03F010000000000C03F0000E01F0000C03F0000C03F0000C03FFEFF7F7F0000C03FFFFF7F7F0000C03F0000807F0000C03F000080FFFEFF7F7FFFFF7FFFFEFF7F7FFEFF7FFFFEFF7F7F0000C0BFFEFF7F7F0000E09FFEFF7F7F01000080FEFF7F7F00000000FEFF7F7F01000000FEFF7F7F0000E01FFEFF7F7F0000C03FFEFF7F7FFEFF7F7FFEFF7F7FFFFF7F7FFEFF7F7F0000807FFEFF7F7F000080FFFFFF7F7FFFFF7FFFFFFF7F7FFEFF7FFFFFFF7F7F0000C0BFFFFF7F7F0000E09FFFFF7F7F01000080FFFF7F7F00000000FFFF7F7F01000000FFFF7F7F0000E01FFFFF7F7F0000C03FFFFF7F7FFEFF7F7FFFFF7F7FFFFF7F7FFFFF7F7F0000807FFFFF7F7F000080FF0000807FFFFF7FFF0000807FFEFF7FFF0000807F0000C0BF0000807F0000E09F0000807F010000800000807F000000000000807F010000000000807F0000E01F0000807F0000C03F0000807FFEFF7F7F0000807FFFFF7F7F0000807F0000807F0000807F"> : tensor<169xcomplex<f32>>
return %0 : tensor<169xcomplex<f32>>
}
func.func private @expected() -> tensor<169xcomplex<f32>> {
%0 = stablehlo.constant dense<"0xE4CB16400000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0F493F0000807FDB0F49400000807FE4CB16406E86B342E4CB16406E86B342DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0F493F6E86B342DB0F493F6E86B342000000000000807FDB0F49400000807FE4CB16406E86B342E4CB16406E86B342DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0F493F6E86B342DA0F493F6E86B342000000000000807FDB0F49400000807FDB0F4940FCD4B242DB0F4940FCD4B2427B26C93FF504353ADB0FC93FF204353ADB0FC93FF204353ADB0FC93FF204353ADB0FC93FF204353ADB0FC93FF204353A3AF9C83FF504353AA8050000FCD4B242A8050000FCD4B242000000000000807FDB0F49400000807FDB0F4940FCD4B242DB0F4940FCD4B2427B26C93FF2379817DB0FC93FF0379817DB0FC93FF0379817DB0FC93FF0379817DB0FC93FF0379817DB0FC93FF03798173AF9C83FF237981700000000FCD4B24200000000FCD4B242000000000000807FDB0F49400000807FDB0F4940FCD4B242DB0F4940FCD4B2427B26C93F01000000DB0FC93F01000000DB0FC93F01000000DB0FC93F01000000DB0FC93F01000000DB0FC93F010000003AF9C83F0100000000000000FCD4B24200000000FCD4B242000000000000807FDB0F4940000080FFDB0F4940FCD4B2C2DB0F4940FCD4B2C27B26C93F00000000DB0FC93F00000000DB0FC93F00000000DB0FC93F00000000DB0FC93F00000000DB0FC93F000000003AF9C83F0000000000000000FCD4B2C200000000FCD4B2C200000000000080FFDB0F4940000080FFDB0F4940FCD4B2C2DB0F4940FCD4B2C27B26C93F01000080DB0FC93F01000080DB0FC93F01000080DB0FC93F01000080DB0FC93F01000080DB0FC93F010000803AF9C83F0100008000000000FCD4B2C200000000FCD4B2C200000000000080FFDB0F4940000080FFDB0F4940FCD4B2C2DB0F4940FCD4B2C27B26C93FF2379897DB0FC93FF0379897DB0FC93FF0379897DB0FC93FF0379897DB0FC93FF0379897DB0FC93FF03798973AF9C83FF237989700000000FCD4B2C200000000FCD4B2C200000000000080FFDB0F4940000080FFDB0F4940FCD4B2C2DB0F4940FCD4B2C27B26C93FF50435BADB0FC93FF20435BADB0FC93FF20435BADB0FC93FF20435BADB0FC93FF20435BADB0FC93FF20435BA3AF9C83FF50435BAA8050000FCD4B2C2A8050000FCD4B2C200000000000080FFDB0F4940000080FFE4CB16406E86B3C2E4CB16406E86B3C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0F493F6E86B3C2DA0F493F6E86B3C200000000000080FFDB0F4940000080FFE4CB16406E86B3C2E4CB16406E86B3C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0F493F6E86B3C2DB0F493F6E86B3C200000000000080FFE4CB1640000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0F493F000080FF"> : tensor<169xcomplex<f32>>
%0 = stablehlo.constant dense<"0xE4CB16400000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0FC93F0000807FDB0F493F0000807FDB0F49400000807FE4CB16406E86B342E4CB16406E86B342DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0F493F6E86B342DB0F493F6E86B342000000000000807FDB0F49400000807FE4CB16406E86B342E4CB16406E86B342DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0FC93FFCD4B242DB0F493F6E86B342DA0F493F6E86B342000000000000807FDB0F49400000807FDB0F4940FCD4B242DB0F4940FCD4B242D2461340E590B93FDB0FC93F00EE983FDB0FC93F00EE983FDB0FC93F00EE983FDB0FC93F00EE983FDB0FC93F00EE983F2224573FE590B93F00003000FCD4B24200003000FCD4B242000000000000807FDB0F49400000807FDB0F4940FCD4B242DB0F4940FCD4B242DB0F49406561763FDB0FC93F0000E01FDB0FC93F0000E01FDB0FC93F0000E01FDB0FC93F0000E01FDB0FC93F0000E01F085AC81F6561763F00000000FCD4B24200000000FCD4B242000000000000807FDB0F49400000807FDB0F4940FCD4B242DB0F4940FCD4B242DB0F49406561763FDB0FC93F01000000DB0FC93F01000000DB0FC93F01000000DB0FC93F01000000DB0FC93F01000000000000006561763F00000000FCD4B24200000000FCD4B242000000000000807FDB0F4940000080FFDB0F4940FCD4B2C2DB0F4940FCD4B2C2DB0F4940656176BFDB0FC93F00000000DB0FC93F00000000DB0FC93F00000000DB0FC93F00000000DB0FC93F0000000000000000656176BF00000000FCD4B2C200000000FCD4B2C200000000000080FFDB0F4940000080FFDB0F4940FCD4B2C2DB0F4940FCD4B2C2DB0F4940656176BFDB0FC93F01000080DB0FC93F01000080DB0FC93F01000080DB0FC93F01000080DB0FC93F0100008000000000656176BF00000000FCD4B2C200000000FCD4B2C200000000000080FFDB0F4940000080FFDB0F4940FCD4B2C2DB0F4940FCD4B2C2DB0F4940656176BFDB0FC93F0000E09FDB0FC93F0000E09FDB0FC93F0000E09FDB0FC93F0000E09FDB0FC93F0000E09F085AC81F656176BF00000000FCD4B2C200000000FCD4B2C200000000000080FFDB0F4940000080FFDB0F4940FCD4B2C2DB0F4940FCD4B2C2D2461340E590B9BFDB0FC93F00EE98BFDB0FC93F00EE98BFDB0FC93F00EE98BFDB0FC93F00EE98BFDB0FC93F00EE98BF2224573FE590B9BF00003000FCD4B2C200003000FCD4B2C200000000000080FFDB0F4940000080FFE4CB16406E86B3C2E4CB16406E86B3C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0F493F6E86B3C2DA0F493F6E86B3C200000000000080FFDB0F4940000080FFE4CB16406E86B3C2E4CB16406E86B3C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0FC93FFCD4B2C2DB0F493F6E86B3C2DB0F493F6E86B3C200000000000080FFE4CB1640000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0FC93F000080FFDB0F493F000080FF"> : tensor<169xcomplex<f32>>
return %0 : tensor<169xcomplex<f32>>
}
func.func public @main() {
Expand Down
Loading

0 comments on commit f8fed84

Please sign in to comment.