-
Notifications
You must be signed in to change notification settings - Fork 14.4k
[flang] Add -fcomplex-arithmetic= option and select complex division algorithm #146641
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
[flang] Add -fcomplex-arithmetic= option and select complex division algorithm #146641
Conversation
…algorithm This patch adds an option to select the method for computing complex number division. It uses `LoweringOptions` to determine whether to lower complex division to a runtime function call or to MLIR's `complex.div`, and `CodeGenOptions` to select the computation algorithm for `complex.div`. The available option values and their corresponding algorithms are as follows: - `full`: Lower to a runtime function call. (Default behavior) - `improved`: Lower to `complex.div` and expand to Smith's algorithm. - `basic`: Lower to `complex.div` and expand to the algebraic algorithm. See also the discussion in the following discourse post: https://discourse.llvm.org/t/optimization-of-complex-number-division/83468
@llvm/pr-subscribers-flang-codegen @llvm/pr-subscribers-flang-driver Author: Shunsuke Watanabe (s-watanabe314) ChangesThis patch adds an option to select the method for computing complex number division. It uses
See also the discussion in the following discourse post: https://discourse.llvm.org/t/optimization-of-complex-number-division/83468 Patch is 102.49 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/146641.diff 22 Files Affected:
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 9911d752966e3..58209ceb5dc54 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1023,12 +1023,12 @@ defm offload_uniform_block : BoolFOption<"offload-uniform-block",
BothFlags<[], [ClangOption], " that kernels are launched with uniform block sizes (default true for CUDA/HIP and false otherwise)">>;
def fcomplex_arithmetic_EQ : Joined<["-"], "fcomplex-arithmetic=">, Group<f_Group>,
- Visibility<[ClangOption, CC1Option]>,
+ Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>,
Values<"full,improved,promoted,basic">, NormalizedValuesScope<"LangOptions">,
NormalizedValues<["CX_Full", "CX_Improved", "CX_Promoted", "CX_Basic"]>;
def complex_range_EQ : Joined<["-"], "complex-range=">, Group<f_Group>,
- Visibility<[CC1Option]>,
+ Visibility<[CC1Option, FC1Option]>,
Values<"full,improved,promoted,basic">, NormalizedValuesScope<"LangOptions">,
NormalizedValues<["CX_Full", "CX_Improved", "CX_Promoted", "CX_Basic"]>,
MarshallingInfoEnum<LangOpts<"ComplexRange">, "CX_Full">;
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index e4e321ba1e195..ba5db7e3aba43 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -595,6 +595,30 @@ void Flang::addOffloadOptions(Compilation &C, const InputInfoList &Inputs,
addOpenMPHostOffloadingArgs(C, JA, Args, CmdArgs);
}
+static std::string ComplexRangeKindToStr(LangOptions::ComplexRangeKind Range) {
+ switch (Range) {
+ case LangOptions::ComplexRangeKind::CX_Full:
+ return "full";
+ break;
+ case LangOptions::ComplexRangeKind::CX_Improved:
+ return "improved";
+ break;
+ case LangOptions::ComplexRangeKind::CX_Basic:
+ return "basic";
+ break;
+ default:
+ return "";
+ }
+}
+
+static std::string
+RenderComplexRangeOption(LangOptions::ComplexRangeKind Range) {
+ std::string ComplexRangeStr = ComplexRangeKindToStr(Range);
+ if (!ComplexRangeStr.empty())
+ return "-complex-range=" + ComplexRangeStr;
+ return ComplexRangeStr;
+}
+
static void addFloatingPointOptions(const Driver &D, const ArgList &Args,
ArgStringList &CmdArgs) {
StringRef FPContract;
@@ -605,6 +629,8 @@ static void addFloatingPointOptions(const Driver &D, const ArgList &Args,
bool AssociativeMath = false;
bool ReciprocalMath = false;
+ LangOptions::ComplexRangeKind Range = LangOptions::ComplexRangeKind::CX_None;
+
if (const Arg *A = Args.getLastArg(options::OPT_ffp_contract)) {
const StringRef Val = A->getValue();
if (Val == "fast" || Val == "off") {
@@ -629,6 +655,20 @@ static void addFloatingPointOptions(const Driver &D, const ArgList &Args,
default:
continue;
+ case options::OPT_fcomplex_arithmetic_EQ: {
+ StringRef Val = A->getValue();
+ if (Val == "full")
+ Range = LangOptions::ComplexRangeKind::CX_Full;
+ else if (Val == "improved")
+ Range = LangOptions::ComplexRangeKind::CX_Improved;
+ else if (Val == "basic")
+ Range = LangOptions::ComplexRangeKind::CX_Basic;
+ else {
+ D.Diag(diag::err_drv_unsupported_option_argument)
+ << A->getSpelling() << Val;
+ }
+ break;
+ }
case options::OPT_fhonor_infinities:
HonorINFs = true;
break;
@@ -699,6 +739,13 @@ static void addFloatingPointOptions(const Driver &D, const ArgList &Args,
if (!Recip.empty())
CmdArgs.push_back(Args.MakeArgString("-mrecip=" + Recip));
+ if (Range != LangOptions::ComplexRangeKind::CX_None) {
+ std::string ComplexRangeStr = RenderComplexRangeOption(Range);
+ CmdArgs.push_back(Args.MakeArgString(ComplexRangeStr));
+ CmdArgs.push_back(Args.MakeArgString("-fcomplex-arithmetic=" +
+ ComplexRangeKindToStr(Range)));
+ }
+
if (!HonorINFs && !HonorNaNs && AssociativeMath && ReciprocalMath &&
ApproxFunc && !SignedZeros &&
(FPContract == "fast" || FPContract.empty())) {
diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def
index ae12aec518108..cdeea93c9aecb 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.def
+++ b/flang/include/flang/Frontend/CodeGenOptions.def
@@ -52,6 +52,7 @@ ENUM_CODEGENOPT(RelocationModel, llvm::Reloc::Model, 3, llvm::Reloc::PIC_) ///<
ENUM_CODEGENOPT(DebugInfo, llvm::codegenoptions::DebugInfoKind, 4, llvm::codegenoptions::NoDebugInfo) ///< Level of debug info to generate
ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 4, llvm::driver::VectorLibrary::NoLibrary) ///< Vector functions library to use
ENUM_CODEGENOPT(FramePointer, llvm::FramePointerKind, 2, llvm::FramePointerKind::None) ///< Enable the usage of frame pointers
+ENUM_CODEGENOPT(ComplexRange, ComplexRangeKind, 3, ComplexRangeKind::CX_Full) ///< Method for calculating complex number division
ENUM_CODEGENOPT(DoConcurrentMapping, DoConcurrentMappingKind, 2, DoConcurrentMappingKind::DCMK_None) ///< Map `do concurrent` to OpenMP
diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h
index bad17c8309eb8..df6063cc90340 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.h
+++ b/flang/include/flang/Frontend/CodeGenOptions.h
@@ -192,6 +192,31 @@ class CodeGenOptions : public CodeGenOptionsBase {
return getProfileUse() == llvm::driver::ProfileCSIRInstr;
}
+ /// Controls the various implementations for complex division.
+ enum ComplexRangeKind {
+ /// Implementation of complex division using a call to runtime library
+ /// functions. Overflow and non-finite values are handled by the library
+ /// implementation. This is the default value.
+ CX_Full,
+
+ /// Implementation of complex division offering an improved handling
+ /// for overflow in intermediate calculations. Overflow and non-finite
+ /// values are handled by MLIR's implementation of "complex.div", but this
+ /// may change in the future.
+ CX_Improved,
+
+ /// Implementation of complex division using algebraic formulas at source
+ /// precision. No special handling to avoid overflow. NaN and infinite
+ /// values are not handled.
+ CX_Basic,
+
+ /// No range rule is enabled.
+ CX_None
+
+ /// TODO: Implemention of other values as needed. In Clang, "CX_Promoted"
+ /// is implemented. (See clang/Basic/LangOptions.h)
+ };
+
// Define accessors/mutators for code generation options of enumeration type.
#define CODEGENOPT(Name, Bits, Default)
#define ENUM_CODEGENOPT(Name, Type, Bits, Default) \
diff --git a/flang/include/flang/Lower/LoweringOptions.def b/flang/include/flang/Lower/LoweringOptions.def
index 3263ab129d076..8135704971aa4 100644
--- a/flang/include/flang/Lower/LoweringOptions.def
+++ b/flang/include/flang/Lower/LoweringOptions.def
@@ -70,5 +70,9 @@ ENUM_LOWERINGOPT(CUDARuntimeCheck, unsigned, 1, 0)
/// derived types defined in other compilation units.
ENUM_LOWERINGOPT(SkipExternalRttiDefinition, unsigned, 1, 0)
+/// If true, convert complex number division to runtime on the frontend.
+/// If false, lower to the complex dialect of MLIR.
+/// On by default.
+ENUM_LOWERINGOPT(ComplexDivisionToRuntime, unsigned, 1, 1)
#undef LOWERINGOPT
#undef ENUM_LOWERINGOPT
diff --git a/flang/include/flang/Optimizer/Builder/FIRBuilder.h b/flang/include/flang/Optimizer/Builder/FIRBuilder.h
index e1eaab3346901..b1513850a9048 100644
--- a/flang/include/flang/Optimizer/Builder/FIRBuilder.h
+++ b/flang/include/flang/Optimizer/Builder/FIRBuilder.h
@@ -609,6 +609,17 @@ class FirOpBuilder : public mlir::OpBuilder, public mlir::OpBuilder::Listener {
return integerOverflowFlags;
}
+ /// Set ComplexDivisionToRuntimeFlag value for whether complex number division
+ /// is lowered to a runtime function by this builder.
+ void setComplexDivisionToRuntimeFlag(bool flag) {
+ complexDivisionToRuntimeFlag = flag;
+ }
+
+ /// Get current ComplexDivisionToRuntimeFlag value.
+ bool getComplexDivisionToRuntimeFlag() const {
+ return complexDivisionToRuntimeFlag;
+ }
+
/// Dump the current function. (debug)
LLVM_DUMP_METHOD void dumpFunc();
@@ -673,6 +684,10 @@ class FirOpBuilder : public mlir::OpBuilder, public mlir::OpBuilder::Listener {
/// mlir::arith::IntegerOverflowFlagsAttr.
mlir::arith::IntegerOverflowFlags integerOverflowFlags{};
+ /// Flag to control whether complex number division is lowered to a runtime
+ /// function or to the MLIR complex dialect.
+ bool complexDivisionToRuntimeFlag = true;
+
/// fir::GlobalOp and func::FuncOp symbol table to speed-up
/// lookups.
mlir::SymbolTable *symbolTable = nullptr;
diff --git a/flang/include/flang/Optimizer/CodeGen/CodeGen.h b/flang/include/flang/Optimizer/CodeGen/CodeGen.h
index 93f07d8d5d4d9..e9a07a8dde5cd 100644
--- a/flang/include/flang/Optimizer/CodeGen/CodeGen.h
+++ b/flang/include/flang/Optimizer/CodeGen/CodeGen.h
@@ -9,6 +9,7 @@
#ifndef FORTRAN_OPTIMIZER_CODEGEN_CODEGEN_H
#define FORTRAN_OPTIMIZER_CODEGEN_CODEGEN_H
+#include "flang/Frontend/CodeGenOptions.h"
#include "mlir/IR/BuiltinOps.h"
#include "mlir/Pass/Pass.h"
#include "mlir/Pass/PassRegistry.h"
@@ -58,6 +59,11 @@ struct FIRToLLVMPassOptions {
// the name of the global variable corresponding to a derived
// type's descriptor.
bool typeDescriptorsRenamedForAssembly = false;
+
+ // Specify the calculation method for complex number division used by the
+ // Conversion pass of the MLIR complex dialect.
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind ComplexRange =
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind::CX_Full;
};
/// Convert FIR to the LLVM IR dialect with default options.
diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h
index 337685c82af5f..df1da27058552 100644
--- a/flang/include/flang/Tools/CrossToolHelpers.h
+++ b/flang/include/flang/Tools/CrossToolHelpers.h
@@ -140,6 +140,9 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks {
std::string InstrumentFunctionExit =
""; ///< Name of the instrument-function that is called on each
///< function-exit
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind ComplexRange =
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind::
+ CX_Full; ///< Method for calculating complex number division
};
struct OffloadModuleOpts {
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 30d81f3daa969..86ec410b1f70f 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -484,6 +484,21 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts,
}
parseDoConcurrentMapping(opts, args, diags);
+
+ if (const auto *arg =
+ args.getLastArg(clang::driver::options::OPT_complex_range_EQ)) {
+ llvm::StringRef argValue = llvm::StringRef(arg->getValue());
+ if (argValue == "full") {
+ opts.setComplexRange(CodeGenOptions::ComplexRangeKind::CX_Full);
+ } else if (argValue == "improved") {
+ opts.setComplexRange(CodeGenOptions::ComplexRangeKind::CX_Improved);
+ } else if (argValue == "basic") {
+ opts.setComplexRange(CodeGenOptions::ComplexRangeKind::CX_Basic);
+ } else {
+ diags.Report(clang::diag::err_drv_invalid_value)
+ << arg->getAsString(args) << arg->getValue();
+ }
+ }
}
/// Parses all target input arguments and populates the target
@@ -1811,4 +1826,10 @@ void CompilerInvocation::setLoweringOptions() {
.setNoSignedZeros(langOptions.NoSignedZeros)
.setAssociativeMath(langOptions.AssociativeMath)
.setReciprocalMath(langOptions.ReciprocalMath);
+
+ if (codegenOpts.getComplexRange() ==
+ CodeGenOptions::ComplexRangeKind::CX_Improved ||
+ codegenOpts.getComplexRange() ==
+ CodeGenOptions::ComplexRangeKind::CX_Basic)
+ loweringOpts.setComplexDivisionToRuntime(false);
}
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index bf15def3f3b2e..b5f4f9421f633 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -750,6 +750,8 @@ void CodeGenAction::generateLLVMIR() {
if (ci.getInvocation().getLoweringOpts().getIntegerWrapAround())
config.NSWOnLoopVarInc = false;
+ config.ComplexRange = opts.getComplexRange();
+
// Create the pass pipeline
fir::createMLIRToLLVMPassPipeline(pm, config, getCurrentFile());
(void)mlir::applyPassManagerCLOptions(pm);
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index ff35840a6668c..7e95c640e73b0 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -5746,6 +5746,8 @@ class FirConverter : public Fortran::lower::AbstractConverter {
builder =
new fir::FirOpBuilder(func, bridge.getKindMap(), &mlirSymbolTable);
assert(builder && "FirOpBuilder did not instantiate");
+ builder->setComplexDivisionToRuntimeFlag(
+ bridge.getLoweringOptions().getComplexDivisionToRuntime());
builder->setFastMathFlags(bridge.getLoweringOptions().getMathOptions());
builder->setInsertionPointToStart(&func.front());
if (funit.parent.isA<Fortran::lower::pft::FunctionLikeUnit>()) {
diff --git a/flang/lib/Lower/ConvertExprToHLFIR.cpp b/flang/lib/Lower/ConvertExprToHLFIR.cpp
index df8dfbc72c030..cb338618dbf3b 100644
--- a/flang/lib/Lower/ConvertExprToHLFIR.cpp
+++ b/flang/lib/Lower/ConvertExprToHLFIR.cpp
@@ -1066,8 +1066,16 @@ struct BinaryOp<Fortran::evaluate::Divide<
mlir::Type ty = Fortran::lower::getFIRType(
builder.getContext(), Fortran::common::TypeCategory::Complex, KIND,
/*params=*/{});
- return hlfir::EntityWithAttributes{
- fir::genDivC(builder, loc, ty, lhs, rhs)};
+
+ // TODO: Ideally, complex number division operations should always be
+ // lowered to MLIR. However, converting them to the runtime via MLIR causes
+ // ABI issues.
+ if (builder.getComplexDivisionToRuntimeFlag())
+ return hlfir::EntityWithAttributes{
+ fir::genDivC(builder, loc, ty, lhs, rhs)};
+ else
+ return hlfir::EntityWithAttributes{
+ builder.create<mlir::complex::DivOp>(loc, lhs, rhs)};
}
};
diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
index 2b018912b40e4..4cf533f195e69 100644
--- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp
+++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
@@ -4119,7 +4119,20 @@ class FIRToLLVMLowering
mathToFuncsOptions.minWidthOfFPowIExponent = 33;
mathConvertionPM.addPass(
mlir::createConvertMathToFuncs(mathToFuncsOptions));
- mathConvertionPM.addPass(mlir::createConvertComplexToStandardPass());
+
+ mlir::ConvertComplexToStandardPassOptions complexToStandardOptions{};
+ if (options.ComplexRange ==
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind::CX_Basic) {
+ complexToStandardOptions.complexRange =
+ mlir::complex::ComplexRangeFlags::basic;
+ } else if (options.ComplexRange == Fortran::frontend::CodeGenOptions::
+ ComplexRangeKind::CX_Improved) {
+ complexToStandardOptions.complexRange =
+ mlir::complex::ComplexRangeFlags::improved;
+ }
+ mathConvertionPM.addPass(
+ mlir::createConvertComplexToStandardPass(complexToStandardOptions));
+
// Convert Math dialect operations into LLVM dialect operations.
// There is no way to prefer MathToLLVM patterns over MathToLibm
// patterns (applied below), so we have to run MathToLLVM conversion here.
diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp
index 42d9e7ba2418f..d940934f9f6ad 100644
--- a/flang/lib/Optimizer/Passes/Pipelines.cpp
+++ b/flang/lib/Optimizer/Passes/Pipelines.cpp
@@ -113,6 +113,7 @@ void addFIRToLLVMPass(mlir::PassManager &pm,
options.forceUnifiedTBAATree = useOldAliasTags;
options.typeDescriptorsRenamedForAssembly =
!disableCompilerGeneratedNamesConversion;
+ options.ComplexRange = config.ComplexRange;
addPassConditionally(pm, disableFirToLlvmIr,
[&]() { return fir::createFIRToLLVMPass(options); });
// The dialect conversion framework may leave dead unrealized_conversion_cast
diff --git a/flang/test/Driver/complex-range.f90 b/flang/test/Driver/complex-range.f90
new file mode 100644
index 0000000000000..e5a1ba9068ac9
--- /dev/null
+++ b/flang/test/Driver/complex-range.f90
@@ -0,0 +1,23 @@
+! Test range options for complex multiplication and division.
+
+! RUN: %flang -### -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=RANGE
+
+! RUN: %flang -### -fcomplex-arithmetic=full -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=FULL
+
+! RUN: %flang -### -fcomplex-arithmetic=improved -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=IMPRVD
+
+! RUN: %flang -### -fcomplex-arithmetic=basic -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=BASIC
+
+! RUN: not %flang -### -fcomplex-arithmetic=foo -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=ERR
+
+! RANGE-NOT: -complex-range=
+! FULL: -complex-range=full
+! IMPRVD: -complex-range=improved
+! BASIC: -complex-range=basic
+
+! ERR: error: unsupported argument 'foo' to option '-fcomplex-arithmetic='
diff --git a/flang/test/Integration/complex-div-to-llvm-kind10.f90 b/flang/test/Integration/complex-div-to-llvm-kind10.f90
new file mode 100644
index 0000000000000..04d1f7ed9b024
--- /dev/null
+++ b/flang/test/Integration/complex-div-to-llvm-kind10.f90
@@ -0,0 +1,131 @@
+! Test lowering complex division to llvm ir according to options
+
+! REQUIRES: target=x86_64{{.*}}
+! RUN: %flang -fcomplex-arithmetic=improved -S -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,IMPRVD
+! RUN: %flang -fcomplex-arithmetic=basic -S -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,BASIC
+
+
+! CHECK-LABEL: @div_test_extended
+! CHECK-SAME: ptr %[[RET:.*]], ptr %[[LHS:.*]], ptr %[[RHS:.*]])
+! CHECK: %[[LOAD_LHS:.*]] = load { x86_fp80, x86_fp80 }, ptr %[[LHS]], align 16
+! CHECK: %[[LOAD_RHS:.*]] = load { x86_fp80, x86_fp80 }, ptr %[[RHS]], align 16
+! CHECK: %[[LHS_REAL:.*]] = extractvalue { x86_fp80, x86_fp80 } %[[LOAD_LHS]], 0
+! CHECK: %[[LHS_IMAG:.*]] = extractvalue { x86_fp80, x86_fp80 } %[[LOAD_LHS]], 1
+! CHECK: %[[RHS_REAL:.*]] = extractvalue { x86_fp80, x86_fp80 } %[[LOAD_RHS]], 0
+! CHECK: %[[RHS_IMAG:.*]] = extractvalue { x86_fp80, x86_fp80 } %[[LOAD_RHS]], 1
+
+! IMPRVD: %[[RHS_REAL_IMAG_RATIO:.*]] = fdiv contract x86_fp80 %[[RHS_REAL]], %[[RHS_IMAG]]
+! IMPRVD: %[[RHS_REAL_TIMES_RHS_REAL_IMAG_RATIO:.*]] = fmul contract x86_fp80 %[[RHS_REAL_IMAG_RATIO]], %[[RHS_REAL]]
+! IMPRVD: %[[RHS_REAL_IMAG_DENOM:.*]] = fadd contract x86_fp80 %[[RHS_IMAG]], %[[RHS_REAL_TIMES_RHS_REAL_IMAG_RATIO]]
+! IMPRVD: %[[LHS_REAL_TIMES_RHS_REAL_IMAG_RATIO:.*]] = fmul contract x86_fp80 %[[LHS_REAL]], %[[RHS_REAL_IMAG_RATIO]]
+! IMPRVD: %[[REAL_NUMERATOR_1:.*]] = fadd contract x86_fp80 %[[LHS_REAL_TIMES_RHS_REAL_IMAG_RATIO]], %[[LHS_IMAG]]
+! IMPRVD: %[[RESULT_REAL_1:.*]] = fdiv contract x86_fp80 %[[REAL_NUMERATOR_1]], %[[RHS_REAL_IMAG_DENOM]]
+! IMPRVD: %[[LHS_IMAG_TIMES_RHS_REAL_IMAG_RATIO:.*]] = fmul contract x86_fp80 %[[LHS_IMAG]], %[[RHS_REAL_IMAG_RATIO]]
+! IMPRVD: %[[IMAG_NUMERATOR_1:.*]] = fsub contract x86_fp80 %[[LHS_IMAG_TIMES_RHS_REAL_IMAG_RATIO]], %[[LHS_REAL]]
+! IMPRVD: %[[RESULT_IMAG_1:.*]] = fdiv contract x86_fp80 %[[IMAG_NUMERATOR_1]], %[[RHS_REAL_IMAG_DENOM]]
+! IMPRVD: %[[RHS_IMAG_REAL_RATIO:.*]] = fdiv contract x86_fp80 %[[RHS_IMAG]], %[[RHS_REAL]]
+! IMPRVD: %[[RHS_IMAG_TIMES_RHS_IMAG_REAL_RATIO:.*]] = fmul contract x86_fp80 %[[RHS_IMAG_REAL_RATIO]], %[[RHS_IMAG]]
+! IMPRVD: %[[RHS_IMAG_REAL_DENOM:.*]] = fadd contract x86_fp80 %[[RHS_REAL]], %[[RHS_IMAG_TIMES_RHS_IMAG_REAL_RATIO]]
+! IMPRVD: %[[LHS_IMAG_TIMES_RHS_IMAG_REAL_RATIO:.*]] = fmul contract x86_fp80 %[[LHS_IMAG]], %[[RHS_IMAG_REAL_RATIO]]
+! IMPRVD: %[[REAL_NUMERATOR_2:.*]] = fadd contract x86_fp80 %[[LHS_REAL]], %[[LHS_IMAG_TIMES_RHS_IMAG_REAL_RATIO]]
+! IMPRVD: %[[RESULT_REAL_2:.*]] = fdiv contract x86_fp80 %[[REAL_NUMERATOR_2]], %[[RHS_IMAG_REAL_DENOM]]
+! IMPRVD: %[[LHS_REAL_TIMES_...
[truncated]
|
@llvm/pr-subscribers-flang-fir-hlfir Author: Shunsuke Watanabe (s-watanabe314) ChangesThis patch adds an option to select the method for computing complex number division. It uses
See also the discussion in the following discourse post: https://discourse.llvm.org/t/optimization-of-complex-number-division/83468 Patch is 102.49 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/146641.diff 22 Files Affected:
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 9911d752966e3..58209ceb5dc54 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1023,12 +1023,12 @@ defm offload_uniform_block : BoolFOption<"offload-uniform-block",
BothFlags<[], [ClangOption], " that kernels are launched with uniform block sizes (default true for CUDA/HIP and false otherwise)">>;
def fcomplex_arithmetic_EQ : Joined<["-"], "fcomplex-arithmetic=">, Group<f_Group>,
- Visibility<[ClangOption, CC1Option]>,
+ Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>,
Values<"full,improved,promoted,basic">, NormalizedValuesScope<"LangOptions">,
NormalizedValues<["CX_Full", "CX_Improved", "CX_Promoted", "CX_Basic"]>;
def complex_range_EQ : Joined<["-"], "complex-range=">, Group<f_Group>,
- Visibility<[CC1Option]>,
+ Visibility<[CC1Option, FC1Option]>,
Values<"full,improved,promoted,basic">, NormalizedValuesScope<"LangOptions">,
NormalizedValues<["CX_Full", "CX_Improved", "CX_Promoted", "CX_Basic"]>,
MarshallingInfoEnum<LangOpts<"ComplexRange">, "CX_Full">;
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index e4e321ba1e195..ba5db7e3aba43 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -595,6 +595,30 @@ void Flang::addOffloadOptions(Compilation &C, const InputInfoList &Inputs,
addOpenMPHostOffloadingArgs(C, JA, Args, CmdArgs);
}
+static std::string ComplexRangeKindToStr(LangOptions::ComplexRangeKind Range) {
+ switch (Range) {
+ case LangOptions::ComplexRangeKind::CX_Full:
+ return "full";
+ break;
+ case LangOptions::ComplexRangeKind::CX_Improved:
+ return "improved";
+ break;
+ case LangOptions::ComplexRangeKind::CX_Basic:
+ return "basic";
+ break;
+ default:
+ return "";
+ }
+}
+
+static std::string
+RenderComplexRangeOption(LangOptions::ComplexRangeKind Range) {
+ std::string ComplexRangeStr = ComplexRangeKindToStr(Range);
+ if (!ComplexRangeStr.empty())
+ return "-complex-range=" + ComplexRangeStr;
+ return ComplexRangeStr;
+}
+
static void addFloatingPointOptions(const Driver &D, const ArgList &Args,
ArgStringList &CmdArgs) {
StringRef FPContract;
@@ -605,6 +629,8 @@ static void addFloatingPointOptions(const Driver &D, const ArgList &Args,
bool AssociativeMath = false;
bool ReciprocalMath = false;
+ LangOptions::ComplexRangeKind Range = LangOptions::ComplexRangeKind::CX_None;
+
if (const Arg *A = Args.getLastArg(options::OPT_ffp_contract)) {
const StringRef Val = A->getValue();
if (Val == "fast" || Val == "off") {
@@ -629,6 +655,20 @@ static void addFloatingPointOptions(const Driver &D, const ArgList &Args,
default:
continue;
+ case options::OPT_fcomplex_arithmetic_EQ: {
+ StringRef Val = A->getValue();
+ if (Val == "full")
+ Range = LangOptions::ComplexRangeKind::CX_Full;
+ else if (Val == "improved")
+ Range = LangOptions::ComplexRangeKind::CX_Improved;
+ else if (Val == "basic")
+ Range = LangOptions::ComplexRangeKind::CX_Basic;
+ else {
+ D.Diag(diag::err_drv_unsupported_option_argument)
+ << A->getSpelling() << Val;
+ }
+ break;
+ }
case options::OPT_fhonor_infinities:
HonorINFs = true;
break;
@@ -699,6 +739,13 @@ static void addFloatingPointOptions(const Driver &D, const ArgList &Args,
if (!Recip.empty())
CmdArgs.push_back(Args.MakeArgString("-mrecip=" + Recip));
+ if (Range != LangOptions::ComplexRangeKind::CX_None) {
+ std::string ComplexRangeStr = RenderComplexRangeOption(Range);
+ CmdArgs.push_back(Args.MakeArgString(ComplexRangeStr));
+ CmdArgs.push_back(Args.MakeArgString("-fcomplex-arithmetic=" +
+ ComplexRangeKindToStr(Range)));
+ }
+
if (!HonorINFs && !HonorNaNs && AssociativeMath && ReciprocalMath &&
ApproxFunc && !SignedZeros &&
(FPContract == "fast" || FPContract.empty())) {
diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def
index ae12aec518108..cdeea93c9aecb 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.def
+++ b/flang/include/flang/Frontend/CodeGenOptions.def
@@ -52,6 +52,7 @@ ENUM_CODEGENOPT(RelocationModel, llvm::Reloc::Model, 3, llvm::Reloc::PIC_) ///<
ENUM_CODEGENOPT(DebugInfo, llvm::codegenoptions::DebugInfoKind, 4, llvm::codegenoptions::NoDebugInfo) ///< Level of debug info to generate
ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 4, llvm::driver::VectorLibrary::NoLibrary) ///< Vector functions library to use
ENUM_CODEGENOPT(FramePointer, llvm::FramePointerKind, 2, llvm::FramePointerKind::None) ///< Enable the usage of frame pointers
+ENUM_CODEGENOPT(ComplexRange, ComplexRangeKind, 3, ComplexRangeKind::CX_Full) ///< Method for calculating complex number division
ENUM_CODEGENOPT(DoConcurrentMapping, DoConcurrentMappingKind, 2, DoConcurrentMappingKind::DCMK_None) ///< Map `do concurrent` to OpenMP
diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h
index bad17c8309eb8..df6063cc90340 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.h
+++ b/flang/include/flang/Frontend/CodeGenOptions.h
@@ -192,6 +192,31 @@ class CodeGenOptions : public CodeGenOptionsBase {
return getProfileUse() == llvm::driver::ProfileCSIRInstr;
}
+ /// Controls the various implementations for complex division.
+ enum ComplexRangeKind {
+ /// Implementation of complex division using a call to runtime library
+ /// functions. Overflow and non-finite values are handled by the library
+ /// implementation. This is the default value.
+ CX_Full,
+
+ /// Implementation of complex division offering an improved handling
+ /// for overflow in intermediate calculations. Overflow and non-finite
+ /// values are handled by MLIR's implementation of "complex.div", but this
+ /// may change in the future.
+ CX_Improved,
+
+ /// Implementation of complex division using algebraic formulas at source
+ /// precision. No special handling to avoid overflow. NaN and infinite
+ /// values are not handled.
+ CX_Basic,
+
+ /// No range rule is enabled.
+ CX_None
+
+ /// TODO: Implemention of other values as needed. In Clang, "CX_Promoted"
+ /// is implemented. (See clang/Basic/LangOptions.h)
+ };
+
// Define accessors/mutators for code generation options of enumeration type.
#define CODEGENOPT(Name, Bits, Default)
#define ENUM_CODEGENOPT(Name, Type, Bits, Default) \
diff --git a/flang/include/flang/Lower/LoweringOptions.def b/flang/include/flang/Lower/LoweringOptions.def
index 3263ab129d076..8135704971aa4 100644
--- a/flang/include/flang/Lower/LoweringOptions.def
+++ b/flang/include/flang/Lower/LoweringOptions.def
@@ -70,5 +70,9 @@ ENUM_LOWERINGOPT(CUDARuntimeCheck, unsigned, 1, 0)
/// derived types defined in other compilation units.
ENUM_LOWERINGOPT(SkipExternalRttiDefinition, unsigned, 1, 0)
+/// If true, convert complex number division to runtime on the frontend.
+/// If false, lower to the complex dialect of MLIR.
+/// On by default.
+ENUM_LOWERINGOPT(ComplexDivisionToRuntime, unsigned, 1, 1)
#undef LOWERINGOPT
#undef ENUM_LOWERINGOPT
diff --git a/flang/include/flang/Optimizer/Builder/FIRBuilder.h b/flang/include/flang/Optimizer/Builder/FIRBuilder.h
index e1eaab3346901..b1513850a9048 100644
--- a/flang/include/flang/Optimizer/Builder/FIRBuilder.h
+++ b/flang/include/flang/Optimizer/Builder/FIRBuilder.h
@@ -609,6 +609,17 @@ class FirOpBuilder : public mlir::OpBuilder, public mlir::OpBuilder::Listener {
return integerOverflowFlags;
}
+ /// Set ComplexDivisionToRuntimeFlag value for whether complex number division
+ /// is lowered to a runtime function by this builder.
+ void setComplexDivisionToRuntimeFlag(bool flag) {
+ complexDivisionToRuntimeFlag = flag;
+ }
+
+ /// Get current ComplexDivisionToRuntimeFlag value.
+ bool getComplexDivisionToRuntimeFlag() const {
+ return complexDivisionToRuntimeFlag;
+ }
+
/// Dump the current function. (debug)
LLVM_DUMP_METHOD void dumpFunc();
@@ -673,6 +684,10 @@ class FirOpBuilder : public mlir::OpBuilder, public mlir::OpBuilder::Listener {
/// mlir::arith::IntegerOverflowFlagsAttr.
mlir::arith::IntegerOverflowFlags integerOverflowFlags{};
+ /// Flag to control whether complex number division is lowered to a runtime
+ /// function or to the MLIR complex dialect.
+ bool complexDivisionToRuntimeFlag = true;
+
/// fir::GlobalOp and func::FuncOp symbol table to speed-up
/// lookups.
mlir::SymbolTable *symbolTable = nullptr;
diff --git a/flang/include/flang/Optimizer/CodeGen/CodeGen.h b/flang/include/flang/Optimizer/CodeGen/CodeGen.h
index 93f07d8d5d4d9..e9a07a8dde5cd 100644
--- a/flang/include/flang/Optimizer/CodeGen/CodeGen.h
+++ b/flang/include/flang/Optimizer/CodeGen/CodeGen.h
@@ -9,6 +9,7 @@
#ifndef FORTRAN_OPTIMIZER_CODEGEN_CODEGEN_H
#define FORTRAN_OPTIMIZER_CODEGEN_CODEGEN_H
+#include "flang/Frontend/CodeGenOptions.h"
#include "mlir/IR/BuiltinOps.h"
#include "mlir/Pass/Pass.h"
#include "mlir/Pass/PassRegistry.h"
@@ -58,6 +59,11 @@ struct FIRToLLVMPassOptions {
// the name of the global variable corresponding to a derived
// type's descriptor.
bool typeDescriptorsRenamedForAssembly = false;
+
+ // Specify the calculation method for complex number division used by the
+ // Conversion pass of the MLIR complex dialect.
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind ComplexRange =
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind::CX_Full;
};
/// Convert FIR to the LLVM IR dialect with default options.
diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h
index 337685c82af5f..df1da27058552 100644
--- a/flang/include/flang/Tools/CrossToolHelpers.h
+++ b/flang/include/flang/Tools/CrossToolHelpers.h
@@ -140,6 +140,9 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks {
std::string InstrumentFunctionExit =
""; ///< Name of the instrument-function that is called on each
///< function-exit
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind ComplexRange =
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind::
+ CX_Full; ///< Method for calculating complex number division
};
struct OffloadModuleOpts {
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 30d81f3daa969..86ec410b1f70f 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -484,6 +484,21 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts,
}
parseDoConcurrentMapping(opts, args, diags);
+
+ if (const auto *arg =
+ args.getLastArg(clang::driver::options::OPT_complex_range_EQ)) {
+ llvm::StringRef argValue = llvm::StringRef(arg->getValue());
+ if (argValue == "full") {
+ opts.setComplexRange(CodeGenOptions::ComplexRangeKind::CX_Full);
+ } else if (argValue == "improved") {
+ opts.setComplexRange(CodeGenOptions::ComplexRangeKind::CX_Improved);
+ } else if (argValue == "basic") {
+ opts.setComplexRange(CodeGenOptions::ComplexRangeKind::CX_Basic);
+ } else {
+ diags.Report(clang::diag::err_drv_invalid_value)
+ << arg->getAsString(args) << arg->getValue();
+ }
+ }
}
/// Parses all target input arguments and populates the target
@@ -1811,4 +1826,10 @@ void CompilerInvocation::setLoweringOptions() {
.setNoSignedZeros(langOptions.NoSignedZeros)
.setAssociativeMath(langOptions.AssociativeMath)
.setReciprocalMath(langOptions.ReciprocalMath);
+
+ if (codegenOpts.getComplexRange() ==
+ CodeGenOptions::ComplexRangeKind::CX_Improved ||
+ codegenOpts.getComplexRange() ==
+ CodeGenOptions::ComplexRangeKind::CX_Basic)
+ loweringOpts.setComplexDivisionToRuntime(false);
}
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index bf15def3f3b2e..b5f4f9421f633 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -750,6 +750,8 @@ void CodeGenAction::generateLLVMIR() {
if (ci.getInvocation().getLoweringOpts().getIntegerWrapAround())
config.NSWOnLoopVarInc = false;
+ config.ComplexRange = opts.getComplexRange();
+
// Create the pass pipeline
fir::createMLIRToLLVMPassPipeline(pm, config, getCurrentFile());
(void)mlir::applyPassManagerCLOptions(pm);
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index ff35840a6668c..7e95c640e73b0 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -5746,6 +5746,8 @@ class FirConverter : public Fortran::lower::AbstractConverter {
builder =
new fir::FirOpBuilder(func, bridge.getKindMap(), &mlirSymbolTable);
assert(builder && "FirOpBuilder did not instantiate");
+ builder->setComplexDivisionToRuntimeFlag(
+ bridge.getLoweringOptions().getComplexDivisionToRuntime());
builder->setFastMathFlags(bridge.getLoweringOptions().getMathOptions());
builder->setInsertionPointToStart(&func.front());
if (funit.parent.isA<Fortran::lower::pft::FunctionLikeUnit>()) {
diff --git a/flang/lib/Lower/ConvertExprToHLFIR.cpp b/flang/lib/Lower/ConvertExprToHLFIR.cpp
index df8dfbc72c030..cb338618dbf3b 100644
--- a/flang/lib/Lower/ConvertExprToHLFIR.cpp
+++ b/flang/lib/Lower/ConvertExprToHLFIR.cpp
@@ -1066,8 +1066,16 @@ struct BinaryOp<Fortran::evaluate::Divide<
mlir::Type ty = Fortran::lower::getFIRType(
builder.getContext(), Fortran::common::TypeCategory::Complex, KIND,
/*params=*/{});
- return hlfir::EntityWithAttributes{
- fir::genDivC(builder, loc, ty, lhs, rhs)};
+
+ // TODO: Ideally, complex number division operations should always be
+ // lowered to MLIR. However, converting them to the runtime via MLIR causes
+ // ABI issues.
+ if (builder.getComplexDivisionToRuntimeFlag())
+ return hlfir::EntityWithAttributes{
+ fir::genDivC(builder, loc, ty, lhs, rhs)};
+ else
+ return hlfir::EntityWithAttributes{
+ builder.create<mlir::complex::DivOp>(loc, lhs, rhs)};
}
};
diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
index 2b018912b40e4..4cf533f195e69 100644
--- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp
+++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
@@ -4119,7 +4119,20 @@ class FIRToLLVMLowering
mathToFuncsOptions.minWidthOfFPowIExponent = 33;
mathConvertionPM.addPass(
mlir::createConvertMathToFuncs(mathToFuncsOptions));
- mathConvertionPM.addPass(mlir::createConvertComplexToStandardPass());
+
+ mlir::ConvertComplexToStandardPassOptions complexToStandardOptions{};
+ if (options.ComplexRange ==
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind::CX_Basic) {
+ complexToStandardOptions.complexRange =
+ mlir::complex::ComplexRangeFlags::basic;
+ } else if (options.ComplexRange == Fortran::frontend::CodeGenOptions::
+ ComplexRangeKind::CX_Improved) {
+ complexToStandardOptions.complexRange =
+ mlir::complex::ComplexRangeFlags::improved;
+ }
+ mathConvertionPM.addPass(
+ mlir::createConvertComplexToStandardPass(complexToStandardOptions));
+
// Convert Math dialect operations into LLVM dialect operations.
// There is no way to prefer MathToLLVM patterns over MathToLibm
// patterns (applied below), so we have to run MathToLLVM conversion here.
diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp
index 42d9e7ba2418f..d940934f9f6ad 100644
--- a/flang/lib/Optimizer/Passes/Pipelines.cpp
+++ b/flang/lib/Optimizer/Passes/Pipelines.cpp
@@ -113,6 +113,7 @@ void addFIRToLLVMPass(mlir::PassManager &pm,
options.forceUnifiedTBAATree = useOldAliasTags;
options.typeDescriptorsRenamedForAssembly =
!disableCompilerGeneratedNamesConversion;
+ options.ComplexRange = config.ComplexRange;
addPassConditionally(pm, disableFirToLlvmIr,
[&]() { return fir::createFIRToLLVMPass(options); });
// The dialect conversion framework may leave dead unrealized_conversion_cast
diff --git a/flang/test/Driver/complex-range.f90 b/flang/test/Driver/complex-range.f90
new file mode 100644
index 0000000000000..e5a1ba9068ac9
--- /dev/null
+++ b/flang/test/Driver/complex-range.f90
@@ -0,0 +1,23 @@
+! Test range options for complex multiplication and division.
+
+! RUN: %flang -### -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=RANGE
+
+! RUN: %flang -### -fcomplex-arithmetic=full -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=FULL
+
+! RUN: %flang -### -fcomplex-arithmetic=improved -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=IMPRVD
+
+! RUN: %flang -### -fcomplex-arithmetic=basic -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=BASIC
+
+! RUN: not %flang -### -fcomplex-arithmetic=foo -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=ERR
+
+! RANGE-NOT: -complex-range=
+! FULL: -complex-range=full
+! IMPRVD: -complex-range=improved
+! BASIC: -complex-range=basic
+
+! ERR: error: unsupported argument 'foo' to option '-fcomplex-arithmetic='
diff --git a/flang/test/Integration/complex-div-to-llvm-kind10.f90 b/flang/test/Integration/complex-div-to-llvm-kind10.f90
new file mode 100644
index 0000000000000..04d1f7ed9b024
--- /dev/null
+++ b/flang/test/Integration/complex-div-to-llvm-kind10.f90
@@ -0,0 +1,131 @@
+! Test lowering complex division to llvm ir according to options
+
+! REQUIRES: target=x86_64{{.*}}
+! RUN: %flang -fcomplex-arithmetic=improved -S -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,IMPRVD
+! RUN: %flang -fcomplex-arithmetic=basic -S -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,BASIC
+
+
+! CHECK-LABEL: @div_test_extended
+! CHECK-SAME: ptr %[[RET:.*]], ptr %[[LHS:.*]], ptr %[[RHS:.*]])
+! CHECK: %[[LOAD_LHS:.*]] = load { x86_fp80, x86_fp80 }, ptr %[[LHS]], align 16
+! CHECK: %[[LOAD_RHS:.*]] = load { x86_fp80, x86_fp80 }, ptr %[[RHS]], align 16
+! CHECK: %[[LHS_REAL:.*]] = extractvalue { x86_fp80, x86_fp80 } %[[LOAD_LHS]], 0
+! CHECK: %[[LHS_IMAG:.*]] = extractvalue { x86_fp80, x86_fp80 } %[[LOAD_LHS]], 1
+! CHECK: %[[RHS_REAL:.*]] = extractvalue { x86_fp80, x86_fp80 } %[[LOAD_RHS]], 0
+! CHECK: %[[RHS_IMAG:.*]] = extractvalue { x86_fp80, x86_fp80 } %[[LOAD_RHS]], 1
+
+! IMPRVD: %[[RHS_REAL_IMAG_RATIO:.*]] = fdiv contract x86_fp80 %[[RHS_REAL]], %[[RHS_IMAG]]
+! IMPRVD: %[[RHS_REAL_TIMES_RHS_REAL_IMAG_RATIO:.*]] = fmul contract x86_fp80 %[[RHS_REAL_IMAG_RATIO]], %[[RHS_REAL]]
+! IMPRVD: %[[RHS_REAL_IMAG_DENOM:.*]] = fadd contract x86_fp80 %[[RHS_IMAG]], %[[RHS_REAL_TIMES_RHS_REAL_IMAG_RATIO]]
+! IMPRVD: %[[LHS_REAL_TIMES_RHS_REAL_IMAG_RATIO:.*]] = fmul contract x86_fp80 %[[LHS_REAL]], %[[RHS_REAL_IMAG_RATIO]]
+! IMPRVD: %[[REAL_NUMERATOR_1:.*]] = fadd contract x86_fp80 %[[LHS_REAL_TIMES_RHS_REAL_IMAG_RATIO]], %[[LHS_IMAG]]
+! IMPRVD: %[[RESULT_REAL_1:.*]] = fdiv contract x86_fp80 %[[REAL_NUMERATOR_1]], %[[RHS_REAL_IMAG_DENOM]]
+! IMPRVD: %[[LHS_IMAG_TIMES_RHS_REAL_IMAG_RATIO:.*]] = fmul contract x86_fp80 %[[LHS_IMAG]], %[[RHS_REAL_IMAG_RATIO]]
+! IMPRVD: %[[IMAG_NUMERATOR_1:.*]] = fsub contract x86_fp80 %[[LHS_IMAG_TIMES_RHS_REAL_IMAG_RATIO]], %[[LHS_REAL]]
+! IMPRVD: %[[RESULT_IMAG_1:.*]] = fdiv contract x86_fp80 %[[IMAG_NUMERATOR_1]], %[[RHS_REAL_IMAG_DENOM]]
+! IMPRVD: %[[RHS_IMAG_REAL_RATIO:.*]] = fdiv contract x86_fp80 %[[RHS_IMAG]], %[[RHS_REAL]]
+! IMPRVD: %[[RHS_IMAG_TIMES_RHS_IMAG_REAL_RATIO:.*]] = fmul contract x86_fp80 %[[RHS_IMAG_REAL_RATIO]], %[[RHS_IMAG]]
+! IMPRVD: %[[RHS_IMAG_REAL_DENOM:.*]] = fadd contract x86_fp80 %[[RHS_REAL]], %[[RHS_IMAG_TIMES_RHS_IMAG_REAL_RATIO]]
+! IMPRVD: %[[LHS_IMAG_TIMES_RHS_IMAG_REAL_RATIO:.*]] = fmul contract x86_fp80 %[[LHS_IMAG]], %[[RHS_IMAG_REAL_RATIO]]
+! IMPRVD: %[[REAL_NUMERATOR_2:.*]] = fadd contract x86_fp80 %[[LHS_REAL]], %[[LHS_IMAG_TIMES_RHS_IMAG_REAL_RATIO]]
+! IMPRVD: %[[RESULT_REAL_2:.*]] = fdiv contract x86_fp80 %[[REAL_NUMERATOR_2]], %[[RHS_IMAG_REAL_DENOM]]
+! IMPRVD: %[[LHS_REAL_TIMES_...
[truncated]
|
@llvm/pr-subscribers-clang Author: Shunsuke Watanabe (s-watanabe314) ChangesThis patch adds an option to select the method for computing complex number division. It uses
See also the discussion in the following discourse post: https://discourse.llvm.org/t/optimization-of-complex-number-division/83468 Patch is 102.49 KiB, truncated to 20.00 KiB below, full version: https://github.com/llvm/llvm-project/pull/146641.diff 22 Files Affected:
diff --git a/clang/include/clang/Driver/Options.td b/clang/include/clang/Driver/Options.td
index 9911d752966e3..58209ceb5dc54 100644
--- a/clang/include/clang/Driver/Options.td
+++ b/clang/include/clang/Driver/Options.td
@@ -1023,12 +1023,12 @@ defm offload_uniform_block : BoolFOption<"offload-uniform-block",
BothFlags<[], [ClangOption], " that kernels are launched with uniform block sizes (default true for CUDA/HIP and false otherwise)">>;
def fcomplex_arithmetic_EQ : Joined<["-"], "fcomplex-arithmetic=">, Group<f_Group>,
- Visibility<[ClangOption, CC1Option]>,
+ Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>,
Values<"full,improved,promoted,basic">, NormalizedValuesScope<"LangOptions">,
NormalizedValues<["CX_Full", "CX_Improved", "CX_Promoted", "CX_Basic"]>;
def complex_range_EQ : Joined<["-"], "complex-range=">, Group<f_Group>,
- Visibility<[CC1Option]>,
+ Visibility<[CC1Option, FC1Option]>,
Values<"full,improved,promoted,basic">, NormalizedValuesScope<"LangOptions">,
NormalizedValues<["CX_Full", "CX_Improved", "CX_Promoted", "CX_Basic"]>,
MarshallingInfoEnum<LangOpts<"ComplexRange">, "CX_Full">;
diff --git a/clang/lib/Driver/ToolChains/Flang.cpp b/clang/lib/Driver/ToolChains/Flang.cpp
index e4e321ba1e195..ba5db7e3aba43 100644
--- a/clang/lib/Driver/ToolChains/Flang.cpp
+++ b/clang/lib/Driver/ToolChains/Flang.cpp
@@ -595,6 +595,30 @@ void Flang::addOffloadOptions(Compilation &C, const InputInfoList &Inputs,
addOpenMPHostOffloadingArgs(C, JA, Args, CmdArgs);
}
+static std::string ComplexRangeKindToStr(LangOptions::ComplexRangeKind Range) {
+ switch (Range) {
+ case LangOptions::ComplexRangeKind::CX_Full:
+ return "full";
+ break;
+ case LangOptions::ComplexRangeKind::CX_Improved:
+ return "improved";
+ break;
+ case LangOptions::ComplexRangeKind::CX_Basic:
+ return "basic";
+ break;
+ default:
+ return "";
+ }
+}
+
+static std::string
+RenderComplexRangeOption(LangOptions::ComplexRangeKind Range) {
+ std::string ComplexRangeStr = ComplexRangeKindToStr(Range);
+ if (!ComplexRangeStr.empty())
+ return "-complex-range=" + ComplexRangeStr;
+ return ComplexRangeStr;
+}
+
static void addFloatingPointOptions(const Driver &D, const ArgList &Args,
ArgStringList &CmdArgs) {
StringRef FPContract;
@@ -605,6 +629,8 @@ static void addFloatingPointOptions(const Driver &D, const ArgList &Args,
bool AssociativeMath = false;
bool ReciprocalMath = false;
+ LangOptions::ComplexRangeKind Range = LangOptions::ComplexRangeKind::CX_None;
+
if (const Arg *A = Args.getLastArg(options::OPT_ffp_contract)) {
const StringRef Val = A->getValue();
if (Val == "fast" || Val == "off") {
@@ -629,6 +655,20 @@ static void addFloatingPointOptions(const Driver &D, const ArgList &Args,
default:
continue;
+ case options::OPT_fcomplex_arithmetic_EQ: {
+ StringRef Val = A->getValue();
+ if (Val == "full")
+ Range = LangOptions::ComplexRangeKind::CX_Full;
+ else if (Val == "improved")
+ Range = LangOptions::ComplexRangeKind::CX_Improved;
+ else if (Val == "basic")
+ Range = LangOptions::ComplexRangeKind::CX_Basic;
+ else {
+ D.Diag(diag::err_drv_unsupported_option_argument)
+ << A->getSpelling() << Val;
+ }
+ break;
+ }
case options::OPT_fhonor_infinities:
HonorINFs = true;
break;
@@ -699,6 +739,13 @@ static void addFloatingPointOptions(const Driver &D, const ArgList &Args,
if (!Recip.empty())
CmdArgs.push_back(Args.MakeArgString("-mrecip=" + Recip));
+ if (Range != LangOptions::ComplexRangeKind::CX_None) {
+ std::string ComplexRangeStr = RenderComplexRangeOption(Range);
+ CmdArgs.push_back(Args.MakeArgString(ComplexRangeStr));
+ CmdArgs.push_back(Args.MakeArgString("-fcomplex-arithmetic=" +
+ ComplexRangeKindToStr(Range)));
+ }
+
if (!HonorINFs && !HonorNaNs && AssociativeMath && ReciprocalMath &&
ApproxFunc && !SignedZeros &&
(FPContract == "fast" || FPContract.empty())) {
diff --git a/flang/include/flang/Frontend/CodeGenOptions.def b/flang/include/flang/Frontend/CodeGenOptions.def
index ae12aec518108..cdeea93c9aecb 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.def
+++ b/flang/include/flang/Frontend/CodeGenOptions.def
@@ -52,6 +52,7 @@ ENUM_CODEGENOPT(RelocationModel, llvm::Reloc::Model, 3, llvm::Reloc::PIC_) ///<
ENUM_CODEGENOPT(DebugInfo, llvm::codegenoptions::DebugInfoKind, 4, llvm::codegenoptions::NoDebugInfo) ///< Level of debug info to generate
ENUM_CODEGENOPT(VecLib, llvm::driver::VectorLibrary, 4, llvm::driver::VectorLibrary::NoLibrary) ///< Vector functions library to use
ENUM_CODEGENOPT(FramePointer, llvm::FramePointerKind, 2, llvm::FramePointerKind::None) ///< Enable the usage of frame pointers
+ENUM_CODEGENOPT(ComplexRange, ComplexRangeKind, 3, ComplexRangeKind::CX_Full) ///< Method for calculating complex number division
ENUM_CODEGENOPT(DoConcurrentMapping, DoConcurrentMappingKind, 2, DoConcurrentMappingKind::DCMK_None) ///< Map `do concurrent` to OpenMP
diff --git a/flang/include/flang/Frontend/CodeGenOptions.h b/flang/include/flang/Frontend/CodeGenOptions.h
index bad17c8309eb8..df6063cc90340 100644
--- a/flang/include/flang/Frontend/CodeGenOptions.h
+++ b/flang/include/flang/Frontend/CodeGenOptions.h
@@ -192,6 +192,31 @@ class CodeGenOptions : public CodeGenOptionsBase {
return getProfileUse() == llvm::driver::ProfileCSIRInstr;
}
+ /// Controls the various implementations for complex division.
+ enum ComplexRangeKind {
+ /// Implementation of complex division using a call to runtime library
+ /// functions. Overflow and non-finite values are handled by the library
+ /// implementation. This is the default value.
+ CX_Full,
+
+ /// Implementation of complex division offering an improved handling
+ /// for overflow in intermediate calculations. Overflow and non-finite
+ /// values are handled by MLIR's implementation of "complex.div", but this
+ /// may change in the future.
+ CX_Improved,
+
+ /// Implementation of complex division using algebraic formulas at source
+ /// precision. No special handling to avoid overflow. NaN and infinite
+ /// values are not handled.
+ CX_Basic,
+
+ /// No range rule is enabled.
+ CX_None
+
+ /// TODO: Implemention of other values as needed. In Clang, "CX_Promoted"
+ /// is implemented. (See clang/Basic/LangOptions.h)
+ };
+
// Define accessors/mutators for code generation options of enumeration type.
#define CODEGENOPT(Name, Bits, Default)
#define ENUM_CODEGENOPT(Name, Type, Bits, Default) \
diff --git a/flang/include/flang/Lower/LoweringOptions.def b/flang/include/flang/Lower/LoweringOptions.def
index 3263ab129d076..8135704971aa4 100644
--- a/flang/include/flang/Lower/LoweringOptions.def
+++ b/flang/include/flang/Lower/LoweringOptions.def
@@ -70,5 +70,9 @@ ENUM_LOWERINGOPT(CUDARuntimeCheck, unsigned, 1, 0)
/// derived types defined in other compilation units.
ENUM_LOWERINGOPT(SkipExternalRttiDefinition, unsigned, 1, 0)
+/// If true, convert complex number division to runtime on the frontend.
+/// If false, lower to the complex dialect of MLIR.
+/// On by default.
+ENUM_LOWERINGOPT(ComplexDivisionToRuntime, unsigned, 1, 1)
#undef LOWERINGOPT
#undef ENUM_LOWERINGOPT
diff --git a/flang/include/flang/Optimizer/Builder/FIRBuilder.h b/flang/include/flang/Optimizer/Builder/FIRBuilder.h
index e1eaab3346901..b1513850a9048 100644
--- a/flang/include/flang/Optimizer/Builder/FIRBuilder.h
+++ b/flang/include/flang/Optimizer/Builder/FIRBuilder.h
@@ -609,6 +609,17 @@ class FirOpBuilder : public mlir::OpBuilder, public mlir::OpBuilder::Listener {
return integerOverflowFlags;
}
+ /// Set ComplexDivisionToRuntimeFlag value for whether complex number division
+ /// is lowered to a runtime function by this builder.
+ void setComplexDivisionToRuntimeFlag(bool flag) {
+ complexDivisionToRuntimeFlag = flag;
+ }
+
+ /// Get current ComplexDivisionToRuntimeFlag value.
+ bool getComplexDivisionToRuntimeFlag() const {
+ return complexDivisionToRuntimeFlag;
+ }
+
/// Dump the current function. (debug)
LLVM_DUMP_METHOD void dumpFunc();
@@ -673,6 +684,10 @@ class FirOpBuilder : public mlir::OpBuilder, public mlir::OpBuilder::Listener {
/// mlir::arith::IntegerOverflowFlagsAttr.
mlir::arith::IntegerOverflowFlags integerOverflowFlags{};
+ /// Flag to control whether complex number division is lowered to a runtime
+ /// function or to the MLIR complex dialect.
+ bool complexDivisionToRuntimeFlag = true;
+
/// fir::GlobalOp and func::FuncOp symbol table to speed-up
/// lookups.
mlir::SymbolTable *symbolTable = nullptr;
diff --git a/flang/include/flang/Optimizer/CodeGen/CodeGen.h b/flang/include/flang/Optimizer/CodeGen/CodeGen.h
index 93f07d8d5d4d9..e9a07a8dde5cd 100644
--- a/flang/include/flang/Optimizer/CodeGen/CodeGen.h
+++ b/flang/include/flang/Optimizer/CodeGen/CodeGen.h
@@ -9,6 +9,7 @@
#ifndef FORTRAN_OPTIMIZER_CODEGEN_CODEGEN_H
#define FORTRAN_OPTIMIZER_CODEGEN_CODEGEN_H
+#include "flang/Frontend/CodeGenOptions.h"
#include "mlir/IR/BuiltinOps.h"
#include "mlir/Pass/Pass.h"
#include "mlir/Pass/PassRegistry.h"
@@ -58,6 +59,11 @@ struct FIRToLLVMPassOptions {
// the name of the global variable corresponding to a derived
// type's descriptor.
bool typeDescriptorsRenamedForAssembly = false;
+
+ // Specify the calculation method for complex number division used by the
+ // Conversion pass of the MLIR complex dialect.
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind ComplexRange =
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind::CX_Full;
};
/// Convert FIR to the LLVM IR dialect with default options.
diff --git a/flang/include/flang/Tools/CrossToolHelpers.h b/flang/include/flang/Tools/CrossToolHelpers.h
index 337685c82af5f..df1da27058552 100644
--- a/flang/include/flang/Tools/CrossToolHelpers.h
+++ b/flang/include/flang/Tools/CrossToolHelpers.h
@@ -140,6 +140,9 @@ struct MLIRToLLVMPassPipelineConfig : public FlangEPCallBacks {
std::string InstrumentFunctionExit =
""; ///< Name of the instrument-function that is called on each
///< function-exit
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind ComplexRange =
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind::
+ CX_Full; ///< Method for calculating complex number division
};
struct OffloadModuleOpts {
diff --git a/flang/lib/Frontend/CompilerInvocation.cpp b/flang/lib/Frontend/CompilerInvocation.cpp
index 30d81f3daa969..86ec410b1f70f 100644
--- a/flang/lib/Frontend/CompilerInvocation.cpp
+++ b/flang/lib/Frontend/CompilerInvocation.cpp
@@ -484,6 +484,21 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts,
}
parseDoConcurrentMapping(opts, args, diags);
+
+ if (const auto *arg =
+ args.getLastArg(clang::driver::options::OPT_complex_range_EQ)) {
+ llvm::StringRef argValue = llvm::StringRef(arg->getValue());
+ if (argValue == "full") {
+ opts.setComplexRange(CodeGenOptions::ComplexRangeKind::CX_Full);
+ } else if (argValue == "improved") {
+ opts.setComplexRange(CodeGenOptions::ComplexRangeKind::CX_Improved);
+ } else if (argValue == "basic") {
+ opts.setComplexRange(CodeGenOptions::ComplexRangeKind::CX_Basic);
+ } else {
+ diags.Report(clang::diag::err_drv_invalid_value)
+ << arg->getAsString(args) << arg->getValue();
+ }
+ }
}
/// Parses all target input arguments and populates the target
@@ -1811,4 +1826,10 @@ void CompilerInvocation::setLoweringOptions() {
.setNoSignedZeros(langOptions.NoSignedZeros)
.setAssociativeMath(langOptions.AssociativeMath)
.setReciprocalMath(langOptions.ReciprocalMath);
+
+ if (codegenOpts.getComplexRange() ==
+ CodeGenOptions::ComplexRangeKind::CX_Improved ||
+ codegenOpts.getComplexRange() ==
+ CodeGenOptions::ComplexRangeKind::CX_Basic)
+ loweringOpts.setComplexDivisionToRuntime(false);
}
diff --git a/flang/lib/Frontend/FrontendActions.cpp b/flang/lib/Frontend/FrontendActions.cpp
index bf15def3f3b2e..b5f4f9421f633 100644
--- a/flang/lib/Frontend/FrontendActions.cpp
+++ b/flang/lib/Frontend/FrontendActions.cpp
@@ -750,6 +750,8 @@ void CodeGenAction::generateLLVMIR() {
if (ci.getInvocation().getLoweringOpts().getIntegerWrapAround())
config.NSWOnLoopVarInc = false;
+ config.ComplexRange = opts.getComplexRange();
+
// Create the pass pipeline
fir::createMLIRToLLVMPassPipeline(pm, config, getCurrentFile());
(void)mlir::applyPassManagerCLOptions(pm);
diff --git a/flang/lib/Lower/Bridge.cpp b/flang/lib/Lower/Bridge.cpp
index ff35840a6668c..7e95c640e73b0 100644
--- a/flang/lib/Lower/Bridge.cpp
+++ b/flang/lib/Lower/Bridge.cpp
@@ -5746,6 +5746,8 @@ class FirConverter : public Fortran::lower::AbstractConverter {
builder =
new fir::FirOpBuilder(func, bridge.getKindMap(), &mlirSymbolTable);
assert(builder && "FirOpBuilder did not instantiate");
+ builder->setComplexDivisionToRuntimeFlag(
+ bridge.getLoweringOptions().getComplexDivisionToRuntime());
builder->setFastMathFlags(bridge.getLoweringOptions().getMathOptions());
builder->setInsertionPointToStart(&func.front());
if (funit.parent.isA<Fortran::lower::pft::FunctionLikeUnit>()) {
diff --git a/flang/lib/Lower/ConvertExprToHLFIR.cpp b/flang/lib/Lower/ConvertExprToHLFIR.cpp
index df8dfbc72c030..cb338618dbf3b 100644
--- a/flang/lib/Lower/ConvertExprToHLFIR.cpp
+++ b/flang/lib/Lower/ConvertExprToHLFIR.cpp
@@ -1066,8 +1066,16 @@ struct BinaryOp<Fortran::evaluate::Divide<
mlir::Type ty = Fortran::lower::getFIRType(
builder.getContext(), Fortran::common::TypeCategory::Complex, KIND,
/*params=*/{});
- return hlfir::EntityWithAttributes{
- fir::genDivC(builder, loc, ty, lhs, rhs)};
+
+ // TODO: Ideally, complex number division operations should always be
+ // lowered to MLIR. However, converting them to the runtime via MLIR causes
+ // ABI issues.
+ if (builder.getComplexDivisionToRuntimeFlag())
+ return hlfir::EntityWithAttributes{
+ fir::genDivC(builder, loc, ty, lhs, rhs)};
+ else
+ return hlfir::EntityWithAttributes{
+ builder.create<mlir::complex::DivOp>(loc, lhs, rhs)};
}
};
diff --git a/flang/lib/Optimizer/CodeGen/CodeGen.cpp b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
index 2b018912b40e4..4cf533f195e69 100644
--- a/flang/lib/Optimizer/CodeGen/CodeGen.cpp
+++ b/flang/lib/Optimizer/CodeGen/CodeGen.cpp
@@ -4119,7 +4119,20 @@ class FIRToLLVMLowering
mathToFuncsOptions.minWidthOfFPowIExponent = 33;
mathConvertionPM.addPass(
mlir::createConvertMathToFuncs(mathToFuncsOptions));
- mathConvertionPM.addPass(mlir::createConvertComplexToStandardPass());
+
+ mlir::ConvertComplexToStandardPassOptions complexToStandardOptions{};
+ if (options.ComplexRange ==
+ Fortran::frontend::CodeGenOptions::ComplexRangeKind::CX_Basic) {
+ complexToStandardOptions.complexRange =
+ mlir::complex::ComplexRangeFlags::basic;
+ } else if (options.ComplexRange == Fortran::frontend::CodeGenOptions::
+ ComplexRangeKind::CX_Improved) {
+ complexToStandardOptions.complexRange =
+ mlir::complex::ComplexRangeFlags::improved;
+ }
+ mathConvertionPM.addPass(
+ mlir::createConvertComplexToStandardPass(complexToStandardOptions));
+
// Convert Math dialect operations into LLVM dialect operations.
// There is no way to prefer MathToLLVM patterns over MathToLibm
// patterns (applied below), so we have to run MathToLLVM conversion here.
diff --git a/flang/lib/Optimizer/Passes/Pipelines.cpp b/flang/lib/Optimizer/Passes/Pipelines.cpp
index 42d9e7ba2418f..d940934f9f6ad 100644
--- a/flang/lib/Optimizer/Passes/Pipelines.cpp
+++ b/flang/lib/Optimizer/Passes/Pipelines.cpp
@@ -113,6 +113,7 @@ void addFIRToLLVMPass(mlir::PassManager &pm,
options.forceUnifiedTBAATree = useOldAliasTags;
options.typeDescriptorsRenamedForAssembly =
!disableCompilerGeneratedNamesConversion;
+ options.ComplexRange = config.ComplexRange;
addPassConditionally(pm, disableFirToLlvmIr,
[&]() { return fir::createFIRToLLVMPass(options); });
// The dialect conversion framework may leave dead unrealized_conversion_cast
diff --git a/flang/test/Driver/complex-range.f90 b/flang/test/Driver/complex-range.f90
new file mode 100644
index 0000000000000..e5a1ba9068ac9
--- /dev/null
+++ b/flang/test/Driver/complex-range.f90
@@ -0,0 +1,23 @@
+! Test range options for complex multiplication and division.
+
+! RUN: %flang -### -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=RANGE
+
+! RUN: %flang -### -fcomplex-arithmetic=full -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=FULL
+
+! RUN: %flang -### -fcomplex-arithmetic=improved -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=IMPRVD
+
+! RUN: %flang -### -fcomplex-arithmetic=basic -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=BASIC
+
+! RUN: not %flang -### -fcomplex-arithmetic=foo -c %s 2>&1 \
+! RUN: | FileCheck %s --check-prefix=ERR
+
+! RANGE-NOT: -complex-range=
+! FULL: -complex-range=full
+! IMPRVD: -complex-range=improved
+! BASIC: -complex-range=basic
+
+! ERR: error: unsupported argument 'foo' to option '-fcomplex-arithmetic='
diff --git a/flang/test/Integration/complex-div-to-llvm-kind10.f90 b/flang/test/Integration/complex-div-to-llvm-kind10.f90
new file mode 100644
index 0000000000000..04d1f7ed9b024
--- /dev/null
+++ b/flang/test/Integration/complex-div-to-llvm-kind10.f90
@@ -0,0 +1,131 @@
+! Test lowering complex division to llvm ir according to options
+
+! REQUIRES: target=x86_64{{.*}}
+! RUN: %flang -fcomplex-arithmetic=improved -S -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,IMPRVD
+! RUN: %flang -fcomplex-arithmetic=basic -S -emit-llvm %s -o - | FileCheck %s --check-prefixes=CHECK,BASIC
+
+
+! CHECK-LABEL: @div_test_extended
+! CHECK-SAME: ptr %[[RET:.*]], ptr %[[LHS:.*]], ptr %[[RHS:.*]])
+! CHECK: %[[LOAD_LHS:.*]] = load { x86_fp80, x86_fp80 }, ptr %[[LHS]], align 16
+! CHECK: %[[LOAD_RHS:.*]] = load { x86_fp80, x86_fp80 }, ptr %[[RHS]], align 16
+! CHECK: %[[LHS_REAL:.*]] = extractvalue { x86_fp80, x86_fp80 } %[[LOAD_LHS]], 0
+! CHECK: %[[LHS_IMAG:.*]] = extractvalue { x86_fp80, x86_fp80 } %[[LOAD_LHS]], 1
+! CHECK: %[[RHS_REAL:.*]] = extractvalue { x86_fp80, x86_fp80 } %[[LOAD_RHS]], 0
+! CHECK: %[[RHS_IMAG:.*]] = extractvalue { x86_fp80, x86_fp80 } %[[LOAD_RHS]], 1
+
+! IMPRVD: %[[RHS_REAL_IMAG_RATIO:.*]] = fdiv contract x86_fp80 %[[RHS_REAL]], %[[RHS_IMAG]]
+! IMPRVD: %[[RHS_REAL_TIMES_RHS_REAL_IMAG_RATIO:.*]] = fmul contract x86_fp80 %[[RHS_REAL_IMAG_RATIO]], %[[RHS_REAL]]
+! IMPRVD: %[[RHS_REAL_IMAG_DENOM:.*]] = fadd contract x86_fp80 %[[RHS_IMAG]], %[[RHS_REAL_TIMES_RHS_REAL_IMAG_RATIO]]
+! IMPRVD: %[[LHS_REAL_TIMES_RHS_REAL_IMAG_RATIO:.*]] = fmul contract x86_fp80 %[[LHS_REAL]], %[[RHS_REAL_IMAG_RATIO]]
+! IMPRVD: %[[REAL_NUMERATOR_1:.*]] = fadd contract x86_fp80 %[[LHS_REAL_TIMES_RHS_REAL_IMAG_RATIO]], %[[LHS_IMAG]]
+! IMPRVD: %[[RESULT_REAL_1:.*]] = fdiv contract x86_fp80 %[[REAL_NUMERATOR_1]], %[[RHS_REAL_IMAG_DENOM]]
+! IMPRVD: %[[LHS_IMAG_TIMES_RHS_REAL_IMAG_RATIO:.*]] = fmul contract x86_fp80 %[[LHS_IMAG]], %[[RHS_REAL_IMAG_RATIO]]
+! IMPRVD: %[[IMAG_NUMERATOR_1:.*]] = fsub contract x86_fp80 %[[LHS_IMAG_TIMES_RHS_REAL_IMAG_RATIO]], %[[LHS_REAL]]
+! IMPRVD: %[[RESULT_IMAG_1:.*]] = fdiv contract x86_fp80 %[[IMAG_NUMERATOR_1]], %[[RHS_REAL_IMAG_DENOM]]
+! IMPRVD: %[[RHS_IMAG_REAL_RATIO:.*]] = fdiv contract x86_fp80 %[[RHS_IMAG]], %[[RHS_REAL]]
+! IMPRVD: %[[RHS_IMAG_TIMES_RHS_IMAG_REAL_RATIO:.*]] = fmul contract x86_fp80 %[[RHS_IMAG_REAL_RATIO]], %[[RHS_IMAG]]
+! IMPRVD: %[[RHS_IMAG_REAL_DENOM:.*]] = fadd contract x86_fp80 %[[RHS_REAL]], %[[RHS_IMAG_TIMES_RHS_IMAG_REAL_RATIO]]
+! IMPRVD: %[[LHS_IMAG_TIMES_RHS_IMAG_REAL_RATIO:.*]] = fmul contract x86_fp80 %[[LHS_IMAG]], %[[RHS_IMAG_REAL_RATIO]]
+! IMPRVD: %[[REAL_NUMERATOR_2:.*]] = fadd contract x86_fp80 %[[LHS_REAL]], %[[LHS_IMAG_TIMES_RHS_IMAG_REAL_RATIO]]
+! IMPRVD: %[[RESULT_REAL_2:.*]] = fdiv contract x86_fp80 %[[REAL_NUMERATOR_2]], %[[RHS_IMAG_REAL_DENOM]]
+! IMPRVD: %[[LHS_REAL_TIMES_...
[truncated]
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Thank you!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Thanks for working on this @s-watanabe314. This will be very helpful to make further comparison.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
@@ -9,6 +9,7 @@ | |||
#ifndef FORTRAN_OPTIMIZER_CODEGEN_CODEGEN_H | |||
#define FORTRAN_OPTIMIZER_CODEGEN_CODEGEN_H | |||
|
|||
#include "flang/Frontend/CodeGenOptions.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not requesting to do that here, but I feel the CodeGenOptions should be defined in Codegen and used/set in Frontend rather than having Codegen depends on Frontend things I think.
This can be refactored independently and is not a huge deal for a header use without adding a library linking dependency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the CodeGenOptions.h
header is included because of the ComplexRangeKind
enum. If that is moved to llvm/Frontend/Driver/CodeGenOptions.h
, we don't have this issue. It may be worth doing it in this PR, but I am ok with moving it in a separate PR as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the reviews!
Not requesting to do that here, but I feel the CodeGenOptions should be defined in Codegen and used/set in Frontend rather than having Codegen depends on Frontend things I think.
Does this mean that ComplexRangeKind
should be defined in CodeGen.h
instead of CodeGenOptions.h
, and that CodeGenOptions should reference it?
I think the CodeGenOptions.h header is included because of the ComplexRangeKind enum. If that is moved to llvm/Frontend/Driver/CodeGenOptions.h, we don't have this issue. It may be worth doing it in this PR, but I am ok with moving it in a separate PR as well.
Yes, I'm including CodeGenOptions.h
to use the ComplexRangeKind
enum. I'll also try moving it to llvm/Frontend/Driver/CodeGenOptions.h
, but I think that would be a separate PR.
Dear Shunsuke, Thank you for your thoughtful response and for sharing your approach to complex division optimization. We encountered similar performance issues while analyzing SPEC2017 workloads, particularly in the handling of complex arithmetic, and were motivated to propose a solution grounded in the current design conventions of LLVM Flang. While our implementation aligns closely with existing patterns in the Flang complex runtime, we’ve found that your formulation provides a more principled and generalizable abstraction. It is both elegant and insightful, and indeed offers a more flexible foundation for future extensions in this area. In our benchmarks, we observed that the performance gains achieved by your approach are comparable to those of our implementation. We have carefully studied your proposal and found it to be genuinely inspiring. Your contribution exemplifies the clarity and depth characteristic of the LLVM community. We look forward to learning more from your work and to further engaging with the community on this topic. Enjoy the rest of your day! Warmest regards, Hanwen (@OpenXiangShan) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LG. Thanks for all the work and experiments here.
Couple of nit/trivial comments.
def fcomplex_arithmetic_EQ : Joined<["-"], "fcomplex-arithmetic=">, Group<f_Group>, | ||
Visibility<[ClangOption, CC1Option]>, | ||
Visibility<[ClangOption, CC1Option, FlangOption, FC1Option]>, | ||
Values<"full,improved,promoted,basic">, NormalizedValuesScope<"LangOptions">, | ||
NormalizedValues<["CX_Full", "CX_Improved", "CX_Promoted", "CX_Basic"]>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is help text available for these options? If not could you add?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added help text. Could you please review it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks Good. Would you mind adding some information to https://github.com/llvm/llvm-project/blob/main/flang/docs/ComplexOperations.md as well?
The Clang documentation in clang/docs/UsersManual.rst
is more detailed.
static std::string ComplexRangeKindToStr(LangOptions::ComplexRangeKind Range) { | ||
switch (Range) { | ||
case LangOptions::ComplexRangeKind::CX_Full: | ||
return "full"; | ||
break; | ||
case LangOptions::ComplexRangeKind::CX_Improved: | ||
return "improved"; | ||
break; | ||
case LangOptions::ComplexRangeKind::CX_Basic: | ||
return "basic"; | ||
break; | ||
default: | ||
return ""; | ||
} | ||
} | ||
|
||
static std::string | ||
RenderComplexRangeOption(LangOptions::ComplexRangeKind Range) { | ||
std::string ComplexRangeStr = ComplexRangeKindToStr(Range); | ||
if (!ComplexRangeStr.empty()) | ||
return "-complex-range=" + ComplexRangeStr; | ||
return ComplexRangeStr; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we share this code with similar code in Clang.cpp by moving to CommonArgs.cpp or a suitable place?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I moved both functions to CommonArgs.cpp
and removed them from Clang.cpp
and Flang.cpp
.
if (builder.getComplexDivisionToRuntimeFlag()) | ||
return hlfir::EntityWithAttributes{ | ||
fir::genDivC(builder, loc, ty, lhs, rhs)}; | ||
else | ||
return hlfir::EntityWithAttributes{ | ||
builder.create<mlir::complex::DivOp>(loc, lhs, rhs)}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: braces
subroutine div_test_quad(a,b,c) | ||
complex(kind=16) :: a, b, c | ||
a = b / c | ||
end subroutine div_test_quad |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: end of line/newline
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks. I think we should share the ComplexRangeKind
enum between clang
and flang
. I am ok with doing it in a separate PR.
@@ -629,6 +655,20 @@ static void addFloatingPointOptions(const Driver &D, const ArgList &Args, | |||
default: | |||
continue; | |||
|
|||
case options::OPT_fcomplex_arithmetic_EQ: { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can be moved to CommonArgs.cpp
as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although Clang implements CX_Promoted
, the corresponding conversion is not implemented in Flang. Therefore I want the driver to emit an error. Since this behavior differs slightly from Clang, moving this to CommonArgs.cpp
might be difficult.
@@ -192,6 +192,31 @@ class CodeGenOptions : public CodeGenOptionsBase { | |||
return getProfileUse() == llvm::driver::ProfileCSIRInstr; | |||
} | |||
|
|||
/// Controls the various implementations for complex division. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is exactly the same as the enum in clang/Basic/LangOptions
, it could be moved to llvm/Frontend/Driver/CodeGenOptions.h
and shared between clang and flang. There is precedence for this, most recently, here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for providing the example! I understand that your example is about moving Clang's CodeGenOptions
to llvm/Frontend/Driver/CodegenOptions.h
. Since Flang already defines ComplexRangeKind
as CodeGenOptions
, moving it to llvm/Frontend/Driver/CodegenOptions.h
might not require significant modifications. However, in Clang, it's defined in LangOptions
, so I'm unsure how to move it and how much modification would be necessary.
@@ -9,6 +9,7 @@ | |||
#ifndef FORTRAN_OPTIMIZER_CODEGEN_CODEGEN_H | |||
#define FORTRAN_OPTIMIZER_CODEGEN_CODEGEN_H | |||
|
|||
#include "flang/Frontend/CodeGenOptions.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the CodeGenOptions.h
header is included because of the ComplexRangeKind
enum. If that is moved to llvm/Frontend/Driver/CodeGenOptions.h
, we don't have this issue. It may be worth doing it in this PR, but I am ok with moving it in a separate PR as well.
@@ -484,6 +484,21 @@ static void parseCodeGenArgs(Fortran::frontend::CodeGenOptions &opts, | |||
} | |||
|
|||
parseDoConcurrentMapping(opts, args, diags); | |||
|
|||
if (const auto *arg = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could we use a concrete type instead of auto
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I modified it to use llvm::opt::Arg
.
Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
This patch adds an option to select the method for computing complex number division. It uses
LoweringOptions
to determine whether to lower complex division to a runtime function call or to MLIR'scomplex.div
, andCodeGenOptions
to select the computation algorithm forcomplex.div
. The available option values and their corresponding algorithms are as follows:full
: Lower to a runtime function call. (Default behavior)improved
: Lower tocomplex.div
and expand to Smith's algorithm.basic
: Lower tocomplex.div
and expand to the algebraic algorithm.See also the discussion in the following discourse post: https://discourse.llvm.org/t/optimization-of-complex-number-division/83468