[WIP] Use Enzyme.jl for constant optimization #254

MilesCranmer · 2023-08-13T18:15:25Z

Requires: SymbolicML/DynamicExpressions.jl#52

This attempts to use Reverse-mode Enzyme.jl for constant optimization which seems to get a nice speedup.

There are also some other issues which I will continue debugging, like:

┌ Error: Enzyme aligned size and Julia size disagree
│   AlignedSize = 2192
│   esizeof(TT) = 2136
│   fieldtypes(TT) = (Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, @NamedTuple{1, 2, 3, 4, 5::Bool, 6::UInt64, 7::Core.LLVMPtr{UInt64, 0}, 8::UInt64, 9::UInt64}, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, @NamedTuple{1, 2, 3, 4, 5, 6::NTuple{4, Float32}, 7::NTuple{4, Float32}, 8::NTuple{4, Float32}, 9::NTuple{4, Float32}, 10, 11::Core.LLVMPtr{UInt8, 0}, 12, 13, 14::UInt64, 15::UInt64, 16::UInt64, 17::UInt64, 18::UInt64, 19::Bool, 20::Float32, 21::UInt64, 22::Core.LLVMPtr{UInt8, 0}}, @NamedTuple{1, 2, 3, 4, 5, 6::NTuple{4, Float32}, 7::NTuple{4, Float32}, 8::NTuple{4, Float32}, 9::NTuple{4, Float32}, 10, 11::Core.LLVMPtr{UInt8, 0}, 12, 13, 14::UInt64, 15::UInt64, 16::UInt64, 17::UInt64, 18::UInt64, 19::Bool, 20::Float32, 21::UInt64, 22::Core.LLVMPtr{UInt8, 0}}, @NamedTuple{1, 2, 3, 4, 5, 6::NTuple{4, Float32}, 7::NTuple{4, Float32}, 8::NTuple{4, Float32}, 9::NTuple{4, Float32}, 10, 11::Core.LLVMPtr{UInt8, 0}, 12, 13, 14::UInt64, 15::UInt64, 16::UInt64, 17::UInt64, 18::UInt64, 19::Bool, 20::Float32, 21::UInt64, 22::Core.LLVMPtr{UInt8, 0}}, @NamedTuple{1, 2, 3, 4, 5, 6::NTuple{4, Float32}, 7::NTuple{4, Float32}, 8::NTuple{4, Float32}, 9::NTuple{4, Float32}, 10, 11::Core.LLVMPtr{UInt8, 0}, 12, 13, 14::UInt64, 15::UInt64, 16::UInt64, 17::UInt64, 18::UInt64, 19::Bool, 20::Float32, 21::UInt64, 22::Core.LLVMPtr{UInt8, 0}}, @NamedTuple{1, 2::UInt64}, Tuple{Core.LLVMPtr{UInt8, 0}, NTuple{4, Float32}, NTuple{4, Float32}, NTuple{4, Float32}, NTuple{4, Float32}, UInt64, Core.LLVMPtr{UInt8, 0}, Core.LLVMPtr{NTuple{4, Float32}, 0}, Core.LLVMPtr{NTuple{4, Float32}, 0}, Core.LLVMPtr{NTuple{4, Float32}, 0}, Core.LLVMPtr{NTuple{4, Float32}, 0}, Vararg{NTuple{4, Float32}, 6}}, Tuple{Core.LLVMPtr{UInt8, 0}, NTuple{4, Float32}, NTuple{4, Float32}, NTuple{4, Float32}, NTuple{4, Float32}, UInt64, Core.LLVMPtr{UInt8, 0}, Core.LLVMPtr{NTuple{4, Float32}, 0}, Core.LLVMPtr{NTuple{4, Float32}, 0}, Core.LLVMPtr{NTuple{4, Float32}, 0}, Core.LLVMPtr{NTuple{4, Float32}, 0}, Vararg{NTuple{4, Float32}, 6}}, Any, Any, Any, Any, Any, Any, Any, @NamedTuple{1, 2::UInt64}, Any, Any, @NamedTuple{1, 2::UInt64}, Any, Any, @NamedTuple{1, 2::UInt64}, Any, @NamedTuple{1, 2::UInt64}, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, Any, @NamedTuple{1, 2, 3, 4, 5, 6, 7::UInt8, 8, 9, 10, 11::Bool}, @NamedTuple{1, 2, 3, 4, 5, 6, 7::UInt8, 8, 9, 10, 11::Bool}, UInt8, Any, UInt8, Any, Any, Any, Bool, Bool, UInt64, UInt64, UInt64, UInt8, UInt8, Any, Bool, UInt8, Any, Any, Bool, Bool, Bool, Any, Bool, UInt8, Bool, Bool, UInt8, Bool, Bool, Bool, Bool)
└ @ Enzyme.Compiler ~/.julia/packages/GPUCompiler/YO8Uj/src/utils.jl:56
Assertion failed: (isa<To>(Val) && "cast<Ty>() argument of incompatible type!"), function cast, file /opt/aarch64-apple-darwin20/aarch64-apple-darwin20/sys-root/usr/local/include/llvm/Support/Casting.h, line 578.

which is from interference with LoopVectorization.jl. In principle the -O2 pass should remove the LoopVectorization.jl code as turbo=false is a constant.

github-actions · 2023-08-13T20:22:04Z

Benchmark Results

	master	`38c3f0c`...	t[master]/t[`38c3f0c`...]
search/multithreading	51.4 ± 5.3 s	0.0175 ± 0.0016 h	0.818
search/serial	52.9 ± 1.2 s	57.6 ± 0.94 s	0.918
time_to_load	2.41 ± 0.0042 s	2.52 ± 0.018 s	0.954
utils/best_of_sample	1.4 ± 0.5 μs	1.6 ± 0.5 μs	0.875
utils/check_constraints_x10	19.2 ± 4.8 μs	19 ± 4.7 μs	1.01
utils/compute_complexity_x10/Float64	3.3 ± 0.1 μs	3.2 ± 0.1 μs	1.03
utils/compute_complexity_x10/Int64	3.2 ± 0.1 μs	3.4 ± 0.1 μs	0.941
utils/compute_complexity_x10/nothing	2.5 ± 0.2 μs	2.5 ± 0.2 μs	1
utils/optimize_constants_x10	0.0477 ± 0.011 s	0.0493 ± 0.011 s	0.967

Benchmark Plots

A plot of the benchmark results have been uploaded as an artifact to the workflow run for this PR.
Go to "Actions"->"Benchmark a pull request"->[the most recent run]->"Artifacts" (at the bottom).

MilesCranmer · 2023-08-17T15:24:57Z

src/LossFunctions.jl

    else
-        l(i) = loss(x[i], y[i], w[i])
-        return sum(l, eachindex(x)) / sum(w)
+        return sum(@. loss(x, y, w)) / sum(w)


Would be nice to avoid this, but it seems like sum is using the dataset array for temporary storage somehow? (Or Enzyme.jl thinks it is?)

MilesCranmer · 2024-07-01T15:55:27Z

Moved to #326

MilesCranmer marked this pull request as draft August 13, 2023 18:15

This was referenced Aug 13, 2023

Understanding Error: Enzyme aligned size and Julia size disagree EnzymeAD/Enzyme.jl#999

Closed

Overload Optim.optimize for ::Node SymbolicML/DynamicExpressions.jl#30

Merged

This was referenced Aug 17, 2023

Catching assertions / debugging mode? EnzymeAD/Enzyme.jl#1009

Closed

Enzyme compatibility SymbolicML/DynamicExpressions.jl#52

Merged

MilesCranmer commented Aug 17, 2023

View reviewed changes

MilesCranmer force-pushed the enzyme2 branch from 78c1ac2 to f8b6f62 Compare August 17, 2023 23:11

MilesCranmer mentioned this pull request Aug 28, 2023

Hanging and task switch errors when differentiating DynamicExpressions.jl EnzymeAD/Enzyme.jl#1018

Closed

MilesCranmer added 13 commits August 28, 2023 15:09

Reduce repeated code in ConstantOptimization.jl

c8b627b

Create Enzyme extension for constant optimization

5c36fce

Workaround Enzyme/LossFunctions incompatibility

a7d5556

Fix use_autodiff val

57482d7

Fix Enzyme call

c4ace53

Create new storage variable

85804a3

Require DynamicExpressions.jl fix

29fce99

Ping build

489840c

Use Val(turbo) instead of turbo

1c5f66e

Turn on specialization when enzyme enabled

a4960e2

Switch turbo option back

381a047

Update backend

fed4a69

Set up fuse_level option for Enzyme compatibility

05079c8

MilesCranmer force-pushed the enzyme2 branch from 5b9e7d3 to 05079c8 Compare August 28, 2023 19:09

MilesCranmer closed this Jul 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[WIP] Use Enzyme.jl for constant optimization #254

[WIP] Use Enzyme.jl for constant optimization #254

Uh oh!

MilesCranmer commented Aug 13, 2023 •

edited

Loading

Uh oh!

github-actions bot commented Aug 13, 2023

Uh oh!

MilesCranmer Aug 17, 2023

Uh oh!

MilesCranmer commented Jul 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[WIP] Use Enzyme.jl for constant optimization #254

[WIP] Use Enzyme.jl for constant optimization #254

Uh oh!

Conversation

MilesCranmer commented Aug 13, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 13, 2023

Benchmark Results

Benchmark Plots

Uh oh!

MilesCranmer Aug 17, 2023

Choose a reason for hiding this comment

Uh oh!

MilesCranmer commented Jul 1, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

MilesCranmer commented Aug 13, 2023 •

edited

Loading