Skip to content

[UPLC] [Optimization] Add force-ifThenElse-delay #7042

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Apr 16, 2025

Conversation

effectfully
Copy link
Contributor

This implements force-delay cancellation when delays appear under ifThenElse.

Serves the same purpose as #7040, but does it better.

@effectfully
Copy link
Contributor Author

/benchmark nofib

1 similar comment
@effectfully
Copy link
Contributor Author

/benchmark nofib

@effectfully
Copy link
Contributor Author

/benchmark lists

1 similar comment
@effectfully
Copy link
Contributor Author

/benchmark lists

Copy link
Contributor

Click here to check the status of your benchmark.

Copy link
Contributor

Comparing benchmark results of 'nofib' on '24dcdb4d21' (base) and '587a9e6523' (PR)

Results table
Script 24dcdb4 587a9e6 Change
clausify/formula1 2.169 ms 2.170 ms +0.0%
clausify/formula2 2.930 ms 2.922 ms -0.3%
clausify/formula3 8.115 ms 8.089 ms -0.3%
clausify/formula4 17.64 ms 17.36 ms -1.6%
clausify/formula5 39.67 ms 39.31 ms -0.9%
knights/4x4 15.01 ms 14.68 ms -2.2%
knights/6x6 36.52 ms 35.66 ms -2.4%
knights/8x8 63.17 ms 61.83 ms -2.1%
primetest/05digits 9.222 ms 9.030 ms -2.1%
primetest/10digits 17.83 ms 17.74 ms -0.5%
primetest/30digits 54.93 ms 54.32 ms -1.1%
primetest/50digits 90.97 ms 89.77 ms -1.3%
queens4x4/bt 4.087 ms 3.992 ms -2.3%
queens4x4/bm 5.070 ms 4.949 ms -2.4%
queens4x4/bjbt1 4.884 ms 4.781 ms -2.1%
queens4x4/bjbt2 4.609 ms 4.513 ms -2.1%
queens4x4/fc 10.06 ms 9.875 ms -1.8%
queens5x5/bt 57.18 ms 55.64 ms -2.7%
queens5x5/bm 58.23 ms 56.71 ms -2.6%
queens5x5/bjbt1 65.92 ms 64.45 ms -2.2%
queens5x5/bjbt2 64.16 ms 62.71 ms -2.3%
queens5x5/fc 127.6 ms 125.4 ms -1.7%
24dcdb4 587a9e6 Change
TOTAL 760.0 ms 745.9 ms -1.9%

Copy link
Contributor

Click here to check the status of your benchmark.

@effectfully effectfully force-pushed the effectfully/optimization/force-ifThenElse-delay branch from 587a9e6 to e9ae45c Compare April 15, 2025 14:08
@effectfully effectfully force-pushed the effectfully/optimization/force-ifThenElse-delay branch from e9ae45c to 114b4bd Compare April 15, 2025 14:11
Copy link
Contributor

Comparing benchmark results of 'nofib' on '24dcdb4d21' (base) and '587a9e6523' (PR)

Results table
Script 24dcdb4 587a9e6 Change
clausify/formula1 2.155 ms 2.204 ms +2.3%
clausify/formula2 2.906 ms 2.973 ms +2.3%
clausify/formula3 8.067 ms 8.270 ms +2.5%
clausify/formula4 17.53 ms 17.91 ms +2.2%
clausify/formula5 39.40 ms 40.24 ms +2.1%
knights/4x4 14.90 ms 15.24 ms +2.3%
knights/6x6 36.31 ms 37.14 ms +2.3%
knights/8x8 62.77 ms 64.33 ms +2.5%
primetest/05digits 9.188 ms 9.304 ms +1.3%
primetest/10digits 17.84 ms 18.23 ms +2.2%
primetest/30digits 55.00 ms 55.88 ms +1.6%
primetest/50digits 90.72 ms 92.09 ms +1.5%
queens4x4/bt 4.080 ms 4.098 ms +0.4%
queens4x4/bm 5.054 ms 5.065 ms +0.2%
queens4x4/bjbt1 4.864 ms 4.888 ms +0.5%
queens4x4/bjbt2 4.584 ms 4.619 ms +0.8%
queens4x4/fc 10.16 ms 10.11 ms -0.5%
queens5x5/bt 56.96 ms 57.32 ms +0.6%
queens5x5/bm 57.80 ms 58.16 ms +0.6%
queens5x5/bjbt1 65.72 ms 66.17 ms +0.7%
queens5x5/bjbt2 64.04 ms 64.55 ms +0.8%
queens5x5/fc 127.1 ms 129.1 ms +1.6%
24dcdb4 587a9e6 Change
TOTAL 757.1 ms 767.9 ms +1.4%

Copy link
Contributor

Click here to check the status of your benchmark.

Copy link
Contributor

Comparing benchmark results of 'lists' on '24dcdb4d21' (base) and '114b4bdd0d' (PR)

Results table
Script 24dcdb4 114b4bd Change
sort/ghcSort/50 172.1 μs 170.1 μs -1.2%
sort/ghcSort/100 401.9 μs 399.7 μs -0.5%
sort/ghcSort/150 692.0 μs 685.0 μs -1.0%
sort/ghcSort/200 930.6 μs 922.9 μs -0.8%
sort/ghcSort/250 1.205 ms 1.196 ms -0.7%
sort/ghcSort/300 1.585 ms 1.572 ms -0.8%
sort/insertionSort/50 585.8 μs 581.8 μs -0.7%
sort/insertionSort/100 2.341 ms 2.312 ms -1.2%
sort/insertionSort/150 5.277 ms 5.219 ms -1.1%
sort/insertionSort/200 9.441 ms 9.301 ms -1.5%
sort/insertionSort/250 14.74 ms 14.57 ms -1.2%
sort/insertionSort/300 21.31 ms 20.95 ms -1.7%
sort/mergeSort/50 527.0 μs 526.0 μs -0.2%
sort/mergeSort/100 1.211 ms 1.197 ms -1.2%
sort/mergeSort/150 1.954 ms 1.930 ms -1.2%
sort/mergeSort/200 2.742 ms 2.704 ms -1.4%
sort/mergeSort/250 3.590 ms 3.548 ms -1.2%
sort/mergeSort/300 4.372 ms 4.361 ms -0.3%
sort/quickSort/50 1.367 ms 1.345 ms -1.6%
sort/quickSort/100 5.665 ms 5.574 ms -1.6%
sort/quickSort/150 12.71 ms 12.52 ms -1.5%
sort/quickSort/200 22.55 ms 22.27 ms -1.2%
sort/quickSort/250 35.55 ms 35.06 ms -1.4%
sort/quickSort/300 50.99 ms 50.38 ms -1.2%
sum/compiled-from-Haskell/sum-right-builtin/100 80.46 μs 77.38 μs -3.8%
sum/compiled-from-Haskell/sum-right-builtin/500 423.4 μs 405.9 μs -4.1%
sum/compiled-from-Haskell/sum-right-builtin/1000 864.8 μs 860.9 μs -0.5%
sum/compiled-from-Haskell/sum-right-builtin/2500 2.677 ms 2.663 ms -0.5%
sum/compiled-from-Haskell/sum-right-builtin/5000 5.789 ms 5.757 ms -0.6%
sum/compiled-from-Haskell/sum-right-Scott/100 43.11 μs 43.01 μs -0.2%
sum/compiled-from-Haskell/sum-right-Scott/500 229.1 μs 227.2 μs -0.8%
sum/compiled-from-Haskell/sum-right-Scott/1000 484.4 μs 481.0 μs -0.7%
sum/compiled-from-Haskell/sum-right-Scott/2500 1.713 ms 1.695 ms -1.1%
sum/compiled-from-Haskell/sum-right-Scott/5000 4.128 ms 4.083 ms -1.1%
sum/compiled-from-Haskell/sum-right-data/100 257.9 μs 248.5 μs -3.6%
sum/compiled-from-Haskell/sum-right-data/500 1.419 ms 1.374 ms -3.2%
sum/compiled-from-Haskell/sum-right-data/1000 3.189 ms 3.089 ms -3.1%
sum/compiled-from-Haskell/sum-right-data/2500 8.524 ms 8.312 ms -2.5%
sum/compiled-from-Haskell/sum-right-data/5000 18.07 ms 17.25 ms -4.5%
sum/compiled-from-Haskell/sum-left-builtin/100 74.90 μs 74.97 μs +0.1%
sum/compiled-from-Haskell/sum-left-builtin/500 389.7 μs 388.2 μs -0.4%
sum/compiled-from-Haskell/sum-left-builtin/1000 838.1 μs 837.3 μs -0.1%
sum/compiled-from-Haskell/sum-left-builtin/2500 2.571 ms 2.568 ms -0.1%
sum/compiled-from-Haskell/sum-left-builtin/5000 5.681 ms 5.679 ms -0.0%
sum/compiled-from-Haskell/sum-left-Scott/100 41.88 μs 41.83 μs -0.1%
sum/compiled-from-Haskell/sum-left-Scott/500 221.8 μs 222.0 μs +0.1%
sum/compiled-from-Haskell/sum-left-Scott/1000 486.9 μs 487.2 μs +0.1%
sum/compiled-from-Haskell/sum-left-Scott/2500 1.617 ms 1.605 ms -0.7%
sum/compiled-from-Haskell/sum-left-Scott/5000 4.063 ms 3.992 ms -1.7%
sum/compiled-from-Haskell/sum-left-data/100 264.8 μs 248.9 μs -6.0%
sum/compiled-from-Haskell/sum-left-data/500 1.462 ms 1.375 ms -6.0%
sum/compiled-from-Haskell/sum-left-data/1000 3.293 ms 3.122 ms -5.2%
sum/compiled-from-Haskell/sum-left-data/2500 8.759 ms 8.288 ms -5.4%
sum/compiled-from-Haskell/sum-left-data/5000 18.54 ms 17.45 ms -5.9%
sum/hand-written-PLC/sum-right-builtin/100 50.88 μs 50.86 μs -0.0%
sum/hand-written-PLC/sum-right-builtin/500 256.8 μs 255.7 μs -0.4%
sum/hand-written-PLC/sum-right-builtin/1000 531.5 μs 527.3 μs -0.8%
sum/hand-written-PLC/sum-right-builtin/2500 1.546 ms 1.531 ms -1.0%
sum/hand-written-PLC/sum-right-builtin/5000 3.430 ms 3.403 ms -0.8%
sum/hand-written-PLC/sum-right-Scott/100 33.46 μs 33.23 μs -0.7%
sum/hand-written-PLC/sum-right-Scott/500 179.5 μs 180.0 μs +0.3%
sum/hand-written-PLC/sum-right-Scott/1000 399.4 μs 398.0 μs -0.4%
sum/hand-written-PLC/sum-right-Scott/2500 1.337 ms 1.336 ms -0.1%
sum/hand-written-PLC/sum-right-Scott/5000 4.030 ms 4.028 ms -0.0%
sum/hand-written-PLC/sum-left-builtin/100 54.32 μs 54.06 μs -0.5%
sum/hand-written-PLC/sum-left-builtin/500 270.7 μs 264.8 μs -2.2%
sum/hand-written-PLC/sum-left-builtin/1000 533.4 μs 530.3 μs -0.6%
sum/hand-written-PLC/sum-left-builtin/2500 1.330 ms 1.304 ms -2.0%
sum/hand-written-PLC/sum-left-builtin/5000 2.638 ms 2.594 ms -1.7%
sum/hand-written-PLC/sum-left-Scott/100 37.27 μs 37.06 μs -0.6%
sum/hand-written-PLC/sum-left-Scott/500 202.9 μs 202.9 μs 0.0%
sum/hand-written-PLC/sum-left-Scott/1000 449.7 μs 447.6 μs -0.5%
sum/hand-written-PLC/sum-left-Scott/2500 1.554 ms 1.547 ms -0.5%
sum/hand-written-PLC/sum-left-Scott/5000 4.363 ms 4.340 ms -0.5%
24dcdb4 114b4bd Change
TOTAL 321.3 ms 315.3 ms -1.9%

Copy link
Contributor

Click here to check the status of your benchmark.

Copy link
Contributor

Comparing benchmark results of 'lists' on '24dcdb4d21' (base) and '114b4bdd0d' (PR)

Results table
Script 24dcdb4 114b4bd Change
sort/ghcSort/50 172.2 μs 171.0 μs -0.7%
sort/ghcSort/100 401.2 μs 400.8 μs -0.1%
sort/ghcSort/150 693.3 μs 688.8 μs -0.6%
sort/ghcSort/200 931.8 μs 928.9 μs -0.3%
sort/ghcSort/250 1.208 ms 1.205 ms -0.2%
sort/ghcSort/300 1.594 ms 1.587 ms -0.4%
sort/insertionSort/50 585.9 μs 585.5 μs -0.1%
sort/insertionSort/100 2.346 ms 2.341 ms -0.2%
sort/insertionSort/150 5.281 ms 5.267 ms -0.3%
sort/insertionSort/200 9.402 ms 9.389 ms -0.1%
sort/insertionSort/250 14.70 ms 14.72 ms +0.1%
sort/insertionSort/300 21.31 ms 21.31 ms 0.0%
sort/mergeSort/50 528.7 μs 529.5 μs +0.2%
sort/mergeSort/100 1.214 ms 1.209 ms -0.4%
sort/mergeSort/150 1.950 ms 1.924 ms -1.3%
sort/mergeSort/200 2.743 ms 2.732 ms -0.4%
sort/mergeSort/250 3.599 ms 3.565 ms -0.9%
sort/mergeSort/300 4.397 ms 4.355 ms -1.0%
sort/quickSort/50 1.364 ms 1.352 ms -0.9%
sort/quickSort/100 5.656 ms 5.618 ms -0.7%
sort/quickSort/150 12.71 ms 12.62 ms -0.7%
sort/quickSort/200 22.58 ms 22.37 ms -0.9%
sort/quickSort/250 35.57 ms 35.29 ms -0.8%
sort/quickSort/300 51.01 ms 50.89 ms -0.2%
sum/compiled-from-Haskell/sum-right-builtin/100 77.73 μs 78.05 μs +0.4%
sum/compiled-from-Haskell/sum-right-builtin/500 407.5 μs 404.7 μs -0.7%
sum/compiled-from-Haskell/sum-right-builtin/1000 866.4 μs 862.2 μs -0.5%
sum/compiled-from-Haskell/sum-right-builtin/2500 2.666 ms 2.653 ms -0.5%
sum/compiled-from-Haskell/sum-right-builtin/5000 5.810 ms 5.759 ms -0.9%
sum/compiled-from-Haskell/sum-right-Scott/100 43.11 μs 43.15 μs +0.1%
sum/compiled-from-Haskell/sum-right-Scott/500 229.2 μs 227.8 μs -0.6%
sum/compiled-from-Haskell/sum-right-Scott/1000 485.1 μs 481.0 μs -0.8%
sum/compiled-from-Haskell/sum-right-Scott/2500 1.713 ms 1.691 ms -1.3%
sum/compiled-from-Haskell/sum-right-Scott/5000 4.122 ms 4.087 ms -0.8%
sum/compiled-from-Haskell/sum-right-data/100 250.4 μs 246.7 μs -1.5%
sum/compiled-from-Haskell/sum-right-data/500 1.385 ms 1.361 ms -1.7%
sum/compiled-from-Haskell/sum-right-data/1000 3.120 ms 3.077 ms -1.4%
sum/compiled-from-Haskell/sum-right-data/2500 8.337 ms 8.292 ms -0.5%
sum/compiled-from-Haskell/sum-right-data/5000 17.79 ms 17.18 ms -3.4%
sum/compiled-from-Haskell/sum-left-builtin/100 74.87 μs 75.13 μs +0.3%
sum/compiled-from-Haskell/sum-left-builtin/500 392.5 μs 389.8 μs -0.7%
sum/compiled-from-Haskell/sum-left-builtin/1000 835.5 μs 837.1 μs +0.2%
sum/compiled-from-Haskell/sum-left-builtin/2500 2.565 ms 2.573 ms +0.3%
sum/compiled-from-Haskell/sum-left-builtin/5000 5.676 ms 5.686 ms +0.2%
sum/compiled-from-Haskell/sum-left-Scott/100 41.69 μs 41.90 μs +0.5%
sum/compiled-from-Haskell/sum-left-Scott/500 221.9 μs 221.5 μs -0.2%
sum/compiled-from-Haskell/sum-left-Scott/1000 488.2 μs 487.0 μs -0.2%
sum/compiled-from-Haskell/sum-left-Scott/2500 1.616 ms 1.603 ms -0.8%
sum/compiled-from-Haskell/sum-left-Scott/5000 4.041 ms 4.053 ms +0.3%
sum/compiled-from-Haskell/sum-left-data/100 258.1 μs 249.2 μs -3.4%
sum/compiled-from-Haskell/sum-left-data/500 1.422 ms 1.379 ms -3.0%
sum/compiled-from-Haskell/sum-left-data/1000 3.198 ms 3.083 ms -3.6%
sum/compiled-from-Haskell/sum-left-data/2500 8.530 ms 8.265 ms -3.1%
sum/compiled-from-Haskell/sum-left-data/5000 18.11 ms 17.44 ms -3.7%
sum/hand-written-PLC/sum-right-builtin/100 51.13 μs 50.90 μs -0.4%
sum/hand-written-PLC/sum-right-builtin/500 256.1 μs 256.8 μs +0.3%
sum/hand-written-PLC/sum-right-builtin/1000 532.8 μs 528.5 μs -0.8%
sum/hand-written-PLC/sum-right-builtin/2500 1.546 ms 1.532 ms -0.9%
sum/hand-written-PLC/sum-right-builtin/5000 3.441 ms 3.405 ms -1.0%
sum/hand-written-PLC/sum-right-Scott/100 33.56 μs 33.50 μs -0.2%
sum/hand-written-PLC/sum-right-Scott/500 179.5 μs 179.9 μs +0.2%
sum/hand-written-PLC/sum-right-Scott/1000 398.7 μs 397.8 μs -0.2%
sum/hand-written-PLC/sum-right-Scott/2500 1.333 ms 1.336 ms +0.2%
sum/hand-written-PLC/sum-right-Scott/5000 4.025 ms 4.034 ms +0.2%
sum/hand-written-PLC/sum-left-builtin/100 54.70 μs 54.41 μs -0.5%
sum/hand-written-PLC/sum-left-builtin/500 268.9 μs 265.9 μs -1.1%
sum/hand-written-PLC/sum-left-builtin/1000 532.6 μs 526.8 μs -1.1%
sum/hand-written-PLC/sum-left-builtin/2500 1.330 ms 1.308 ms -1.7%
sum/hand-written-PLC/sum-left-builtin/5000 2.656 ms 2.602 ms -2.0%
sum/hand-written-PLC/sum-left-Scott/100 37.19 μs 37.09 μs -0.3%
sum/hand-written-PLC/sum-left-Scott/500 203.2 μs 203.6 μs +0.2%
sum/hand-written-PLC/sum-left-Scott/1000 450.7 μs 448.8 μs -0.4%
sum/hand-written-PLC/sum-left-Scott/2500 1.553 ms 1.550 ms -0.2%
sum/hand-written-PLC/sum-left-Scott/5000 4.349 ms 4.343 ms -0.1%
24dcdb4 114b4bd Change
TOTAL 320.0 ms 317.0 ms -0.9%

@effectfully effectfully force-pushed the effectfully/optimization/force-ifThenElse-delay branch from 114b4bd to c11221f Compare April 15, 2025 21:22
@effectfully
Copy link
Contributor Author

Well, definitely an improvement, but the benchmarks are less reliable than I thought they were. Here's numbers from two different runs:

image

image

Force _ (Delay _ t) -> t
-- Remove @Delay@s from @ifThenElse@ branches if the latter is @Force@d and the delayed term are
-- pure and work-free anyway.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should try the same for chooseList/chooseData/caseList/caseData, but let's start with the obvious thing and do it only for ifThenElse for now.

Comment on lines +1179 to +1191
(force ifThenElse
(equalsInteger 0 index)
(delay
(constr 0 [(unBData (force headList args))]))
(delay
(force
(force ifThenElse
(equalsInteger 1 index)
(delay
(constr 1
[ (unBData
(force headList args)) ]))
(delay (traceError "PT1")))))))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Man casing on Data is gonna be a big deal.

(force ifThenElse
(lessThanInteger 0 x)
(delay (constr 1 []))
(delay (constr 0 []))))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait, how is this not optimized?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed by 68a00d1.

Comment on lines 0 to 1
1890 No newline at end of file
1782
Copy link
Contributor Author

@effectfully effectfully Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-5.7% size.

Comment on lines 0 to 1
3371 No newline at end of file
3149
Copy link
Contributor Author

@effectfully effectfully Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-6.6% size.

Comment on lines 0 to 2
({cpu: 6062960
| mem: 30520}) No newline at end of file
({cpu: 5742960
| mem: 28520})
Copy link
Contributor Author

@effectfully effectfully Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-5.3% CPU and -6.6% MEM.

Comment on lines 0 to 2
({cpu: 941965986
| mem: 4508802}) No newline at end of file
({cpu: 909933986
| mem: 4308602})
Copy link
Contributor Author

@effectfully effectfully Apr 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

-3.4% CPU and -4.4% MEM.

Copy link
Member

@zliu41 zliu41 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice, I'm surprised no one thought of doing this earlier.

Is there any part of Note [Cancelling interleaved Force-Delay pairs] that needs to be updated?

@effectfully effectfully force-pushed the effectfully/optimization/force-ifThenElse-delay branch from 68a00d1 to 9e53d84 Compare April 16, 2025 15:20
@effectfully effectfully enabled auto-merge (squash) April 16, 2025 16:56
@effectfully
Copy link
Contributor Author

Is there any part of Note [Cancelling interleaved Force-Delay pairs] that needs to be updated?

Done.

@effectfully effectfully merged commit a213376 into master Apr 16, 2025
7 checks passed
@effectfully effectfully deleted the effectfully/optimization/force-ifThenElse-delay branch April 16, 2025 18:45
Copy link
Contributor

@ana-pantilie ana-pantilie left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for reviewing this late! I think it's fine if the ifThenElse is fully applied. @ramsay-t we'll have to modify the certifier as well, I created an issue for you: https://github.com/IntersectMBO/plutus-private/issues/1545.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants