You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: entries/ghatem-fpc/README.md
+8Lines changed: 8 additions & 0 deletions
Original file line number
Diff line number
Diff line change
@@ -219,6 +219,9 @@ Some numbers for perspective:
219
219
According to `Valgrind`, one of the functions cost 23% of the total program execution time. **9%** of total time was spent executing `fpc_stackcheck`, which is implicitly executed before the function call, and verifies correctness of the stack frame. Inlining the function seems to remove this cost (since the code is substituted).
220
220
Alternatively, if a method is purely written in `asm` (assembly), the directive `nostackframe` will indicate not to generate a stack frame, but I did not venture to this level.
221
221
222
+
There is a catch, however: in FreePascal at least, if method A calls method B, both of them cannot be marked as `inline`.
223
+
For this reason, some code had to be duplicated into both methods A and B, so as to benefit from optimal inlining.
224
+
222
225
## unrolling loops
223
226
224
227
Loop unrolling is another space-time tradeoff, which in this case we did manually, when knowing ahead of time that a certain loop will execute `N` number of times.
@@ -351,4 +354,9 @@ Definitely more time than I would like to admit :)
351
354
352
355
Nonetheless, much was learned, and I leave the challenge with still many questions as to why certain optimization attempts did not yield any improvements.
353
356
357
+
In addition, one must decide for each such optimization, whether it is appropriate to make use of it in a given project / product or team. In the OneBRC challenge, there are no long-term implications, and we can optimize without restraint or compromise.
358
+
However, the same cannot be said for mature, long-lived, production-ready applications where maintainability is key: at this point, one must look at each optimization and decide whether it is worth the readability compromise or not. At a glance, branchless code is definitely a no-go. Deduplication of code for optimal inlining is probably a second.
359
+
360
+
Of course, there is no universal truth to such decisions, and the choice must be made keeping in mind the project at stake, and the team responsible for its development.
361
+
354
362
Finally, I would like to thank @gcarreno for organizing this event, as well as @paweld and @abouchez for providing additional testing results and hints.
0 commit comments