Skip to content

Commit 33103d5

Browse files
authored
Merge pull request #175 from georges-hatem/main
readme last thoughts
2 parents 4a03159 + 70e779e commit 33103d5

File tree

1 file changed

+8
-0
lines changed

1 file changed

+8
-0
lines changed

entries/ghatem-fpc/README.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -219,6 +219,9 @@ Some numbers for perspective:
219219
According to `Valgrind`, one of the functions cost 23% of the total program execution time. **9%** of total time was spent executing `fpc_stackcheck`, which is implicitly executed before the function call, and verifies correctness of the stack frame. Inlining the function seems to remove this cost (since the code is substituted).
220220
Alternatively, if a method is purely written in `asm` (assembly), the directive `nostackframe` will indicate not to generate a stack frame, but I did not venture to this level.
221221

222+
There is a catch, however: in FreePascal at least, if method A calls method B, both of them cannot be marked as `inline`.
223+
For this reason, some code had to be duplicated into both methods A and B, so as to benefit from optimal inlining.
224+
222225
## unrolling loops
223226

224227
Loop unrolling is another space-time tradeoff, which in this case we did manually, when knowing ahead of time that a certain loop will execute `N` number of times.
@@ -351,4 +354,9 @@ Definitely more time than I would like to admit :)
351354

352355
Nonetheless, much was learned, and I leave the challenge with still many questions as to why certain optimization attempts did not yield any improvements.
353356

357+
In addition, one must decide for each such optimization, whether it is appropriate to make use of it in a given project / product or team. In the OneBRC challenge, there are no long-term implications, and we can optimize without restraint or compromise.
358+
However, the same cannot be said for mature, long-lived, production-ready applications where maintainability is key: at this point, one must look at each optimization and decide whether it is worth the readability compromise or not. At a glance, branchless code is definitely a no-go. Deduplication of code for optimal inlining is probably a second.
359+
360+
Of course, there is no universal truth to such decisions, and the choice must be made keeping in mind the project at stake, and the team responsible for its development.
361+
354362
Finally, I would like to thank @gcarreno for organizing this event, as well as @paweld and @abouchez for providing additional testing results and hints.

0 commit comments

Comments
 (0)