Using Profile-Guided Optimization (PGO) to reduce the encode overhead even more

Hi!

Similarly to https://github.com/vincent-herlemont/native_db/discussions/92 I decided to perform PGO benchmarks on `native_model`. Here are the results.

## Test environment

* Fedora 39
* Linux kernel 6.6.9
* AMD Ryzen 9 5900x
* 48 Gib RAM
* SSD Samsung 980 Pro 2 Tib
* Compiler - Rustc 1.75
* native_model version: the latest for now from the `main` branch on commit `62b1e7cc35e64bce9feb22d6727a4f66fc9b9660`
* Disabled Turbo boost (for more stable results across benchmark runs)

## Benchmark

For benchmark purposes, I use built-in benchmarks with `cargo bench` command. For PGO optimization I use [cargo-pgo](https://github.com/Kobzol/cargo-pgo) tool. The same benchmark suite was used for the PGO training phase built with `cargo pgo bench`. PGO optimized results I got with `cargo pgo optimize bench`.

All measurements are done multiple times to check reproducibility - the results are stable across runs.

## Results

I got the following results:

* Release: https://gist.github.com/zamazan4ik/1696f6135bcb6d69250cee5ec5079619
* PGO optimized compared to Release: https://gist.github.com/zamazan4ik/5c876ec7f3f326735cfcfb13af25828d
* (just for reference) PGO instrumentation compared to Release: https://gist.github.com/zamazan4ik/be67cec26d896461ad52b6f99c4e91e9

At least according to the results above, PGO helps with achieving better overall performance with native_model. Probably the PGO-optimized build can suggest a way how to optimize native_model more aggressively (via comparing ASM for PGOed and non-PGOed native_model versions).
 
## Further steps

I can suggest the following action points:

* Perform more PGO benchmarks on native_model. If it shows improvements - add a note to the documentation about possible improvements in native_model performance with PGO. So native_model users will be aware of PGO effects on native_model performance and can decide to enable PGO for their native_model-based applications to achieve better performance.

Please treat the issue just as a benchmark report, not a problem or something like that. I created the issue just because the discussions are not enabled in this repo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Using Profile-Guided Optimization (PGO) to reduce the encode overhead even more #50

Test environment

Benchmark

Results

Further steps

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Using Profile-Guided Optimization (PGO) to reduce the encode overhead even more #50

Description

Test environment

Benchmark

Results

Further steps

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions