Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New changelog #4384

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open

New changelog #4384

wants to merge 2 commits into from

Conversation

geyslan
Copy link
Member

@geyslan geyslan commented Nov 11, 2024

1. Explain what the PR does

8de2e12 chore(changelog): reduce mem footprint
29fb65b chore: benchmark changelog.Set()

8de2e12 chore(changelog): reduce mem footprint

The current Changelog structure consumes a significant amount of memory
due to the allocation of metadata for each field. As the number of
fields increases, the memory usage grows linearly. Approximately 240
bytes per field were observed just for metadata, excluding the actual
data and pointers for each field.

To reduce memory consumption, the new changelog.Entries implementation
uses a flat slice of any type instead of storing metadata for each field
separately.

---

| Caches | GOGC | Branch | *Heap Use | *Heap   | Diff of | Proctree |
|        |      |        | (Avg)     | Growth  | main    |          |
|--------|------|--------|--------- -|------- -|---------|----------|
| -      | -    | main   | 28        | -       | -       | off      |
| 16384  | -    | main   | 199       | 181     | -       | on       |
| 32768  | -    | main   | 331       | 313     | -       | on       |
| -      | 5    | main   | 18        | -       | -       | off      |
| 16384  | 5    | main   | 125       | 107     | -       | on       |
| 32768  | 5    | main   | 209       | 191     | -       | on       |
|---------------------------------- --------- ----------------------|
| -      | -    | new    | 28        | -       | -       | off      |
| 16384  | -    | new    | 119       | 91      | -xxxxx% | on       |
| 32768  | -    | new    | 172       | 144     | -xxxxx% | on       |
| -      | 5    | new    | 18        | -       | -       | off      |
| 16384  | 5    | new    | 76        | 58      | -xxxxx% | on       |
| 32768  | 5    | new    | 111       | 93      | -xxxxx% | on       |

* in MB

With GOGC set to 5, the new implementation reduces average heap usage by
approximately 54% when using cache sizes of 16,384. For cache sizes of
32,768, the reduction is around 48%.

When GOGC is set to the default value, the reductions are roughly 50%
and 46% for cache sizes of 16,384 and 32,768, respectively.

The "Heap in Use" column serves as a good indicator of memory
consumption and can assist in determining optimal cache sizes.

---

Proctree Stressor Benchmark CPU time (5 threads, 700000 iterations each)

| Method | New      | New (CPU %) | Old       | Old (CPU %) | Overhead |
|--------|----------|-------------|-----------|-------------|----------|
| Set    | 0.41 min | 0.74%       | < 0.01 hs | 0.87%       | -15.0%   |
| Get    | 0.04 min | 0.07%       | < 0.01 hs | 0.87%       | -92.0%   |

2. Explain how to test it

3. Other comments

The current Changelog structure consumes a significant amount of memory
due to the allocation of metadata for each field. As the number of
fields increases, the memory usage grows linearly. Approximately 240
bytes per field were observed just for metadata, excluding the actual
data and pointers for each field.

To reduce memory consumption, the new changelog.Entries implementation
uses a flat slice of any type instead of storing metadata for each field
separately.

---

| Caches | GOGC | Branch | *Heap Use | *Heap   | Diff of | Proctree |
|        |      |        | (Avg)     | Growth  | main    |          |
|--------|------|--------|--------- -|------- -|---------|----------|
| -      | -    | main   | 28        | -       | -       | off      |
| 16384  | -    | main   | 199       | 181     | -       | on       |
| 32768  | -    | main   | 331       | 313     | -       | on       |
| -      | 5    | main   | 18        | -       | -       | off      |
| 16384  | 5    | main   | 125       | 107     | -       | on       |
| 32768  | 5    | main   | 209       | 191     | -       | on       |
|---------------------------------- --------- ----------------------|
| -      | -    | new    | 28        | -       | -       | off      |
| 16384  | -    | new    | 119       | 91      | -50.28% | on       |
| 32768  | -    | new    | 172       | 144     | -46.01% | on       |
| -      | 5    | new    | 18        | -       | -       | off      |
| 16384  | 5    | new    | 76        | 58      | -54.21% | on       |
| 32768  | 5    | new    | 111       | 93      | -48.69% | on       |

* in MB

With GOGC set to 5, the new implementation reduces average heap usage by
approximately 54% when using cache sizes of 16,384. For cache sizes of
32,768, the reduction is around 48%.

When GOGC is set to the default value, the reductions are roughly 50%
and 46% for cache sizes of 16,384 and 32,768, respectively.

The "Heap in Use" column serves as a good indicator of memory
consumption and can assist in determining optimal cache sizes.

---

Proctree Stressor Benchmark CPU time (5 threads, 700000 iterations each)

| Method | New      | New (CPU %) | Old       | Old (CPU %) | Overhead |
|--------|----------|-------------|-----------|-------------|----------|
| Set    | 0.41 min | 0.74%       | < 0.01 hs | 0.87%       | -15.0%   |
| Get    | 0.04 min | 0.07%       | < 0.01 hs | 0.87%       | -92.0%   |
Comment on lines +32 to +33
interpreter: NewFileInfo(),
interp: NewFileInfo(),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also turns interpreters fields on, previously disabled by: #4058

So, the heap reduction already considers the interpreter fields enabled again.

@geyslan
Copy link
Member Author

geyslan commented Nov 12, 2024

@trvll FYI.

@NDStrahilevitz
Copy link
Collaborator

I will give this a full review tomorrow. I would like to ask beforehand why we gave up on generic comparable types entirely? Was there a significant memory footprint deriving from that usage?

@geyslan
Copy link
Member Author

geyslan commented Nov 13, 2024

I would like to ask beforehand why we gave up on generic comparable types entirely?

Currently I can't see any specific use case for generic implementation.

Was there a significant memory footprint deriving from that usage?

Before tackling this PR I've tested replacing all generic changelog fields with primitive ones (no-changelog branch). You can notice a huge decrement of heap in use after a stress that don't force changelog store more than one change (in other words, that stores an unique changelog value per field).

image

@geyslan
Copy link
Member Author

geyslan commented Nov 13, 2024

@NDStrahilevitz

Currently I can't see any specific use case for generic implementation.

I've got a considerable reduction by using generics (getting rid of the interface pointers), so hold on. Soon it will be pushed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants