Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Don't use io.ReadAll while recycling indices #878

Merged
merged 7 commits into from
Sep 20, 2022

Conversation

mtojek
Copy link
Contributor

@mtojek mtojek commented Sep 20, 2022

This PR replaces io.ReadAll with json.Decode as the routine doesn't need the index content. I'm not convinced about this structure as it uses json.Unmarshaller, but I don't have an idea how to improve it.

Right now:

Zrzut ekranu 2022-09-20 o 12 19 19

If you can observe some other places to refactor, feel free to comment.

Testing (if you want to play):

go run . -feature-storage-indexer -storage-indexer-bucket-internal gs://mtojek-temp/package-storage-v2 -log-level debug -httpprof ":6060"

mtojek-temp is in the elastic-observability namespace.

@mtojek mtojek self-assigned this Sep 20, 2022
@mtojek mtojek requested a review from a team September 20, 2022 10:21
@elasticmachine
Copy link

elasticmachine commented Sep 20, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-09-20T11:09:59.017+0000

  • Duration: 5 min 34 sec

Test stats 🧪

Test Results
Failed 0
Passed 213
Skipped 0
Total 213

🤖 GitHub comments

Expand to view the GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

@jsoriano
Copy link
Member

I'm not convinced about this structure as it uses json.Unmarshaller, but I don't have an idea how to improve it.

These calls to json.Unmarshaller will be done for each package JSON one after the other, this should be fine memory-wise. Have you found performance problems there?

If you can observe some other places to refactor, feel free to comment.

Other place that may be copying too many objects is transformSearchIndexAllToPackages, that seems to be copying all package manifests.
Maybe this transformation could be done directly by loadSearchIndexAll. Or searchIndexAll struct could contain pointers to manifests that can be reused in the transform step.

Copy link
Member

@jsoriano jsoriano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 only by not keeping the whole index response in memory there should be improvements in memory usage.

@mtojek
Copy link
Contributor Author

mtojek commented Sep 20, 2022

These calls to json.Unmarshaller will be done for each package JSON one after the other, this should be fine memory-wise. Have you found performance problems there?

It depends on how json manages the decode cache inside, but I'm not master of that area, so can't answer if we can spot fields to improve there.

Other place that may be copying too many objects is transformSearchIndexAllToPackages, that seems to be copying all package manifests.

Agree, but I don't see this part "hot" in the calltree graph. Only 500 KB is used there.

I don't know if there is any json marshaller that deallocates used memory. I tried to play with jsoniter, but it gave similar results (as the logic doesn't jump over fields).

Still 👍 to merge it, @jsoriano? :)

@jsoriano
Copy link
Member

Still +1 to merge it, @jsoriano? :)

Change makes sense to me, so ok to merge :)

Though not sure if it will solve all the problems. But we can iterate. Do you have a profile with the new code?

@mtojek
Copy link
Contributor Author

mtojek commented Sep 20, 2022

Do you have a profile with the new code?

Do you mean http://localhost:6060/debug/pprof/heap? Yes, I dumped it, but it's easy to generate it just in case.

@mtojek
Copy link
Contributor Author

mtojek commented Sep 20, 2022

Just in case, before introducing this change:

Zrzut ekranu 2022-09-20 o 13 54 27

@mtojek mtojek merged commit b563585 into elastic:main Sep 20, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants