Skip to content

[Storage] Add Closer to Batch interface and use it to prevent memory leak (BadgerDB & Pebble) #7257

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Apr 21, 2025

Conversation

fxamacker
Copy link
Member

@fxamacker fxamacker commented Apr 9, 2025

Closes #7258
Updates #6527

Currently, the Batch interface doesn't provide a Close function to release memory of the batch. Not calling the database function to close or cancel a batch can cause "unbounded memory consumption" according to BadgerDB v2 docs. And for Pebble, we should use Batch.Close() for both non-error and error paths.

This PR adds a Closer to Batch interface so it can be used to release memory of the batch. The Close() function can be called as a defer statement immediately after creating Batch to reduce risk of unbounded memory consumption.

This PR also updates existing code to call Close() after a new Batch is created.

I found this while getting familiar with storage code related to BadgerDB and Pebble.

Details

BadgerDB v2 docs for WriteBatch.Cancel() says:

Cancel function must be called if there's a chance that Flush might not get called. If neither Flush or Cancel is called, the transaction oracle would never get a chance to clear out the row commit timestamp map, thus causing an unbounded memory consumption. Typically, you can call Cancel as a defer statement right after NewWriteBatch is called.

For example, the error path in WithReaderBatchWriter() currently has a memory leak when using BadgerDB.

Pebble docs says for Batch.Close():

Close closes the batch without committing it.

Pebble needs their Batch.Close() function to be called for both non-error and error paths.

Currently, the error path in WithReaderBatchWriter()
has a memory leak when using BadgerDB.

This commit calls WriteBatch.Cancel() as a defer statement
right after BadgerDB WriteBatch is created.

BadgerDB v2 docs for WriteBatch.Cancel() says:

"Cancel function must be called if there's a chance that Flush
might not get called. If neither Flush or Cancel is called, the
transaction oracle would never get a chance to clear out the row
commit timestamp map, thus causing an unbounded memory consumption.
Typically, you can call Cancel as a defer statement right after
NewWriteBatch is called."
@fxamacker fxamacker requested review from zhangchiqing and a team April 9, 2025 23:52
@fxamacker fxamacker self-assigned this Apr 9, 2025
@codecov-commenter
Copy link

codecov-commenter commented Apr 9, 2025

Codecov Report

Attention: Patch coverage is 68.18182% with 14 lines in your changes missing coverage. Please review.

Project coverage is 41.32%. Comparing base (9fa5d10) to head (7ad6c30).

Files with missing lines Patch % Lines
storage/mock/batch.go 0.00% 12 Missing ⚠️
...ck-executed-height/cmd/rollback_executed_height.go 0.00% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##           master    #7257   +/-   ##
=======================================
  Coverage   41.31%   41.32%           
=======================================
  Files        2174     2174           
  Lines      190514   190558   +44     
=======================================
+ Hits        78713    78745   +32     
- Misses     105223   105236   +13     
+ Partials     6578     6577    -1     
Flag Coverage Δ
unittests 41.32% <68.18%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

This commit adds Closer to Batch interface to release memory
of th batch.  This can be called as a defer statement
immediately after creating Batch to reduce risk of unbounded
memory consumption.

This commit also updates existing code to call Close after
a new Batch is created.
@fxamacker fxamacker changed the title [Storage] Fix memory leak in WithReaderBatchWriter() when using BadgerDB [Storage] Add Close() to Batch interface and use it to prevent memory leak (BadgerDB & Pebble) Apr 10, 2025
@fxamacker fxamacker changed the title [Storage] Add Close() to Batch interface and use it to prevent memory leak (BadgerDB & Pebble) [Storage] Add Closer to Batch interface and use it to prevent memory leak (BadgerDB & Pebble) Apr 10, 2025
Copy link
Member

@zhangchiqing zhangchiqing left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the fix Faye.

@fxamacker fxamacker requested review from a team April 10, 2025 18:00
Copy link
Contributor

@janezpodhostnik janezpodhostnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great change!

@fxamacker fxamacker added this pull request to the merge queue Apr 21, 2025
Merged via the queue into master with commit e0f788a Apr 21, 2025
56 checks passed
@fxamacker fxamacker deleted the fxamacker/release-badgerdb-writebatch branch April 21, 2025 13:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a Close() to Batch interface and use it to prevent memory leak (BadgerDB & Pebble)
4 participants