Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run a separate in memory snapshot to reduce number of entries stored in raft memory storage #18825

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

serathius
Copy link
Member

@serathius serathius commented Nov 2, 2024

Part of #17098

Alternative to #18635

Goal: Reduce number of raft entries stored in memory

Context:

  • Etcd maintains 2 different independently updates storages. V2 - stored as snapshots every X entries (100`000 by default), V3 - memory mapped bbolt storage, flushed every 5 seconds to disk
  • Etcd maintains guarantee for the data stored on disk. V3 storage needs to be always fresher than V2 snapshot. Meaning V3 consistent index is >= index of last snapshot.
  • This guarantee requires that before writing any V2 snapshots etcd needs to flush V3 storage which is costly.
  • In etcd v3.6 no longer uses v2 storage as source of truth, however it maintains periodic snapshot generation for backward compatibility.
  • Raft in memory store needs to store last snapshot and all entries since. However, this snapshot doesn't need to be persisted to disk.

Proposal:

  • Separate disk and memory snapshotting.
  • Disk snapshotting will be used for backward compatibility. Done every --snapshot-count.
  • Memory snapshots would no longer be bound by disk overhead.

Benchmark results ./bin/tools/benchmark put --total=15000 --val-size=100000

Scenario Snapshot count Memory snapshot count CPU[%] Memory[MB] PUTs per second
Base 10000 n/a 49.4 1954.5 1029.2
Snapshot in memory 10000 1 51.1 1330.4 1006.1
Snapshot in memory 10000 10 48.9 1264.2 1046.3
Snapshot in memory 10000 100 48.1 1210.4 1051.8
Snapshot in memory 10000 1000 48.2 1360.0 1047.2
Snapshot in memory 10000 10000 49.4 2040.3 1027.4

@k8s-ci-robot
Copy link

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@k8s-ci-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: serathius

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@codecov-commenter
Copy link

codecov-commenter commented Nov 2, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 91.89189% with 3 lines in your changes missing coverage. Please review.

Project coverage is 68.75%. Comparing base (3de0018) to head (fc7dfa3).
Report is 2 commits behind head on main.

Current head fc7dfa3 differs from pull request most recent head df6ba3f

Please upload reports for the commit df6ba3f to get more accurate results.

Files with missing lines Patch % Lines
server/etcdserver/server.go 91.89% 3 Missing ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
Files with missing lines Coverage Δ
server/etcdserver/api/membership/cluster.go 88.41% <ø> (-0.13%) ⬇️
server/etcdserver/server.go 80.95% <91.89%> (-0.22%) ⬇️

... and 24 files with indirect coverage changes

@@            Coverage Diff             @@
##             main   #18825      +/-   ##
==========================================
- Coverage   68.76%   68.75%   -0.01%     
==========================================
  Files         420      420              
  Lines       35523    35528       +5     
==========================================
  Hits        24426    24426              
- Misses       9665     9670       +5     
  Partials     1432     1432              

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3de0018...df6ba3f. Read the comment docs.

…in raft memory storage

Signed-off-by: Marek Siarkowicz <siarkowicz@google.com>
Comment on lines +2188 to +2205
func (s *EtcdServer) snapshotToMemory(snapi uint64, confState raftpb.ConfState) {
d := GetMembershipInfoInV2Format(s.Logger(), s.cluster)

lg := s.Logger()

// For backward compatibility, generate v2 snapshot from v3 state.
snap, err := s.r.raftStorage.CreateSnapshot(snapi, &confState, d)
if err != nil {
// the snapshot was done asynchronously with the progress of raft.
// raft might have already got a newer snapshot.
if errorspkg.Is(err, raft.ErrSnapOutOfDate) {
return
}
lg.Panic("failed to create snapshot", zap.Error(err))
}

verifyConsistentIndexIsLatest(lg, snap, s.consistIndex.ConsistentIndex())
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You do not persist the snapshot to disk. Is it intentionally? You will still have the same issue as #18588?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please read the description of the PR for context.

Comment on lines -902 to -907
c.lg.Info(
"snapshot storing member",
zap.String("id", m.ID.String()),
zap.Strings("peer-urls", m.PeerURLs),
zap.Bool("is-learner", m.IsLearner),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove this log ?

Copy link
Member Author

@serathius serathius Nov 4, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now happens too frequently and spams logs

Comment on lines -2205 to -2208
lg.Info(
"compacted Raft logs",
zap.Uint64("compact-index", compacti),
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why remove this log?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now happens too frequently and spams logs

Comment on lines -2191 to -2195
compacti := uint64(1)
if snapi > s.Cfg.SnapshotCatchUpEntries {
compacti = snapi - s.Cfg.SnapshotCatchUpEntries
if snapi <= s.Cfg.SnapshotCatchUpEntries {
return
}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be unrelated to this PR, can we do it in a separate small PR?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, compactLog was causing some issue with when called too frequently. Need to double check.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

Successfully merging this pull request may close these issues.

4 participants