Skip to content

Comments

[Store] Notify master on disk eviction to fix stale metadata #1549

Open
duhaode520 wants to merge 4 commits intokvcache-ai:mainfrom
duhaode520:feature/eviction-master-notification
Open

[Store] Notify master on disk eviction to fix stale metadata #1549
duhaode520 wants to merge 4 commits intokvcache-ai:mainfrom
duhaode520:feature/eviction-master-notification

Conversation

@duhaode520
Copy link

Description

When StorageBackend evicts local disk files via FIFO eviction (EnsureDiskSpaceEvictFile) to make space for new writes, the master is not notified. This causes stale DiskDescriptor / LocalDiskDescriptor metadata to persist in the master, pointing to files that no longer exist on disk. Subsequent Get requests are routed to non-existent files, causing metadata not found errors.

This issue was identified during PR #1028 review (comment by @ykwd: "Before removing a file, the master should be notified"). Related issues: #1078.

Problem

Put(key) → PutStart → StoreObject → EnsureDiskSpace
                                         ↓
                                    EvictFile()  ← deletes local file, does NOT notify master
                                         ↓
                                    Master retains stale DiskDescriptor for evicted key

Solution

  1. FileRecord gains a key field — tracks which object key each file belongs to, so eviction can report the affected key.

  2. StoreObject / EnsureDiskSpace return evicted keys — callers receive a std::vector<std::string> of keys whose files were evicted during the write.

  3. New EvictDiskReplica RPC — a dedicated master-side operation that removes DISK or LOCAL_DISK replicas for a key without affecting memory replicas. Unlike PutRevoke (PROCESSING-only) or Remove (deletes entire key), this works on COMPLETE replicas and preserves memory replicas.

  4. Client-side notification in both write paths:

    • PutToLocalFile: after StoreObject, iterates evicted keys and calls EvictDiskReplica(key, DISK)
    • BatchOffload: new eviction_handler callback parameter; FileStorage passes a handler calling EvictDiskReplica(key, LOCAL_DISK)

Scope

  • In scope: Runtime eviction triggered by EnsureDiskSpace during StoreObject (both PutToLocalFile and BatchOffload paths).
  • Out of scope: InitQuotaEvict (startup-time eviction) — can rely on ScanMeta to re-sync with master on restart.

Type of Change

  • Types
    • Bug fix
    • New feature
      • Transfer Engine
      • Mooncake Store
      • Mooncake EP
      • Integration
      • P2P Store
      • Python Wheel
    • Breaking change
    • CI/CD
    • Documentation update
    • Other

How Has This Been Tested?

Unit Tests (storage_backend_test.cpp)

  • StoreObjectReturnsEvictedKeys: fills 2KB quota with 2×1KB files, writes 3rd → verifies evicted key returned in FIFO order
  • StoreObjectEvictionWithEmptyKey: writes without key → verifies empty eviction list (backward compat)
  • AdaptorBatchOffload_EvictionHandlerCalled: 3KB quota, writes 4×1KB files → verifies eviction_handler called with correct evicted key

Unit Tests (master_service_ssd_test.cpp)

  • EvictDiskReplica_RemovesDiskReplica: creates MEMORY+DISK replicas, evicts DISK → verifies MEMORY preserved
  • EvictDiskReplica_NonExistentKeyReturnsError: verifies OBJECT_NOT_FOUND
  • EvictDiskReplica_InvalidReplicaTypeReturnsError: verifies INVALID_PARAMS for MEMORY type

Integration Tests (client_integration_test.cpp)

  • EvictionNotifiesMaster: starts InProcMaster with 3KB disk quota, puts 3×1KB keys (fills quota), puts 4th key → verifies key_0's DISK replica removed from master metadata while keys 1-3 retain their replicas

All 3 test suites pass: storage_backend_test (49), master_service_ssd_test (10), client_integration_test (11).

Checklist

  • I have performed a self-review of my own code.
  • I have formatted my own code using ./scripts/code_format.sh before submitting.
  • I have updated the documentation.
  • I have added tests to prove my changes are effective.

- FileRecord gains a key field to track evicted object keys
- StoreObject returns evicted key list; EvictFile returns FileRecord
- EnsureDiskSpace collects evicted keys through FIFO eviction loop
- MasterService::EvictDiskReplica(client_id, key, replica_type) with
  DISK vs LOCAL_DISK distinction and per-client filtering
- Wire through RPC layer: WrappedMasterService + MasterClient
- Unit tests for StorageBackend eviction and MasterService EvictDiskReplica
- PutToLocalFile passes key to StoreObject, calls EvictDiskReplica for
  each evicted key with ReplicaType::DISK
- Client::EvictDiskReplica public method forwarding to MasterClient
- FileStorage::OffloadObjects creates eviction_handler lambda calling
  EvictDiskReplica with ReplicaType::LOCAL_DISK, passes to BatchOffload
- StorageBackendAdaptor::BatchOffload invokes eviction_handler per
  evicted key from StoreObject
- Unit test for eviction_handler callback in BatchOffload
- Extend InProcMasterConfig with root_fs_dir, enable_disk_eviction, and
  quota_bytes to allow tests to configure disk replica behavior
- Add EvictionNotificationTest fixture with small quota (3KB) to trigger
  FIFO eviction after 3 x 1KB puts
- Verify evicted key's DISK replica is removed from master metadata via
  EvictDiskReplica RPC, and non-evicted keys retain their DISK replicas
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @duhaode520, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request addresses a critical issue where the master service's metadata could become stale after local disk eviction, leading to data access errors. The changes introduce a robust notification system, ensuring that the master is always aware of locally evicted disk replicas. This improves data consistency and reliability by actively synchronizing the master's view of stored objects with the actual state of local storage.

Highlights

  • Master Notification for Disk Eviction: Implemented a mechanism to notify the master service when local disk files are evicted by the StorageBackend, preventing stale metadata from persisting on the master.
  • New RPC for Disk Replica Eviction: Introduced a new EvictDiskReplica RPC on the master, allowing clients to specifically remove DISK or LOCAL_DISK replicas for a key without affecting memory replicas or deleting the entire key.
  • Storage Backend Enhancements: Modified FileRecord to track the associated object key, and updated StoreObject and EnsureDiskSpace methods in StorageBackend to return a list of keys corresponding to evicted files during write operations.
  • Client-Side Eviction Handling: Integrated client-side logic in PutToLocalFile and BatchOffload to call the new EvictDiskReplica RPC for any keys evicted by the local storage backend.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Changelog
  • mooncake-store/include/client_service.h
    • Added EvictDiskReplica method to notify the master about locally evicted disk replicas.
  • mooncake-store/include/master_client.h
    • Added EvictDiskReplica method to the MasterClient interface for notifying the master.
  • mooncake-store/include/master_config.h
    • Added root_fs_dir, enable_disk_eviction, and quota_bytes fields to InProcMasterConfig struct.
    • Added corresponding builder methods (set_root_fs_dir, set_enable_disk_eviction, set_quota_bytes) to InProcMasterConfigBuilder.
  • mooncake-store/include/master_service.h
    • Added EvictDiskReplica method to the MasterService class for handling disk replica eviction requests.
  • mooncake-store/include/rpc_service.h
    • Added EvictDiskReplica method to WrappedMasterService for RPC handling.
  • mooncake-store/include/storage_backend.h
    • Updated FileRecord struct to include a key field for tracking associated object keys.
    • Modified BatchOffload methods in StorageBackendInterface and derived classes to accept an optional eviction_handler callback.
    • Changed StoreObject methods to return tl::expected<std::vector<std::string>, ErrorCode> (list of evicted keys) and accept an optional key parameter.
    • Updated EvictFile to return a FileRecord instead of a std::string.
    • Modified AddFileToWriteQueue to accept an optional key parameter.
    • Changed EnsureDiskSpace to return tl::expected<std::vector<std::string>, ErrorCode> (list of evicted keys).
  • mooncake-store/src/client_service.cpp
    • Implemented Client::EvictDiskReplica by forwarding the call to master_client_.
    • Modified Client::PutToLocalFile to pass the object key to StoreObject and iterate over returned evicted keys to notify the master via EvictDiskReplica.
  • mooncake-store/src/file_storage.cpp
    • Modified FileStorage::OffloadObjects to provide an eviction_handler to storage_backend_->BatchOffload, which calls client_->EvictDiskReplica for LOCAL_DISK replicas.
  • mooncake-store/src/master_client.cpp
    • Added RpcNameTraits specialization for EvictDiskReplica.
    • Implemented MasterClient::EvictDiskReplica to invoke the corresponding RPC on the master.
  • mooncake-store/src/master_service.cpp
    • Implemented MasterService::EvictDiskReplica to remove DISK or LOCAL_DISK replicas from metadata based on the provided key and replica type, and to erase metadata if no valid replicas remain.
  • mooncake-store/src/rpc_service.cpp
    • Implemented WrappedMasterService::EvictDiskReplica to wrap the master service call and log RPC details.
    • Registered the EvictDiskReplica handler with the RPC server.
  • mooncake-store/src/storage_backend.cpp
    • Updated StorageBackend::Init to initialize FileRecord with an empty key when scanning existing files.
    • Modified StorageBackend::InitQuotaEvict to use the FileRecord returned by EvictFile.
    • Updated StorageBackend::StoreObject overloads to accept an optional key parameter, pass it to AddFileToWriteQueue, and return the list of evicted keys from EnsureDiskSpace.
    • Modified StorageBackend::EvictFile to return a FileRecord containing the path, size, and associated key of the evicted file.
    • Updated StorageBackend::AddFileToWriteQueue to store the provided key in the FileRecord.
    • Modified StorageBackend::EnsureDiskSpace to collect and return a vector of keys corresponding to files evicted to make space.
    • Updated StorageBackendAdaptor::BatchOffload to accept and utilize the eviction_handler callback.
    • Modified BucketStorageBackend::BatchOffload and OffsetAllocatorStorageBackend::BatchOffload to accept the eviction_handler parameter.
  • mooncake-store/tests/client_integration_test.cpp
    • Included <filesystem> header.
    • Added EvictionNotificationTest fixture to set up an in-process master with disk eviction enabled and a small quota.
    • Added DiskReplicaRemovedAfterEviction integration test to verify that the master is correctly notified and removes DISK replicas after local file eviction.
  • mooncake-store/tests/master_service_ssd_test.cpp
    • Added EvictDiskReplica_RemovesDiskReplica test to verify that the EvictDiskReplica RPC correctly removes disk replicas while preserving memory replicas.
    • Added EvictDiskReplica_NonExistentKeyReturnsError test to ensure an error is returned for non-existent keys.
    • Added EvictDiskReplica_InvalidReplicaTypeReturnsError test to confirm that an error is returned for invalid replica types (e.g., MEMORY).
  • mooncake-store/tests/storage_backend_test.cpp
    • Added StoreObjectReturnsEvictedKeys test to verify that StoreObject correctly returns keys of evicted files.
    • Added StoreObjectEvictionWithEmptyKey test to ensure that StoreObject returns an empty list of evicted keys when files without associated keys are evicted.
    • Added AdaptorBatchOffload_EvictionHandlerCalled test to confirm that the eviction_handler callback is invoked during BatchOffload when files are evicted.
  • mooncake-store/tests/test_server_helpers.h
    • Updated InProcMaster::Start to configure root_fs_dir, enable_disk_eviction, and quota_bytes for the WrappedMasterService based on the provided InProcMasterConfig.
Activity
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses the issue of stale metadata on the master after local disk eviction. The solution is well-designed, introducing a new EvictDiskReplica RPC and correctly propagating eviction information from the StorageBackend up to the client service to trigger master notification. The changes are comprehensive, covering both the Put and BatchOffload paths. The addition of unit and integration tests is commendable and significantly improves confidence in the fix. I've identified a couple of areas for improvement regarding code duplication and test clarity, which are detailed in the review comments.

Comment on lines +2669 to +2710
TEST_F(StorageBackendTest, AdaptorBatchOffload_EvictionHandlerCalled) {
// Test that the eviction_handler callback in BatchOffload is correctly
// invoked when the underlying StorageBackend evicts files during
// StoreObject. We use a direct StorageBackend with a small quota to
// guarantee eviction, then verify via StorageBackendAdaptor that
// the handler fires.
//
// Since StorageBackendAdaptor::Init doesn't forward quota to the
// underlying StorageBackend, we test at the StoreObject level (already
// covered by StoreObjectReturnsEvictedKeys) and verify the BatchOffload
// handler wiring here with a mock-like capture.

std::string test_dir = data_path + "/eviction_handler_test";
std::filesystem::create_directories(test_dir);

// Create backend with small quota (3072 bytes = room for ~3 files of 1024)
StorageBackend backend(test_dir, "", true);
auto init_result = backend.Init(3072);
ASSERT_TRUE(init_result.has_value());

// Pre-fill with keyed files
std::string data(1024, 'A');
auto r1 = backend.StoreObject(test_dir + "/f1", data, "key_1");
ASSERT_TRUE(r1.has_value());
auto r2 = backend.StoreObject(test_dir + "/f2", data, "key_2");
ASSERT_TRUE(r2.has_value());
auto r3 = backend.StoreObject(test_dir + "/f3", data, "key_3");
ASSERT_TRUE(r3.has_value());

// Now store one more, which should evict key_1
std::vector<std::string> evicted_keys;
auto r4 = backend.StoreObject(test_dir + "/f4", data, "key_4");
ASSERT_TRUE(r4.has_value());
for (const auto& ek : r4.value()) {
evicted_keys.push_back(ek);
}

EXPECT_FALSE(evicted_keys.empty())
<< "Should have evicted at least one key";
EXPECT_EQ(evicted_keys[0], "key_1")
<< "FIFO eviction should evict key_1 first";
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

This test case, AdaptorBatchOffload_EvictionHandlerCalled, appears to be redundant and its name is misleading. The implementation directly tests StorageBackend::StoreObject, which is already covered by StoreObjectReturnsEvictedKeys. It does not test StorageBackendAdaptor::BatchOffload or the eviction_handler callback as the name and comment suggest.

To improve test clarity and avoid redundancy, consider one of the following:

  1. Remove this test: Since its core logic is already tested, removing it would clean up the test suite.
  2. Refactor the test: If the goal is to test the eviction_handler wiring in StorageBackendAdaptor, the test should be refactored to instantiate a StorageBackendAdaptor and provide a mock eviction_handler to verify it's called with the correct evicted keys. This might require some adjustments to the StorageBackendAdaptor to allow injecting a StorageBackend with a specific quota for testing purposes.

As it stands, the test is confusing and duplicates existing test coverage.

@codecov-commenter
Copy link

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

❌ Patch coverage is 88.41463% with 38 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
mooncake-store/src/storage_backend.cpp 56.75% 16 Missing ⚠️
mooncake-store/src/file_storage.cpp 0.00% 9 Missing ⚠️
mooncake-store/src/master_service.cpp 72.72% 6 Missing ⚠️
mooncake-store/src/client_service.cpp 55.55% 4 Missing ⚠️
mooncake-store/tests/client_integration_test.cpp 96.90% 3 Missing ⚠️

📢 Thoughts on this report? Let us know!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants