-
Notifications
You must be signed in to change notification settings - Fork 6
sm: imagemanager: add handling outdated items #496
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: feature_unification
Are you sure you want to change the base?
Conversation
Add list of currently processing layers and blobs and do not process same blobs in parallel. Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
We add update item to outdated item but only id is not enough to identify update item id. Add version param to identify update item unique. Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
StorageItf has two functions: GetItemsInfo and GetItemsInfos which is confusing. Rename GetItemsInfo to GetAllItemsInfos and GetItemsInfos to GetItemInfos. Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR adds handling of outdated items in the SM (Service Manager) image manager module. It introduces a storage interface for persisting update item data, implements background processing for removing expired items, and adds a new database module to provide storage implementations.
Changes:
- Adds storage interface and outdated item management to SM image manager
- Introduces new database module for SM with storage interface implementations
- Updates space allocator API to include version parameter for item tracking
- Adds comprehensive tests for outdated item handling, integrity validation, and orphan cleanup
Reviewed changes
Copilot reviewed 23 out of 23 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| src/core/sm/imagemanager/itf/storage.hpp | New storage interface for image manager with methods to add, update, remove, and query update items |
| src/core/sm/imagemanager/tests/stubs/storagestub.hpp | Test stub implementation of storage interface with future-based synchronization for tests |
| src/core/sm/imagemanager/imagemanager.hpp | Added Start/Stop methods, storage interface, background thread for outdated item processing, and updated allocator size |
| src/core/sm/imagemanager/imagemanager.cpp | Implemented outdated item handling, integrity validation, orphan cleanup, and thread-safe blob installation |
| src/core/sm/imagemanager/tests/imagemanager.cpp | Added comprehensive tests for outdated items, max versions, max stored items, integrity validation, and orphan cleanup |
| src/core/sm/launcher/launcher.cpp | Moved debug log statement into task lambda to avoid duplicate logging |
| src/core/sm/database/ | New database module with interface that aggregates storage interfaces from multiple SM modules |
| src/core/common/spaceallocator/ | Updated AddOutdatedItem and RestoreOutdatedItem API to include version parameter |
| src/core/cm/ | Updated CM code to match new space allocator API signatures |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| class DatabaseItf ["aos::cm::database::DatabaseItf"] { | ||
| <<interface>> | ||
| } | ||
|
|
||
| class ImageManagerStorageItf ["aos::cm::imagemanager::StorageItf"] { | ||
| <<interface>> | ||
| } | ||
|
|
||
| class LauncherStorageItf ["aos::cm::launcher::StorageItf"] { | ||
| <<interface>> | ||
| } | ||
|
|
||
| class NetworkManagerStorageItf ["aos::cm::networkmanager::StorageItf"] { | ||
| <<interface>> | ||
| } | ||
|
|
||
| DatabaseItf ..|> ImageManagerStorageItf | ||
| DatabaseItf ..|> LauncherStorageItf | ||
| DatabaseItf ..|> NetworkManagerStorageItf |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The mermaid diagram references "aos::cm::database::DatabaseItf" but this is the SM (Service Manager) module, not CM (Communication Manager). All references in the diagram should use "aos::sm::" namespace instead of "aos::cm::".
| } | ||
|
|
||
| /** | ||
| * Returns feature to wait when item removed. |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in comment: "Returns feature" should be "Returns future" based on the return type std::future.
| * Returns feature to wait when item removed. | |
| * Returns future to wait when item removed. |
| void ImageManager::ProcessOutdatedItems() | ||
| { | ||
| while (true) { | ||
| UniqueLock lock {mMutex}; | ||
|
|
||
| if (auto err | ||
| = mCV.Wait(lock, [&]() { return (mClose || mProcessOutdatedItems) && mInProgressBlobs.IsEmpty(); }); | ||
| !err.IsNone()) { | ||
| LOG_ERR() << "Wait failed" << Log::Field(err); | ||
| continue; | ||
| } | ||
|
|
||
| if (mClose) { | ||
| break; | ||
| } | ||
|
|
||
| if (!mProcessOutdatedItems) { | ||
| continue; | ||
| } | ||
|
|
||
| mProcessOutdatedItems = false; | ||
|
|
||
| if (auto err = HandleOutdatedItems(); !err.IsNone()) { | ||
| LOG_ERR() << "Can't handle outdated items" << Log::Field(err); | ||
| } | ||
|
|
||
| if (auto err = HandleItemsIntegrity(); !err.IsNone()) { | ||
| LOG_ERR() << "Can't handle items integrity" << Log::Field(err); | ||
| } | ||
|
|
||
| auto [size, err] = RemoveOrphans(); | ||
| if (!err.IsNone()) { | ||
| LOG_ERR() << "Remove orphans failed" << Log::Field(err); | ||
| } | ||
|
|
||
| mSpaceAllocator->FreeSpace(size); | ||
| } |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential deadlock issue: ProcessOutdatedItems holds the mutex lock while calling HandleOutdatedItems, HandleItemsIntegrity, and RemoveOrphans. These functions may perform long-running operations (file I/O, validation, etc.) while holding the lock. This could block other operations that need the mutex, such as InstallUpdateItem or RemoveUpdateItem. Consider releasing the lock before calling these functions and reacquiring it when needed, or using a separate lock for the background processing thread.
| RetWithError<size_t> ImageManager::RemoveItem(const String& id, const String& version) | ||
| { | ||
| LOG_DBG() << "Remove item" << Log::Field("id", id); | ||
| LockGuard lock {mMutex}; | ||
|
|
||
| LOG_DBG() << "Remove item" << Log::Field("id", id) << Log::Field("version", version); | ||
|
|
||
| if (auto err = mStorage->RemoveUpdateItem(id, version); !err.IsNone()) { | ||
| LOG_ERR() << "Failed to remove update item" << Log::Field("itemID", id) << Log::Field("version", version) | ||
| << Log::Field(err); | ||
| } | ||
|
|
||
| auto [size, err] = RemoveOrphans(); | ||
| if (!err.IsNone()) { | ||
| return RetWithError<size_t>(0, AOS_ERROR_WRAP(err)); | ||
| } | ||
|
|
||
| return 0; | ||
| return size; | ||
| } |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential deadlock: RemoveItem (line 336) is called by the space allocator while holding its own lock. RemoveItem acquires mMutex and calls RemoveOrphans. RemoveOrphans calls GetAllUpdateItems which may also acquire internal storage locks. Meanwhile, ProcessOutdatedItems holds mMutex and may call space allocator methods (FreeSpace, AddOutdatedItem, RestoreOutdatedItem). This creates a potential deadlock scenario where RemoveItem waits for mMutex while ProcessOutdatedItems holds mMutex and waits for space allocator operations. Consider using a different synchronization strategy or ensuring that no space allocator operations are called while holding mMutex.
| auto manifest = MakeUnique<oci::ImageManifest>(&mAllocator); | ||
|
|
||
| if (auto err = mOCISpec->LoadImageManifest(path, *manifest); !err.IsNone()) { | ||
| return AOS_ERROR_WRAP(err); | ||
| } |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing null check: After creating manifest with MakeUnique at line 940, the code does not check if manifest is nullptr before dereferencing it at line 942. If memory allocation fails, this will cause a nullptr dereference. Add a null check similar to the pattern used in InstallUpdateItem (lines 184-187).
| struct ItemIdent { | ||
| StaticString<cIDLen> mID; | ||
| StaticString<cVersionLen> mVersion; | ||
|
|
||
| bool operator==(const ItemIdent& other) const { return mID == other.mID && mVersion == other.mVersion; } | ||
| bool operator!=(const ItemIdent& other) const { return !operator==(other); } | ||
| }; | ||
|
|
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Dead code: The ItemIdent struct (lines 150-156) is defined but never used in the StorageStub class. Consider removing it or using it if it was intended for a specific purpose.
| struct ItemIdent { | |
| StaticString<cIDLen> mID; | |
| StaticString<cVersionLen> mVersion; | |
| bool operator==(const ItemIdent& other) const { return mID == other.mID && mVersion == other.mVersion; } | |
| bool operator!=(const ItemIdent& other) const { return !operator==(other); } | |
| }; |
| * Database class. | ||
| */ | ||
| class Database : public imagemanager::StorageItf, public launcher::StorageItf, public networkmanager::StorageItf { }; |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Inconsistent naming: In CM the class is named "DatabaseItf" (line 23 of src/core/cm/database/itf/database.hpp) but in SM it's named "Database". For consistency across the codebase, the SM version should also be named "DatabaseItf" to match the CM implementation.
| * Database class. | |
| */ | |
| class Database : public imagemanager::StorageItf, public launcher::StorageItf, public networkmanager::StorageItf { }; | |
| * DatabaseItf class. | |
| */ | |
| class DatabaseItf : public imagemanager::StorageItf, public launcher::StorageItf, public networkmanager::StorageItf { }; |
| } | ||
|
|
||
| /** | ||
| * Add or updates update item in storage. |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in comment: "Add or updates" should be "Adds or updates" to be grammatically correct.
| * Add or updates update item in storage. | |
| * Adds or updates an update item in storage. |
| auto config = MakeUnique<oci::ImageConfig>(&mAllocator); | ||
|
|
||
| if (auto err = mOCISpec->LoadImageConfig(path, *config); !err.IsNone()) { | ||
| return AOS_ERROR_WRAP(err); | ||
| } |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing null check: After creating config with MakeUnique at line 955, the code does not check if config is nullptr before dereferencing it at line 957. If memory allocation fails, this will cause a nullptr dereference. Add a null check similar to the pattern used in InstallUpdateItem.
| // oci::cMaxNumLayers + 3 (layers + manifest + image config + aos service) | ||
| static constexpr auto cMaxNumInstalledBlobs = cMaxNumUpdateItems * (oci::cMaxNumLayers + 3); | ||
| static constexpr auto cMaxNumInstalledLayers = cMaxNumUpdateItems * oci::cMaxNumLayers; |
Copilot
AI
Jan 30, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Potential allocator size miscalculation: The comment on line 115 says "layers + manifest + image config + aos service" equals 3 extra items, but this should be 3 if aos service is optional. However, the calculation assumes all items will have aos service. Consider whether cMaxNumInstalledBlobs should account for the optional nature of aos service, or if the current calculation is intentionally conservative. The calculation should use cMaxNumStoredUpdateItems (which is cMaxNumUpdateItems * 2) instead of cMaxNumUpdateItems if items can have multiple versions stored, otherwise only the installed items should be counted.
| // oci::cMaxNumLayers + 3 (layers + manifest + image config + aos service) | |
| static constexpr auto cMaxNumInstalledBlobs = cMaxNumUpdateItems * (oci::cMaxNumLayers + 3); | |
| static constexpr auto cMaxNumInstalledLayers = cMaxNumUpdateItems * oci::cMaxNumLayers; | |
| // Total stored update items, accounting for multiple versions per item. | |
| static constexpr auto cMaxNumStoredUpdateItems = cMaxNumUpdateItems * cMaxNumItemVersions; | |
| // oci::cMaxNumLayers + 3 (layers + manifest + image config + optional aos service) | |
| static constexpr auto cMaxNumInstalledBlobs = cMaxNumStoredUpdateItems * (oci::cMaxNumLayers + 3); | |
| static constexpr auto cMaxNumInstalledLayers = cMaxNumStoredUpdateItems * oci::cMaxNumLayers; |
Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
In order to do not overflow maximum allowed items, keep only two versions of items for revert purpose, and remove the oldest removed or oldest installed item before adding new one when max item limits reached. Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
Signed-off-by: Oleksandr Grytsov <oleksandr_grytsov@epam.com>
mlohvynenko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed-by: Mykhailo Lohvynenko <mykhailo_lohvynenko@epam.com>
MykolaSuperman
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed-by: Mykola Solianko <mykola_solianko@epam.com>
| if (auto err = mStorage->RemoveUpdateItem(id, version); !err.IsNone()) { | ||
| LOG_ERR() << "Failed to remove update item" << Log::Field("itemID", id) << Log::Field("version", version) | ||
| << Log::Field(err); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we add here RestoreOutdatedItem from space allocator?
No description provided.