Skip to content

Commit 782ee7a

Browse files
authored
Partial Loading PR 3.5: Fix pre-mature model drops from the RAM cache (#7522)
## Summary This is an unplanned fix between PR3 and PR4 in the sequence of partial loading (i.e. low-VRAM) PRs. This PR restores the 'Current Workaround' documented in #7513. In other words, to work around a flaw in the model cache API, this fix allows models to be loaded into VRAM _even if_ they have been dropped from the RAM cache. This PR also adds an info log each time that this workaround is hit. In a future PR (#7509), we will eliminate the places in the application code that are capable of triggering this condition. ## Related Issues / Discussions - #7492 - #7494 - #7500 - #7513 ## QA Instructions - Set RAM cache limit to a small value. E.g. `ram: 4` - Run FLUX text-to-image with the full T5 encoder, which exceeds 4GB. This will trigger the error condition. - Before the fix, this test configuration would cause a `KeyError`. After the fix, we should see an info-level log explaining that the condition was hit, but that generation should continue successfully. ## Merge Plan No special instructions. ## Checklist - [x] _The PR has a short but descriptive title, suitable for a changelog_ - [x] _Tests added / updated (if applicable)_ - [x] _Documentation added / updated (if applicable)_ - [ ] _Updated `What's New` copy (if doing a release after this PR)_
2 parents f4f7415 + c579a21 commit 782ee7a

File tree

2 files changed

+20
-8
lines changed

2 files changed

+20
-8
lines changed

invokeai/backend/model_manager/load/load_base.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -57,20 +57,20 @@ def __init__(self, cache_record: CacheRecord, cache: ModelCache):
5757
self._cache = cache
5858

5959
def __enter__(self) -> AnyModel:
60-
self._cache.lock(self._cache_record.key)
60+
self._cache.lock(self._cache_record)
6161
return self.model
6262

6363
def __exit__(self, *args: Any, **kwargs: Any) -> None:
64-
self._cache.unlock(self._cache_record.key)
64+
self._cache.unlock(self._cache_record)
6565

6666
@contextmanager
6767
def model_on_device(self) -> Generator[Tuple[Optional[Dict[str, torch.Tensor]], AnyModel], None, None]:
6868
"""Return a tuple consisting of the model's state dict (if it exists) and the locked model on execution device."""
69-
self._cache.lock(self._cache_record.key)
69+
self._cache.lock(self._cache_record)
7070
try:
7171
yield (self._cache_record.state_dict, self._cache_record.model)
7272
finally:
73-
self._cache.unlock(self._cache_record.key)
73+
self._cache.unlock(self._cache_record)
7474

7575
@property
7676
def model(self) -> AnyModel:

invokeai/backend/model_manager/load/model_cache/model_cache.py

Lines changed: 16 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -194,9 +194,15 @@ def get(
194194

195195
return cache_entry
196196

197-
def lock(self, key: str) -> None:
197+
def lock(self, cache_entry: CacheRecord) -> None:
198198
"""Lock a model for use and move it into VRAM."""
199-
cache_entry = self._cached_models[key]
199+
if cache_entry.key not in self._cached_models:
200+
self._logger.info(
201+
f"Locking model cache entry {cache_entry.key} ({cache_entry.model.__class__.__name__}), but it has "
202+
"already been dropped from the RAM cache. This is a sign that the model loading order is non-optimal "
203+
"in the invocation code (See https://github.com/invoke-ai/InvokeAI/issues/7513)."
204+
)
205+
# cache_entry = self._cached_models[key]
200206
cache_entry.lock()
201207

202208
try:
@@ -214,9 +220,15 @@ def lock(self, key: str) -> None:
214220
cache_entry.unlock()
215221
raise
216222

217-
def unlock(self, key: str) -> None:
223+
def unlock(self, cache_entry: CacheRecord) -> None:
218224
"""Unlock a model."""
219-
cache_entry = self._cached_models[key]
225+
if cache_entry.key not in self._cached_models:
226+
self._logger.info(
227+
f"Unlocking model cache entry {cache_entry.key} ({cache_entry.model.__class__.__name__}), but it has "
228+
"already been dropped from the RAM cache. This is a sign that the model loading order is non-optimal "
229+
"in the invocation code (See https://github.com/invoke-ai/InvokeAI/issues/7513)."
230+
)
231+
# cache_entry = self._cached_models[key]
220232
cache_entry.unlock()
221233
if not self._lazy_offloading:
222234
self._offload_unlocked_models(0)

0 commit comments

Comments
 (0)