Skip to content

Refactor scheduler retriever#1

Merged
tangg555 merged 7 commits intorefactor-schedulerfrom
refactor-scheduler-Retriever
Jan 23, 2026
Merged

Refactor scheduler retriever#1
tangg555 merged 7 commits intorefactor-schedulerfrom
refactor-scheduler-Retriever

Conversation

@glin93
Copy link
Collaborator

@glin93 glin93 commented Jan 22, 2026

Description

Summary: (summary)

Fix: #(issue)

Docs Issue/PR: (docs-issue-or-pr-link)

Reviewer: @(reviewer)

Checklist:

  • I have performed a self-review of my own code | 我已自行检查了自己的代码
  • I have commented my code in hard-to-understand areas | 我已在难以理解的地方对代码进行了注释
  • I have added tests that prove my fix is effective or that my feature works | 我已添加测试以证明我的修复有效或功能正常
  • I have created related documentation issue/PR in MemOS-Docs (if applicable) | 我已在 MemOS-Docs 中创建了相关的文档 issue/PR(如果适用)
  • I have linked the issue to this PR (if applicable) | 我已将 issue 链接到此 PR(如果适用)
  • I have mentioned the person who will review this PR | 我已提及将审查此 PR 的人

Note

Major refactor to decouple retrieval and post-processing, and to standardize search.

  • Replace SchedulerRetriever with SchedulerSearchService (unified search) and MemoryPostProcessor (filter + rerank); update schedulers to use post_processor and search_service
  • Unify search mode handling (map search_methodSearchMode) and route searches through search_service (with NaiveTextMemory fallback)
  • Fix critical result count bug by querying memory_type="All" in one call (avoids returning 2*top_k); preserve searcher internet toggle state
  • Move memory enhancement and recall (enhance_memories_with_query, recall_for_missing_memories) into AdvancedSearcher
  • Consolidate JSON/list extraction helpers into memos.utils; update imports accordingly
  • Adjust optimized_scheduler, general_scheduler, and eval path to new services; minor signature cleanup
  • Add unit tests: test_search_service.py, test_search_service_bug_fix.py, test_post_processor.py

Written by Cursor Bugbot for commit 1663831. This will update automatically on new commits. Configure here.

glin1993@outlook.com added 5 commits January 22, 2026 17:29
…iever

- Split SchedulerRetriever into SchedulerSearchService and MemoryPostProcessor
- SchedulerSearchService: Unified search interface delegating to Searcher
- MemoryPostProcessor: Handles all post-retrieval processing (enhance, filter, rerank)
- Fix: Use memory_type='All' to prevent 2*top_k bug in search results
- Add robust search_method conversion with logging for unknown methods
- Update all scheduler call sites (base, general, optimized, eval)
- Add comprehensive unit tests for both new services

Breaking changes: None (backward compatible)
Fixes: Search result count bug (was returning up to 2*top_k)
Problem:
- enhance_memories_with_query and recall_for_missing_memories were in MemoryPostProcessor
- This caused SearchHandler to depend on Scheduler (cross-layer coupling)
- These are general search enhancement features, not scheduler-specific

Solution:
- Move both methods from MemoryPostProcessor to AdvancedSearcher
- Update SingleCubeView to use self.searcher instead of self.mem_scheduler.post_processor
- Simplify MemoryPostProcessor to only handle filtering and reranking

Benefits:
- Better separation of concerns (search enhancement belongs to Searcher)
- Removes cross-layer dependency (API layer no longer depends on Scheduler)
- Enables reuse (both Scheduler and SearchHandler share same implementation)
- Improves testability (AdvancedSearcher can be tested independently)

Changes:
- advanced_searcher.py: +291 lines (added enhancement methods)
- post_processor.py: -320 lines (removed enhancement, kept filter/rerank)
- single_cube.py: 6 changes (updated call sites)
- test_post_processor.py: -57 lines (removed obsolete tests)

Type safety: Verified via 'AdvancedSearcher as Searcher' alias in tree.py
Fixes 3 critical issues identified in code review:

1. Configuration behavior change (HIGH RISK)
   Problem: enhance_memories_with_query called without batch_size/retries
   Impact: Behavior inconsistent with historical settings
   Fix: Pass DEFAULT_SCHEDULER_RETRIEVER_BATCH_SIZE and RETRIES explicitly

2. Module dependency inversion (ARCHITECTURE REGRESSION)
   Problem: AdvancedSearcher depended on memos.mem_scheduler.utils
   Impact: Violated decoupling goal (memories should not depend on scheduler)
   Fix: Move extract_json_obj/extract_list_items_in_answer to memos.utils
   Changes:
   - src/memos/utils.py: +163 lines (add utility functions)
   - misc_utils.py: -163 lines, +3 lines (re-export for backward compat)
   - Update imports in 6 files

3. Test coverage migration (NOTED)
   Note: _split_batches tests removed but not migrated
   Action: Defer to future test enhancement

Benefits:
- Maintains historical batch/retry behavior
- Restores proper layering (no memories→scheduler dependency)
- Backward compatible (misc_utils re-exports from memos.utils)

Verification:
✓ No scheduler imports in AdvancedSearcher
✓ Config params passed to enhance_memories_with_query
✓ All existing imports work via re-export
Critical fix for misc_utils.py:
- Removed empty 'def extract_json_obj(text: str):' declaration
- This was leftover from sed deletion and caused IndentationError
- Functions are properly re-exported from memos.utils

Verification: python3 -m py_compile passes
- Fix test_search_service.py: call_count assertion (2 -> 1)
  SearchService now uses memory_type='All' for single search call
- Fix test_post_processor.py: remove incorrect 'with Mock()' usage
- Fix ruff E712: use truthiness check instead of '== True'
- Apply ruff formatting to all modified files

All 23 tests now pass successfully.
@glin93 glin93 requested a review from tangg555 January 22, 2026 12:45
cursor[bot]

This comment was marked as outdated.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 3 potential issues.

Bugbot Autofix is OFF. To automatically fix reported issues with Cloud Agents, enable Autofix in the Cursor dashboard.

prompt_name = "memory_rewrite_enhancement"
else:
text_memories = "\n".join([f"- {mem}" for i, mem in enumerate(batch_texts)])
prompt_name = "memory_recreate_enhancement"
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing prompt templates in AdvancedSearcher will cause KeyError

High Severity

The enhance_memories_with_query and recall_for_missing_memories methods were moved to AdvancedSearcher, but the required prompt templates are missing from advanced_search_prompts.py. The PROMPT_MAPPING in that file doesn't include "memory_rewrite_enhancement" (used when FINE_STRATEGY == FineStrategy.REWRITE) or "enlarge_recall". When load_template attempts PROMPT_MAPPING[template_name] for these keys, a KeyError is raised at runtime.

Additional Locations (1)

Fix in Cursor Fix in Web

finally:
# Restore original setting
if original_manual_close is not None:
self.searcher.manual_close_internet = original_manual_close
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Race condition when modifying shared searcher state

Medium Severity

The _search_with_searcher method modifies self.searcher.manual_close_internet and restores it in a finally block. Since the Searcher instance is shared across all concurrent requests, two threads calling search() with different internet_search values can interfere with each other. One thread's setting could affect another thread's search, causing incorrect internet search behavior. The old code path via TreeTextMemory.search() created a new Searcher per call, avoiding this issue.

Fix in Cursor Fix in Web

session_id=target_session_id,
internet_search=search_req.internet_search,
chat_history=search_req.chat_history,
mem_cube_id=user_context.mem_cube_id,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Search fails silently for NaiveTextMemory in optimized scheduler

Medium Severity

The search_memories method accepts NaiveMemCube (which may use NaiveTextMemory), but NaiveTextMemory lacks get_searcher, so self.searcher is None. When search_service.search() falls back to _search_with_text_mem, an assertion requires TreeTextMemory, causing AssertionError. The exception is caught and returns an empty list, silently failing searches that previously worked when the old code called mem_cube.text_mem.search() directly.

Additional Locations (1)

Fix in Cursor Fix in Web

@tangg555 tangg555 merged commit e5358f3 into refactor-scheduler Jan 23, 2026
1 check passed
tangg555 added a commit that referenced this pull request Feb 9, 2026
…nsor#1004)

* refactor(scheduler): modularize handlers and search

- extract scheduler handlers into dedicated modules

- split retriever into pipelines (search/enhance/rerank/filter)

- centralize text search logic for API and scheduler

Refs MemTensor#1003

* fix: resolve ruff lint errors and address Copilot review feedback

- Fix TC001/TC002/TC003: move type-only imports into TYPE_CHECKING blocks
- Fix RUF059: prefix unused variables with underscore
- Fix typos: "Memorires" -> "Memories", "exeption" -> "exception"
- Remove self-assignment: `text_mem_base = text_mem_base`
- Remove unused `user_context` param from `build_search_context`
- Restore original `QUERY_TASK_LABEL` in activation memory update
- Apply ruff format to all modified files

* fix(redis): serialize schedule messages for streams

- json-encode list/dict fields for Redis XADD

- decode chat_history safely when reading from streams

* fix(examples): use scheduler handlers

- avoid private _memory_update_consumer call

- delegate mem update handler to built-in handler

* style(ruff): fix isinstance union syntax

- apply X | Y form to satisfy UP038

* style(ruff): format message schemas

- align datetime line breaks with ruff format

* refactor: integrate architectural improvements from refactor-scheduler-stage2
This commit merges key modularization benefits and bug fixes from the
refactor-scheduler-stage2 branch into fancy-scheduler:
Modularize activation memory logic into ActivationMemoryManager
Introduce SchedulerSearchService for unified memory search coordination
Extract filtering and reranking logic into MemoryPostProcessor
Maintain intentional search scope of LongTermMemory and UserMemory in SearchPipeline and SchedulerSearchService
Update BaseScheduler to initialize and manage lifecycle of new modules
Refactor BaseSchedulerMemoryMixin to delegate tasks to specialized managers

* refactor: extract common logic to BaseSchedulerHandler and fix import paths

* refactor: rename ctx to scheduler_context in mem_scheduler

Rename abbreviation 'ctx' to 'scheduler_context' in GeneralScheduler and SchedulerHandlerRegistry to improve code readability and clarity.

* refactor: sync orchestrator config and fix scheduler imports

- Sync orchestrator config removal when unregistering handlers in dispatcher
- Fix missing TaskPriorityLevel import in dispatcher
- Fix register_handlers signature in BaseSchedulerQueueMixin
- Fix handler registry imports and initialization map
- Fix relative imports in handlers package

* fix: resolve PR #1 review issues (P0 imports, P3 types/init)

* chore(ruff): fix unused unpacked vars

- prefix unused unpacked vars with underscore

- apply ruff format changes

* perf(search): include embeddings for mmr

* fix: Pass user_context in mem_read and pref_add handlers

- Update MemReadMessageHandler to extract user_context from message and pass it to _process_memories_with_reader and transfer_mem.
- Update PrefAddMessageHandler to extract user_context from message and pass it to pref_mem.add.
- This ensures user context information is available during memory reading and preference adding operations.

---------

Co-authored-by: chentang <travistang@foxmail.com>
Co-authored-by: glin1993@outlook.com <>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants