feat(autocontextmemory): compress prompt optimize#457
feat(autocontextmemory): compress prompt optimize#457AlbumenJ merged 2 commits intoagentscope-ai:mainfrom
Conversation
Change-Id: Ibc1fc1c1dde5ef316d0ec1ee46e2959ae39aea23
Change-Id: If5da4bf0a658b94aa9f6a6bbcebdd7041f484634
There was a problem hiding this comment.
Pull request overview
This pull request enhances the AutoContext memory compression system with improved configuration options, better handling of compressed messages, and refined compression logic. The changes focus on making compression more efficient and user-configurable.
Key Changes:
- Introduced
minCompressionTokenThresholdconfiguration to skip compression for small message batches (< 5000 tokens by default) - Added plan-related tool filtering to avoid compressing plan management operations
- Standardized offload hints to use
CONTEXT_OFFLOADtag format consistently
Reviewed changes
Copilot reviewed 8 out of 8 changed files in this pull request and generated 9 comments.
Show a summary per file
| File | Description |
|---|---|
| AutoContextConfig.java | Added minCompressionTokenThreshold configuration field with builder support |
| Prompts.java | Updated compression prompts with clearer instructions and introduced CONTEXT_OFFLOAD_TAG_FORMAT |
| PromptConfig.java | Removed outdated documentation about non-configurable format templates |
| MsgUtils.java | Added isCompressedMessage() method and plan-related tool filtering utilities |
| AutoContextMemory.java | Integrated token threshold checks, plan-tool filtering, and simplified plan state context |
| AutoContextHook.java | Modified to always inject system prompt with CONTEXT_OFFLOAD instructions |
| AutoContextMemoryTest.java | Added comprehensive tests for new token threshold feature and plan-tool filtering |
| AutoContextHookTest.java | Updated tests to account for system prompt always being added |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| break; | ||
| } | ||
| } | ||
| if (allPlanRelated && toolUseBlocks.size() > 0) { |
There was a problem hiding this comment.
The condition toolUseBlocks.size() > 0 is redundant. The check !toolUseBlocks.isEmpty() on line 471 already ensures the list is not empty. This extra check adds unnecessary complexity.
| if (allPlanRelated && toolUseBlocks.size() > 0) { | |
| if (allPlanRelated) { |
| * @param minCompressionTokenThreshold the minimum token count threshold | ||
| * @return this builder instance for method chaining | ||
| */ | ||
| public Builder minCompressionTokenThreshold(int minCompressionTokenThreshold) { |
There was a problem hiding this comment.
Missing validation for the minCompressionTokenThreshold parameter. The method should validate that the threshold is non-negative (>= 0) to prevent invalid configurations. Negative values would cause unexpected behavior in the compression logic.
| public Builder minCompressionTokenThreshold(int minCompressionTokenThreshold) { | |
| public Builder minCompressionTokenThreshold(int minCompressionTokenThreshold) { | |
| if (minCompressionTokenThreshold < 0) { | |
| throw new IllegalArgumentException("minCompressionTokenThreshold must be greater than or equal to 0"); | |
| } |
| + " [prior context (if any)]\\n" | ||
| + "[new tool summaries]\n" |
There was a problem hiding this comment.
Line 183 has inconsistent formatting with extra newline notation. The string shows [prior context (if any)]\\n with escaped newline followed by actual newline on the next line. This creates confusing formatting that makes it unclear what the actual output format should be.
| + " [prior context (if any)]\\n" | |
| + "[new tool summaries]\n" | |
| + " [prior context (if any)]\n" | |
| + " [new tool summaries]\n" |
| // Include user message in messagesToSummarize for context, but keep it in the final | ||
| // list |
There was a problem hiding this comment.
The comment on line 961 says "Include user message in messagesToSummarize for context, but keep it in the final list", but the code on line 1024 (Step 7) shows that messages between user and assistant (including assistant) are removed and replaced with summary. This creates confusion about whether the user message is kept or removed. The comment should clarify that the user message itself (at userIndex) is NOT removed, only the messages after it.
|
|
||
| // Check if _compress_meta exists in metadata | ||
| Object compressMeta = metadata.get("_compress_meta"); | ||
| return compressMeta != null && compressMeta instanceof Map; |
There was a problem hiding this comment.
The instanceof check should use pattern matching (instanceof Map) without the type check, or better yet, check if it's specifically a Map to ensure type safety. The current implementation allows any Map type, but doesn't verify the expected structure of the compress metadata.
| // Check compression events - should NOT have TOOL_INVOCATION_COMPRESS event | ||
| List<CompressionEvent> compressionEvents = skipCompressionMemory.getCompressionEvents(); | ||
| boolean hasToolCompressionEvent = | ||
| compressionEvents.stream() | ||
| .anyMatch( | ||
| event -> | ||
| CompressionEvent.TOOL_INVOCATION_COMPRESS.equals( | ||
| event.getEventType())); | ||
| assertFalse( | ||
| hasToolCompressionEvent, | ||
| "Should not have tool compression event when token count is below threshold"); |
There was a problem hiding this comment.
The test verifies that the model was NOT called for tool compression, but doesn't actually verify that the skip logic was triggered. Consider adding a debug log verification or checking compression events more explicitly to ensure the minCompressionTokenThreshold logic was actually executed and caused the skip.
| String offloadTag = | ||
| offloadUuid != null | ||
| ? String.format(Prompts.CONTEXT_OFFLOAD_TAG_FORMAT, offloadUuid) | ||
| : ""; | ||
|
|
||
| // Combine: summary content + newline + UUID tag | ||
| String finalContent = summaryContent; | ||
| if (!offloadTag.isEmpty()) { | ||
| finalContent = finalContent + "\n" + offloadTag; |
There was a problem hiding this comment.
The condition check !offloadTag.isEmpty() is unnecessary. Since offloadTag is constructed from either a formatted string or an empty string, the condition could be simplified to just check if offloadUuid is not null, which was already done during construction. This creates redundant logic.
| String offloadTag = | |
| offloadUuid != null | |
| ? String.format(Prompts.CONTEXT_OFFLOAD_TAG_FORMAT, offloadUuid) | |
| : ""; | |
| // Combine: summary content + newline + UUID tag | |
| String finalContent = summaryContent; | |
| if (!offloadTag.isEmpty()) { | |
| finalContent = finalContent + "\n" + offloadTag; | |
| // Combine: summary content + newline + UUID tag | |
| String finalContent = summaryContent; | |
| if (offloadUuid != null) { | |
| finalContent = | |
| finalContent | |
| + "\n" | |
| + String.format(Prompts.CONTEXT_OFFLOAD_TAG_FORMAT, offloadUuid); |
| public static final String PREVIOUS_ROUND_CONVERSATION_SUMMARY_PROMPT = | ||
| "You are an expert content compression specialist. Your task is to intelligently" | ||
| + " summarize the following conversation history from a previous round. The content" | ||
| + " includes a user question, tool invocations and their results, and the" | ||
| + " assistant's final response for that round.\n" | ||
| "You are an expert dialogue compressor for autonomous agents. Your task is to rewrite" | ||
| + " the assistant's final response from the previous round as a self-contained," | ||
| + " concise reply that incorporates all essential facts learned during the" | ||
| + " round—without referencing tools, functions, or internal execution steps.\n" | ||
| + "\n" | ||
| + "Please provide a concise summary that:\n" | ||
| + " - Preserves important decisions, conclusions, and key information\n" | ||
| + " - Maintains context that would be needed for future interactions\n" | ||
| + " - Consolidates repeated or similar information\n" | ||
| + " - Highlights any important outcomes or results\n" | ||
| + " - Special attention for write/change operations: If tool invocations involve" | ||
| + " write or modification operations (such as file writing, data updates, state" | ||
| + " changes, etc.), pay extra attention to preserve detailed information about what" | ||
| + " was written, modified, or changed, including file paths, content summaries, and" | ||
| + " modification results\n" | ||
| + " - Provide a clear summary of the assistant's final response in that round," | ||
| + " highlighting the key points, conclusions, or actions taken"; | ||
|
|
||
| /** Format for previous round conversation summary. */ | ||
| public static final String PREVIOUS_ROUND_CONVERSATION_SUMMARY_FORMAT = | ||
| "<conversation_summary>%s</conversation_summary>\n" | ||
| + "<hint> This is a summary of previous conversation rounds. You can use this" | ||
| + " information as historical context for future reference.\n"; | ||
| + "Input includes: the user's original question, the assistant's original response," | ||
| + " and the results of any tool executions that informed that response.\n" | ||
| + "\n" | ||
| + "Your output will REPLACE the original assistant message in the conversation" | ||
| + " history, forming a clean USER -> ASSISTANT pair for future context.\n" | ||
| + "\n" | ||
| + "Guidelines:\n" | ||
| + " - NEVER mention tools, functions, API calls, or execution steps (e.g., avoid" | ||
| + " 'I called...', 'The system returned...', 'After running X...').\n" | ||
| + " - INSTEAD, state all findings as direct, factual knowledge the assistant now" | ||
| + " possesses.\n" | ||
| + " - PRESERVE CRITICAL FACTS from tool results, especially:\n" | ||
| + " • File paths and their contents, changes, or creation (e.g.," | ||
| + " '/etc/app.conf sets port=8080')\n" | ||
| + " • Exact error messages when diagnostic (e.g., 'Permission denied (errno" | ||
| + " 13)', 'timeout after 30s')\n" | ||
| + " • IDs, URLs, ports, status codes, configuration values, and data keys\n" | ||
| + " • Outcomes of write/change operations (e.g., 'Wrote maintenance flag to" | ||
| + " /tmp/status', 'Updated user_id=789 email in database')\n" | ||
| + " • Service states or process info (e.g., 'auth-service is stopped'," | ||
| + " 'PID=4567')\n" | ||
| + " - If an action was performed (e.g., file written, service restarted), clearly" | ||
| + " state WHAT changed and WHERE.\n" | ||
| + " - If something failed or was incomplete, specify the limitation (e.g., 'Could" | ||
| + " not restart: permission denied').\n" | ||
| + " - Consolidate redundant information; omit generic success messages with no" | ||
| + " actionable detail.\n" | ||
| + " - Use clear, informative language—avoid meta-phrases like 'Based on logs...'" | ||
| + " or 'As observed...'.\n" | ||
| + " - Output must be plain text: no markdown, bullets, JSON, XML, or section" | ||
| + " headers."; |
There was a problem hiding this comment.
The prompt uses a narrative style with phrases like "Your task is to...", "Your output will REPLACE...", which could be more direct and actionable. Consider using imperative mood consistently (e.g., "Rewrite the assistant's final response..." instead of "Your task is to rewrite...") for clearer, more concise instructions.
| String appendedInstruction = | ||
| "\n\n" | ||
| + "You may see compressed messages containing <!-- CONTEXT_OFFLOAD" | ||
| + " uuid=... -->.\n" | ||
| + "- Use the UUID to call context_reload if you need full details.\n" | ||
| + "- NEVER mention, quote, or refer to UUIDs, offload tags, or internal" | ||
| + " metadata in your response."; | ||
|
|
||
| String newSystemText = | ||
| originalSystemText != null | ||
| ? originalSystemText + appendedInstruction | ||
| : appendedInstruction.trim(); | ||
|
|
||
| Msg updatedSystemMsg = | ||
| Msg.builder() | ||
| .role(MsgRole.SYSTEM) | ||
| .name(originalSystemMsg.getName()) | ||
| .content(TextBlock.builder().text(newSystemText).build()) | ||
| .metadata(originalSystemMsg.getMetadata()) | ||
| .build(); | ||
|
|
||
| newInputMessages.add(updatedSystemMsg); | ||
| } else { | ||
| // No system message exists, create a new one with the instruction | ||
| String instruction = | ||
| "You may see compressed messages containing <!-- CONTEXT_OFFLOAD uuid=..." | ||
| + " -->.\n" | ||
| + "- Use the UUID to call context_reload if you need full details.\n" | ||
| + "- NEVER mention, quote, or refer to UUIDs, offload tags, or internal" | ||
| + " metadata in your response."; |
There was a problem hiding this comment.
The system prompt instruction text contains duplicate wording. Lines 244-246 say "NEVER mention, quote, or refer to UUIDs, offload tags, or internal metadata" and lines 268-269 repeat "NEVER mention, quote, or refer to UUIDs, offload tags, or internal metadata". Consider extracting this as a constant to avoid duplication and ensure consistency.
Codecov Report❌ Patch coverage is 📢 Thoughts on this report? Let us know! |
This pull request introduces several improvements and refinements to the AutoContext memory system and its integration in the advanced example. The main changes include enhanced configuration options for compression, better handling and annotation of compressed messages, and improved logic for when and how message compression is applied. These updates aim to make the memory compression process more robust, configurable, and user-friendly.
AutoContext Memory Compression Improvements:
minCompressionTokenThresholdtoAutoContextConfig, allowing compression to be skipped if the token count is below a certain threshold (default: 5000 tokens). This helps avoid unnecessary compression overhead for small tool invocation batches. [1] [2] [3] [4] [5]CONTEXT_OFFLOADformat, and improved how these hints are appended to summary messages. [1] [2] [3]User/System Message Handling:
CONTEXT_OFFLOADUUIDs, regardless of whether compression occurred in the current round. This ensures the model has the necessary context for handling compressed messages.Advanced Example Enhancements:
AutoMemoryExample, providing better integration with the Studio dashboard for monitoring and debugging. [1] [2]tokenRatioparameter inAutoMemoryExampleto 0.3, making compression more aggressive and potentially reducing memory usage.These changes collectively enhance the flexibility, reliability, and transparency of the AutoContext memory compression system.Change-Id: Ibc1fc1c1dde5ef316d0ec1ee46e2959ae39aea23
AgentScope-Java Version
[The version of AgentScope-Java you are working on, e.g. 1.0.6, check your pom.xml dependency version or run
mvn dependency:tree | grep agentscope-parent:pom(only mac/linux)]Description
[Please describe the background, purpose, changes made, and how to test this PR]
Checklist
Please check the following items before code is ready to be reviewed.
mvn spotless:applymvn test)