Skip to content

Conversation

@JSv4
Copy link
Collaborator

@JSv4 JSv4 commented Oct 20, 2025

Summary

Fully resolves #518 - Complete implementation of modular, customizable agent instructions with backend infrastructure, GraphQL API, and frontend UI.

Problem

When a corpus has no description BUT has documents with annotations/analyses, the corpus agent gives unhelpful responses like "The corpus description is currently empty." This wastes the agent's potential to actually examine and summarize the available documents.

Complete Solution

1. Backend Infrastructure ✅

New Corpus Model Fields:

  • corpus_agent_instructions - Custom system prompt for corpus-level agents
  • document_agent_instructions - Custom system prompt for document-level agents
  • Both fields are optional and fall back to Django settings defaults

Django Settings for Defaults:

  • DEFAULT_DOCUMENT_AGENT_INSTRUCTIONS - Improved instructions emphasizing tool usage and source citation
  • DEFAULT_CORPUS_AGENT_INSTRUCTIONS - Smarter instructions that explicitly handle empty corpus descriptions:
    • Tells agent to use list_documents() when corpus description is empty
    • Instructs agent to proactively examine documents rather than just complaining
    • Emphasizes being helpful and using available tools

Agent Factory Integration:

  • CoreDocumentAgentFactory.get_default_system_prompt() checks corpus.document_agent_instructions first, falls back to settings
  • CoreCorpusAgentFactory.get_default_system_prompt() checks corpus.corpus_agent_instructions first, falls back to settings
  • Fully backward compatible - existing corpuses use improved defaults automatically

2. GraphQL API ✅

Updated UpdateCorpusMutation:

  • Added corpusAgentInstructions argument (optional String)
  • Added documentAgentInstructions argument (optional String)

Updated CorpusSerializer:

  • Added both fields to serializable fields list

Updated CorpusType:

  • Fields automatically exposed via DjangoObjectType

3. Frontend UI ✅

New CorpusAgentSettings Component:

  • Professional, user-friendly interface for editing agent instructions
  • Two text areas with monospace font for better readability
  • Helpful descriptions explaining what each instruction type controls
  • Change tracking - save/reset buttons only appear when changes are made
  • Permission-aware - only shows edit UI if user has update permissions
  • Integrated into CorpusSettings page as a new "Agent Instructions" section

Updated GraphQL Mutations:

  • UPDATE_CORPUS mutation accepts new optional fields
  • UpdateCorpusInputs TypeScript interface updated

Changes Made

Modified Files:

  • opencontractserver/corpuses/models.py - Added instruction fields
  • opencontractserver/llms/agents/core_agents.py - Updated factories to use custom/default instructions
  • config/settings/base.py - Added default instruction settings
  • config/graphql/mutations.py - Added mutation arguments
  • config/graphql/serializers.py - Added fields to serializer
  • frontend/src/graphql/mutations.ts - Updated mutation and types
  • frontend/src/components/corpuses/CorpusSettings.tsx - Integrated new component

New Files:

  • opencontractserver/corpuses/migrations/0022_add_agent_instructions.py - Database migration
  • frontend/src/components/corpuses/CorpusAgentSettings.tsx - UI component

Test Plan

  • Pre-commit hooks pass (Python + frontend)
  • TypeScript compilation passes
  • Migration created successfully
  • Agent factories check corpus fields first, fall back to settings
  • GraphQL mutation accepts new fields
  • Frontend UI renders and tracks changes correctly

Impact

Immediate improvement: Corpus agents are now smarter about handling empty corpus descriptions
Customization: Corpus owners can fully customize agent behavior per corpus
Backward compatible: Existing corpuses automatically use improved defaults
Professional UI: Easy-to-use interface for editing instructions
Full stack: Complete implementation from database to UI

JSv4 added 4 commits October 19, 2025 23:37
Resolves core backend requirements for #518

Changes:
1. Added corpus_agent_instructions and document_agent_instructions fields to Corpus model
   - Allows per-corpus customization of agent behavior
   - Falls back to settings defaults if not set

2. Created Django migration (0022_add_agent_instructions)

3. Added Django settings for default instructions
   - DEFAULT_DOCUMENT_AGENT_INSTRUCTIONS - improved instructions that emphasize using tools
   - DEFAULT_CORPUS_AGENT_INSTRUCTIONS - smarter instructions that handle empty corpus descriptions

4. Updated agent factories to use custom/default instructions
   - CoreDocumentAgentFactory.get_default_system_prompt now checks corpus.document_agent_instructions
   - CoreCorpusAgentFactory.get_default_system_prompt now checks corpus.corpus_agent_instructions
   - Falls back to settings defaults if custom instructions not provided

Impact:
- Improves corpus agent responses when corpus has no description but has documents
- Allows corpus owners to customize agent behavior per corpus
- Maintains backward compatibility with existing corpuses

Remaining work for #518:
- GraphQL mutations/queries for editing instructions via frontend
- Frontend UI for corpus owners to edit agent instructions
- Integration tests
This commit completes #518 by adding full GraphQL and frontend support for editing
agent instructions per corpus.

Backend (GraphQL):
1. Updated UpdateCorpusMutation to accept corpusAgentInstructions and documentAgentInstructions
2. Updated CorpusSerializer to include new fields in serialization
3. CorpusType automatically exposes new model fields via DjangoObjectType

Frontend:
1. Updated UPDATE_CORPUS GraphQL mutation with new fields
2. Updated UpdateCorpusInputs interface to include new optional fields
3. Created CorpusAgentSettings component:
   - Displays two text areas for editing corpus and document agent instructions
   - Shows helpful descriptions for each instruction type
   - Tracks changes and enables save/reset buttons
   - Properly handles permissions (only corpus owners/editors can update)
   - Uses monospace font for better readability of instructions
4. Integrated CorpusAgentSettings into CorpusSettings page as new section

Complete Solution:
- Backend: Model fields + Django settings defaults ✅
- Backend: Agent factories use custom/default instructions ✅
- Backend: GraphQL mutations and queries ✅
- Frontend: UI for editing instructions ✅
- Tests: TypeScript compilation passes ✅
- Tests: Pre-commit hooks pass ✅

Impact:
Corpus owners can now customize how AI agents behave when analyzing their corpus and
documents. The improved default instructions make agents smarter about handling
corpuses with empty descriptions but available documents.
Updated DEFAULT_DOCUMENT_AGENT_INSTRUCTIONS to use the original detailed
instructions verbatim with visual separators, comprehensive search strategy,
and explicit tool selection guidance.

This provides more comprehensive guidance to document agents with:
- Visual separators (━━━) for better readability
- Detailed 5-step search strategy
- Explicit tool selection guide
- Comprehensive response requirements
- Critical emphasis on citation protocol
The get_default_system_prompt method signature was updated to accept both
document and corpus parameters, but the test was still checking for only
the document parameter. Updated the test assertion to match the new signature.
@codecov
Copy link

codecov bot commented Oct 20, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

JSv4 added 2 commits October 22, 2025 01:55
Adds four new test cases to achieve complete coverage of the agent instruction
selection logic in CoreDocumentAgentFactory and CoreCorpusAgentFactory:

1. test_document_agent_uses_default_instructions_when_corpus_has_none - Tests
   fallback to DEFAULT_DOCUMENT_AGENT_INSTRUCTIONS when corpus has no custom
   document_agent_instructions

2. test_corpus_agent_uses_default_instructions_when_corpus_has_none - Tests
   fallback to DEFAULT_CORPUS_AGENT_INSTRUCTIONS when corpus has no custom
   corpus_agent_instructions

3. test_document_agent_uses_custom_instructions_when_corpus_has_them - Tests
   using custom document_agent_instructions from corpus when available

4. test_corpus_agent_uses_custom_instructions_when_corpus_has_them - Tests
   using custom corpus_agent_instructions from corpus when available

These tests ensure both branches of the if/else blocks are covered.

Resolves missing coverage identified in PR #521.
Use document_id instead of passing model instance to database_sync_to_async
to avoid query failures when crossing async/sync thread boundaries.
@JSv4 JSv4 merged commit 4393ff7 into main Oct 23, 2025
11 checks passed
@JSv4 JSv4 deleted the feature/issue-518-modular-agent-instructions branch October 23, 2025 02:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Feature] Improve Corpus Agent Instruction and Make Modular.

2 participants