Skip to content

Generalize toolkit from ARDA-specific to general-purpose#1

Merged
MSevey merged 1 commit intomainfrom
feat/generalize-toolkit-remove-arda-specifics
Jan 30, 2026
Merged

Generalize toolkit from ARDA-specific to general-purpose#1
MSevey merged 1 commit intomainfrom
feat/generalize-toolkit-remove-arda-specifics

Conversation

@MSevey
Copy link
Contributor

@MSevey MSevey commented Jan 30, 2026

Summary

Transforms the code-ingest toolkit from an ARDA-opinionated tool into a general-purpose, config-driven code ingestion and search system. All hard-coded ARDA references have been replaced with config-driven or generic patterns.

Key Changes

📚 Documentation & Examples

  • Updated README to focus on generic "configure repos → vectorize → run MCP" workflow
  • Replaced all ARDA project references with generic placeholders (my-backend, myproject_code_rust, your-org--embed.modal.run)
  • Added prominent customization comments to config files

⚙️ Configuration

  • config/collections.yaml: Added guidance to customize prefix and collection names
  • config/repositories.yaml: Marked as example schema to replace with your repos
  • Collection names now fully config-driven with generic fallbacks

🔄 Ingestion Pipeline

  • Generic fallback collection names: code_rust, frontend, api_contracts (no arda_ prefix)
  • Config-driven helpers: determine_service_collection() and determine_concern_collections() now accept config parameters
  • Removed path branches on arda-credit-app; now use generic patterns (/api/, /frontend/)
  • Dependency analyzer: _is_arda_package()_is_internal_package() (checks against all configured repos)

🔌 MCP Server

  • Docstrings updated: "Code Ingestion MCP" instead of "Arda Vector Database"
  • Query router: generic service patterns instead of hard-coded ARDA repo list
  • Domain tools: all collections loaded dynamically from _collections_config
  • GitHub utils: generalized to support REPO_URL_1, REPO_URL_2, etc.

Result

The toolkit now presents as a general-purpose solution that:

  • ✅ Works with any GitHub organization's repositories
  • ✅ All naming controlled through config/collections.yaml and config/repositories.yaml
  • ✅ Generic fallbacks when config is missing
  • ✅ Clear customization points with examples
  • ✅ Maintains backward compatibility

Files Changed

44 files across documentation, configuration, ingestion pipeline, and MCP server.

Test Plan

  • Verify make health works with existing ARDA config
  • Test with modified collection_prefix in config/collections.yaml
  • Test with minimal config/repositories.yaml (single repo)
  • Verify MCP server starts and searches work
  • Check that documentation is clear for new users

Transforms code-ingest from an ARDA-opinionated tool to a generic,
config-driven code ingestion and search system. All hard-coded ARDA
references replaced with config-driven or generic patterns.

Documentation:
- README: generic use case focus, removed ARDA project references
- All docs: replaced arda-credit, arda_code_rust with {prefix}/generic examples
- Config files: added prominent customization comments

Ingestion Pipeline:
- Fallback collections: generic names (code_rust, frontend, api_contracts)
- Fallback repos: single generic example with warning
- Config helpers: now accept config dict parameters for dynamic resolution
- Services: removed arda-credit-app path branches, use generic patterns
- Parsers: generic component detection, removed repo-specific logic
- Dependency analyzer: _is_arda_package → _is_internal_package

MCP Server:
- Docstrings: "Code Ingestion MCP" instead of "Arda Vector Database"
- Query router: generic service patterns, no hard-coded arda- repos
- Domain tools: all collections loaded from _collections_config dynamically
- GitHub utils: generalized to REPO_URL_1..N environment variables
- Resource URIs: confirmed vector:// throughout (arda:// removed from docs)

All collection and repository names now flow from config/collections.yaml
and config/repositories.yaml, with generic fallbacks when YAML is missing.
@MSevey MSevey merged commit 831fca6 into main Jan 30, 2026
@MSevey MSevey deleted the feat/generalize-toolkit-remove-arda-specifics branch January 30, 2026 18:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant