Skip to content

Cleanup#191

Merged
m1rl0k merged 2 commits intotestfrom
cleanup
Jan 20, 2026
Merged

Cleanup#191
m1rl0k merged 2 commits intotestfrom
cleanup

Conversation

@m1rl0k
Copy link
Collaborator

@m1rl0k m1rl0k commented Jan 20, 2026

No description provided.

Deleted the _cleanup_answer and _answer_style_guidance functions from context_answer.py, consolidating answer cleanup and style guidance logic elsewhere or removing unused code.
Adds mapping from symbol_path to point_id for more accurate graph edge extraction and linking in the ingest pipeline. Improves edge upsert logic to include caller_point_id, updates backfill to avoid edge deletion, and adds dynamic backend detection for edge upserts. Also improves symbol matching in Neo4j queries to support suffix matches for fully-qualified names and adds throttling for PageRank computation.
@m1rl0k m1rl0k merged commit b19bb6a into test Jan 20, 2026
1 check passed
@augmentcode
Copy link

augmentcode bot commented Jan 20, 2026

🤖 Augment PR Summary

Summary: This PR streamlines graph ingestion/query behavior and removes some expensive default work during indexing.

Changes:

  • Neo4j backend: throttles/guards PageRank recomputation behind env flags (off by default) and adds a per-collection run interval with locking.
  • Neo4j backend: uses asyncio.get_running_loop() (with a defensive fallback) when executing sync queries in an executor.
  • Neo4j backend: ensures auto-backfill checks still run for new collections even if the hint DB was already initialized.
  • Neo4j backend: broadens symbol matching in callers/callees/importers queries to include suffix matches for fully-qualified names.
  • Ingest pipeline: tracks symbol_path -> point_id and passes caller_point_id into call/import edge extraction so edges can link back to source chunks.
  • Smart reindexing: computes point IDs earlier to reuse them consistently across reused/new points and edge extraction.
  • Graph backfill: switches to upsert-only during backfill (no per-path edge deletion) to avoid accidental data loss across multiple chunks per file.
  • Graph backfill: adds runtime backend detection and adapts edge batches to Neo4j’s GraphEdge format when enabled.
  • Context answering: removes redundant answer-cleanup/style helper code to simplify the module.

Technical Notes: PageRank refresh can be enabled via NEO4J_PAGERANK_ON_UPSERT=1, with optional throttling controlled by NEO4J_PAGERANK_ON_UPSERT_MIN_INTERVAL.

🤖 Was this summary useful? React with 👍 or 👎

Copy link

@augmentcode augmentcode bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed. 1 suggestions posted.

Fix All in Augment

Comment augment review to trigger a new review at any time.

# or enable `NEO4J_PAGERANK_ON_UPSERT=1` for periodic refreshes.
if total > 0 and os.environ.get("NEO4J_PAGERANK_ON_UPSERT", "").strip().lower() in {"1", "true", "yes", "on"}:
import time as _time
min_interval = float(os.environ.get("NEO4J_PAGERANK_ON_UPSERT_MIN_INTERVAL", "300") or 300)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

min_interval = float(os.environ.get(...)) will raise ValueError if NEO4J_PAGERANK_ON_UPSERT_MIN_INTERVAL is set to a non-numeric string, which would abort edge upserts; consider guarding the parse and falling back to a default.

Fix This in Augment

🤖 Was this useful? React with 👍 or 👎

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant