Conversation
Remove embedded PostgreSQL (pg0) as a hard dependency since this fork targets external PostgreSQL with vchord_bm25. The pg0 import is now conditional with a clear RuntimeError guiding users to set HINDSIGHT_API_DATABASE_URL. Fix CREATE EXTENSION error handling in migrations: after a failed CREATE EXTENSION, PostgreSQL aborts the transaction, causing subsequent queries in the except handler to fail with InFailedSqlTransaction. Use SAVEPOINTs to isolate the extension creation so the transaction remains usable in the fallback path. Also remove the .pg0 data directory creation from the Dockerfile since embedded PostgreSQL is no longer used in container builds. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughThis PR makes the pg0-embedded dependency optional by removing it from project dependencies, updating Docker configuration to not create pg0-specific directories, and adding graceful fallback handling for the pg0 module import with runtime error messaging. Changes
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
hindsight-api/hindsight_api/alembic/versions/n9i0j1k2l3m4_learnings_and_pinned_reflections.py (1)
105-114: Same correct pattern forpg_textsearch.Consistent with the
vchord_bm25branch and the initial_schema migration. If you extract the helper suggested in the initial_schema review, this code can also be simplified.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hindsight-api/hindsight_api/alembic/versions/n9i0j1k2l3m4_learnings_and_pinned_reflections.py` around lines 105 - 114, The SAVEPOINT/release/rollback pattern around creating the pg_textsearch extension should match the vchord_bm25/initial_schema migrations: wrap op.execute("CREATE EXTENSION IF NOT EXISTS pg_textsearch CASCADE") with conn.execute(text("SAVEPOINT create_ext_pg_textsearch")) and conn.execute(text("RELEASE SAVEPOINT create_ext_pg_textsearch")) in try, and on Exception ROLLBACK TO SAVEPOINT then query pg_extension (via conn.execute(text("SELECT 1 FROM pg_extension WHERE extname = 'pg_textsearch'")).fetchone()) and re-raise only if the extension still isn't present; optionally refactor this logic into the shared helper used by initial_schema to reduce duplication (referencing conn, op, and the savepoint name create_ext_pg_textsearch).
🧹 Nitpick comments (4)
hindsight-api/hindsight_api/alembic/versions/n9i0j1k2l3m4_learnings_and_pinned_reflections.py (1)
82-121:_detect_text_search_extension()is duplicated verbatim across migration files.Both this file and
5a366d414dce_initial_schema.pycontain identical copies of_detect_text_search_extension()and_detect_vector_extension(). While Alembic migrations are typically self-contained, these helpers are pure detection logic with no schema-version-specific behavior — they'd be good candidates for a shared module (e.g., amigration_utils.pyalongside the versions directory). This would also make the savepoint helper extraction from the other review comment more impactful.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hindsight-api/hindsight_api/alembic/versions/n9i0j1k2l3m4_learnings_and_pinned_reflections.py` around lines 82 - 121, The duplicated pure helpers _detect_text_search_extension and _detect_vector_extension should be extracted into a shared migration utility module (e.g., migration_utils.py placed alongside the Alembic versions) and both migration scripts should import and call those functions instead of containing verbatim copies; create migration_utils.py with the detection logic (including the savepoint/create-extension sequences), update the migration files to remove the local definitions and import _detect_text_search_extension and _detect_vector_extension, and ensure the migrations still call the imported functions where they previously referenced the local ones.hindsight-api/hindsight_api/alembic/versions/5a366d414dce_initial_schema.py (2)
126-135: Savepoint for vector extension fallback looks correct.Same pattern as the text-search extensions — properly guards the fallback
CREATE EXTENSION IF NOT EXISTS vectorso the transaction survives a permission-denied error when the extension is already installed.One nit: the bare
except Exceptionsilently swallows the original error when the extension does already exist. Consider logging the caught exception atDEBUGlevel so operators can diagnose unexpected failures during migrations, rather than only discovering the swallow via savepoint rollback.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hindsight-api/hindsight_api/alembic/versions/5a366d414dce_initial_schema.py` around lines 126 - 135, In the except Exception block guarding the vector extension creation, capture the exception (e.g., except Exception as e) and log it at DEBUG before (or immediately after) performing the SAVEPOINT rollback so operators can diagnose failures; use a module logger (logging.getLogger(__name__) or an existing logger) and include exception details (exc_info=True or similar) in the log call, referencing the existing symbols conn, op.execute("CREATE EXTENSION IF NOT EXISTS vector"), and the savepoint/rollback logic to locate where to add the debug logging.
85-94: Consider extracting the repeated savepoint-guarded extension creation into a helper.The same SAVEPOINT → CREATE EXTENSION → RELEASE / ROLLBACK + existence-check pattern appears three times in this file and is duplicated again in
n9i0j1k2l3m4_learnings_and_pinned_reflections.py. A small helper would reduce duplication and make the intent clearer.♻️ Example helper extraction
def _create_extension_safe(ext_name: str, savepoint_name: str | None = None) -> None: """Create a PostgreSQL extension within a savepoint, tolerating permission errors if already installed.""" sp = savepoint_name or f"create_ext_{ext_name}" conn = op.get_bind() try: conn.execute(text(f"SAVEPOINT {sp}")) op.execute(f"CREATE EXTENSION IF NOT EXISTS {ext_name} CASCADE") conn.execute(text(f"RELEASE SAVEPOINT {sp}")) except Exception: conn.execute(text(f"ROLLBACK TO SAVEPOINT {sp}")) result = conn.execute( text("SELECT 1 FROM pg_extension WHERE extname = :name"), {"name": ext_name}, ).fetchone() if not result: raiseThis can then be shared across migration files (e.g., placed in a common utils module alongside
_get_schema_prefix).Also applies to: 98-107, 126-135
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hindsight-api/hindsight_api/alembic/versions/5a366d414dce_initial_schema.py` around lines 85 - 94, Extract the repeated SAVEPOINT→CREATE EXTENSION→RELEASE/ROLLBACK pattern into a reusable helper (e.g., _create_extension_safe) that encapsulates op.get_bind(), the SAVEPOINT/RELEASE/ROLLBACK logic, the CREATE EXTENSION IF NOT EXISTS call, and the post-error pg_extension existence check; replace each duplicated block in the migration modules (the initial_schema migration and the n9i0j1k2l3m4_learnings_and_pinned_reflections migration) with calls to this helper, and place the helper in a shared utils module (or a small local helper at top of the migration files) so the migrations call _create_extension_safe(ext_name) instead of repeating the block.hindsight-api/hindsight_api/pg0.py (1)
36-40: Clear guard with actionable error message.The guard correctly prevents any
Pg0(...)call when the module is absent. The error message directs users to theHINDSIGHT_API_DATABASE_URLalternative, which aligns with the PR objective.One minor observation: when
Pg0 is None, the type annotation on Line 33 (self._pg0: Pg0 | None) degrades toNoneType | None, and the return annotation on Line 35 (-> Pg0) becomes-> None. This is semantically consistent (the function raises before returning), so it won't cause runtime issues — but static type checkers like mypy may flag it. Consider guarding the annotations withTYPE_CHECKING:💡 Optional: type-checking-safe annotation
+from __future__ import annotations +from typing import TYPE_CHECKING + +if TYPE_CHECKING: + from pg0 import Pg0 as Pg0Type +else: + Pg0Type = None + try: from pg0 import Pg0 except ImportError: Pg0 = NoneThen use
Pg0Typein annotations andPg0for runtime checks. This is only relevant if you run mypy/pyright on this codebase.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@hindsight-api/hindsight_api/pg0.py` around lines 36 - 40, Wrap runtime import/usage of Pg0 with a TYPE_CHECKING-safe annotation alias so static type checkers don't see Pg0 as None: import TYPE_CHECKING from typing, and inside "if TYPE_CHECKING:" import Pg0 (e.g. "from pg0_embedded import Pg0 as Pg0Type") or declare a protocol/alias, then use Pg0Type in annotations (replace self._pg0: Pg0 | None and the method return annotation -> Pg0 with Pg0Type | None and -> Pg0Type). Keep the runtime guard that checks "if Pg0 is None: raise RuntimeError(...)" and continue to use the runtime Pg0 identifier for construction/usage while annotations reference Pg0Type.
ℹ️ Review info
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (5)
docker/standalone/Dockerfilehindsight-api/hindsight_api/alembic/versions/5a366d414dce_initial_schema.pyhindsight-api/hindsight_api/alembic/versions/n9i0j1k2l3m4_learnings_and_pinned_reflections.pyhindsight-api/hindsight_api/pg0.pyhindsight-api/pyproject.toml
💤 Files with no reviewable changes (2)
- docker/standalone/Dockerfile
- hindsight-api/pyproject.toml
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In
`@hindsight-api/hindsight_api/alembic/versions/n9i0j1k2l3m4_learnings_and_pinned_reflections.py`:
- Around line 105-114: The SAVEPOINT/release/rollback pattern around creating
the pg_textsearch extension should match the vchord_bm25/initial_schema
migrations: wrap op.execute("CREATE EXTENSION IF NOT EXISTS pg_textsearch
CASCADE") with conn.execute(text("SAVEPOINT create_ext_pg_textsearch")) and
conn.execute(text("RELEASE SAVEPOINT create_ext_pg_textsearch")) in try, and on
Exception ROLLBACK TO SAVEPOINT then query pg_extension (via
conn.execute(text("SELECT 1 FROM pg_extension WHERE extname =
'pg_textsearch'")).fetchone()) and re-raise only if the extension still isn't
present; optionally refactor this logic into the shared helper used by
initial_schema to reduce duplication (referencing conn, op, and the savepoint
name create_ext_pg_textsearch).
---
Nitpick comments:
In `@hindsight-api/hindsight_api/alembic/versions/5a366d414dce_initial_schema.py`:
- Around line 126-135: In the except Exception block guarding the vector
extension creation, capture the exception (e.g., except Exception as e) and log
it at DEBUG before (or immediately after) performing the SAVEPOINT rollback so
operators can diagnose failures; use a module logger
(logging.getLogger(__name__) or an existing logger) and include exception
details (exc_info=True or similar) in the log call, referencing the existing
symbols conn, op.execute("CREATE EXTENSION IF NOT EXISTS vector"), and the
savepoint/rollback logic to locate where to add the debug logging.
- Around line 85-94: Extract the repeated SAVEPOINT→CREATE
EXTENSION→RELEASE/ROLLBACK pattern into a reusable helper (e.g.,
_create_extension_safe) that encapsulates op.get_bind(), the
SAVEPOINT/RELEASE/ROLLBACK logic, the CREATE EXTENSION IF NOT EXISTS call, and
the post-error pg_extension existence check; replace each duplicated block in
the migration modules (the initial_schema migration and the
n9i0j1k2l3m4_learnings_and_pinned_reflections migration) with calls to this
helper, and place the helper in a shared utils module (or a small local helper
at top of the migration files) so the migrations call
_create_extension_safe(ext_name) instead of repeating the block.
In
`@hindsight-api/hindsight_api/alembic/versions/n9i0j1k2l3m4_learnings_and_pinned_reflections.py`:
- Around line 82-121: The duplicated pure helpers _detect_text_search_extension
and _detect_vector_extension should be extracted into a shared migration utility
module (e.g., migration_utils.py placed alongside the Alembic versions) and both
migration scripts should import and call those functions instead of containing
verbatim copies; create migration_utils.py with the detection logic (including
the savepoint/create-extension sequences), update the migration files to remove
the local definitions and import _detect_text_search_extension and
_detect_vector_extension, and ensure the migrations still call the imported
functions where they previously referenced the local ones.
In `@hindsight-api/hindsight_api/pg0.py`:
- Around line 36-40: Wrap runtime import/usage of Pg0 with a TYPE_CHECKING-safe
annotation alias so static type checkers don't see Pg0 as None: import
TYPE_CHECKING from typing, and inside "if TYPE_CHECKING:" import Pg0 (e.g. "from
pg0_embedded import Pg0 as Pg0Type") or declare a protocol/alias, then use
Pg0Type in annotations (replace self._pg0: Pg0 | None and the method return
annotation -> Pg0 with Pg0Type | None and -> Pg0Type). Keep the runtime guard
that checks "if Pg0 is None: raise RuntimeError(...)" and continue to use the
runtime Pg0 identifier for construction/usage while annotations reference
Pg0Type.
Summary
pg0import is now conditional with a clearRuntimeErrorguiding users to setHINDSIGHT_API_DATABASE_URL.InFailedSqlTransactionbug in migrations — after a failedCREATE EXTENSION, PostgreSQL aborts the transaction, but theexcepthandler tried to query in the same aborted transaction. Now usesSAVEPOINT/ROLLBACK TO SAVEPOINTto keep the transaction alive for the fallback check..pg0data directory creation from Dockerfile — no longer needed since embedded PostgreSQL isn't used in container builds.Test plan
uv syncsucceeds without pg0-embedded as a direct dependencyparse_pg0_url()works without pg0 installedEmbeddedPostgres._get_pg0()raises clearRuntimeErrorwhen pg0 is missing./scripts/hooks/lint.shpassesdocker build --target standalone --build-arg INCLUDE_LOCAL_MODELS=false --build-arg PRELOAD_ML_MODELS=false -f docker/standalone/Dockerfile .🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Chores