Skip to content

Conversation

@abnegate
Copy link
Member

@abnegate abnegate commented Aug 15, 2025

Summary by CodeRabbit

  • New Features
    • Adds fixed-size vector data type, similarity queries (dot, cosine, euclidean), vector distance ordering, and HNSW vector index support when available.
  • Breaking Changes
    • Public API extended with new vector/spatial capability flags and constants; adapters/integrations must declare support.
  • Tests
    • Extensive new unit and end-to-end vector tests; spatial tests now exit cleanly when unsupported.
  • Chores
    • Postgres image now includes pgvector.

Co-authored-by: jakeb994 <jakeb994@gmail.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Aug 15, 2025

Note

Other AI code review bot(s) detected

CodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review.

Walkthrough

Adds first-class vector datatype and query support: Docker installs pgvector; core constants/types and vector query operators; Vector validator and index (HNSW) validation; adapter capability APIs and SQL/Postgres vector ordering; extensive unit and e2e tests and related refactors (SRID and index-length constants).

Changes

Cohort / File(s) Summary
Container: Postgres image
postgres.dockerfile
Installs postgresql-16-pgvector.
Adapter API & delegation
src/Database/Adapter.php, src/Database/Adapter/Pool.php
Adds getSupportForVectors(): bool to Adapter and Pool delegation.
SQL adapter scaffolding
src/Database/Adapter/SQL.php
Adds getSupportForVectors() (default false), getVectorDistanceOrder() scaffold, integrates vector ordering into find, rejects vector operators by default, and adjusts VAR_VECTOR width accounting.
Postgres adapter
src/Database/Adapter/Postgres.php
Enables pgvector extension; maps VECTOR(size); validates vector sizes; provides ORDER BY fragments for DOT/COSINE/EUCLIDEAN; supports HNSW/SPATIAL index variants; returns true for getSupportForVectors().
Core types & constants
src/Database/Database.php
Adds VAR_VECTOR, VAR_UUID, HNSW index constants, MAX_* constants, MAX_VECTOR_DIMENSIONS, MAX_ARRAY_INDEX_LENGTH, DEFAULT_SRID, EARTH_RADIUS; removes VAR_OBJECT_ID/SRID; integrates vector encoding/decoding and validation hooks.
Query API
src/Database/Query.php
Adds TYPE_VECTOR_DOT/COSINE/EUCLIDEAN, VECTOR_TYPES, and static builders vectorDot/vectorCosine/vectorEuclidean.
Validators: Vector & wiring
src/Database/Validator/Vector.php, src/Database/Validator/Structure.php, src/Database/Validator/Queries.php, src/Database/Validator/Query/Filter.php
New Vector validator (fixed-length numeric vectors); Structure registers it; query validators treat vector query types as filters; filter validator enforces attribute type and exact numeric size.
Index validator refactor
src/Database/Validator/Index.php
Constructor-promoted properties, adds supportForVectorIndexes, new public checks checkSpatialIndex, checkNonSpatialIndexOnSpatialAttribute, checkVectorIndex(), and integrates vector index validation.
Indexed queries validation
src/Database/Validator/IndexedQueries.php
Rejects multiple vector filters in a single request.
Adapter SRID / array index length changes
src/Database/Adapter/MariaDB.php, src/Database/Adapter/MySQL.php
Use MAX_ARRAY_INDEX_LENGTH for array index casts and Database::DEFAULT_SRID for spatial SRID usage.
Sequence / Structure tests adjustments
src/Database/Validator/Sequence.php, tests/unit/Validator/StructureTest.php
Replace VAR_OBJECT_ID with VAR_UUID and update numeric constant references.
Tests: e2e & unit additions
tests/e2e/Adapter/Base.php, tests/e2e/Adapter/Scopes/VectorTests.php, tests/unit/Validator/VectorTest.php, tests/unit/QueryTest.php, tests/e2e/Adapter/PostgresTest.php
Adds VectorTests trait and many e2e vector scenarios; unit tests for Vector validator and vector query builders; minor whitespace change in PostgresTest.
Tests: other updates
tests/e2e/Adapter/Scopes/SpatialTests.php, tests/e2e/Adapter/Scopes/IndexTests.php, tests/e2e/Adapter/Scopes/CollectionTests.php, tests/e2e/Adapter/Scopes/DocumentTests.php, tests/e2e/Adapter/Scopes/PermissionTests.php
Replace spatial test skips with expectNotToPerformAssertions() early returns; pass additional adapter capability flags to Index validator; update tests to new constants and VAR_UUID.

Sequence Diagram(s)

sequenceDiagram
  autonumber
  actor Client
  participant DB as Database
  participant SQL as Adapter\\SQL
  participant PG as Adapter\\Postgres
  participant PGX as PostgreSQL (pgvector)

  rect rgb(245,250,255)
    note over Client,DB: Create collection with vector attribute
    Client->>DB: createCollection(schema with VAR_VECTOR size=N)
    DB->>PG: getSupportForVectors()
    PG-->>DB: true
    DB->>PG: enable pgvector extension
    PG->>PGX: CREATE EXTENSION IF NOT EXISTS vector
    PGX-->>PG: ok / error
    DB->>PG: CREATE TABLE ... COLUMN VECTOR(N)
    PG-->>DB: ok
    DB-->>Client: collection created
  end
Loading
sequenceDiagram
  autonumber
  actor Client
  participant DB as Database
  participant SQL as Adapter\\SQL
  participant PG as Adapter\\Postgres
  participant PGX as PostgreSQL (pgvector)

  rect rgb(245,255,245)
    note over Client,DB: Vector similarity query flow
    Client->>DB: find([vectorCosine('emb', v)], limit)
    DB->>SQL: build SQL, extract vector queries
    SQL->>PG: getVectorDistanceOrder(query, binds, alias)
    PG-->>SQL: ORDER BY expression using pgvector ops
    SQL-->>DB: final SQL + binds
    DB->>PGX: SELECT ... ORDER BY vector distance
    PGX-->>DB: rows
    DB-->>Client: documents
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60–90 minutes

Possibly related PRs

Suggested reviewers

  • ArnabChatterjee20k
  • fogelito

Poem

I nibble through indexes, light on paws,
pgvector stitched into database laws.
HNSW hops, cosine spins, dot and L2,
embeddings tucked in rows — I fetch for you.
A rabbit's cheer — vectors join the crew! 🥕

Pre-merge checks and finishing touches

✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title Check ✅ Passed The pull request title "Add vector attribute support for PostgreSQL with pgvector extension" directly relates to and accurately summarizes the primary objective of the changeset. The PR introduces comprehensive vector attribute support across the database abstraction layer, with PostgreSQL and the pgvector extension as the concrete implementation target. The title is specific, clear, and avoids generic language—it identifies the key feature being added (vector attribute support), the target database (PostgreSQL), and the technology being used (pgvector extension). A developer scanning the repository history would clearly understand that this PR adds vector querying and indexing capabilities to PostgreSQL databases. While the changeset includes supporting infrastructure across multiple adapter classes and validators to establish a generalized framework for vector support, the title appropriately focuses on the primary objective and the main implementation target.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat/postgresql-vector-support

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between b5a3c95 and f06ef55.

📒 Files selected for processing (1)
  • src/Database/Validator/Query/Filter.php (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/Database/Validator/Query/Filter.php
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Unit Test

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

cursoragent and others added 2 commits August 15, 2025 13:07
Co-authored-by: jakeb994 <jakeb994@gmail.com>
cursoragent and others added 2 commits August 15, 2025 13:13
Co-authored-by: jakeb994 <jakeb994@gmail.com>
Co-authored-by: jakeb994 <jakeb994@gmail.com>
Co-authored-by: jakeb994 <jakeb994@gmail.com>
cursoragent and others added 3 commits August 15, 2025 13:20
Co-authored-by: jakeb994 <jakeb994@gmail.com>
Co-authored-by: jakeb994 <jakeb994@gmail.com>
Co-authored-by: jakeb994 <jakeb994@gmail.com>
cursoragent and others added 10 commits August 15, 2025 15:27
Co-authored-by: jakeb994 <jakeb994@gmail.com>
…r-support

# Conflicts:
#	src/Database/Adapter/Postgres.php
#	src/Database/Adapter/SQL.php
#	src/Database/Database.php
#	src/Database/Query.php
#	src/Database/Validator/Query/Filter.php
#	src/Database/Validator/Structure.php
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)
src/Database/Query.php (1)

885-901: Bug: fallback assigns offset/cursor from $limit

In two places, fallback uses $limit instead of the local variable, which can silently corrupt grouping when values are missing:

  • Line 890: $offset = $values[0] ?? $limit; should use $offset.
  • Line 899: $cursor = $values[0] ?? $limit; should use $cursor.

Apply:

-                    $offset = $values[0] ?? $limit;
+                    $offset = $values[0] ?? $offset;
...
-                    $cursor = $values[0] ?? $limit;
+                    $cursor = $values[0] ?? $cursor;
src/Database/Adapter/Postgres.php (1)

877-923: Add HNSW index validation to enforce single-attribute constraint.

While the match statement correctly includes HNSW index types, the code doesn't validate that HNSW indexes reference exactly one vector attribute or prevent order directives, as noted in the previous review comments. HNSW indexes in PostgreSQL pgvector must reference a single vector column and don't support sort orders.

Apply this diff to add validation:

     public function createIndex(string $collection, string $id, string $type, array $attributes, array $lengths, array $orders, array $indexAttributeTypes = []): bool
     {
         $collection = $this->filter($collection);
         $id = $this->filter($id);
+
+        $isHnsw = \in_array($type, [
+            Database::INDEX_HNSW_EUCLIDEAN,
+            Database::INDEX_HNSW_COSINE,
+            Database::INDEX_HNSW_DOT,
+        ], true);
+
+        if ($isHnsw && \count($attributes) !== 1) {
+            throw new DatabaseException('HNSW indexes must reference exactly one vector attribute');
+        }
+
+        if ($isHnsw && !empty($orders)) {
+            throw new DatabaseException('HNSW indexes do not support attribute order directives');
+        }

         foreach ($attributes as $i => $attr) {
-            $order = empty($orders[$i]) || Database::INDEX_FULLTEXT === $type ? '' : $orders[$i];
+            $order = (empty($orders[$i]) || Database::INDEX_FULLTEXT === $type || $isHnsw) ? '' : $orders[$i];
src/Database/Database.php (1)

2411-2460: Vector defaults need the same guard on update.

updateAttribute() now accepts vector defaults without re-checking shape/length, so a scalar (or wrong-size) default slips through even though createAttribute() rejects it. A malformed default then propagates to new documents and downstream validators. Mirror the top-level checks from the creation path: require an array, enforce exact length, and ensure every element is numeric before calling validateDefaultTypes().

Apply this diff:

             case self::VAR_VECTOR:
                 if (!$this->adapter->getSupportForVectors()) {
                     throw new DatabaseException('Vector types are not supported by the current database');
                 }
                 if ($array) {
                     throw new DatabaseException('Vector type cannot be an array');
                 }
                 if ($size <= 0) {
                     throw new DatabaseException('Vector dimensions must be a positive integer');
                 }
                 if ($size > self::MAX_VECTOR_DIMENSIONS) {
                     throw new DatabaseException('Vector dimensions cannot exceed ' . self::MAX_VECTOR_DIMENSIONS);
                 }
+                if ($default !== null) {
+                    if (!\is_array($default)) {
+                        throw new DatabaseException('Vector default value must be an array');
+                    }
+                    if (\count($default) !== $size) {
+                        throw new DatabaseException('Vector default value must have exactly ' . $size . ' elements');
+                    }
+                    foreach ($default as $component) {
+                        if (!\is_numeric($component)) {
+                            throw new DatabaseException('Vector default value must contain only numeric elements');
+                        }
+                    }
+                }
                 break;
src/Database/Validator/Index.php (1)

34-38: Initialize $attributes to prevent uninitialized typed-property error.

The typed property $attributes is written by key before being initialized, which can cause errors. Initialize it at the start of the constructor body.

Based on learnings

Apply this diff:

     ) {
+        $this->attributes = [];
         foreach ($attributes as $attribute) {
             $key = \strtolower($attribute->getAttribute('key', $attribute->getAttribute('$id')));
             $this->attributes[$key] = $attribute;
♻️ Duplicate comments (3)
src/Database/Adapter/SQL.php (1)

2379-2391: Add validation for nested vector queries and adapter support.

Two concerns:

  1. Nested vector queries: The past review comment correctly identifies that nested vector queries (inside AND/OR) will slip through extraction and hit getSQLOperator(), causing a generic exception. While the exception will eventually be thrown, failing fast with a clear message would improve the developer experience.

  2. Adapter capability: No check that getSupportForVectors() returns true before extracting vector queries. If the adapter doesn't support vectors, getVectorDistanceOrder() returns null and the queries are silently ignored, which could confuse users who expect similarity search to work.

Consider adding these validations:

+        // Validate vector support if vector queries present
+        $hasVectorQueries = false;
+        foreach ($queries as $query) {
+            if ($query->isNested()) {
+                foreach ($query->getValues() as $nested) {
+                    if (in_array($nested->getMethod(), Query::VECTOR_TYPES, true)) {
+                        throw new DatabaseException('Vector queries must be top-level (not nested inside AND/OR).');
+                    }
+                }
+            }
+            if (in_array($query->getMethod(), Query::VECTOR_TYPES, true)) {
+                $hasVectorQueries = true;
+            }
+        }
+
+        if ($hasVectorQueries && !$this->getSupportForVectors()) {
+            throw new DatabaseException('Vector queries are not supported by this adapter.');
+        }
+
         // Extract vector queries for ORDER BY
         $vectorQueries = [];
         $otherQueries = [];
src/Database/Validator/Index.php (2)

356-361: Use array_filter for orders check to avoid false positives.

The !empty($orders) check can incorrectly reject vector indexes when the API supplies an orders array pre-filled with empty values (e.g., ['', '']). Use array_filter to match the lengths check.

Based on learnings

Apply this diff:

         $orders = $index->getAttribute('orders', []);
         $lengths = $index->getAttribute('lengths', []);
-        if (!empty($orders) || \count(\array_filter($lengths)) > 0) {
+        if (\count(\array_filter($orders)) > 0 || \count(\array_filter($lengths)) > 0) {
             $this->message = 'Vector indexes do not support orders or lengths';
             return false;
         }

285-288: Fix orders check and add lengths validation for spatial indexes.

The current !empty($orders) check false-positives on arrays with empty values (e.g., ['', '']). Additionally, spatial indexes should not allow lengths, but this is not validated.

Based on learnings

Apply this diff:

+        $lengths = $index->getAttribute('lengths', []);
-        if (!empty($orders) && !$this->supportForSpatialIndexNull) {
+        if (\count(\array_filter($orders)) > 0 && !$this->supportForSpatialIndexOrder) {
             $this->message = 'Spatial indexes with explicit orders are not supported. Remove the orders to create this index.';
             return false;
         }
+
+        if (\count(\array_filter($lengths)) > 0) {
+            $this->message = 'Spatial indexes do not support lengths';
+            return false;
+        }
🧹 Nitpick comments (7)
src/Database/Query.php (3)

114-118: Grouping constant VECTOR_TYPES

Nice touch. Consider adding a convenience helper:

  • isVectorQuery(): bool using VECTOR_TYPES (parity with isSpatialQuery), useful in adapters/validators.

877-877: Behavior note: first limit wins

Comment matches implementation (first limit kept, rest ignored). Consider documenting this in public API docs to avoid caller confusion.


1134-1168: Vector helpers: add lightweight input validation (optional)

Helpers accept array<float> but PHP can't enforce; a quick guard improves DX and earlier errors.

     public static function vectorDot(string $attribute, array $vector): self
     {
-        return new self(self::TYPE_VECTOR_DOT, $attribute, [$vector]);
+        self::assertNumericVector($vector);
+        return new self(self::TYPE_VECTOR_DOT, $attribute, [$vector]);
     }
@@
     public static function vectorCosine(string $attribute, array $vector): self
     {
-        return new self(self::TYPE_VECTOR_COSINE, $attribute, [$vector]);
+        self::assertNumericVector($vector);
+        return new self(self::TYPE_VECTOR_COSINE, $attribute, [$vector]);
     }
@@
     public static function vectorEuclidean(string $attribute, array $vector): self
     {
-        return new self(self::TYPE_VECTOR_EUCLIDEAN, $attribute, [$vector]);
+        self::assertNumericVector($vector);
+        return new self(self::TYPE_VECTOR_EUCLIDEAN, $attribute, [$vector]);
     }

Add this private helper anywhere in the class:

/** @param array<mixed> $v */
private static function assertNumericVector(array $v): void
{
    if ($v === [] || !\array_is_list($v)) {
        throw new QueryException('Vector must be a non-empty list');
    }
    foreach ($v as $x) {
        if (!\is_int($x) && !\is_float($x)) {
            throw new QueryException('Vector elements must be numeric');
        }
    }
}
tests/e2e/Adapter/Scopes/CollectionTests.php (1)

696-698: Sync comment with new constant name

Inline comment still mentions ARRAY_INDEX_LENGTH. Update to prevent drift.

-                'lengths' => [500], // Will be changed to Database::ARRAY_INDEX_LENGTH (255)
+                'lengths' => [500], // Will be changed to Database::MAX_ARRAY_INDEX_LENGTH (255)
src/Database/Adapter/MySQL.php (1)

199-235: Use DEFAULT_SRID in spatial SQL type strings

SRID 4326 is hardcoded three times. Reuse Database::DEFAULT_SRID for consistency and single source of truth.

-            case Database::VAR_POINT:
-                $type = 'POINT SRID 4326';
+            case Database::VAR_POINT:
+                $srid = Database::DEFAULT_SRID;
+                $type = "POINT SRID {$srid}";
@@
-            case Database::VAR_LINESTRING:
-                $type = 'LINESTRING SRID 4326';
+            case Database::VAR_LINESTRING:
+                $srid = Database::DEFAULT_SRID;
+                $type = "LINESTRING SRID {$srid}";
@@
-            case Database::VAR_POLYGON:
-                $type = 'POLYGON SRID 4326';
+            case Database::VAR_POLYGON:
+                $srid = Database::DEFAULT_SRID;
+                $type = "POLYGON SRID {$srid}";
src/Database/Validator/Structure.php (1)

359-362: Harden VAR_VECTOR validation (dimension guard and non-array enforcement).

Today a missing/zero size yields a confusing “array of 0” requirement, and a misconfigured array: true could route vectors through the array-branch. Add small guards:

  • Validate size > 0 and ≤ Database::MAX_VECTOR_DIMENSIONS.
  • Force $array = false for vectors to avoid treating them as arrays of vectors.

Suggested diff:

-                case Database::VAR_VECTOR:
-                    $validators[] = new Vector($attribute['size'] ?? 0);
-                    break;
+                case Database::VAR_VECTOR:
+                    $dimension = (int) ($attribute['size'] ?? 0);
+                    if ($dimension <= 0 || $dimension > Database::MAX_VECTOR_DIMENSIONS) {
+                        $this->message = 'Attribute "'.$key.'" has invalid size. Must be between 1 and '.Database::MAX_VECTOR_DIMENSIONS;
+                        return false;
+                    }
+                    $validators[] = new Vector($dimension);
+                    // Ensure vectors are treated as a single array value, not an array of vectors
+                    $array = false;
+                    break;

Additionally, consider tightening Vector::isValid to require array_is_list($value) so associative arrays are rejected. (src/Database/Validator/Vector.php: isValid)

tests/e2e/Adapter/Scopes/DocumentTests.php (1)

86-90: Complete the remaining Database::MAX_BIG_INT replacements in DocumentTests.php for consistency.

The migration to Database::MAX_* constants is correct. The verification script found two remaining hardcoded bigint literals that should be replaced:

  • Line 408: assertEquals(9223372036854775807, ...) → use Database::MAX_BIG_INT
  • Line 424: assertEquals(9223372036854775807, ...) → use Database::MAX_BIG_INT

9223372036854775807 is PHP_INT_MAX on 64-bit systems, and Database::MAX_BIG_INT is defined as this value. Replacing these hardcoded literals ensures consistency across all similar assertions and improves portability, avoiding 32-bit brittleness.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2a53c11 and 2faf75a.

📒 Files selected for processing (13)
  • src/Database/Adapter/MariaDB.php (3 hunks)
  • src/Database/Adapter/MySQL.php (1 hunks)
  • src/Database/Adapter/Postgres.php (10 hunks)
  • src/Database/Adapter/SQL.php (8 hunks)
  • src/Database/Database.php (16 hunks)
  • src/Database/Query.php (7 hunks)
  • src/Database/Validator/Index.php (5 hunks)
  • src/Database/Validator/Sequence.php (1 hunks)
  • src/Database/Validator/Structure.php (2 hunks)
  • tests/e2e/Adapter/Scopes/CollectionTests.php (1 hunks)
  • tests/e2e/Adapter/Scopes/DocumentTests.php (18 hunks)
  • tests/e2e/Adapter/Scopes/IndexTests.php (2 hunks)
  • tests/e2e/Adapter/Scopes/PermissionTests.php (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/Database/Validator/Sequence.php
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
PR: utopia-php/database#721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.

Applied to files:

  • tests/e2e/Adapter/Scopes/DocumentTests.php
🧬 Code graph analysis (11)
tests/e2e/Adapter/Scopes/CollectionTests.php (1)
src/Database/Database.php (1)
  • Database (37-8255)
src/Database/Adapter/MariaDB.php (1)
src/Database/Database.php (1)
  • Database (37-8255)
tests/e2e/Adapter/Scopes/PermissionTests.php (1)
src/Database/Database.php (1)
  • Database (37-8255)
tests/e2e/Adapter/Scopes/DocumentTests.php (1)
src/Database/Database.php (1)
  • Database (37-8255)
src/Database/Validator/Structure.php (2)
src/Database/Database.php (1)
  • Database (37-8255)
src/Database/Validator/Vector.php (1)
  • Vector (7-84)
tests/e2e/Adapter/Scopes/IndexTests.php (6)
src/Database/Adapter/SQL.php (6)
  • getMaxIndexLength (1817-1823)
  • getInternalIndexesKeys (1867-1870)
  • getSupportForIndexArray (1465-1468)
  • getSupportForSpatialIndexNull (1505-1508)
  • getSupportForSpatialIndexOrder (1515-1518)
  • getSupportForVectors (1535-1538)
src/Database/Adapter.php (6)
  • getMaxIndexLength (876-876)
  • getInternalIndexesKeys (1276-1276)
  • getSupportForIndexArray (935-935)
  • getSupportForSpatialIndexNull (1071-1071)
  • getSupportForSpatialIndexOrder (1085-1085)
  • getSupportForVectors (1029-1029)
src/Database/Adapter/Pool.php (6)
  • getMaxIndexLength (308-311)
  • getInternalIndexesKeys (493-496)
  • getSupportForIndexArray (338-341)
  • getSupportForSpatialIndexNull (438-441)
  • getSupportForSpatialIndexOrder (528-531)
  • getSupportForVectors (408-411)
src/Database/Adapter/MariaDB.php (4)
  • getInternalIndexesKeys (1769-1772)
  • getSupportForIndexArray (1842-1845)
  • getSupportForSpatialIndexNull (1857-1860)
  • getSupportForSpatialIndexOrder (1876-1879)
src/Database/Adapter/MySQL.php (2)
  • getSupportForIndexArray (130-136)
  • getSupportForSpatialIndexOrder (179-182)
src/Database/Adapter/Postgres.php (3)
  • getSupportForSpatialIndexNull (2062-2065)
  • getSupportForSpatialIndexOrder (2082-2085)
  • getSupportForVectors (1990-1993)
src/Database/Adapter/Postgres.php (4)
src/Database/Query.php (5)
  • getAttribute (173-176)
  • Query (8-1169)
  • setAttribute (214-219)
  • getValues (181-184)
  • getMethod (165-168)
src/Database/Database.php (2)
  • Database (37-8255)
  • getNamespace (791-794)
src/Database/Adapter/SQL.php (7)
  • getSQLTable (1736-1739)
  • getSpatialGeomFromText (1547-1559)
  • getVectorDistanceOrder (1597-1600)
  • getInternalKeyForAttribute (1950-1962)
  • getPDO (1745-1748)
  • execute (1984-1987)
  • getSupportForVectors (1535-1538)
src/Database/Adapter.php (5)
  • getNamespace (107-110)
  • filter (1203-1212)
  • quote (467-467)
  • execute (1300-1300)
  • getSupportForVectors (1029-1029)
src/Database/Adapter/SQL.php (5)
src/Database/Database.php (1)
  • Database (37-8255)
src/Database/Adapter/Postgres.php (2)
  • getSupportForVectors (1990-1993)
  • getVectorDistanceOrder (1721-1741)
src/Database/Adapter.php (1)
  • getSupportForVectors (1029-1029)
src/Database/Adapter/Pool.php (1)
  • getSupportForVectors (408-411)
src/Database/Query.php (2)
  • Query (8-1169)
  • getMethod (165-168)
src/Database/Adapter/MySQL.php (1)
src/Database/Database.php (1)
  • Database (37-8255)
src/Database/Validator/Index.php (1)
src/Database/Database.php (1)
  • Database (37-8255)
src/Database/Database.php (4)
src/Database/Adapter/Postgres.php (1)
  • getSupportForVectors (1990-1993)
src/Database/Adapter/SQL.php (1)
  • getSupportForVectors (1535-1538)
src/Database/Adapter.php (1)
  • getSupportForVectors (1029-1029)
src/Database/Adapter/Pool.php (1)
  • getSupportForVectors (408-411)
🪛 GitHub Actions: CodeQL
tests/e2e/Adapter/Scopes/IndexTests.php

[error] 167-172: PHPStan: Access to protected property Utopia\Database\Database::$adapter.


[error] 245-250: PHPStan: Access to protected property Utopia\Database\Database::$adapter.

src/Database/Validator/Index.php

[error] 51-51: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 62-62: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 76-76: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 94-94: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 113-113: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 138-138: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 143-143: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 149-149: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 155-155: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 160-160: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 164-164: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 185-185: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 206-206: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 216-216: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 224-224: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 241-241: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 265-265: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 274-274: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 280-280: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 286-286: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 313-313: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 339-339: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 346-346: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 352-352: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 359-359: PHPStan: Access to an undefined property Utopia\Database\Validator\Index::$message.


[error] 51-359: PHPStan reported 37 errors in total from analyse command.

🪛 PHPMD (2.15.0)
src/Database/Adapter/SQL.php

1597-1597: Avoid unused parameters such as '$query'. (undefined)

(UnusedFormalParameter)


1597-1597: Avoid unused parameters such as '$binds'. (undefined)

(UnusedFormalParameter)


1597-1597: Avoid unused parameters such as '$alias'. (undefined)

(UnusedFormalParameter)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Unit Test
🔇 Additional comments (22)
src/Database/Query.php (3)

44-48: New vector query constants — LGTM

Naming is clear and aligned with distance metrics. No concerns.


99-102: Expose vector types in TYPES

Good to surface them as first‑class methods for validation/parse. Ensure downstream validators/adapter switches treat these as filters (not orders/selections).


294-297: isMethod updated for vectors

Method whitelist correctly includes the new vector methods and select. Looks good.

tests/e2e/Adapter/Scopes/CollectionTests.php (1)

719-721: Update to MAX_ARRAY_INDEX_LENGTH — LGTM

Assertion matches Database::createCollection behavior for array attributes.

src/Database/Adapter/MySQL.php (1)

119-123: Switch to DEFAULT_SRID

Good consistency with Database::DEFAULT_SRID for meters path.

src/Database/Validator/Structure.php (2)

327-333: INT range update is correct.

Switch to MAX_INT/MAX_BIG_INT with size>=8 heuristic is appropriate and keeps behavior explicit.


335-340: FLOAT bounds via MAX_DOUBLE look good.

Using FloatValidator + Range with MAX_DOUBLE mirrors prior intent and avoids stale DOUBLE_MAX.

src/Database/Adapter/MariaDB.php (2)

142-142: LGTM! Constant rename aligns with updated Database constants.

The use of Database::MAX_ARRAY_INDEX_LENGTH is consistent with the new constant naming introduced in the PR.

Also applies to: 749-749


1893-1893: LGTM! SRID constant updated correctly.

The use of Database::DEFAULT_SRID aligns with the new global constant for WGS84 geographic coordinates.

tests/e2e/Adapter/Scopes/PermissionTests.php (1)

221-224: LGTM! Test data updated with new constant names.

The constant renames from Database::INT_MAX to Database::MAX_INT and Database::BIG_INT_MAX to Database::MAX_BIG_INT are consistent with the updated Database class constants.

Also applies to: 253-256, 277-280

src/Database/Adapter/Postgres.php (7)

196-207: LGTM! Vector extension installation integrated correctly.

The vector attribute detection and automatic pgvector extension installation in createCollection ensures the extension is available before creating vector columns.


456-465: LGTM! Vector dimension validation uses central constant.

The validation correctly uses Database::MAX_VECTOR_DIMENSIONS instead of a hardcoded value and ensures the pgvector extension is installed. This addresses the concerns from the previous review.

Also applies to: 568-576


1649-1652: LGTM! Vector queries correctly delegated to ORDER BY clause.

Returning an empty string for vector distance queries in the WHERE clause is correct, as these are handled in the ORDER BY clause via getVectorDistanceOrder.


1712-1741: LGTM! Vector distance ORDER BY implementation looks correct.

The getVectorDistanceOrder method correctly:

  • Maps query types to PostgreSQL pgvector operators (<#>, <=>, <->)
  • Formats the vector as a PostgreSQL array literal
  • Casts to ::vector type

1810-1820: LGTM! Vector SQL type mapping implemented correctly.

The addition of Database::VAR_VECTOR returning VECTOR({$size}) is correct for PostgreSQL pgvector. The SRID updates for spatial types also correctly use the central Database::DEFAULT_SRID constant.


1841-1855: LGTM! Safe extension installation implementation.

The private ensurePgVectorExtension method safely installs the pgvector extension using CREATE EXTENSION IF NOT EXISTS, preventing errors if already installed.


1985-1993: LGTM! Vector support capability exposed correctly.

The public getSupportForVectors method correctly indicates that PostgreSQL supports vector types through pgvector.

src/Database/Adapter/SQL.php (5)

1133-1136: LGTM! Vector width calculation is correct.

The 4 bytes per dimension calculation is appropriate for float32 storage. Note: The past review comment about Postgres document size limit should be addressed in the Postgres adapter file, not here.


1530-1538: LGTM! Capability method follows established pattern.

The default false return allows adapters like Postgres to override and advertise vector support.


1589-1600: LGTM! Base method for adapter overrides.

This scaffolding method is correctly designed to be overridden by adapters that support vectors (e.g., Postgres). The static analysis warnings about unused parameters are false positives—the parameters are used in concrete implementations.


1663-1666: LGTM! Correct exception for vector queries.

Vector queries are handled via ORDER BY (not WHERE conditions), so throwing an exception here prevents misuse. The error message clearly indicates the issue.


2479-2492: LGTM! Vector ordering logic is sound.

Prepending vector distance calculations to the ORDER BY clause correctly prioritizes similarity search results. The implementation handles multiple vector queries (if needed) by adding all their distance expressions to the ordering.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (3)
tests/e2e/Adapter/Scopes/VectorTests.php (3)

1699-1711: Fix skipAuth and find() calls (same issues as lines 1643-1665).

Same issues: skipAuth() method doesn't exist, and find() expects a string for the 3rd parameter, not array<Role>.


31-31: Fix type error: pass explicit booleans instead of null.

The createAttribute() method expects boolean parameters for $array (6th parameter) and $required (7th parameter), but line 31 passes null for the 6th parameter. This causes a type error under strict types.

Apply this diff:

-        $database->createAttribute('vectorCollection', 'large_embedding', Database::VAR_VECTOR, 128, false, null);
+        $database->createAttribute('vectorCollection', 'large_embedding', Database::VAR_VECTOR, 128, false, false, false);

666-701: Add a secondary ordering for stable cursor pagination.

Vector distance ordering alone can produce nondeterministic results when multiple documents have equal distances, leading to unstable pagination with cursorAfter. Append Query::orderAsc('$id') as a tie-breaker to ensure stable, predictable pagination.

Apply this diff to stabilize pagination:

         $page1 = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::limit(3),
             Query::offset(0)
         ]);
 
         // Second page
         $page2 = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::limit(3),
             Query::offset(3)
         ]);
 
         // Test with cursor pagination
         $firstBatch = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::limit(5)
         ]);
 
         $lastDoc = $firstBatch[4];
         $nextBatch = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::cursorAfter($lastDoc),
             Query::limit(5)
         ]);
🧹 Nitpick comments (1)
tests/e2e/Adapter/Scopes/VectorTests.php (1)

906-909: Relax strict equality for approximate HNSW results.

HNSW is an approximate nearest-neighbor index; identical top-K ordering isn't guaranteed even on small datasets. Replace the strict assertEquals with basic sanity checks to avoid flaky test failures.

Apply this diff:

-        // Results should be the same
-        $this->assertEquals(
-            array_map(fn ($d) => $d->getId(), $results1),
-            array_map(fn ($d) => $d->getId(), $results2)
-        );
+        // Results should be valid and non-empty (HNSW is approximate)
+        $this->assertCount(10, $results2);
+        $this->assertNotEmpty($results2[0]->getId());
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2faf75a and e8318df.

📒 Files selected for processing (1)
  • tests/e2e/Adapter/Scopes/VectorTests.php (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
PR: utopia-php/database#721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.

Applied to files:

  • tests/e2e/Adapter/Scopes/VectorTests.php
🧬 Code graph analysis (1)
tests/e2e/Adapter/Scopes/VectorTests.php (4)
src/Database/Adapter.php (1)
  • Adapter (16-1325)
src/Database/Database.php (4)
  • Database (37-8255)
  • createAttribute (1734-1793)
  • createIndex (3367-3505)
  • count (7040-7089)
src/Database/Query.php (5)
  • Query (8-1169)
  • vectorDot (1141-1144)
  • vectorCosine (1153-1156)
  • vectorEuclidean (1165-1168)
  • limit (640-643)
src/Database/Adapter/Postgres.php (3)
  • getSupportForVectors (1990-1993)
  • createAttribute (454-484)
  • createIndex (877-932)
🪛 GitHub Actions: CodeQL
tests/e2e/Adapter/Scopes/VectorTests.php

[error] 1261-1261: Parameter #7 $signed of method Utopia\Database\Database::createAttribute() expects bool, array<string, string|true> given.


[error] 1342-1342: Parameter #7 $signed of method Utopia\Database\Database::createAttribute() expects bool, array<string, string|true> given.


[error] 1547-1547: Method Utopia\Database\Database::updateCollection() invoked with 2 parameters, 3 required.


[error] 1547-1547: Parameter #2 $permissions of method Utopia\Database\Database::updateCollection() expects array, string given.


[error] 1643-1643: Call to an undefined method Utopia\Database\Database::skipAuth().


[error] 1646-1646: Parameter #3 $forPermission of method Utopia\Database\Database::find() expects string, array<int, Utopia\Database\Helpers\Role> given.


[error] 1657-1657: Parameter #3 $forPermission of method Utopia\Database\Database::find() expects string, array<int, Utopia\Database\Helpers\Role> given.


[error] 1665-1665: Call to an undefined method Utopia\Database\Database::skipAuth().


[error] 1699-1699: Call to an undefined method Utopia\Database\Database::skipAuth().


[error] 1703-1703: Parameter #3 $forPermission of method Utopia\Database\Database::find() expects string, array<int, Utopia\Database\Helpers\Role> given.


[error] 1711-1711: Call to an undefined method Utopia\Database\Database::skipAuth().

🪛 GitHub Actions: Linter
tests/e2e/Adapter/Scopes/VectorTests.php

[error] 1-1: Pint lint failed. PSR-12 violation: function_declaration. Exit code 1. Run './vendor/bin/pint' to fix code style issues.

🪛 PHPMD (2.15.0)
tests/e2e/Adapter/Scopes/VectorTests.php

595-595: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


604-604: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


888-888: Avoid unused local variables such as '$timeWithoutIndex'. (undefined)

(UnusedLocalVariable)


901-901: Avoid unused local variables such as '$timeWithIndex'. (undefined)

(UnusedLocalVariable)


975-975: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


982-982: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1281-1281: Avoid unused local variables such as '$child1'. (undefined)

(UnusedLocalVariable)


1289-1289: Avoid unused local variables such as '$child2'. (undefined)

(UnusedLocalVariable)


1353-1353: Avoid unused local variables such as '$book1'. (undefined)

(UnusedLocalVariable)


1362-1362: Avoid unused local variables such as '$book2'. (undefined)

(UnusedLocalVariable)


1414-1414: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1618-1618: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


1626-1626: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1634-1634: Avoid unused local variables such as '$doc3'. (undefined)

(UnusedLocalVariable)


1925-1925: Avoid unused local variables such as '$update1'. (undefined)

(UnusedLocalVariable)


1930-1930: Avoid unused local variables such as '$update2'. (undefined)

(UnusedLocalVariable)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (Mirror)
🔇 Additional comments (1)
tests/e2e/Adapter/Scopes/VectorTests.php (1)

1547-1547: Based on my verification, the SDK documentation shows updateCollection takes databaseId, collectionId, and collection name as parameters. The test code passes two string parameters to updateCollection('vectorOriginal', 'vectorRenamed'), which matches the expected signature where the second parameter is the new collection name, not a permissions array.

The original review comment's assertion that "the 2nd parameter expects array<string> (permissions), not a string" is incorrect. The updateCollection method takes collection ID and name as required parameters, with read and write permissions as optional parameters.

The method call is correct as written. No fixes are required.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (7)
tests/e2e/Adapter/Scopes/VectorTests.php (7)

666-701: Add deterministic tie-breaker for stable cursor pagination.

Vector distance ordering can yield nondeterministic results for equal distances. Append Query::orderAsc('$id') after vector ordering to ensure stable pagination and predictable cursorAfter behavior.

Apply this diff:

         $page1 = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::limit(3),
             Query::offset(0)
         ]);
         
         // Second page
         $page2 = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::limit(3),
             Query::offset(3)
         ]);
         
         // Test with cursor pagination
         $firstBatch = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::limit(5)
         ]);
         
         $this->assertCount(5, $firstBatch);
         
         $lastDoc = $firstBatch[4];
         $nextBatch = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::cursorAfter($lastDoc),
             Query::limit(5)
         ]);

906-909: Relax strict equality assertion for approximate HNSW results.

HNSW is an approximate algorithm and may not return identical top-K results compared to exact search, even on small datasets. Replace strict equality with sanity checks.

Apply this diff:

-        // Results should be the same
-        $this->assertEquals(
-            array_map(fn ($d) => $d->getId(), $results1),
-            array_map(fn ($d) => $d->getId(), $results2)
-        );
+        // Results should be sorted and non-empty (HNSW is approximate)
+        $this->assertCount(10, $results2);
+        $this->assertNotEmpty($results2[0]->getId());

31-31: Fix null passed to boolean parameter.

The 6th parameter of createAttribute is $array (bool), but this call passes null, which will cause a TypeError under declare(strict_types=1).

Apply this diff:

-        $database->createAttribute('vectorCollection', 'large_embedding', Database::VAR_VECTOR, 128, false, null);
+        $database->createAttribute('vectorCollection', 'large_embedding', Database::VAR_VECTOR, 128, false, false);

1261-1261: Fix parameter order in createAttribute call for relationships.

The relationship configuration array is passed as the 7th parameter, but createAttribute() expects a boolean $signed at position 7. The relationship options should be the 8th parameter ($filters).

Apply this diff:

-        $database->createAttribute('vectorChild', 'parent', Database::VAR_RELATIONSHIP, 0, false, false, false, ['relatedCollection' => 'vectorParent', 'relationType' => Database::RELATION_ONE_TO_MANY, 'twoWay' => true, 'twoWayKey' => 'children']);
+        $database->createAttribute('vectorChild', 'parent', Database::VAR_RELATIONSHIP, 0, false, false, false, ['relatedCollection' => 'vectorParent', 'relationType' => Database::RELATION_ONE_TO_MANY, 'twoWay' => true, 'twoWayKey' => 'children']);

Wait, let me recount the parameters. The signature is:

createAttribute($collection, $key, $type, $size, $required, $array = false, $signed = true, $filters = [])

So:

  1. collection
  2. key
  3. type
  4. size
  5. required
  6. array
  7. signed
  8. filters

In line 1261:

$database->createAttribute('vectorChild', 'parent', Database::VAR_RELATIONSHIP, 0, false, false, false, ['relatedCollection' => ...]);
  1. 'vectorChild' - collection ✓
  2. 'parent' - key ✓
  3. Database::VAR_RELATIONSHIP - type ✓
  4. 0 - size ✓
  5. false - required ✓
  6. false - array ✓
  7. false - signed ✓
  8. [...] - filters ✓

Actually, this looks correct! Let me check the error message again. The CodeQL error says:
"Parameter #7 $signed of method Utopia\Database\Database::createAttribute() expects bool, array<string, string|true> given."

So it's saying the 7th parameter is receiving an array. Let me recount:

Actually wait, looking at the code again:

$database->createAttribute('vectorChild', 'parent', Database::VAR_RELATIONSHIP, 0, false, null, ['relatedCollection' => ...]);

Oh! The actual line in the code only has 7 arguments total after the method name:

  1. 'vectorChild'
  2. 'parent'
  3. Database::VAR_RELATIONSHIP
  4. 0
  5. false
  6. null <- this should be false for $array
  7. ['relatedCollection' => ...] <- this is being passed as $signed but should be $filters

So the fix is to insert false (or true) for the $array parameter and another false (or true) for the $signed parameter:

-        $database->createAttribute('vectorChild', 'parent', Database::VAR_RELATIONSHIP, 0, false, null, ['relatedCollection' => 'vectorParent', 'relationType' => Database::RELATION_ONE_TO_MANY, 'twoWay' => true, 'twoWayKey' => 'children']);
+        $database->createAttribute('vectorChild', 'parent', Database::VAR_RELATIONSHIP, 0, false, false, false, ['relatedCollection' => 'vectorParent', 'relationType' => Database::RELATION_ONE_TO_MANY, 'twoWay' => true, 'twoWayKey' => 'children']);

This changes:

  • 6th param from null to false ($array)
  • 7th param from ['relatedCollection'...] to false ($signed)
  • 8th param (new) is ['relatedCollection'...] ($filters)

1342-1342: Fix parameter order in createAttribute call for relationships.

Same issue as line 1261: the relationship configuration array is passed as the 7th parameter, but should be the 8th ($filters). The 7th parameter ($signed) expects a boolean, and the 6th parameter ($array) is receiving null instead of a boolean.

Apply this diff:

-        $database->createAttribute('vectorBooks', 'author', Database::VAR_RELATIONSHIP, 0, false, null, ['relatedCollection' => 'vectorAuthors', 'relationType' => Database::RELATION_MANY_TO_ONE, 'twoWay' => true, 'twoWayKey' => 'books']);
+        $database->createAttribute('vectorBooks', 'author', Database::VAR_RELATIONSHIP, 0, false, false, false, ['relatedCollection' => 'vectorAuthors', 'relationType' => Database::RELATION_MANY_TO_ONE, 'twoWay' => true, 'twoWayKey' => 'books']);

1643-1668: Remove non-existent skipAuth() calls and fix find() permission parameter.

The Database class has no skipAuth() method. Additionally, find() expects a string $forPermission parameter (e.g., Database::PERMISSION_READ), not an array of roles.

The test logic for permission filtering seems to be using an incorrect API. If you need to test permission-based vector searches, verify the correct API from the Database class documentation.

Apply this diff to remove the invalid calls:

         // Query as user1 - should only see doc1 and doc3
-        $database->skipAuth(false);
-        $results = $database->find('vectorPermissions', [
+        $results = $database->find('vectorPermissions', [
             Query::vectorCosine('embedding', [1.0, 0.0, 0.0])
-        ], [Role::user('user1')]);
+        ]);
         
-        $this->assertCount(2, $results);
+        // Without proper permission filtering API, this test needs revision
+        $this->assertGreaterThanOrEqual(2, $results);
         $names = array_map(fn($d) => $d->getAttribute('name'), $results);
         $this->assertContains('Doc 1', $names);
         $this->assertContains('Doc 3', $names);
-        $this->assertNotContains('Doc 2', $names);
         
         // Query as user2 - should only see doc2 and doc3
         $results = $database->find('vectorPermissions', [
             Query::vectorCosine('embedding', [1.0, 0.0, 0.0])
-        ], [Role::user('user2')]);
+        ]);
         
-        $this->assertCount(2, $results);
+        $this->assertGreaterThanOrEqual(2, $results);
         $names = array_map(fn($d) => $d->getAttribute('name'), $results);
         $this->assertContains('Doc 2', $names);
         $this->assertContains('Doc 3', $names);
-        $this->assertNotContains('Doc 1', $names);
-        
-        $database->skipAuth(true);

1699-1711: Remove non-existent skipAuth() calls and fix find() permission parameter.

Same issue as lines 1643-1668: skipAuth() doesn't exist and find() expects a string permission parameter, not an array of roles.

Apply this diff:

         // Query with limit 3 as any user - should skip restricted docs and return accessible ones
-        $database->skipAuth(false);
         $results = $database->find('vectorPermScoring', [
             Query::vectorCosine('embedding', [1.0, 0.0, 0.0]),
             Query::limit(3)
-        ], [Role::any()]);
+        ]);
         
-        // Should only get the 2 accessible documents
-        $this->assertCount(2, $results);
+        // Without proper permission filtering, adjust assertion
+        $this->assertLessThanOrEqual(3, count($results));
         foreach ($results as $doc) {
             $this->assertGreaterThanOrEqual(3, $doc->getAttribute('score'));
         }
-        
-        $database->skipAuth(true);
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e8318df and 08c1407.

📒 Files selected for processing (1)
  • tests/e2e/Adapter/Scopes/VectorTests.php (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
PR: utopia-php/database#721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.

Applied to files:

  • tests/e2e/Adapter/Scopes/VectorTests.php
🧬 Code graph analysis (1)
tests/e2e/Adapter/Scopes/VectorTests.php (8)
src/Database/Adapter.php (1)
  • Adapter (16-1325)
src/Database/Database.php (5)
  • Database (37-8255)
  • getDatabase (819-822)
  • createAttribute (1734-1793)
  • createIndex (3367-3505)
  • deleteIndex (3519-3544)
src/Database/Query.php (4)
  • Query (8-1169)
  • vectorDot (1141-1144)
  • vectorCosine (1153-1156)
  • vectorEuclidean (1165-1168)
tests/e2e/Adapter/Base.php (2)
  • getDatabase (38-38)
  • deleteIndex (54-54)
tests/e2e/Adapter/PostgresTest.php (2)
  • getDatabase (21-51)
  • deleteIndex (63-72)
tests/e2e/Adapter/MariaDBTest.php (2)
  • getDatabase (21-52)
  • deleteIndex (64-72)
src/Database/Adapter/Postgres.php (4)
  • getSupportForVectors (1990-1993)
  • createAttribute (454-484)
  • createIndex (877-932)
  • deleteIndex (942-955)
src/Database/Adapter/SQL.php (2)
  • getSupportForVectors (1535-1538)
  • createAttribute (234-248)
🪛 GitHub Actions: CodeQL
tests/e2e/Adapter/Scopes/VectorTests.php

[error] 1261-1711: PHPStan: Method Utopia\Database\Database::createAttribute() expects bool in some calls; received array; also multiple parameter type mismatches and undefined methods (skipAuth).

🪛 PHPMD (2.15.0)
tests/e2e/Adapter/Scopes/VectorTests.php

595-595: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


604-604: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


888-888: Avoid unused local variables such as '$timeWithoutIndex'. (undefined)

(UnusedLocalVariable)


901-901: Avoid unused local variables such as '$timeWithIndex'. (undefined)

(UnusedLocalVariable)


975-975: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


982-982: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1281-1281: Avoid unused local variables such as '$child1'. (undefined)

(UnusedLocalVariable)


1289-1289: Avoid unused local variables such as '$child2'. (undefined)

(UnusedLocalVariable)


1353-1353: Avoid unused local variables such as '$book1'. (undefined)

(UnusedLocalVariable)


1362-1362: Avoid unused local variables such as '$book2'. (undefined)

(UnusedLocalVariable)


1414-1414: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1618-1618: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


1626-1626: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1634-1634: Avoid unused local variables such as '$doc3'. (undefined)

(UnusedLocalVariable)


1925-1925: Avoid unused local variables such as '$update1'. (undefined)

(UnusedLocalVariable)


1930-1930: Avoid unused local variables such as '$update2'. (undefined)

(UnusedLocalVariable)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (MariaDB)

…p/database into feat/postgresql-vector-support

# Conflicts:
#	tests/e2e/Adapter/Scopes/VectorTests.php
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
tests/e2e/Adapter/Scopes/VectorTests.php (1)

1325-1389: Fix relationship attribute parameter order (duplicate issue).

Same issue as the previous relationship test: the configuration array is at the wrong position.

Apply this diff:

         $database->createCollection('vectorBooks');
         $database->createAttribute('vectorBooks', 'title', Database::VAR_STRING, 255, true);
         $database->createAttribute('vectorBooks', 'embedding', Database::VAR_VECTOR, 3, true);
-        $database->createAttribute('vectorBooks', 'author', Database::VAR_RELATIONSHIP, 0, false, null, ['relatedCollection' => 'vectorAuthors', 'relationType' => Database::RELATION_MANY_TO_ONE, 'twoWay' => true, 'twoWayKey' => 'books']);
+        $database->createAttribute('vectorBooks', 'author', Database::VAR_RELATIONSHIP, 0, false, null, false, false, ['relatedCollection' => 'vectorAuthors', 'relationType' => Database::RELATION_MANY_TO_ONE, 'twoWay' => true, 'twoWayKey' => 'books']);
🧹 Nitpick comments (4)
src/Database/Validator/Index.php (2)

255-293: Tighten spatial index validation: use array_filter for orders and reject lengths.

The !empty($orders) check at line 287 can false-positive when the orders array contains empty values. Additionally, spatial indexes should not allow length specifications.

Apply this diff:

         foreach ($attributes as $attributeName) {
             $attribute = $this->attributes[\strtolower($attributeName)] ?? new Document();
             $attributeType = $attribute->getAttribute('type', '');
 
             if (!\in_array($attributeType, Database::SPATIAL_TYPES, true)) {
                 $this->message = 'Spatial index can only be created on spatial attributes (point, linestring, polygon). Attribute "' . $attributeName . '" is of type "' . $attributeType . '"';
                 return false;
             }
 
             $required = (bool)$attribute->getAttribute('required', false);
             if (!$required && !$this->supportForSpatialIndexNull) {
                 $this->message = 'Spatial indexes do not allow null values. Mark the attribute "' . $attributeName . '" as required or create the index on a column with no null values.';
                 return false;
             }
         }
 
-        if (!empty($orders) && !$this->supportForSpatialIndexNull) {
+        $lengths = $index->getAttribute('lengths', []);
+        if (\count(\array_filter($orders)) > 0 && !$this->supportForSpatialIndexOrder) {
             $this->message = 'Spatial indexes with explicit orders are not supported. Remove the orders to create this index.';
             return false;
         }
+
+        if (\count(\array_filter($lengths)) > 0) {
+            $this->message = 'Spatial indexes do not support lengths';
+            return false;
+        }
 
         return true;

328-366: Fix orders check to use array_filter for consistency.

Line 360 uses !empty($orders) which can false-positive when the orders array contains empty values, while the lengths check correctly uses array_filter. Make them consistent.

Apply this diff:

         $attribute = $this->attributes[\strtolower($attributes[0])] ?? new Document();
         if ($attribute->getAttribute('type') !== Database::VAR_VECTOR) {
             $this->message = 'Vector index can only be created on vector attributes';
             return false;
         }
 
         $orders = $index->getAttribute('orders', []);
         $lengths = $index->getAttribute('lengths', []);
-        if (!empty($orders) || \count(\array_filter($lengths)) > 0) {
+        if (\count(\array_filter($orders)) > 0 || \count(\array_filter($lengths)) > 0) {
             $this->message = 'Vector indexes do not support orders or lengths';
             return false;
         }
 
         return true;
tests/e2e/Adapter/Scopes/VectorTests.php (2)

666-701: Add deterministic tie-breaker for cursor pagination stability.

Vector distance ordering alone can yield unstable order when documents have equal distances. Without a secondary sort key, cursorAfter and cursorBefore pagination can behave unpredictably. Add Query::orderAsc('$id') to ensure stable pagination.

Apply this diff to add tie-breakers:

         // First page
         $page1 = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::limit(3),
             Query::offset(0)
         ]);
 
         $this->assertCount(3, $page1);
 
         // Second page
         $page2 = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::limit(3),
             Query::offset(3)
         ]);
 
         $this->assertCount(3, $page2);
 
         // Ensure different documents
         $page1Ids = array_map(fn ($doc) => $doc->getId(), $page1);
         $page2Ids = array_map(fn ($doc) => $doc->getId(), $page2);
         $this->assertEmpty(array_intersect($page1Ids, $page2Ids));
 
         // Test with cursor pagination
         $firstBatch = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::limit(5)
         ]);
 
         $this->assertCount(5, $firstBatch);
 
         $lastDoc = $firstBatch[4];
         $nextBatch = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::cursorAfter($lastDoc),
             Query::limit(5)
         ]);

906-910: Relax HNSW result assertion—approximate search doesn't guarantee exact match.

HNSW is an approximate nearest-neighbor algorithm. Even on small datasets, requiring identical result ordering is too strict and can cause flaky tests.

Apply this diff:

         $this->assertCount(10, $results2);
 
-        // Results should be the same
-        $this->assertEquals(
-            array_map(fn ($d) => $d->getId(), $results1),
-            array_map(fn ($d) => $d->getId(), $results2)
-        );
+        // Results should be non-empty and have expected count (approximate search may differ)
+        $this->assertCount(10, $results1);
+        $this->assertCount(10, $results2);
+        $this->assertNotEmpty($results1[0]->getId());
+        $this->assertNotEmpty($results2[0]->getId());
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 08c1407 and a05d4af.

📒 Files selected for processing (3)
  • src/Database/Validator/Index.php (5 hunks)
  • tests/e2e/Adapter/Scopes/IndexTests.php (2 hunks)
  • tests/e2e/Adapter/Scopes/VectorTests.php (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
PR: utopia-php/database#721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.

Applied to files:

  • tests/e2e/Adapter/Scopes/IndexTests.php
  • tests/e2e/Adapter/Scopes/VectorTests.php
🧬 Code graph analysis (3)
tests/e2e/Adapter/Scopes/IndexTests.php (7)
src/Database/Database.php (1)
  • getAdapter (1241-1244)
src/Database/Adapter/MySQL.php (2)
  • getSupportForIndexArray (130-136)
  • getSupportForSpatialIndexOrder (179-182)
src/Database/Adapter/MariaDB.php (3)
  • getSupportForIndexArray (1842-1845)
  • getSupportForSpatialIndexNull (1857-1860)
  • getSupportForSpatialIndexOrder (1876-1879)
src/Database/Adapter/SQL.php (4)
  • getSupportForIndexArray (1465-1468)
  • getSupportForSpatialIndexNull (1505-1508)
  • getSupportForSpatialIndexOrder (1515-1518)
  • getSupportForVectors (1535-1538)
src/Database/Adapter.php (4)
  • getSupportForIndexArray (935-935)
  • getSupportForSpatialIndexNull (1071-1071)
  • getSupportForSpatialIndexOrder (1085-1085)
  • getSupportForVectors (1029-1029)
src/Database/Adapter/Pool.php (4)
  • getSupportForIndexArray (338-341)
  • getSupportForSpatialIndexNull (438-441)
  • getSupportForSpatialIndexOrder (528-531)
  • getSupportForVectors (408-411)
src/Database/Adapter/Postgres.php (3)
  • getSupportForSpatialIndexNull (2062-2065)
  • getSupportForSpatialIndexOrder (2082-2085)
  • getSupportForVectors (1990-1993)
src/Database/Validator/Index.php (1)
src/Database/Database.php (1)
  • Database (37-8255)
tests/e2e/Adapter/Scopes/VectorTests.php (10)
src/Database/Adapter.php (1)
  • Adapter (16-1325)
src/Database/Database.php (8)
  • Database (37-8255)
  • getDatabase (819-822)
  • createCollection (1370-1504)
  • createAttribute (1734-1793)
  • find (6812-6943)
  • createIndex (3367-3505)
  • updateAttribute (2345-2572)
  • deleteIndex (3519-3544)
src/Database/Helpers/Permission.php (3)
  • Permission (9-264)
  • read (186-195)
  • update (220-229)
src/Database/Query.php (4)
  • Query (8-1169)
  • vectorDot (1141-1144)
  • vectorCosine (1153-1156)
  • vectorEuclidean (1165-1168)
src/Database/Validator/Authorization.php (3)
  • Authorization (7-225)
  • setRole (83-86)
  • cleanRoles (109-112)
tests/e2e/Adapter/Base.php (2)
  • getDatabase (38-38)
  • deleteIndex (54-54)
tests/e2e/Adapter/MariaDBTest.php (2)
  • getDatabase (21-52)
  • deleteIndex (64-72)
tests/e2e/Adapter/PostgresTest.php (2)
  • getDatabase (21-51)
  • deleteIndex (63-72)
src/Database/Adapter/SQL.php (3)
  • getSupportForVectors (1535-1538)
  • createAttribute (234-248)
  • find (2366-2571)
src/Database/Adapter/Postgres.php (6)
  • getSupportForVectors (1990-1993)
  • createCollection (191-343)
  • createAttribute (454-484)
  • createIndex (877-932)
  • updateAttribute (562-625)
  • deleteIndex (942-955)
🪛 PHPMD (2.15.0)
tests/e2e/Adapter/Scopes/VectorTests.php

596-596: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


605-605: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


889-889: Avoid unused local variables such as '$timeWithoutIndex'. (undefined)

(UnusedLocalVariable)


902-902: Avoid unused local variables such as '$timeWithIndex'. (undefined)

(UnusedLocalVariable)


976-976: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


983-983: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1282-1282: Avoid unused local variables such as '$child1'. (undefined)

(UnusedLocalVariable)


1290-1290: Avoid unused local variables such as '$child2'. (undefined)

(UnusedLocalVariable)


1354-1354: Avoid unused local variables such as '$book1'. (undefined)

(UnusedLocalVariable)


1363-1363: Avoid unused local variables such as '$book2'. (undefined)

(UnusedLocalVariable)


1415-1415: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1578-1578: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


1586-1586: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1594-1594: Avoid unused local variables such as '$doc3'. (undefined)

(UnusedLocalVariable)


1886-1886: Avoid unused local variables such as '$update1'. (undefined)

(UnusedLocalVariable)


1891-1891: Avoid unused local variables such as '$update2'. (undefined)

(UnusedLocalVariable)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (MySQL)
🔇 Additional comments (8)
tests/e2e/Adapter/Scopes/IndexTests.php (2)

165-173: Correct usage of adapter capability methods.

The Index validator construction properly uses getAdapter() to access adapter capability methods and includes all new capability flags (spatial index null/order support and vector support). This aligns with the updated constructor signature.


243-251: Consistent adapter capability usage.

The second Index validator construction follows the same correct pattern, using getAdapter() and passing all capability flags.

src/Database/Validator/Index.php (3)

28-45: Constructor correctly initializes attributes.

The promoted properties pattern is clean and appropriate for PHP 8.1+. The constructor properly initializes the $attributes array by indexing both user-provided attributes and internal attributes.


295-321: Spatial attribute protection logic is correct.

The method properly prevents creating non-spatial indexes on spatial attributes, ensuring spatial data is indexed with appropriate spatial index types.


399-408: New validation methods properly integrated.

The three new validation methods are correctly integrated into isValid() following the established pattern. The validation order is logical.

tests/e2e/Adapter/Scopes/VectorTests.php (3)

15-151: Basic vector attribute tests look comprehensive.

The tests properly cover vector attribute creation, dimension validation, document creation, and early returns when vectors aren't supported. The parameter passing is correct.


1632-1676: Permission filtering tests correctly use Authorization API.

The tests properly use Authorization::setRole() and cleanRoles() for permission testing. The find() calls respect the role context without needing explicit permission parameters.


1-100: Comprehensive vector test suite with minor fixes needed.

This test trait provides excellent coverage of vector functionality including CRUD operations, validation, queries, pagination, indexes, relationships, permissions, and numerous edge cases (infinity, NaN, sparse arrays, etc.). The tests properly skip when vector support is unavailable.

The static analysis warnings about unused variables are false positives—the variables are used in assertions or for setup.

However, three issues require fixes:

  1. Lines 1262 & 1343: Relationship attribute parameter order (critical)
  2. Lines 666-701: Missing pagination tie-breakers (recommended)
  3. Lines 906-910: Too-strict HNSW result assertions (recommended)

These are flagged in separate comments below.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

♻️ Duplicate comments (2)
src/Database/Validator/Index.php (2)

358-363: Make orders check consistent with lengths by using array_filter.

Line 360 uses !empty($orders) while the lengths check correctly uses array_filter. Pre-filled arrays with empty values would incorrectly fail the orders check.

Apply this diff:

         $orders = $index->getAttribute('orders', []);
         $lengths = $index->getAttribute('lengths', []);
-        if (!empty($orders) || \count(\array_filter($lengths)) > 0) {
+        if (\count(\array_filter($orders)) > 0 || \count(\array_filter($lengths)) > 0) {
             $this->message = 'Vector indexes do not support orders or lengths';
             return false;
         }

287-290: Use array_filter for orders check and add lengths validation.

!empty($orders) false-positives when the orders array contains empty values. Additionally, spatial indexes should also reject lengths.

Apply this diff:

-        if (!empty($orders) && !$this->supportForSpatialIndexOrder) {
+        $lengths = $index->getAttribute('lengths', []);
+        if (\count(\array_filter($orders)) > 0 && !$this->supportForSpatialIndexOrder) {
             $this->message = 'Spatial indexes with explicit orders are not supported. Remove the orders to create this index.';
             return false;
         }
+
+        if (\count(\array_filter($lengths)) > 0) {
+            $this->message = 'Spatial indexes do not support lengths';
+            return false;
+        }
🧹 Nitpick comments (2)
tests/e2e/Adapter/Scopes/VectorTests.php (2)

906-910: Relax HNSW result equality check—approximate indexes may differ.

HNSW performs approximate nearest-neighbor search and does not guarantee identical top-K results, even on small datasets. Consider replacing the strict equality check with basic sanity assertions.

Apply this diff:

-        // Results should be the same
-        $this->assertEquals(
-            array_map(fn ($d) => $d->getId(), $results1),
-            array_map(fn ($d) => $d->getId(), $results2)
-        );
+        // HNSW is approximate; verify both return expected count and have overlap
+        $this->assertCount(10, $results2);
+        $ids1 = array_map(fn ($d) => $d->getId(), $results1);
+        $ids2 = array_map(fn ($d) => $d->getId(), $results2);
+        $overlap = count(array_intersect($ids1, $ids2));
+        $this->assertGreaterThanOrEqual(7, $overlap, 'Expected significant overlap between indexed and non-indexed results');

689-705: Add deterministic tie-breaker for cursor-based pagination.

Vector distance ordering can produce non-deterministic results when distances are equal. Append Query::orderAsc('$id') to ensure cursors work reliably.

Apply this diff:

         // Test with cursor pagination
         $firstBatch = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::limit(5)
         ]);
 
         $this->assertCount(5, $firstBatch);
 
         $lastDoc = $firstBatch[4];
         $nextBatch = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::cursorAfter($lastDoc),
             Query::limit(5)
         ]);
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between a05d4af and 50c040f.

📒 Files selected for processing (3)
  • src/Database/Validator/Index.php (5 hunks)
  • src/Database/Validator/Vector.php (1 hunks)
  • tests/e2e/Adapter/Scopes/VectorTests.php (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
PR: utopia-php/database#721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.

Applied to files:

  • tests/e2e/Adapter/Scopes/VectorTests.php
🧬 Code graph analysis (3)
src/Database/Validator/Vector.php (2)
src/Database/Database.php (2)
  • Database (37-8255)
  • count (7040-7089)
src/Database/Validator/Query/Filter.php (1)
  • isValid (280-385)
src/Database/Validator/Index.php (2)
src/Database/Database.php (1)
  • Database (37-8255)
src/Database/Document.php (2)
  • getAttribute (224-231)
  • Document (12-470)
tests/e2e/Adapter/Scopes/VectorTests.php (4)
src/Database/Adapter.php (1)
  • Adapter (16-1325)
src/Database/Database.php (4)
  • Database (37-8255)
  • createAttribute (1734-1793)
  • createIndex (3367-3505)
  • count (7040-7089)
src/Database/Query.php (5)
  • Query (8-1169)
  • vectorDot (1141-1144)
  • vectorCosine (1153-1156)
  • vectorEuclidean (1165-1168)
  • limit (640-643)
src/Database/Adapter/Postgres.php (3)
  • getSupportForVectors (1990-1993)
  • createAttribute (454-484)
  • createIndex (877-932)
🪛 PHPMD (2.15.0)
tests/e2e/Adapter/Scopes/VectorTests.php

596-596: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


605-605: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


889-889: Avoid unused local variables such as '$timeWithoutIndex'. (undefined)

(UnusedLocalVariable)


902-902: Avoid unused local variables such as '$timeWithIndex'. (undefined)

(UnusedLocalVariable)


976-976: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


983-983: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1282-1282: Avoid unused local variables such as '$child1'. (undefined)

(UnusedLocalVariable)


1290-1290: Avoid unused local variables such as '$child2'. (undefined)

(UnusedLocalVariable)


1354-1354: Avoid unused local variables such as '$book1'. (undefined)

(UnusedLocalVariable)


1363-1363: Avoid unused local variables such as '$book2'. (undefined)

(UnusedLocalVariable)


1415-1415: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1578-1578: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


1586-1586: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1594-1594: Avoid unused local variables such as '$doc3'. (undefined)

(UnusedLocalVariable)


1886-1886: Avoid unused local variables such as '$update1'. (undefined)

(UnusedLocalVariable)


1891-1891: Avoid unused local variables such as '$update2'. (undefined)

(UnusedLocalVariable)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Unit Test
🔇 Additional comments (5)
src/Database/Validator/Index.php (3)

28-45: LGTM: Constructor properly promotes properties and initializes attributes.

The typed property declaration on line 17 ensures $attributes is initialized, and the new supportForVectorIndexes parameter correctly extends the validator for vector index validation.


295-321: LGTM: Correctly prevents non-spatial indexes on spatial attributes.

The validation properly ensures spatial attributes require spatial indexes, preventing misuse.


399-407: LGTM: New validation methods properly integrated.

The spatial and vector index checks are correctly called and propagate failures appropriately.

tests/e2e/Adapter/Scopes/VectorTests.php (2)

1343-1343: Verify createAttribute parameter count matches signature.

Same concern as line 1262—confirm the 10 parameters align with the actual method signature.


2537-2576: LGTM: Correctly validates single vector query restriction.

The test properly ensures that only one vector similarity query can be used per find operation, as multiple vector orderings are undefined.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
tests/e2e/Adapter/Scopes/VectorTests.php (2)

667-701: Consider adding a tie-breaker to stabilize vector-ordered pagination.

Vector distance alone can yield nondeterministic ordering when documents have equal distances. For cursor pagination (cursorAfter/cursorBefore) to work predictably, append Query::orderAsc('$id') as a secondary sort key.

Example for the queries in this test:

 $page1 = $database->find('vectorPagination', [
     Query::vectorCosine('embedding', $searchVector),
+    Query::orderAsc('$id'),
     Query::limit(3),
     Query::offset(0)
 ]);

Apply similar changes at lines 675-679, 689-692, and 697-701.

Based on learnings


906-910: Relax HNSW result assertion for approximate search.

HNSW is an approximate nearest-neighbor index; exact result-set equality isn't guaranteed even on small datasets. Consider replacing strict assertEquals with sanity checks (count, non-empty, reasonable overlap).

-// Results should be the same
-$this->assertEquals(
-    array_map(fn ($d) => $d->getId(), $results1),
-    array_map(fn ($d) => $d->getId(), $results2)
-);
+// Both results should have the expected count and be valid
+$this->assertCount(10, $results2);
+$ids2 = array_map(fn ($d) => $d->getId(), $results2);
+$this->assertEquals(10, count(array_unique($ids2)), 'IDs should be unique');

Based on learnings

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 50c040f and e16689d.

📒 Files selected for processing (3)
  • src/Database/Database.php (23 hunks)
  • src/Database/Validator/IndexedQueries.php (1 hunks)
  • tests/e2e/Adapter/Scopes/VectorTests.php (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
PR: utopia-php/database#721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.

Applied to files:

  • tests/e2e/Adapter/Scopes/VectorTests.php
🧬 Code graph analysis (3)
src/Database/Validator/IndexedQueries.php (1)
src/Database/Query.php (2)
  • getMethod (165-168)
  • Query (8-1169)
tests/e2e/Adapter/Scopes/VectorTests.php (5)
src/Database/Adapter.php (1)
  • Adapter (16-1325)
src/Database/Database.php (4)
  • Database (37-8255)
  • createAttribute (1734-1793)
  • createIndex (3367-3505)
  • count (7040-7089)
src/Database/Query.php (4)
  • Query (8-1169)
  • vectorDot (1141-1144)
  • vectorCosine (1153-1156)
  • vectorEuclidean (1165-1168)
src/Database/Adapter/Postgres.php (3)
  • getSupportForVectors (1990-1993)
  • createAttribute (454-484)
  • createIndex (877-932)
src/Database/Adapter/SQL.php (3)
  • getSupportForVectors (1535-1538)
  • createAttribute (234-248)
  • count (2583-2645)
src/Database/Database.php (7)
src/Database/Adapter/Postgres.php (2)
  • getSupportForVectors (1990-1993)
  • getSupportForSpatialAttributes (2052-2055)
src/Database/Adapter/SQL.php (3)
  • getSupportForVectors (1535-1538)
  • count (2583-2645)
  • getSupportForSpatialAttributes (1495-1498)
src/Database/Adapter.php (3)
  • getSupportForVectors (1029-1029)
  • count (825-825)
  • getSupportForSpatialAttributes (1064-1064)
src/Database/Adapter/Pool.php (3)
  • getSupportForVectors (408-411)
  • count (273-276)
  • getSupportForSpatialAttributes (433-436)
src/Database/Adapter/MariaDB.php (1)
  • getSupportForSpatialAttributes (1847-1850)
src/Database/Adapter/SQLite.php (1)
  • getSupportForSpatialAttributes (971-974)
src/Database/Query.php (3)
  • Query (8-1169)
  • isNested (930-937)
  • getValues (181-184)
🪛 PHPMD (2.15.0)
tests/e2e/Adapter/Scopes/VectorTests.php

596-596: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


605-605: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


889-889: Avoid unused local variables such as '$timeWithoutIndex'. (undefined)

(UnusedLocalVariable)


902-902: Avoid unused local variables such as '$timeWithIndex'. (undefined)

(UnusedLocalVariable)


976-976: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


983-983: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1282-1282: Avoid unused local variables such as '$child1'. (undefined)

(UnusedLocalVariable)


1290-1290: Avoid unused local variables such as '$child2'. (undefined)

(UnusedLocalVariable)


1354-1354: Avoid unused local variables such as '$book1'. (undefined)

(UnusedLocalVariable)


1363-1363: Avoid unused local variables such as '$book2'. (undefined)

(UnusedLocalVariable)


1415-1415: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1578-1578: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


1586-1586: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1594-1594: Avoid unused local variables such as '$doc3'. (undefined)

(UnusedLocalVariable)


1888-1888: Avoid unused local variables such as '$update1'. (undefined)

(UnusedLocalVariable)


1893-1893: Avoid unused local variables such as '$update2'. (undefined)

(UnusedLocalVariable)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (Postgres)
🔇 Additional comments (1)
tests/e2e/Adapter/Scopes/VectorTests.php (1)

1-14: Excellent comprehensive test coverage for vector functionality.

This test trait provides thorough end-to-end testing of vector attributes, queries, indexes, permissions, and edge cases. The test quality is high with proper setup/cleanup and clear assertions. The static analysis warnings about unused variables are false positives typical in test code and can be safely ignored.

Two recommended refinements from previous reviews remain applicable (see separate comments): adding pagination tie-breakers and relaxing HNSW result assertions.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

♻️ Duplicate comments (3)
src/Database/Database.php (3)

77-86: Unify wording (“size” over “dimensions”) and add a BC alias constant.

Project discussion prefers “size”. Add an alias and update messages; keeps BC while improving clarity.

 public const MAX_VECTOR_DIMENSIONS = 16000;
+// Backward-compatible alias; prefer SIZE in new code and error strings
+public const MAX_VECTOR_SIZE = self::MAX_VECTOR_DIMENSIONS;
- if ($size <= 0) {
-     throw new DatabaseException('Vector dimensions must be a positive integer');
- }
- if ($size > self::MAX_VECTOR_DIMENSIONS) {
-     throw new DatabaseException('Vector dimensions cannot exceed ' . self::MAX_VECTOR_DIMENSIONS);
- }
+ if ($size <= 0) {
+     throw new DatabaseException('Vector size must be a positive integer');
+ }
+ if ($size > self::MAX_VECTOR_SIZE) {
+     throw new DatabaseException('Vector size cannot exceed ' . self::MAX_VECTOR_SIZE);
+ }

Also applies to: 2029-2034, 2426-2431


2138-2143: Numeric component enforcement for vector defaults — LGTM.

Covers recursive validation of elements. Please ensure tests exist for:

  • scalar default (reject),
  • non‑numeric element (reject),
  • wrong length (reject).
#!/bin/bash
# Look for tests exercising invalid VAR_VECTOR defaults
rg -n --type=php -C2 'VAR_VECTOR|vector' tests | rg -n 'default|scalar|length|numeric' -n

3439-3446: Vector indexes: gate added; also enforce “exactly one attribute” (and type=vector).

Add a cheap preflight here; keep Validator as the source of truth.

             case Database::INDEX_HNSW_EUCLIDEAN:
             case Database::INDEX_HNSW_COSINE:
             case Database::INDEX_HNSW_DOT:
                 if (!$this->adapter->getSupportForVectors()) {
                     throw new DatabaseException('Vector indexes are not supported');
                 }
+                if (\count($attributes) !== 1) {
+                    throw new DatabaseException('Vector indexes require exactly one attribute');
+                }
                 break;

After building $indexAttributesWithTypes:

         foreach ($attributes as $i => $attr) {
             ...
         }
+
+        // Preflight: ensure the single indexed attribute is a vector
+        if (\in_array($type, [self::INDEX_HNSW_EUCLIDEAN, self::INDEX_HNSW_COSINE, self::INDEX_HNSW_DOT], true)) {
+            $attrName = $attributes[0] ?? null;
+            $attrType = $indexAttributesWithTypes[$attrName] ?? null;
+            if ($attrType !== self::VAR_VECTOR) {
+                throw new DatabaseException('Vector indexes must target a vector attribute');
+            }
+        }

Verify whether IndexValidator already enforces this; if yes, we can skip the preflight:

#!/bin/bash
rg -n --type=php -C3 'IndexValidator|HNSW|VAR_VECTOR|Vector indexes require exactly one' src

Also applies to: 3452-3479

🧹 Nitpick comments (2)
src/Database/Database.php (2)

2022-2050: Vector attribute validation looks solid. Minor copy tweak suggested above.

Gates by capability, forbids arrays, bounds size, and validates default shape/content.

Apply the “size” copy changes from the earlier comment for consistency.


580-616: Vector filter: widen decode param type to handle non-string adapter returns.

The decode function parameter should accept mixed instead of ?string to prevent accidental TypeError if adapters return non-string values.

-            function (?string $value) {
+            function (mixed $value) {
                 if (is_null($value)) {
                     return null;
                 }
                 if (!is_string($value)) {
                     return $value;
                 }
                 $decoded = json_decode($value, true);
                 return is_array($decoded) ? $decoded : $value;
             }

The first-class callable syntax \floatval(...) is already supported by the project's minimum PHP version (8.1) and does not require changes.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e16689d and 966b637.

📒 Files selected for processing (4)
  • src/Database/Database.php (26 hunks)
  • src/Database/Validator/IndexedQueries.php (2 hunks)
  • src/Database/Validator/Vector.php (1 hunks)
  • tests/e2e/Adapter/Scopes/VectorTests.php (1 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
  • src/Database/Validator/IndexedQueries.php
  • src/Database/Validator/Vector.php
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
PR: utopia-php/database#721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.

Applied to files:

  • tests/e2e/Adapter/Scopes/VectorTests.php
🧬 Code graph analysis (2)
tests/e2e/Adapter/Scopes/VectorTests.php (5)
src/Database/Adapter.php (1)
  • Adapter (16-1325)
src/Database/Database.php (4)
  • Database (37-8277)
  • createAttribute (1743-1802)
  • count (7062-7111)
  • create (1292-1311)
src/Database/Query.php (5)
  • Query (8-1169)
  • vectorDot (1141-1144)
  • vectorCosine (1153-1156)
  • vectorEuclidean (1165-1168)
  • or (792-795)
src/Database/Adapter/Postgres.php (3)
  • getSupportForVectors (1990-1993)
  • createAttribute (454-484)
  • create (135-162)
src/Database/Adapter/SQL.php (3)
  • getSupportForVectors (1535-1538)
  • createAttribute (234-248)
  • count (2583-2645)
src/Database/Database.php (5)
src/Database/Adapter/Postgres.php (2)
  • getSupportForVectors (1990-1993)
  • getSupportForSpatialAttributes (2052-2055)
src/Database/Adapter/SQL.php (3)
  • getSupportForVectors (1535-1538)
  • count (2583-2645)
  • getSupportForSpatialAttributes (1495-1498)
src/Database/Adapter.php (3)
  • getSupportForVectors (1029-1029)
  • count (825-825)
  • getSupportForSpatialAttributes (1064-1064)
src/Database/Adapter/Pool.php (3)
  • getSupportForVectors (408-411)
  • count (273-276)
  • getSupportForSpatialAttributes (433-436)
src/Database/Query.php (3)
  • Query (8-1169)
  • isNested (930-937)
  • getValues (181-184)
🪛 PHPMD (2.15.0)
tests/e2e/Adapter/Scopes/VectorTests.php

596-596: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


605-605: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


889-889: Avoid unused local variables such as '$timeWithoutIndex'. (undefined)

(UnusedLocalVariable)


902-902: Avoid unused local variables such as '$timeWithIndex'. (undefined)

(UnusedLocalVariable)


976-976: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


983-983: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1282-1282: Avoid unused local variables such as '$child1'. (undefined)

(UnusedLocalVariable)


1290-1290: Avoid unused local variables such as '$child2'. (undefined)

(UnusedLocalVariable)


1354-1354: Avoid unused local variables such as '$book1'. (undefined)

(UnusedLocalVariable)


1363-1363: Avoid unused local variables such as '$book2'. (undefined)

(UnusedLocalVariable)


1415-1415: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1583-1583: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


1591-1591: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1599-1599: Avoid unused local variables such as '$doc3'. (undefined)

(UnusedLocalVariable)


1893-1893: Avoid unused local variables such as '$update1'. (undefined)

(UnusedLocalVariable)


1898-1898: Avoid unused local variables such as '$update2'. (undefined)

(UnusedLocalVariable)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Unit Test
🔇 Additional comments (6)
src/Database/Database.php (5)

1382-1391: Auto‑attach filters for spatial/vector attributes — LGTM.

Ensures encode/decode runs consistently without caller boilerplate.


2419-2445: Mirror of vector default validation in updateAttribute — LGTM.

Prevents inconsistent metadata on updates.


3601-3601: Early query type validation — LGTM.

checkQueryTypes() before validation/execution is a good guardrail across getters/mutators.

Also applies to: 5110-5110, 6649-6649, 7069-7069, 7133-7133, 6846-6846


7322-7325: Decode: reverse filter order — LGTM.

Ensures decode operations invert encode chain.


7732-7741: checkQueryTypes() recursion — LGTM.

Clear error text; protects against non‑Query input and nested misuse.

tests/e2e/Adapter/Scopes/VectorTests.php (1)

1-2625: Excellent comprehensive test coverage for vector functionality!

This test trait provides thorough end-to-end coverage of the pgvector integration, including:

  • Core vector operations (dot, cosine, euclidean)
  • Edge cases (INF, NaN, zero vectors, sparse arrays, nested arrays)
  • Dimension boundaries (1 to 16000)
  • HNSW indexes and performance
  • Relationships and permissions
  • Pagination and filtering
  • Validation and error handling

The test structure is solid—all tests properly check adapter support and return early when vectors aren't available.

Optional improvements (from previous reviews, still pending):

  1. Lines 667-701: Consider adding Query::orderAsc('$id') tie-breakers to vector pagination tests for deterministic ordering when distances are equal.
  2. Lines 907-910: The HNSW performance test asserts exact equality of result IDs, but HNSW is approximate—consider relaxing to sanity checks (count, non-empty) instead of strict equality.

Both are nice-to-have refinements rather than blocking issues.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (2)
tests/e2e/Adapter/Scopes/VectorTests.php (2)

667-701: Add deterministic tie-breaker for stable pagination with vector ordering.

Vector distance ordering alone can yield unstable ordering when documents have equal distances. Append Query::orderAsc('$id') after the vector query to ensure pagination and cursor operations behave predictably.

Apply this diff:

         $page1 = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::limit(3),
             Query::offset(0)
         ]);

         // Second page
         $page2 = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::limit(3),
             Query::offset(3)
         ]);

         // Test with cursor pagination
         $firstBatch = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::limit(5)
         ]);

         $lastDoc = $firstBatch[4];
         $nextBatch = $database->find('vectorPagination', [
             Query::vectorCosine('embedding', $searchVector),
+            Query::orderAsc('$id'),
             Query::cursorAfter($lastDoc),
             Query::limit(5)
         ]);

907-910: Relax HNSW result comparison—approximate search doesn't guarantee identical ordering.

HNSW is an approximate nearest-neighbor algorithm. Expecting exact ID match between indexed and non-indexed results is too strict and may cause flaky test failures. Replace with relaxed sanity checks.

Apply this diff:

-        // Results should be the same
-        $this->assertEquals(
-            array_map(fn ($d) => $d->getId(), $results1),
-            array_map(fn ($d) => $d->getId(), $results2)
-        );
+        // Results should be valid and have reasonable overlap
+        $this->assertCount(10, $results2);
+        $this->assertNotEmpty($results2[0]->getId());
+        
+        // Check for reasonable overlap (at least 50% common results)
+        $ids1 = array_map(fn ($d) => $d->getId(), $results1);
+        $ids2 = array_map(fn ($d) => $d->getId(), $results2);
+        $intersection = count(array_intersect($ids1, $ids2));
+        $this->assertGreaterThanOrEqual(5, $intersection, 'HNSW should return mostly similar results');
🧹 Nitpick comments (1)
tests/unit/Validator/VectorTest.php (1)

31-47: Consider adding edge case tests for robustness.

The dimension tests are solid, but consider adding edge cases:

// Test nested arrays are rejected
$this->assertFalse($validator5->isValid([1.0, 2.0, [3.0], 4.0, 5.0]));

// Test special float values
$this->assertFalse($validator5->isValid([1.0, 2.0, INF, 4.0, 5.0])); // Infinity
$this->assertFalse($validator5->isValid([1.0, 2.0, NAN, 4.0, 5.0])); // Not a number

These would strengthen validation against unexpected inputs.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 966b637 and fc907a3.

📒 Files selected for processing (2)
  • tests/e2e/Adapter/Scopes/VectorTests.php (1 hunks)
  • tests/unit/Validator/VectorTest.php (1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
PR: utopia-php/database#721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.

Applied to files:

  • tests/e2e/Adapter/Scopes/VectorTests.php
🧬 Code graph analysis (2)
tests/unit/Validator/VectorTest.php (1)
src/Database/Validator/Vector.php (5)
  • Vector (7-96)
  • isValid (41-71)
  • getDescription (28-31)
  • getType (92-95)
  • isArray (80-83)
tests/e2e/Adapter/Scopes/VectorTests.php (5)
src/Database/Adapter.php (1)
  • Adapter (16-1325)
src/Database/Database.php (4)
  • Database (37-8277)
  • createAttribute (1743-1802)
  • createIndex (3389-3527)
  • count (7062-7111)
src/Database/Query.php (6)
  • Query (8-1169)
  • vectorDot (1141-1144)
  • vectorCosine (1153-1156)
  • vectorEuclidean (1165-1168)
  • limit (640-643)
  • or (792-795)
src/Database/Adapter/Postgres.php (3)
  • getSupportForVectors (1990-1993)
  • createAttribute (454-484)
  • createIndex (877-932)
src/Database/Adapter/SQL.php (3)
  • getSupportForVectors (1535-1538)
  • createAttribute (234-248)
  • count (2583-2645)
🪛 PHPMD (2.15.0)
tests/e2e/Adapter/Scopes/VectorTests.php

596-596: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


605-605: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


889-889: Avoid unused local variables such as '$timeWithoutIndex'. (undefined)

(UnusedLocalVariable)


902-902: Avoid unused local variables such as '$timeWithIndex'. (undefined)

(UnusedLocalVariable)


976-976: Avoid unused local variables such as '$doc1'. (undefined)

(UnusedLocalVariable)


983-983: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1282-1282: Avoid unused local variables such as '$child1'. (undefined)

(UnusedLocalVariable)


1290-1290: Avoid unused local variables such as '$child2'. (undefined)

(UnusedLocalVariable)


1354-1354: Avoid unused local variables such as '$book1'. (undefined)

(UnusedLocalVariable)


1363-1363: Avoid unused local variables such as '$book2'. (undefined)

(UnusedLocalVariable)


1415-1415: Avoid unused local variables such as '$doc2'. (undefined)

(UnusedLocalVariable)


1890-1890: Avoid unused local variables such as '$update1'. (undefined)

(UnusedLocalVariable)


1895-1895: Avoid unused local variables such as '$update2'. (undefined)

(UnusedLocalVariable)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (SQLite)
🔇 Additional comments (3)
tests/e2e/Adapter/Scopes/VectorTests.php (1)

1-10: Comprehensive vector test suite with excellent edge case coverage.

This test trait provides thorough end-to-end coverage of vector functionality including:

  • Attribute creation and validation with dimension checks
  • Document operations with vector data
  • All three vector query types (dot product, cosine, euclidean)
  • HNSW index creation and performance
  • Pagination (offset, cursor forward/backward)
  • Permissions and authorization
  • Extensive edge cases (INF, NaN, null, empty, special arrays, boundary dimensions)
  • Relationships with vectors
  • Concurrent updates and data integrity

The test structure is well-organized with proper adapter capability checks and cleanup. The two recommended refactors noted above will improve test stability.

tests/unit/Validator/VectorTest.php (2)

49-56: LGTM!

The description assertions correctly match the Vector validator implementation and test multiple dimension values.


58-63: LGTM with a semantic note.

The assertions correctly test the Vector validator's behavior. Note that isArray() === false while getType() === 'array' might seem counterintuitive, but this appears to be intentional framework behavior where isArray() indicates whether the validator accepts an array of items rather than a single array value.

If this semantic distinction is unclear, verify it aligns with the Utopia Database framework conventions for validator types.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
src/Database/Adapter/Postgres.php (1)

868-879: HNSW index constraints: single vector column, no sort orders; enforce and strip orders.

Current code allows multiple attrs and ASC/DESC to leak into USING HNSW, producing invalid SQL. Validate and omit orders for HNSW; optionally assert the attribute is vector using $indexAttributeTypes.

     public function createIndex(string $collection, string $id, string $type, array $attributes, array $lengths, array $orders, array $indexAttributeTypes = []): bool
     {
         $collection = $this->filter($collection);
         $id = $this->filter($id);
 
+        $isHnsw = \in_array($type, [
+            Database::INDEX_HNSW_EUCLIDEAN,
+            Database::INDEX_HNSW_COSINE,
+            Database::INDEX_HNSW_DOT,
+        ], true);
+        if ($isHnsw) {
+            if (\count($attributes) !== 1) {
+                throw new DatabaseException('HNSW indexes must reference exactly one vector attribute');
+            }
+            if (!empty($orders)) {
+                throw new DatabaseException('HNSW indexes do not support attribute order directives');
+            }
+            $attrName = $attributes[0] ?? '';
+            if (($indexAttributeTypes[$attrName] ?? null) !== Database::VAR_VECTOR) {
+                throw new DatabaseException('HNSW indexes require a vector attribute');
+            }
+            $this->ensurePgVectorExtension();
+        }
+
         foreach ($attributes as $i => $attr) {
-            $order = empty($orders[$i]) || Database::INDEX_FULLTEXT === $type ? '' : $orders[$i];
+            $order = (empty($orders[$i]) || Database::INDEX_FULLTEXT === $type || $isHnsw) ? '' : $orders[$i];
 
             $attr = match ($attr) {
                 '$id' => '_uid',
                 '$createdAt' => '_createdAt',
                 '$updatedAt' => '_updatedAt',
                 default => $this->filter($attr),
             };
 
             $attributes[$i] = "\"{$attr}\" {$order}";
         }
@@
-        // Add USING clause for special index types
+        // Add USING clause for special index types
         $sql .= match ($type) {
             Database::INDEX_SPATIAL => " USING GIST ({$attributes})",
             Database::INDEX_HNSW_EUCLIDEAN => " USING HNSW ({$attributes} vector_l2_ops)",
             Database::INDEX_HNSW_COSINE => " USING HNSW ({$attributes} vector_cosine_ops)",
             Database::INDEX_HNSW_DOT => " USING HNSW ({$attributes} vector_ip_ops)",
             default => " ({$attributes})",
         };

Also applies to: 895-899, 902-909

♻️ Duplicate comments (2)
src/Database/Adapter/Postgres.php (2)

444-452: Vector attribute: validate + ensure pgvector extension (missing call).

You validate dimensions but don’t ensure the extension is installed. This will break on existing DBs that weren’t created by this adapter’s create(). Add the ensure call.

-        // Ensure pgvector extension is installed for vector types
-        if ($type === Database::VAR_VECTOR) {
+        // Ensure pgvector extension is installed for vector types
+        if ($type === Database::VAR_VECTOR) {
+            $this->ensurePgVectorExtension();
             if ($size <= 0) {
                 throw new DatabaseException('Vector dimensions must be a positive integer');
             }
             if ($size > Database::MAX_VECTOR_DIMENSIONS) {
                 throw new DatabaseException('Vector dimensions cannot exceed ' . Database::MAX_VECTOR_DIMENSIONS);
             }
         }

555-570: Update attribute to vector: ensure extension before type cast/alter.

Same gap on updates; ensure the extension then proceed.

-        if ($type === Database::VAR_VECTOR) {
+        if ($type === Database::VAR_VECTOR) {
+            $this->ensurePgVectorExtension();
             if ($size <= 0) {
                 throw new DatabaseException('Vector dimensions must be a positive integer');
             }
             if ($size > Database::MAX_VECTOR_DIMENSIONS) {
                 throw new DatabaseException('Vector dimensions cannot exceed ' . Database::MAX_VECTOR_DIMENSIONS);
             }
         }
🧹 Nitpick comments (2)
src/Database/Adapter/Postgres.php (2)

150-152: Centralize extension install and call from attribute/index flows, not just on database create.

Replace direct CREATE EXTENSION calls with a helper and reuse it from createAttribute/updateAttribute/createIndex so existing DBs don’t fail on first vector use.

Apply:

-        // Enable extensions
-        $this->getPDO()->prepare('CREATE EXTENSION IF NOT EXISTS postgis')->execute();
-        $this->getPDO()->prepare('CREATE EXTENSION IF NOT EXISTS vector')->execute();
+        // Enable extensions
+        $this->ensurePostGIS();
+        $this->ensurePgVectorExtension();

Add (outside this hunk):

protected function ensurePostGIS(): void
{
    $this->getPDO()->prepare('CREATE EXTENSION IF NOT EXISTS postgis')->execute();
}

protected function ensurePgVectorExtension(): void
{
    $this->getPDO()->prepare('CREATE EXTENSION IF NOT EXISTS vector')->execute();
}

1698-1727: Harden vector ORDER BY builder: input validation + avoid first-class callable.

  • Validate a non-empty numeric array is provided.
  • Prefer array_map('floatval', ...) (wider PHP compatibility) over first-class callable.
-        $values = $query->getValues();
-        $vectorArray = $values[0] ?? [];
-        $vector = '[' . implode(',', \array_map(\floatval(...), $vectorArray)) . ']';
+        $values = $query->getValues();
+        $vectorArray = $values[0] ?? [];
+        if (!\is_array($vectorArray) || $vectorArray === []) {
+            throw new DatabaseException('Vector query requires a non-empty numeric array');
+        }
+        $vector = '[' . \implode(',', \array_map('floatval', $vectorArray)) . ']';
         $binds[":vector_{$placeholder}"] = $vector;
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ffe9b9a and ed69345.

📒 Files selected for processing (1)
  • src/Database/Adapter/Postgres.php (9 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
src/Database/Adapter/Postgres.php (5)
src/Database/Adapter/SQL.php (7)
  • getPDO (1745-1748)
  • getSQLType (1672-1678)
  • getSQLTable (1736-1739)
  • getSpatialGeomFromText (1547-1559)
  • getVectorDistanceOrder (1597-1600)
  • getInternalKeyForAttribute (1950-1962)
  • getSupportForVectors (1535-1538)
src/Database/Database.php (2)
  • Database (37-8277)
  • getNamespace (800-803)
src/Database/Adapter/MariaDB.php (2)
  • getSQLType (1591-1645)
  • quote (1832-1835)
src/Database/Adapter/SQLite.php (1)
  • getSQLTable (1080-1083)
src/Database/Query.php (1)
  • Query (8-1169)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (SQLite)
🔇 Additional comments (4)
src/Database/Adapter/Postgres.php (4)

1508-1511: SRID handling for distance in meters looks good.
Using DEFAULT_SRID with ST_SetSRID aligns geography casts with global config.


1758-1758: Type mappings: DEFAULT_SRID for geometry and VECTOR(size) are correct.
The mappings align with PostGIS/pgvector expectations.

Also applies to: 1797-1807


1955-1964: Capability flag exposed correctly.
getSupportForVectors() returning true is appropriate for this adapter.


1635-1639: No issues found; vector distance calculations are correctly processed through ORDER BY.

The verification confirms that vector filter queries (DOT, COSINE, EUCLIDEAN) bypass the WHERE clause by design, and are correctly handled via the ORDER BY path. The getVectorDistanceOrder() method is invoked for each vector query (SQL.php line 2482), and the returned distance calculations are merged into the final ORDER BY clause (lines 2490–2493). The integration is complete and working as intended.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (3)
src/Database/Validator/Index.php (2)

324-367: Use array_filter for orders check to match lengths.

Line 361 uses !empty($orders) which can incorrectly reject vector indexes if the orders array is populated with empty placeholder values. For consistency with the lengths check on the same line, use array_filter.

Apply this diff:

-        if (!empty($orders) || \count(\array_filter($lengths)) > 0) {
+        if (\count(\array_filter($orders)) > 0 || \count(\array_filter($lengths)) > 0) {
             $this->message = 'Vector indexes do not support orders or lengths';
             return false;
         }

252-294: Fix the orders check and add lengths validation.

Two issues in this method:

  1. Line 288 uses !empty($orders) which will incorrectly reject indexes when the orders array contains only empty string placeholders. Use array_filter to detect actual order values.
  2. Missing validation to reject length specifications for spatial indexes.

Apply this diff:

+        $lengths = $index->getAttribute('lengths', []);
         if (!empty($orders) && !$this->supportForSpatialIndexOrder) {
-            $this->message = 'Spatial indexes with explicit orders are not supported. Remove the orders to create this index.';
+        if (\count(\array_filter($orders)) > 0 && !$this->supportForSpatialIndexOrder) {
+            $this->message = 'Spatial indexes with explicit orders are not supported. Remove the orders to create this index.';
             return false;
         }
+
+        if (\count(\array_filter($lengths)) > 0) {
+            $this->message = 'Spatial indexes do not support lengths';
+            return false;
+        }
src/Database/Validator/Query/Filter.php (1)

238-249: Good guard: restrict vector queries to vector, non-array attributes.

This prevents misuse early. Consider deduping with the vector cases in isValid() to avoid double-checking in two places, or leave as-is for defense in depth.

Note: Using “size” over “dimensions” here aligns with prior review guidance.

🧹 Nitpick comments (1)
src/Database/Validator/Query/Filter.php (1)

159-177: Harden vector value validation: enforce list, finite numerics, and valid size config.

Current checks allow associative arrays, zero/undefined size, and ±INF/NaN. Tighten validation and align messages.

 case Database::VAR_VECTOR:
-    // For vector queries, validate that the value is an array of floats
-    if (!is_array($value)) {
+    // For vector queries, validate that the value is a list of finite numerics with correct size
+    if (!\is_array($value)) {
         $this->message = 'Vector query value must be an array';
         return false;
     }
-    foreach ($value as $component) {
-        if (!is_numeric($component)) {
-            $this->message = 'Vector query value must contain only numeric values';
-            return false;
-        }
-    }
-    // Check size match
-    $expectedSize = $attributeSchema['size'] ?? 0;
-    if (count($value) !== $expectedSize) {
-        $this->message = "Vector query value must have {$expectedSize} elements";
-        return false;
-    }
+    // Ensure sequential integer keys (list)
+    if (\array_values($value) !== $value) {
+        $this->message = 'Vector query value must be a list (sequential array)';
+        return false;
+    }
+    $expectedSize = $attributeSchema['size'] ?? null;
+    if (!\is_int($expectedSize) || $expectedSize <= 0) {
+        $this->message = 'Vector attribute "' . $attribute . '" is missing a valid size configuration';
+        return false;
+    }
+    if (\count($value) !== $expectedSize) {
+        $this->message = "Vector query value must have {$expectedSize} elements";
+        return false;
+    }
+    foreach ($value as $component) {
+        if (!\is_numeric($component) || !\is_finite((float)$component)) {
+            $this->message = 'Vector query value must contain only finite numeric values';
+            return false;
+        }
+    }
     continue 2;

Optional: reuse the dedicated Vector validator introduced in this PR for a single source of truth (signature permitting), to avoid drift from attribute creation rules.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 363cc8d and b5a3c95.

📒 Files selected for processing (2)
  • src/Database/Validator/Index.php (5 hunks)
  • src/Database/Validator/Query/Filter.php (3 hunks)
🧰 Additional context used
🧬 Code graph analysis (2)
src/Database/Validator/Index.php (2)
src/Database/Database.php (1)
  • Database (37-8277)
src/Database/Document.php (2)
  • getAttribute (224-231)
  • Document (12-470)
src/Database/Validator/Query/Filter.php (3)
src/Database/Database.php (2)
  • Database (37-8277)
  • count (7062-7111)
src/Database/Query.php (1)
  • Query (8-1169)
src/Database/Validator/Query/Order.php (1)
  • isValidAttribute (29-54)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (11)
  • GitHub Check: Adapter Tests (SharedTables/SQLite)
  • GitHub Check: Adapter Tests (MySQL)
  • GitHub Check: Adapter Tests (SharedTables/Postgres)
  • GitHub Check: Adapter Tests (Postgres)
  • GitHub Check: Adapter Tests (Pool)
  • GitHub Check: Adapter Tests (SQLite)
  • GitHub Check: Adapter Tests (SharedTables/MySQL)
  • GitHub Check: Adapter Tests (Mirror)
  • GitHub Check: Adapter Tests (MariaDB)
  • GitHub Check: Adapter Tests (SharedTables/MariaDB)
  • GitHub Check: Unit Test
🔇 Additional comments (3)
src/Database/Validator/Index.php (3)

29-46: Constructor refactoring looks good.

The migration to constructor-promoted properties is clean and the $attributes array is properly initialized in the loop body. The new $supportForVectorIndexes parameter aligns with the broader vector support expansion across the codebase.


296-322: LGTM!

This method correctly prevents creating non-spatial indexes (KEY, FULLTEXT, etc.) on spatial attributes, which would be invalid. The logic and error messaging are clear.


400-408: LGTM!

The new validation methods are properly integrated into isValid, following the existing early-return pattern. The ordering is logical: spatial index validation → prevent non-spatial indexes on spatial attributes → vector index validation.

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
@abnegate abnegate merged commit d8fc1bb into main Oct 17, 2025
15 checks passed
@abnegate abnegate deleted the feat/postgresql-vector-support branch October 17, 2025 14:24
@coderabbitai coderabbitai bot mentioned this pull request Nov 12, 2025
Closed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants