-
Notifications
You must be signed in to change notification settings - Fork 52
Fix validation #749
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix validation #749
Conversation
WalkthroughAdds a CLI Changes
Sequence Diagram(s)sequenceDiagram
participant CLI as CLI (operators task)
participant Task as Task Action
participant Env as setupTestEnvironment
participant DB as Database
participant Seed as seedDocuments
participant Results as displayResults
CLI->>Task: parse args (adapter, iterations, seed, name)
Task->>Env: setupTestEnvironment(database, name, seed)
Env->>Seed: if seed > 0 -> seedDocuments(database, seed)
Seed-->>Env: seeded count, time
Task->>DB: run benchmark iterations
DB-->>Task: results
Task->>Results: displayResults(results, adapter, iterations, seed)
Results-->>CLI: log results table (includes Seeded documents)
sequenceDiagram
participant DB as Database
participant Struct as Structure
participant OpVal as OperatorValidator
participant OpFactory as DatabaseOperator
DB->>Struct: validate attribute value(s) (may include operators)
Struct->>OpFactory: detect/parse value as Operator
alt value is Operator
OpFactory-->>Struct: Operator instance
Struct->>OpVal: construct with currentDocument
Struct->>OpVal: isValid(operator)
alt valid
OpVal-->>Struct: OK (skip standard validation)
else invalid
OpVal-->>Struct: error -> Structure returns false
end
else not Operator
Struct->>Struct: perform standard attribute-type validation
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes
Possibly related PRs
Suggested reviewers
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (3)
💤 Files with no reviewable changes (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)
🔇 Additional comments (1)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/Database/Database.php (1)
30-30: Fix linter: remove unused import.
use Utopia\Database\Validator\Operator as OperatorValidator;is unused after the refactor and triggers the PSR-12 no_unused_imports failure. Remove it.-use Utopia\Database\Validator\Operator as OperatorValidator;
🧹 Nitpick comments (3)
src/Database/Database.php (2)
4955-4955: Merge overlay is fine; ensure $id can’t be mutated.
array_merge($old, $new)is correct, but consider explicitly unsetting/ignoring$idfrom user input to prevent accidental id mutation on write.Does your adapter ignore
$idupdates? If not, add:- $document = \array_merge($old->getArrayCopy(), $document->getArrayCopy()); + $incoming = $document->getArrayCopy(); + unset($incoming['$id']); + $document = \array_merge($old->getArrayCopy(), $incoming);
5123-5134: Refetch-on-operator is correct; consider deduping detection.The refetch logic is solid. To avoid duplicating operator scans in multiple paths, extract a small helper (e.g.,
private function hasOperators(Document|array $data): bool) and reuse.bin/tasks/operators.php (1)
195-248: LGTM! Well-implemented seeding function with good batching and progress feedback.The function effectively seeds documents in batches, provides progress updates, and uses realistic test data. The deterministic ID scheme (
seed_0,seed_1, etc.) is good for reproducibility.Optional nitpick (Line 203): The
(int)cast is redundant sinceceil()returns a float that will be implicitly converted to int in theforloop context:- $batches = (int) ceil($count / $batchSize); + $batches = ceil($count / $batchSize);
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
bin/tasks/operators.php(7 hunks)src/Database/Database.php(8 hunks)src/Database/Validator/Operator.php(1 hunks)src/Database/Validator/Structure.php(3 hunks)tests/unit/Validator/OperatorTest.php(1 hunks)tests/unit/Validator/StructureTest.php(2 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-07-30T19:17:53.630Z
Learnt from: ArnabChatterjee20k
PR: utopia-php/database#642
File: src/Database/Validator/PartialStructure.php:43-52
Timestamp: 2025-07-30T19:17:53.630Z
Learning: In PartialStructure validator, when filtering for required attributes validation using the $requiredAttributes parameter, $this->attributes should be used instead of the merged $attributes array because this validation is specifically for internal attributes like $createdAt and $updatedAt that are defined in the base Structure class, not collection-specific attributes.
Applied to files:
tests/unit/Validator/StructureTest.phpsrc/Database/Validator/Structure.phpsrc/Database/Database.php
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
PR: utopia-php/database#721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.
Applied to files:
tests/unit/Validator/StructureTest.phpsrc/Database/Database.php
📚 Learning: 2025-10-29T12:27:57.071Z
Learnt from: ArnabChatterjee20k
PR: utopia-php/database#747
File: src/Database/Adapter/Mongo.php:1449-1453
Timestamp: 2025-10-29T12:27:57.071Z
Learning: In src/Database/Adapter/Mongo.php, when getSupportForAttributes() returns false (schemaless mode), the updateDocument method intentionally uses a raw document without $set operator for replacement-style updates, as confirmed by the repository maintainer ArnabChatterjee20k.
Applied to files:
src/Database/Database.php
🧬 Code graph analysis (6)
tests/unit/Validator/OperatorTest.php (3)
src/Database/Database.php (1)
Database(38-8506)src/Database/Validator/Operator.php (3)
Operator(10-387)isValid(60-89)getDescription(46-49)src/Database/Operator.php (16)
increment(409-416)decrement(425-432)multiply(510-517)divide(527-537)modulo(589-595)arrayAppend(441-444)arrayUnique(619-622)arrayIntersect(630-633)arrayDiff(641-644)arrayFilter(653-656)stringConcat(486-489)stringReplace(498-501)toggle(544-547)dateAddDays(556-559)dateSubDays(567-570)dateSetNow(577-580)
tests/unit/Validator/StructureTest.php (2)
src/Database/Validator/Operator.php (3)
Operator(10-387)isValid(60-89)getDescription(46-49)src/Database/Validator/Structure.php (3)
Structure(20-460)isValid(211-245)getDescription(197-200)
src/Database/Validator/Operator.php (1)
src/Database/Operator.php (1)
parse(311-324)
src/Database/Validator/Structure.php (3)
src/Database/Validator/Operator.php (3)
Operator(10-387)isValid(60-89)getDescription(46-49)src/Database/Document.php (2)
Document(12-470)setAttribute(244-261)src/Database/Operator.php (1)
isOperator(664-667)
src/Database/Database.php (3)
src/Database/Document.php (2)
getArrayCopy(423-458)isEmpty(396-399)src/Database/Validator/Operator.php (1)
Operator(10-387)src/Database/Operator.php (2)
Operator(14-698)isOperator(664-667)
bin/tasks/operators.php (3)
src/Database/Database.php (2)
Database(38-8506)createDocuments(4485-4588)src/Database/Adapter/SQL.php (1)
createDocuments(2411-2544)src/Database/Adapter/Mongo.php (1)
createDocuments(1343-1382)
🪛 GitHub Actions: Linter
src/Database/Database.php
[error] 1-1: Pint lint failure: PSR-12 no_unused_imports detected in src/Database/Database.php.
🪛 PHPMD (2.15.0)
tests/unit/Validator/OperatorTest.php
147-147: Avoid unused local variables such as '$validator'. (undefined)
(UnusedLocalVariable)
153-153: Avoid unused local variables such as '$operator'. (undefined)
(UnusedLocalVariable)
158-158: Avoid unused local variables such as '$validator'. (undefined)
(UnusedLocalVariable)
164-164: Avoid unused local variables such as '$operator'. (undefined)
(UnusedLocalVariable)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Setup & Build Docker Image
🔇 Additional comments (10)
src/Database/Database.php (5)
4979-4984: Good: early operator detection to decide update.Scanning for
Operator::isOperator($value)to set$shouldUpdateis clear and efficient.
5103-5105: Passing old document into Structure validator.LGTM, but confirm the constructor signature was updated to accept
$oldand that all call sites match.Run your test suite touching Structure/PartialStructure to ensure no signature mismatch.
5254-5261: PartialStructure with historical context (null in bulk).Looks good. Null old-doc is reasonable for bulk updates; just confirm PartialStructure safely handles null and doesn’t apply old-dependent checks.
Also applies to: 5263-5265
6028-6033: Structure validator with$oldin upsert path.Good improvement; enables validations that depend on prior state.
5350-5356: Review comment is incorrect.The concern assumes
castingBefore()modifies or removes Operator objects. This is unfounded:
castingBefore()in Mongo.php only processes attributes defined in the collection schema (viagetAttribute('attributes', [])). Operator instances passed in$updatesare not schema-defined attributes and pass through unmodified.castingBefore()in SQL.php is a no-op.- Type conversions only apply to specific schema types (e.g., VAR_DATETIME → UTCDateTime). Operator instances don't match any conversion and remain intact.
Operator::isOperator()simply checksinstanceof, which succeeds whether the detection happens before or after casting.The operator detection after
castingBefore()works correctly—there is no false negative risk.Likely an incorrect or invalid review comment.
bin/tasks/operators.php (5)
11-15: LGTM! Clear documentation with helpful examples.The documentation effectively explains the new seed capability and provides practical usage examples for different adapters and seed values.
44-46: LGTM! CLI parameter properly integrated.The seed parameter is well-defined with appropriate validation, sensible default (0 = no seeding), and the action signature is correctly updated to accept it.
141-141: LGTM! Function signature appropriately extended.The signature change is clean and the conditional seeding logic (lines 187-190) appropriately skips seeding when
$seedis 0.
916-924: LGTM! Results display consistently updated.The function signature and display output are appropriately extended to show the seeded document count, providing complete benchmark context.
44-248: Excellent integration of seed capability for benchmark testing.The seed feature is comprehensively implemented with:
- Clear documentation and usage examples
- Appropriate CLI parameter with sensible defaults
- Efficient batching strategy for large datasets
- Consistent updates across all function signatures and call sites
- Good user feedback via progress logging
This enhancement enables testing operator performance under varying data conditions, which is valuable for realistic benchmarking scenarios.
| if (Operator::isOperator($value)) { | ||
| // Set the attribute name on the operator for validation | ||
| $value->setAttribute($key); | ||
|
|
||
| $operatorValidator = new OperatorValidator($this->collection, $this->currentDocument); | ||
| if (!$operatorValidator->isValid($value)) { | ||
| $this->message = $operatorValidator->getDescription(); | ||
| return false; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Handle array-backed attribute metadata before invoking OperatorValidator
OperatorValidator still builds its attribute map by calling getAttribute()/getId() on each entry returned from $collection->getAttribute('attributes'). In many code paths (including this test fixture) the collection metadata is hydrated from plain arrays, so as soon as we hit this new branch we instantiate OperatorValidator with array entries and trigger a fatal Call to a member function getAttribute() on array. That turns operator-based updates into hard crashes instead of validation errors. Please normalise the collection attributes to Document instances (or update OperatorValidator to accept arrays) before constructing it here so this code path works for both array- and Document-backed metadata.
🤖 Prompt for AI Agents
In src/Database/Validator/Structure.php around lines 311 to 319, the new branch
that constructs OperatorValidator assumes collection attribute metadata are
Document objects and crashes when attributes are plain arrays; before creating
OperatorValidator, normalize the collection's 'attributes' entries into Document
instances (or otherwise convert array entries to the same object type
OperatorValidator expects) so getAttribute()/getId() calls are safe — detect
array entries and wrap/convert them into Document objects (preserving original
keys/values and any IDs) and then instantiate OperatorValidator with the
normalized collection metadata; alternatively, if preferred, adjust
OperatorValidator to accept arrays, but the simplest fix here is to coerce the
collection attributes to Document instances prior to constructing
OperatorValidator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/Database/Database.php (1)
5253-5260: PartialStructure on updates: ensure operator-aware validation.Validating $updates is right, but if Operator instances are present, PartialStructure/Structure must tolerate them (skip value-type checks for operator payloads). Please confirm or guard.
If not operator-aware, consider short-circuiting when operators exist:
@@ - $validator = new PartialStructure( + $containsOps = !empty(\Utopia\Database\Operator::extractOperators($updates->getArrayCopy())['operators']); + $validator = new PartialStructure( $collection, $this->adapter->getIdAttributeType(), $this->adapter->getMinDateTime(), $this->adapter->getMaxDateTime(), $this->adapter->getSupportForAttributes(), null // No old document available in bulk updates ); - if (!$validator->isValid($updates)) { + if (!$containsOps && !$validator->isValid($updates)) { throw new StructureException($validator->getDescription()); }Also applies to: 5262-5264
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.github/workflows/codeql-analysis.yml(1 hunks).github/workflows/linter.yml(1 hunks)src/Database/Database.php(8 hunks)
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-10-29T12:27:57.071Z
Learnt from: ArnabChatterjee20k
PR: utopia-php/database#747
File: src/Database/Adapter/Mongo.php:1449-1453
Timestamp: 2025-10-29T12:27:57.071Z
Learning: In src/Database/Adapter/Mongo.php, when getSupportForAttributes() returns false (schemaless mode), the updateDocument method intentionally uses a raw document without $set operator for replacement-style updates, as confirmed by the repository maintainer ArnabChatterjee20k.
Applied to files:
src/Database/Database.php
📚 Learning: 2025-07-30T19:17:53.630Z
Learnt from: ArnabChatterjee20k
PR: utopia-php/database#642
File: src/Database/Validator/PartialStructure.php:43-52
Timestamp: 2025-07-30T19:17:53.630Z
Learning: In PartialStructure validator, when filtering for required attributes validation using the $requiredAttributes parameter, $this->attributes should be used instead of the merged $attributes array because this validation is specifically for internal attributes like $createdAt and $updatedAt that are defined in the base Structure class, not collection-specific attributes.
Applied to files:
src/Database/Database.php
📚 Learning: 2025-10-03T02:04:17.803Z
Learnt from: abnegate
PR: utopia-php/database#721
File: tests/e2e/Adapter/Scopes/DocumentTests.php:6418-6439
Timestamp: 2025-10-03T02:04:17.803Z
Learning: In tests/e2e/Adapter/Scopes/DocumentTests::testSchemalessDocumentInvalidInteralAttributeValidation (PHP), when the adapter reports getSupportForAttributes() === false (schemaless), the test should not expect exceptions from createDocuments for “invalid” internal attributes; remove try/catch and ensure the test passes without exceptions, keeping at least one assertion.
Applied to files:
src/Database/Database.php
🧬 Code graph analysis (1)
src/Database/Database.php (3)
src/Database/Validator/Operator.php (2)
Operator(10-387)isValid(60-89)src/Database/Operator.php (2)
Operator(14-698)isOperator(664-667)src/Database/Validator/PartialStructure.php (2)
PartialStructure(8-62)isValid(19-61)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: Adapter Tests (Schemaless/MongoDB)
- GitHub Check: Adapter Tests (SharedTables/MongoDB)
- GitHub Check: Adapter Tests (SharedTables/SQLite)
- GitHub Check: Adapter Tests (SharedTables/MySQL)
- GitHub Check: Adapter Tests (Mirror)
- GitHub Check: Adapter Tests (SharedTables/MariaDB)
- GitHub Check: Adapter Tests (Pool)
- GitHub Check: Adapter Tests (SharedTables/Postgres)
- GitHub Check: Adapter Tests (SQLite)
- GitHub Check: Adapter Tests (MariaDB)
- GitHub Check: Adapter Tests (Postgres)
- GitHub Check: Adapter Tests (MySQL)
- GitHub Check: Adapter Tests (MongoDB)
- GitHub Check: Unit Test
🔇 Additional comments (5)
.github/workflows/linter.yml (1)
19-20: Verify necessity and compatibility of the swoole Docker image for linting.The Docker image has been changed to
phpswoole/swoole:5.1.8-php8.3-alpine, which is a specialized async networking framework. This choice seems unusual for a database library's linter and CodeQL CI job unless the project explicitly uses swoole as a runtime dependency.Please clarify:
- Why is a swoole-specific image needed for linting and code analysis?
- Does this project depend on or use swoole?
- If this is a dependency, are there any compatibility concerns with this specific version (5.1.8) and PHP 8.3?
If swoole is not required for linting, consider reverting to a standard PHP 8.3 Alpine image (e.g.,
php:8.3-alpine) to reduce unnecessary image complexity and potential supply chain risk..github/workflows/codeql-analysis.yml (1)
19-20: Verify necessity and compatibility of the swoole Docker image for CodeQL analysis.The Docker image has been changed to
phpswoole/swoole:5.1.8-php8.3-alpine, mirroring the change inlinter.yml. The same concerns apply: a swoole-specific image seems unusual for code analysis of a database library unless the project explicitly depends on swoole as a runtime requirement.Please verify the same points as raised for
linter.yml:
- Does the project require or use swoole?
- Are there compatibility concerns with swoole 5.1.8 and PHP 8.3?
If swoole is not a runtime dependency, consider using a standard PHP 8.3 Alpine image instead to keep CI images minimal and focused.
src/Database/Database.php (3)
4954-4954: Shallow merge behavior: confirm intent for nulls and unknown keys.array_merge keeps new nulls (good for clearing), and preserves unknown user keys until validation. Just confirm this matches desired semantics for updates (especially with schemaless adapters).
5102-5103: Good: pass old document context into Structure.This enables validators to reason about transitions (e.g., bounds, immutability).
5299-5299: Side effect via reference: safe but ensure adapter honors $skipPermissionsUpdate only.Passing and mutating $batch by reference is fine; validate that no other transient fields (like $skipPermissionsUpdate) leak past castingAfter/cleanup.
Summary by CodeRabbit
New Features
Improvements
Tests
Chores