-
Notifications
You must be signed in to change notification settings - Fork 0
PHPORM-33: Add Query::toMql()
and tests on query syntax
#4
Conversation
055cfb1
to
ec40454
Compare
*/ | ||
public function getFresh($columns = [], $returnLazy = false) | ||
public function toMql($columns = []): array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want toMql
to return a MQL string like db.users.find({})
, I can move this code in an intermediary private method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want toMql to return a MQL string like
db.users.find({})
What is an "MQL string" in that case? db.users.find({})
returns a cursor, which the shell's REPL iterates automatically to display a batch of documents. I'm not sure what that has to do with MQL.
Also, I cannot for the life of me find an actual definition of MQL in the server documentation. "MongoDB Query Language (MQL)" only seems to be mentioned in some Atlas docs and blog posts.
I used to think it referred to the query document syntax but given some of the usage in the above I'm not so sure that's the case anymore.
I certainly wouldn't have expected this method's return value (operation name followed by its arguments) to be considered MQL. That said, I can see how the match syntax alone might be insufficient to convey the query behavior, since it omits options like skip
and limit
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The term MQL is a vague concept, it represent the calls to the high-level library. Ideally we could have an "export query for mongosh" feature in a profiler panel, but we don't have a 100% match in PHP and mongosh method signatures. That would work for the find
, countDocuments
, distinct
, aggregate
.
That said, I can see how the match syntax alone might be insufficient to convey the query behavior, since it omits options like
skip
andlimit
.
They can be formatted as find
options or chained method calls.
Options:
[
'find' => [[], ['skip' => 5, 'limit' => 10]],
]
Chained method calls:
[
'find' => [],
'skip' => [5],
'take' => [10],
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could use the driver events to track the query, but it too deep and may change with server version and maybe the connection string.
In this case, we'd track outgoing events (so CommandStartedEvent
) to verify the command being sent to the server. While it is possible that it changes based on the connection string, such instances are limited (e.g. adding $readPreference
when sending a query to mongos). With that in mind, it would indeed be feasible to use the event system to inspect queries, but that would still require a live database server to run the tests. IMO checking query generation without a MongoDB server has some advantages, but we shouldn't limit our testing to it.
composer.json
Outdated
@@ -23,7 +23,8 @@ | |||
"illuminate/container": "^10.0", | |||
"illuminate/database": "^10.0", | |||
"illuminate/events": "^10.0", | |||
"mongodb/mongodb": "^1.15" | |||
"mongodb/mongodb": "^1.15", | |||
"ext-mongodb": "*" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍, but I would move this to the top of the list.
src/Query/Builder.php
Outdated
@@ -279,21 +278,12 @@ public function getFresh($columns = [], $returnLazy = false) | |||
$aggregations = blank($this->aggregate['columns']) ? [] : $this->aggregate['columns']; | |||
|
|||
if (in_array('*', $aggregations) && $function == 'count') { | |||
// When ORM is paginating, count doesnt need a aggregation, just a cursor operation | |||
// When ORM is paginating, count doesn't need an aggregation, just a cursor operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correction here: it isn't a cursor operation.
For context, the legacy driver used to have a Cursor::count
method, but that would send off a separate count
command which can yield different results. That is the reason why this was omitted from the Cursor
class in the new driver.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not clear to me, is there something to change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @alcaeus was commenting on the "just a cursor operation" phrase in this comment. The link to https://docs.mongodb.com/manual/reference/method/cursor.count/ a few lines below is related I'm sure.
Since Collection::count()
is deprecated, I wonder if this should instead be using countDocuments()
or estimatedDocumentCount()
depending on whether a query filter is used or not, respectively.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll change to use countDocuments
. The name estimatedDocumentCount
make me feel that we shouldn't use it silently.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Imported a lot of tests from Eloquent, and fixed som bugs spotted by this tests.
Should be added as part of a new feature:
whereDate
,whereDay
,whereMonth
,whereYear
,whereTime
- Operations accepting "raw" fields.
whereFulltext
using$text
indexesunion
andunionAll
using aggregationssubselect
using aggregationsinRandomOrder
having
in conjonction with groups using aggregationswhereExists
- Jointures using aggregations
To be tested
- Aggregations
- Like
src/Query/Builder.php
Outdated
$direction = -1; | ||
break; | ||
default: | ||
throw new \InvalidArgumentException('Order direction must be "asc" or "desc".'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bugfix to throw an exception when using an incorrect value for $direction
. Covered by tests imported from Illuminate\Tests\Database\DatabaseQueryBuilderTest::testOrderByInvalidDirectionParam()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if we have CS setup for this project yet, but should there be a use
statement for InvalidArgumentException to avoid the global reference here?
On a separate note, I think we can omit the period after an exception message if there's only one sentence/phrase. That's what is done in PHPC/PHPLIB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is php-cs-fixer, the top-level classes doesn't require to be in use statements. We can decide to change the CS. https://jira.mongodb.org/browse/PHPORM-52
$item = new UTCDateTime($item->format('Uv')); | ||
} | ||
}); | ||
if (is_array($where['values'])) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bugfix to not convert Carbon\CarbonPeriod
properties $startDate
and $endDate
into MongoDB\BSON\UTCDateTime
. CarbonCarbonPeriod::getStartDate()
calls $this->startDate->avoidMutation()
which does not exist on the MongoDB class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
AFAICT, the originally array_walk_recursive()
call was just wrapped in an is_array($where['values'])
condition. How does that fix the bug?
It's not clear to me how the original code would have converted anything other than a DateTimeInterface into UTCDateTime.
I see that Carbon/Carbon
extends DateTime (and thus implements the interface), but it's not clear to me where Carbon/CarbonPeriod
comes into play or why its internal logic would be relevant during this query compilation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure that fixed the bug as described in the comment. I'll digg why the CarbonPeriod
properties were updated.
@@ -997,6 +1028,10 @@ protected function compileWheres(): array | |||
$method = "compileWhere{$where['type']}"; | |||
$result = $this->{$method}($where); | |||
|
|||
if ($not) { | |||
$result = ['$not' => $result]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bugfix to support the $not
operations! whereNot
and orWhereNot
.
@@ -1101,7 +1136,7 @@ protected function compileWhereIn(array $where): array | |||
{ | |||
extract($where); | |||
|
|||
return [$column => ['$in' => array_values($values)]]; | |||
return [$column => ['$in' => Arr::flatten($values)]]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a bugfix to support nested arrays. Test imported from Illuminate\Tests\Database\DatabaseQueryBuilderTest::testBasicWhereIns
if ($values instanceof CarbonPeriod) { | ||
$values = [new UTCDateTime($values->start->format('Uv')), new UTCDateTime($values->end->format('Uv'))]; | ||
} else { | ||
$values = Arr::flatten($values); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bugfix to accept nested arrays in whereBetween
. Test imported from Illuminate\Tests\Database\DatabaseQueryBuilderTest::testWhereBetweens
} | ||
|
||
/** @internal This method is not supported by MongoDB. */ | ||
public function whereColumn($first, $operator = null, $second = null, $boolean = 'and') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All this methods from the base Eloquent builder silently do nothing. By overloading I can add the @internal
annotation so that they are not documented and appears strikethrough in good IDEs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noted that these are tested in testEloquentMethodsNotSupported()
. Good job.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sharing some feedback from an initial, incomplete pass through the PR. Some of this query builder syntax is concerning, to say the least.
*/ | ||
public function getFresh($columns = [], $returnLazy = false) | ||
public function toMql($columns = []): array |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want toMql to return a MQL string like
db.users.find({})
What is an "MQL string" in that case? db.users.find({})
returns a cursor, which the shell's REPL iterates automatically to display a batch of documents. I'm not sure what that has to do with MQL.
Also, I cannot for the life of me find an actual definition of MQL in the server documentation. "MongoDB Query Language (MQL)" only seems to be mentioned in some Atlas docs and blog posts.
I used to think it referred to the query document syntax but given some of the usage in the above I'm not so sure that's the case anymore.
I certainly wouldn't have expected this method's return value (operation name followed by its arguments) to be considered MQL. That said, I can see how the match syntax alone might be insufficient to convey the query behavior, since it omits options like skip
and limit
.
src/Query/Builder.php
Outdated
@@ -279,21 +278,12 @@ public function getFresh($columns = [], $returnLazy = false) | |||
$aggregations = blank($this->aggregate['columns']) ? [] : $this->aggregate['columns']; | |||
|
|||
if (in_array('*', $aggregations) && $function == 'count') { | |||
// When ORM is paginating, count doesnt need a aggregation, just a cursor operation | |||
// When ORM is paginating, count doesn't need an aggregation, just a cursor operation |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think @alcaeus was commenting on the "just a cursor operation" phrase in this comment. The link to https://docs.mongodb.com/manual/reference/method/cursor.count/ a few lines below is related I'm sure.
Since Collection::count()
is deprecated, I wonder if this should instead be using countDocuments()
or estimatedDocumentCount()
depending on whether a query filter is used or not, respectively.
|
||
// Return results | ||
return new Collection($results); | ||
return ['aggregate' => [$pipeline, $options]]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may be outside the scope of the PR, but I'm reading through the existing code above and it looks like this constructs a pipeline in a fixed order based on query builder options. I could see that being relevant for $this->groups
(assuming it's mimicking a group operation from the deprecated group command) but it seems odd for $this->aggregate
since there's no flexibility to construct a pipeline.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
$this->aggregate
can only contain function
and columns
. I'll try to add tests on this feature.
$builder = $this->getBuilder(); | ||
$builder->where('_id', '=', 1)->orWhereNull(['_id', 'expires_at']); | ||
$this->assertSame(['find' => [ | ||
['$or' => [['_id' => 1], ['_id' => null], ['expires_at' => null]]], |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As with my comment below, this query seems unintuitive. I would have expected the orWhereNull()
conditions to be grouped within an $and
operator.
// can accept some nested arrays as values. | ||
$builder = $this->getBuilder(); | ||
$builder->whereIn('id', [ | ||
['issue' => 45582], | ||
['id' => 2], | ||
[3], | ||
]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems wrong. Nested arrays are actually matched by $in
. Consider:
Enterprise rs0 [direct: primary] test> db.coll.insertOne({x:1})
{
acknowledged: true,
insertedId: ObjectId("64acb00e70dc0350a6d117ac")
}
Enterprise rs0 [direct: primary] test> db.coll.insertOne({x:[1,2,3]})
{
acknowledged: true,
insertedId: ObjectId("64acb01370dc0350a6d117ad")
}
Enterprise rs0 [direct: primary] test> db.coll.find({x:{$in:[1,2,3,4]}})
[
{ _id: ObjectId("64acb00e70dc0350a6d117ac"), x: 1 },
{ _id: ObjectId("64acb01370dc0350a6d117ad"), x: [ 1, 2, 3 ] }
]
Enterprise rs0 [direct: primary] test> db.coll.find({x:{$in:[[1,2,3]]}})
[ { _id: ObjectId("64acb01370dc0350a6d117ad"), x: [ 1, 2, 3 ] } ]
Enterprise rs0 [direct: primary] test> db.coll.find({x:{$in:[[1,2,3,4]]}})
In the first query, both documents match because x
either is or has a value within the array [1,2,3,4]
.
In the second query, the [1,2,3]
array within the $in
array is treated as an exact match for only a single document.
The third query matches nothing, since neither document has the value [1,2,3,4]
for x
.
Even if array keys were to be discarded here, it seems wrong that:
[
['issue' => 45582],
['id' => 2],
[3],
]
...would compile to the same query as the previous test:
['issue' => 45582, 'id' => 2, 3]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, I won't put Arr::flatten()
and leave the nested arrays.
} | ||
|
||
/** @internal This method is not supported by MongoDB. */ | ||
public function whereColumn($first, $operator = null, $second = null, $boolean = 'and') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noted that these are tested in testEloquentMethodsNotSupported()
. Good job.
Co-authored-by: Jeremy Mikola <jmikola@gmail.com>
@@ -1147,6 +1173,12 @@ protected function compileWhereBetween(array $where): array | |||
{ | |||
extract($where); | |||
|
|||
if ($values instanceof CarbonPeriod) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix PHPORM-33
This is the preliminary work before writing all the tests. It presents 2 ways of writing the tests:
We could use the driver events to track the query, but it too deep and may change with server version and maybe the connection string.
toMql()
method for that.In this PR
Mongodb\Query\Builder::toMql()
to mimic theIlluminate\Database\Query::toSql()
. No need for atoRawSql()
equivalent because there is no data binding.Illuminate\Tests\Database\DatabaseQueryBuilderTest
The MQL format (my proposition)
MQL is a javascript-like format.
The collection name is not in the dumped format.
Or in split version:
When chaining, add several calls to the array. This does not accept calls to the same method. But the query builder never use method chaining.