Fetch model from ID when filling results #55

hipsterjazzbo · 2017-01-26T06:19:42Z

At the moment, if I specify only certain fields to be indexed (with $searchable for example), only those fields will be filled in search results.

EloquentFiller::fill() should re-retrieve the models from the DB fresh, and use those to populate the results.

An example of such a modification:

public function fill(Model $model, Result $result)
{
    // Get all the _id values from the elastic results
    $ids = $result->hits()->pluck('_id')->toArray();

    // Re-fetch all those models in a single query
    $hits = $model->newQueryWithoutScopes()->whereIn($model->getQualifiedKeyName(), $ids)->get();

    // Populate the results
    $result->setHits($hits);
}

The reason I haven't made this a PR is because I can see that EloquentFiller is doing a number of other things besides just spinning up Model instances. If you could enumerate what else it's doing for me I am happy to do the work. I'm using this package quite extensively :)

The text was updated successfully, but these errors were encountered:

sleimanx2 · 2017-01-26T17:23:50Z

Having the data fetched from the relational database will help with the consistency of the data however we might want to look at the drawbacks that this approach might cause.

1- at each search request we are calling both elasticsearch and rdb which is avoidable in the majority of the scenarios.

2- there are a lot of scenarios when you want to store a relationship value with the parent object in elastic search for instance a post tags. In those scenarios we won't be able eager load the relationships so we will face n+1 problem.

3- in a read only application where the data is stored by another application only in elasticsearch. In this case we won't be able to use the package because we will try to hit the rdb !

I think having something like that could help

$result = Book::search()->match('title','foo')->get();

$resultFresh = $result->hits()->fresh()->with('tags','author')->get();

what do you think ?

hipsterjazzbo · 2017-01-26T21:07:45Z

Those are good points. My thoughts:

IMHO, selecting from the DB isn't a big deal — Laravel itself does it all the time (there's even a trait for it), and selecting by ID is nice and fast.
Regarding relationships, we absolutely could handle that if I understand you correctly. Given your example, if the raw elastic results include a tags property, we could simply use lazy eager loading:
```
// ... looping through elastic result properties
if ($property instanceof Relationship)
    // Eager load relationship from DB in a single query
    $model->load($property);
}
```
Again, IMHO, that fact the Plastic always returns Model instances implicitly links them to the DB. I believe the correct approach would be to fetch from the DB by default, and to continue to offer raw results on demand, via the direct access to the DSL that Plastic offers.

In other words, I don't believe that read-only application example is valid, because the application would already have to have Model classes defined which would by definition break without a DB anyway.

Again, if you agree on these points, I'm more than happy to contribute a PR :)

sleimanx2 · 2017-01-28T15:57:36Z

@hipsterjazzbo I think we are getting closer to a solution.

The lazy-eager loading idea could solve a lot of performance issues when re-fetching the document from the DB.

On the other side the performance difference can still be significant the SerializesModels that you mentioned is used in queues which totally makes sense in that context as for search requests where you need your app to be as interactive as possible you need to reduce unnecessary IO take advantage of the speed of ES.

Moreover, since we are going to hit the database another time why not give the user the ability to add additional query filters before executing the query?

What are your concerns for the proposed API?

As as side note the models dont break unless you try to persist so read-only is okey and convenient. I know a couple of projects where they use it as read only.

Finally the most important thing is that in production most probably you will have ES clustered as opposed to MYSQL where you might have only one node. if you force retrieval from DB you are discarding all the benefits of having a cluster (Load, Speed, Max Requests Number ...).

hipsterjazzbo · 2017-02-02T07:24:40Z

I still feel like it wouldn't be an issue, in terms of hitting db. It'd be one fast select statement.

Also, the whole reason this came up is that my models are in fact broken — I customise the data I store in elastic for performance reasons. One of the things I don't store is a team_id column, which means I can no longer load that relationship for models that come from Plastic.

nmkr · 2017-04-04T09:30:50Z

+1 for an option to fetch fresh values..
i have a trait and a custom method for outputting an image from storage, it gets detected as realtionship and we cannot output the right format.

hipsterjazzbo · 2017-04-13T05:23:31Z

I've been thinking about this more.

The core issue here is that hits() returns model instances that are not real model instances. If one customises the data that is stored in the index, instead of just using the default, the models are then broken.

I propose this:

hits() returns elastic documents directly, Basically just the json_decode()-ed results.
A new method on the results class, models(), which does similar to my suggestion above: Fetches the models on demand from the DB, in a single query.

This keeps the advantages of the current implementation, in that is stays fast, and doesn't force models to be selected, while also 1. making models available very easily if needed, and 2. not presenting models that don't actually relate to the database like they should.

Thoughts, @sleimanx2?

hipsterjazzbo · 2017-04-13T05:39:27Z

Further thoughts:

Doing it this way also opens up the door for future, more advanced usage. For example:

In additon to hits() and models(), we could add something like query() which would return a Illuminate\Database\Eloquent\Query like I put above:

$model->newQueryWithoutScopes()->whereIn($model->getQualifiedKeyName(), $ids);

Which could then be built on if needed.

sleimanx2 added enhancement feature help wanted labels Apr 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fetch model from ID when filling results #55

Fetch model from ID when filling results #55

hipsterjazzbo commented Jan 26, 2017

sleimanx2 commented Jan 26, 2017 •

edited

Loading

hipsterjazzbo commented Jan 26, 2017

sleimanx2 commented Jan 28, 2017

hipsterjazzbo commented Feb 2, 2017

nmkr commented Apr 4, 2017

hipsterjazzbo commented Apr 13, 2017

hipsterjazzbo commented Apr 13, 2017

Fetch model from ID when filling results #55

Fetch model from ID when filling results #55

Comments

hipsterjazzbo commented Jan 26, 2017

sleimanx2 commented Jan 26, 2017 • edited Loading

hipsterjazzbo commented Jan 26, 2017

sleimanx2 commented Jan 28, 2017

hipsterjazzbo commented Feb 2, 2017

nmkr commented Apr 4, 2017

hipsterjazzbo commented Apr 13, 2017

hipsterjazzbo commented Apr 13, 2017

sleimanx2 commented Jan 26, 2017 •

edited

Loading