Skip to content

Add findByFields method #50

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 10 commits into from
May 5, 2021
Merged

Add findByFields method #50

merged 10 commits into from
May 5, 2021

Conversation

ryangoree
Copy link
Contributor

@ryangoree ryangoree commented Mar 4, 2021

Fixes #48

Added a findByFields method and refactored the DataLoader batch fn and tests to make it work.

This new method makes it possible to batch and cache calls for arbitrary fields on the collection.

example:

// Find by a single field
await this.findByFields({ tags: ['gaming', 'games'] }, { ttl: 1 })
// Resulting app level cache key (if the collection was "users"):
//   - 'mongo-users-{"tags":["gaming","games"]}'
// Resulting dataloader memoization cache keys:
//   - '{"tags":"gaming"}'
//   - '{"tags":"games"}'
//     .load() is called with each tag individually, so subsequent calls for either or both tags will load from memoization cache
// Resulting query:
//   - collection.find({ tags: { $in: ['gaming', 'games'] } })
// 

// Find by multiple fields
await this.findByFields({
  tags: ['gaming', 'games'],
  userName: 'testUser'
}, { ttl: 1 })
// Resulting app level cache key (if the collection was "users"):
//   - 'mongo-users-{"tags":["gaming","games"],"userName":["testUser"]}'
// Resulting dataloader memoization cache key:
//   - '{"tags":["gaming","games"],"userName":["testUser"]}'
//     Since the results need to match both the tags AND the userName, .load() is only called once with all the fields and values.
// resulting query:
//   - collection.find({
//       tags: { $in: ['gaming', 'games'] },
//       userName: { $in: ['testUser'] }
//     })

apollo-datasource-mongodb-tests-passed

@ryangoree
Copy link
Contributor Author

ryangoree commented Mar 4, 2021

Forgot to mention, this will also use $or to merge multiple pending requests with different fields.

Examples

// 2 calls for completely different fields -> merged with $or
const pendingUsers1 = this.findByFields({ userName: 'ryan' })
const pendingUsers2 = this.findByFields({ tags: 'gaming' })
await Promise.all([pendingUsers1, pendingUsers2]);
// resulting query:
//   - collection.find({
//       $or: [
//         { userName: { $in: ['ryan'] } },
//         { tags: { $in: ['gaming'] } }
//       ]
//     })

// 2 calls for partially different fields -> merged with $or
const pendingUsers1 = this.findByFields({ userName: 'ryan' })
const pendingUsers2 = this.findByFields({ userName: 'ryan', tags: 'gaming' })
await Promise.all([pendingUsers1, pendingUsers2]);
// resulting query:
//   - collection.find({
//       $or: [
//         { userName: { $in: ['ryan'] } },
//         {
//           userName: { $in: ['ryan'] },
//           tags: { $in: ['gaming'] } }
//         }
//       ]
//     })

// 2 calls for the same fields -> merged without $or
const pendingUsers1 = this.findByFields({ userName: 'ryan' })
const pendingUsers2 = this.findByFields({ userName: 'loren' })
await Promise.all([pendingUsers1, pendingUsers2]);
// resulting query:
//   - collection.find({ userName: { $in: ['ryan', 'loren'] } })

// Mixed
const pendingUsers1 = this.findByFields({ userName: 'ryan', tags: 'games' })
const pendingUsers2 = this.findByFields({ userName: 'loren', tags: 'gaming' })
const pendingUsers3 = this.findByFields({ tags: ['games', 'gaming'] })
await Promise.all([pendingUsers1, pendingUsers2, pendingUsers3]);
// resulting query:
//   - collection.find({
//       $or: [
//         {
//           userName: { $in: ['ryan', 'loren'] },
//           tags: { $in: ['games', 'gaming'] } }
//         },
//         { tags: { $in: ['games', 'gaming']  } }
//       ]
//     })

Copy link
Member

@lorensr lorensr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great, thanks! My efficiency brain wants to go straight to a #1 general solution instead of adding $in queries, but this is a definite improvement over the current feature set :)

return loader.load(JSON.stringify(filter))
})
)
docs = [].concat(...docsArray)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docs = [...docsArray] clones array, but why necessary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

docsArray ends up being an array of arrays in this case ([ [{}, {}], [{}, {}] ]).

[].concat(...docsArray) is just a succinct way of merging them all into a single array.

@@ -89,48 +132,114 @@ describe('createCachingMethods', () => {
expect(collection.find.mock.calls.length).toBe(1)
})

// TODO why doesn't this pass?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like this to remain in the file until solved

Copy link
Contributor Author

@ryangoree ryangoree Mar 12, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.find() is only called once because the dataloader saves the result from the first call. The second call doesn't find anything in the InMemoryLRUCache since the ttl option wasn't passed in the first call, but it ends up calling the dataloader's .load() again with the same key as the first call which returns from memoization cache rather than calling .find() again.

it(`dataloader caches each value individually when finding by a single field`, async () => {
await api.findByFields({ tags: ['foo', 'baz'] }, { ttl: 1 })

const fooBazCacheValue = await cache.get(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cache is separate from dataloader. i suggest removing "dataloader" from the test title and creating other tests for dataloader (which should do memoization caching: https://github.com/graphql/dataloader#caching)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is testing the memoization caching. The desired behavior is that if findByFields is used with a single field and multiple values, each individual value should get cached by the dataloaders memoization cache (tested by calling with 2 values, 'foo' and 'baz', on line 185, then calling with a single value, 'foo', on line 195, then expecting find to have been called only once on line 196). However, the InMemoryLRUCache cache should not be affected by this (tested by ensuring the cache has a result for both values on line 192, but not for a single value, on line 193)

ryangoree and others added 2 commits March 12, 2021 03:02
Co-authored-by: Aditya Thakral <9at8@users.noreply.github.com>
@ryangoree ryangoree requested a review from lorensr March 15, 2021 16:59
@lorensr lorensr merged commit 135f8b8 into GraphQLGuide:master May 5, 2021
@lorensr
Copy link
Member

lorensr commented May 5, 2021

Sorry for the delay! And thanks again for contributing. Published in 0.4.0

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add api to handle other than id field
3 participants