Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimise iteration API #280

Merged
merged 3 commits into from
Jul 14, 2022
Merged

Optimise iteration API #280

merged 3 commits into from
Jul 14, 2022

Conversation

Pimm
Copy link
Collaborator

@Pimm Pimm commented Jul 12, 2022

Basic

This pull requests attempts to optimise the iterate method.

In this scenario, exactly 130 values are requested from the list payments endpoint, as the implementation correctly guesses that any additional values would be "wasted":

for await (let payment of mollieClient.payments.iterate().drop(30).take(100)) {
  console.log(payment);
}

In this scenario, a single request does not satisfy the demand. The Mollie API does not return pages of over 250 values. Therefore, the implementation requests two pages with 150 values each, rather than requesting 250 values twice "wasting" 200 values in the process:

for await (let payment of mollieClient.payments.iterate().take(300)) {
  console.log(payment);
}

Advanced

The iterate methods now return a lazy iterator: a container which creates and holds an upstream iterator when it is needed. As such, the implementation can watch for calls to take, drop, and filter and use that information to make an educated guess about the number of values which are to be consumed when the upstream iterator is created.

A demand of infinity is initially guessed. The chain is then traversed from right to left:

  • take reduces the guess to the lowest value out of the current guess and the limit passed to take.
  • drop increases the guess by the limit passed to drop.
  • filter resets the guess to infinity.

Limitations

Limits applied through other means than take cannot be detected. The implementation does not optimise the following snippet:

const payments = [];
for await (let payment of mollieClient.payments.iterate()) {
  payments.push(payment);
  if (payments.length == 10) {
    break;
  }
}

Because there is no way to predict how many values will satisfy a certain filter, filter before take is not optimised:

// (There is no way to predict how many payments are required to end up with 10.)
for await (let payment of mollieClient.payments.iterate().filter(payment => Math.random() > .5).take(10)) {
  console.log(payment);
}

filter after take is OK, of course.

Because any number of new iterators can be created from any iterator (through drop, filter, map, and take), those iterators together form a (rooted) tree. The educated guess is derived from the path between the root and the leaf which triggers the creation of the upstream iterator.

This does mean that in the following example, this class incorrectly guesses that only 10 values are required when in actuality 12 are required:

const iterator = mollieClient.payments.iterate();
for await (let payment of iterator.take(10)) {
  console.log(payment);
}
// (The endpoint has already been called, therefore this take is ignored.)
for await (let payment of iterator.take(2)) {
  console.log(payment);
}

@Pimm Pimm changed the title Pimm/demand Optimise iteration API Jul 12, 2022
Pimm added 2 commits July 13, 2022 12:35
The iterate methods now return a lazy iterator ‒ a container which creates and holds an upstream iterator when it is needed ‒ which makes an educated guess about the amount of values which are to be consumed.
When using the iterate methods, the DemandingIterator makes an educated guess about the number of values which are to be consumed. This guess is then used to request an optimal number of values from the Mollie API.
@Pimm Pimm marked this pull request as draft July 13, 2022 11:10
Now that the iteration API attempts to request the exact amount of pages required, it should not blindly request a new page just because the current page has been consumed. In the future, it might take the guessed demand into account.
@Pimm Pimm marked this pull request as ready for review July 13, 2022 14:16
Copy link
Collaborator

@dsempel dsempel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clever optimization!

@Pimm Pimm merged commit f5f5d7e into master Jul 14, 2022
@Pimm Pimm deleted the pimm/demand branch July 14, 2022 10:31
@Pimm Pimm mentioned this pull request Mar 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants