-
Notifications
You must be signed in to change notification settings - Fork 0
ExtendedIterable v2 #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looking excellent!
|
is there this looks great btw 🎉 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be fundamentally missing the entire functionality for calling return() (or throw()) on the source iterator. I added a couple tests in this branch https://github.com/HarperDB/extended-iterable/tree/test-return to verify this, but this seems to have been lost from the original version that handled this correctly. Am I missing something here?
I am also curious about the philosophy of using TypeScript (or enforcing types) for testing. Doesn't this limit the tests to type-safe JavaScript? We want this to work with regular JavaScript, and this seems like it is fundamentally limiting the breadth of the testing, when we actually want tests to be doing things that defy expectations (and defy type constraints) and verify that everything this works.
Anyway, there is certainly a lot of awesome stuff, great job with all the new methods!
@kriszyp it wasn't my intention to lose support for
I don't believe this is anything to be concerned about. I have tests that pass in invalid data of which I needed to cast it |
|
@kriszyp I've added full support for @Ethan-Arrowood I've removed the unnecessary transformer logic. It was an ill-designed AI suggestion that I wasn't smart enough at the time to recognize. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking great! On Monday I'm going to check this out locally and see if all the type stuff makes sense. Type casting is generally a code smell for me. But it has its place so I'll confirm on a second pass.
Otherwise I appreciate the exhaustiveness of the tests.
This is definitely close to done!
| * `callback`: `(value: T, index: number) => boolean | Promise<boolean>` - The | ||
| callback function to call for each result. | ||
|
|
||
| ##### Return value | ||
|
|
||
| * `T | Promise<T> | undefined | Promise<undefined>` - The first item of the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the return type become a Promise if the callback parameter returns a Promise? If so I think you may want to consider overloaded types. Right now, this could be interpreted as "regardless if callback returns boolean or Promise<boolean>, the return type could be any one of T | Promise<T> | undefined | Promise<undefined>".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or does it have to do with the items in the iterable already?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And thus, this type is actually correct and the user should always consider that an iteration could be a promise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This ExtendedIterable support sync and async iterators as well as sync and async callbacks for the few methods that accept a callback. That's the type I came up with that made TypeScript happy, but certainly open to doing it better.
I actually use function overloading for reduce(). I just played around with function overloading for find() and I can define a sync version, async version, but I still need the combo version to make TS happy like this:
find(callback: (value: T, index: number) => boolean): T | undefined;
find(callback: (value: T, index: number) => Promise<boolean>): Promise<T | undefined>;
find(callback: (value: T, index: number) => boolean | Promise<boolean>): T | Promise<T | undefined> | undefined {I dunno how to make this situation better. The code is too flexible.
The general rule of thumb is you don't need to await the iterable if you know that both the iterator and the callback(s) are all sync. In rocksdb-js, both the iterator and callbacks are sync. I'm not sure where the requirement to support both sync and async things came from, but we have it now and it may come in handy.
| while (!result.done) { | ||
| if (currentIndex === index) { | ||
| if (iterator.return) { | ||
| return (iterator.return(result.value) as IteratorReturnResult<T>).value; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing a type cast here seems like a code smell to me. I'll try pulling this locally another time and taking a closer look to understand it better. You don't have to change anything here necessarily, I trust you have a good reason for it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another TypeScript issues. iterator.return() can return T or Promise<T> depending if we're chained to a sync Iterator or AsyncIterator. It's impossible for an async iterator to reach this line as it branches above to the #asyncAt() flow. The problem is I couldn't find a way to convince TypeScript that result.value is indeed T. There's no way to detect if iterator is sync or async until you call next(), return(), or throw().
Consumers won't need to worry about this as for-of and for-await-of will get the appropriate iterator.
So it was either cast it as IteratorReturnResult<T> or capture the return value from return(), then test to see if it's a promise and then await it and return the value. I opted to avoid the bloat.
| } | ||
| } | ||
|
|
||
| next(): IteratorResult<T> | Promise<IteratorResult<T>> | any { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the | any - i think that basically neutralizes the other types you've listed here. But I'll have to follow up after I run this locally and inspect the type system a bit closer
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I concur. I need the any to make TypeScript happy because this is both an Iterator and AsyncIterator. JavaScript has no problem with this, but in TypeScript, Iterator and AsyncIterator conflict. You're not supposed to do both. The best solution I could come up with is any.
src/iterators/drop-iterator.ts
Outdated
| import { BaseIterator, DONE } from './base-iterator.js'; | ||
|
|
||
| export class DropIterator<T> extends BaseIterator<T> { | ||
| #count!: number; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why is !: it seems like you assign it definitely in the constructor
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do this in 3 places: drop(), take(), and in concat(). I move the assignment above the validation in both drop() and take().
I don't want to/cant't do that for concat(other) as I would need to make the property that holds the other iterable be undefined, which it will never be, then I'd have assert the iterable is valid every where when I know it will be post constructor.
| #iterator: (Iterator<T> | AsyncIterator<T>) & { async?: boolean }; | ||
|
|
||
| constructor(iterator: IterableLike<T>) { | ||
| this.#iterator = resolveIterator(iterator); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to resolveIterator() to lazily create the iterator.
|
I looked through some of this PR. Some of the issues I found with what I have looked at so far:
I think we have discussed this before, so sorry if I am rehashing this, but this does not have support for the SKIP constant for single step map/filtering like done here: https://github.com/HarperDB/harperdb/blob/main/resources/search.ts#L332-L340 Generally this an increased amount of code to maintain. V1 accomplished everything we currently need with over 50% less code, AFAICT.
So currently if someone finds a bug and wants to submit a PR for a new unit test for JS their code that is failing, if their code is rejected by TS and lefthook's commit hooks, their commit will be rejected? Why this level of strictness when tests should be as broad? This seems kind of hostile to encouraging broader testing to me. This PR is extremely difficult to review in terms of how the existing code was modified and updated to support new functionality. As noted, thre are regressions in functionality which are very difficult to understand and correlate with the underlying changes in the code. This PR conflates module refactors, JS -> TS conversions, behavioral/logical changes, workflow changes, directory structural changes, and more all in the same PR.
Anyway, I certainly understand and appreciate that you have put significant time and effort into these upgrades. I believe there is indeed significant value here and I have no intention of seeing any of this go to waste. However, this needs to properly planned (with appropriate time to review and test outside of unit tests) and this is not the right time to move forward with these changes, as we want to focus efforts on higher priorities. I will go ahead and create the appropriate Jira tickets for this work, so we can properly track and plan these efforts. |
|
@kriszyp thank you for finding time to review this PR!
Correct, I do not have tests for all methods that return an iterable. These will need to be added. Regarding the bug, the code was relying on the iterator's
We agreed last week that the iterator should only be created when
It was my understanding that we were going to break this up into a map and filter in Harper.
I recognize there's a bit of code in v2. Much of it exists to handle both sync and async iterables along with error handling. My code also includes types, includes docs, and whitespace. I think there is opportunities to combine result/error handling and reduce the amount of code.
Casting parameters with invalid values as
I agree, but remember that the extended-iterable v2 code originated in the key ranges PR as apart of CORE-2646. We never had a ticket for extended-iterable.
In the spirit of your previous paragraph, I strongly feel we should not have a single ticket CORE-2807 to add all the remaining methods and a single ticket CORE-2808 to add the tests. This is backwards. We should have a ticket for each method and it's tests. You say you have no intention of seeing any of v2 go to waste, but I don't see a path forward with v2 especially given all of those tickets are a low priority. |
Hmm, I could be wrong about this, but I don't think this is quite right. From what I recall, the challenge here is that you have to determine if the transaction gets closed before propagating the error to the handlers (you can't make this decision when catching the upstream error), because there is no subsequent return() or anything to close. In trying to compare your code with the original code, I think you have kinda renamed
That's fine, but we do need to be cognizant of the extra work that is entailed in this so we can plan for it.
TypeScript says that
These tickets are marked as version 5.1, so they are scheduled, and 5.1 will happen. I changed these to medium priority. |
In what scenario would |
This
ExtendedIterableis derived from theRangeIterablefrom therocksdb-jsproject. The goals of this implementation were to maximize performance while maintaining code readability and extensibility. In addition, the implementation uses TypeScript with generics and comes with a complete test suite with 749 tests and 100% code coverage.The
ExtendedIterablecan be initialized with any iterable-like type including arrays, strings, maps, sets, iterators, async iterators, iterable objects, generators, and async generators.Supported methods include:
asArrayat()concat()drop()every()filter()find()flatMap()map()mapError()reduce()slice()some()take()toArray()These methods can be chained together and values are lazy evaluated as the consumer calls the
next()method.All methods that accept a callback support both sync and async callbacks. The control flow is sync until it detects a Promise in the iterable-like stream or callback result, then switches to async mode.
Errors are bubbled up to the caller. If in async mode, a rejected promise is returned. The
mapError()method can be added to the chain to catch errors and allow iteration to continue.