-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: adds substring method for Text with tests #523
base: master
Are you sure you want to change the base?
Conversation
src/Text.mo
Outdated
/// Text.substring("This is a sentence.", 5, 4) // "is a" | ||
/// Text.substring("This is a sentence.", 0, 0) // "" | ||
/// ``` | ||
public func substring(t : Text, start : Int, len : Int) : Text { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless you've got something more clever planned for negative args, I'd use Nat
for start
and len
and then just two (explicit) loops over the iterator's elements (using iter.next() explicitly). One to loop advance the iterator by count, the second to append the chars. The other question is whether you want to trap when out-of-bounds or return null
with return type ?Text
, not Text
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, that makes sense in terms of Motoko conventions. I was basing this off of a script I was writing where I was using it in conjunction with indexOf
, which returned a -1
if the pattern wasn't found, but both can simply return options and use Nats instead
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was based off of the JavaScript patterns, where you are allowed to use negative indexes to count backward from the end of iterable structures. There's no need to introduce that to Motoko, however
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think in my opinion it's nicer to return an empty string over null
. I've defined those cases in the doc comment, where start
or length
exceeds the length of a base string, it will
- return "" if start is beyond input.size()
- return the substring from the starting position to the end of the input if length exceeds the rest
@crusso Do you think it makes sense to expose |
Correct index-based access is a linear-time operation on Unicode strings, which immediately makes any looping based on it quadratic. It hence was a very deliberate decision to favour iterators and not to support random access on Text in Motoko, nor any other index-based operations. We have discussed in the past that a subtext operator(*) should hence take two iterators as boundaries (which also eliminates the out-of-bounds error case), i.e., something like:
where The open problem, as I remember it, was how to ensure that the iterators belong to the text string in question. That will have to be a dynamic check, but that and extracting the position would require piercing the closure representing the iterator. (Maybe a more bearable solution is to introduce a subtype (*) |
I agree that the substring name may not be fitting, but I think that passing Could we call this |
There isn't really a useful alternative to an iterators-based API. Because how would you compute an index? The only available way is by iterating over the text -- with an iterator. What would be the point in having to convert that into an index first? It's perhaps unusual from a JS perspective, but there are other container libraries that work that way as well. |
How do iterators as boundaries work? How do I have to define them to get the boundaries I want? |
Yeah, I'm curious about that too. In the past, I've want to represent a Text slice as a pair of private and cloneable text iterator, plus length. Iterating the slice clones the iterator and enumerates (up to) length values. But our text iterators don't support cloning at the moment, though I don't think that would be too hard to add. |
No description provided.