-
-
Notifications
You must be signed in to change notification settings - Fork 727
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add std.string.splitterLines() #2987
Conversation
83f16d1
to
003ec29
Compare
|
||
@property Range front() | ||
{ | ||
if (iStart == _unComputed) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a fan of superlong if
statements - this is virtually the entire function. Have it call a helper function? Or make an early return?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of branch prediction, the 'then' is the hot case, and this is why it is organized as it is. Performance! The 'then' is complex enough that as a helper function, it would not get inlined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You won't be able to measure a difference for this one test in such a large function.
Functions consisting of essentially one if
are the pits. The reader sees the beginning of a bulky if
and goes, "boy that's some good long if
statement, surely this function will do many wonderful interesting things long after this if
is done". So the reader puts the if
on the mental stack and goes on to read the large body of code inside the if
, to just realize much later that all that was for naught - there's nothing after if
so it needn't be remembered in the first place!
This is not good style, pure and simple. It's not reason enough to reject this code, but you'd do good to refactor it and anyone coming later would do good to refactor it.
003ec29
to
3338812
Compare
can you transform the nogc test into a doc unittest. Examples always help. |
Does not throw on invalid UTF; such is simply passed unchanged | ||
to the output. | ||
|
||
Does not allocate memory. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not a native english speaker, but I believe both "Does .." sentences need a subject.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's actually a common usage when describing how something works. The subject is implied -- though we don't use it in conversation.
LGTM. |
{ | ||
auto s = "\rpeter\n\rpaul\r\njerry\u2028ice\u2029cream\n\nsunday\nmon\u2030day\n"; | ||
auto lines = s.splitterLines(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An iteration would be nice here - after that we can pull this.
static immutable witness = ["peter", "paul", "jerry", "\u2028ice\u2029cream", "sunday", "mon\u2030day"];
uint i;
foreach (line; lines)
{
assert(line == witness[i++]);
}
assert(i == 6);
83967e8
to
18336ad
Compare
Auto-merge toggled on |
add std.string.splitterLines()
I see it's already merged, but I'd like to bikeshed: |
Have to agree, the name really isn't great. Can we overload |
I also think that |
👍 for |
Requiring length seriously limits this algorithm. It can't support infinite ranges, nor ranges with unknown length, such as data coming from a socket. |
It requires slicing. Are there ranges without length that support slicing? Without allocating memory? There is |
Also, given the functions |
@kuettler Infinite ranges can support slicing, but they don't have a (finite) length. |
@kuettler We already have
Maybe, but |
{ | ||
private: | ||
Range _input; | ||
alias IndexType = typeof(unsigned(_input.length)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@WalterBright what is the reason for this alias? is size_t not suitable here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's murky territory. For example, the length
property is allowed to always return ulong
(according to hasLength
), which breaks code that assumes size_t
when the target is a 32-bit platform. Yet many range algorithms and higher-order ranges treat the length
property as being size_t
. I don't think it's a major problem, but in the long term we need to settle on which way to go.
Melds together string.splitLines() and algorithm.splitter() to make a line splitter that doesn't allocate memory and accepts ranges as input instead of just arrays.