Catch edge cases were delimiters are wrongfully caught#312
Catch edge cases were delimiters are wrongfully caught#312abitrolly merged 2 commits intochubin:masterfrom
Conversation
|
This part of code badly needs some test table with expected inputs and outputs. I am having a hard time trying to juggle them in my head. I spent some time juggling with In [4]: delim = re.escape('+')
In [5]: re.split(delim, '/g++ whoa')
Out[5]: ['/g', '', ' whoa']
In [6]: re.split("%s+" % delim, '/g++ whoa')
Out[6]: ['/g', ' whoa']
In [7]: re.split("%s+" % delim, '/g++')
Out[7]: ['/g', '']
...
In [21]: re.split("%s(?:[^%s])" % (delim, delim), '/ge+ll33+++')
Out[21]: ['/ge', 'l33+++']
... |
|
I don't have a complete list of would be inputs, but here are a few test cases that should correctly split query and section:
|
|
@Jan200101 how about moving that function to the root scope without modification in a separate commit? This way we will be able to test previous behavior in automated fashion with bisect. |
|
sure thing |
|
Well, I actually thought about moving only one function, as it is the only thing which behavior needs to be tested right now. Otherwise the diff becomes quite unreadable. And the move commit needs to be code modification commit for the old behavior to be testable. |
|
I had taken the initiative to move out all functions but reversed that the first commit now moves the nested function out of the scope |
|
So the way to go from here. I need to setup a development environment to fix tests in |
|
@Jan200101 I've merged one of your commits and added tests on top to the |
|
I've added some tests that have + in their name that should be unchanged. I had added another test case locally but kept it out because it fails the test. |
|
I've reworked the logic slightly to deal with the test case above as well as to ease the exception handling a little bit. |
|
CI failure seems unrelated to PR. |
|
CI errors are strange. Need to investigate them before going on. |
|
Fixed tests in #317. Should become green after rebase. |
This replaces cycle logic in #312 with regex.
|
While going through the logic I found that it could still strip ++ in lines, so that |
A followup to chubin#312
This replaces cycle logic in chubin#312 with regex.
related to #308
implements more robust logic that splits on the first delim it finds that isn't followed by the same delim