-
Notifications
You must be signed in to change notification settings - Fork 224
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
starlark: fix bugs in str.{title,capitalize} #3
Conversation
Behavior now matches the spec (and Python3). + Tests. Fixes google/skylark#140 Change-Id: I02f1f9775f4b64095d5e2aeb33f39db94046b50e
doc/spec.md
Outdated
|
||
```python | ||
"hello, world!".capitalize() # "Hello, World!" | ||
"hElLo, wOrLd!".capitalize() # "Hello, World!" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the two comments look wrong (lowercase "w")
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well spotted! Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Behavior now matches the spec (and Python3). + Tests. Fixes google/skylark#140 Change-Id: I02f1f9775f4b64095d5e2aeb33f39db94046b50e
@@ -3505,11 +3505,14 @@ See also: `string·elems`. | |||
<a id='string·capitalize'></a> | |||
### string·capitalize | |||
|
|||
`S.capitalize()` returns a copy of string S with all Unicode letters | |||
that begin words changed to their title case. | |||
`S.capitalize()` returns a copy of string S with its first code point |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Python says "first character" instead of "first code point". Is the difference intended?
In Python:
>>> s = '\uFB03'
>>> s
'ffi'
>>> s.capitalize()
'FFI'
>>> len(s.capitalize())
3
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, this comment was too nitpicky. I don't think the Python behavior actually makes sense: given the wording, I'd expect it to return three characters: F f i.
I think having the spec say this applies to code points, not characters makes sense. It may be good to point out somewhere that this is different than Python though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Python function is even more underspecified than ours: it seems to attempt to uppercase (not capitalize) the decomposition of a lower case letter that has no corresponding uppercase letter. I don't want to go there.
Thanks for educating me about this particular naughty string. :)
…o string-capitalize Change-Id: I6fe54df6cb48d46b5fa0fc194f2ef2276dd464e1
Scratch the surface of any program that uses Unicode and you find bugs---I found some more. Please take another look. |
2c5c207
to
150f739
Compare
* implement set.update method The `update` method on the `set` is defined in the Starlark specification [1] but is not present in the starlark-go implementation. This change introduces this functionality. [1] https://github.com/bazelbuild/starlark/blob/master/spec.md#setupdate * implement set.update method Fixes from @adonovan in PR * implement set.update method Fixes #2 from @adonovan in PR * implement set.update method Fixes #3 from @adonovan in PR
Behavior now matches the spec (and Python3).
Fixes google/skylark#140