Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for characterset in mb_strtoupper-call #5

Merged
merged 1 commit into from
Apr 5, 2013

Conversation

fschaeffer
Copy link
Collaborator

As the internal encoding is not always used in mb_strtoupper I added an option to set the characterset in _options and use it in the _strtoupper-call.

@mtibben
Copy link
Owner

mtibben commented Apr 4, 2013

If we add this as a configurable option, I think we should apply the encoding across all functions that support it. e.g. htmlspecialchars, html_entity_decode

But from what I can see, I don't think this library is even going to work in a charset which doesn't have the ASCII 1-128 characters - heavy use of regular expressions, etc. To support other character sets properly, we would need to convert to UTF-8 right at the beginning

Alternatively, you could just hardcode 'UTF-8' into the mb_strtoupper function, (as the other functions default to UTF-8 also). That way we can say that this library only supports UTF-8, and if you want other encodings you need to do the conversion yourself.

@fschaeffer
Copy link
Collaborator Author

I first thought of hardcoding the characterset into the mb_strtoupper-call but then wanted to give the user the option to override it if necessary. But you're right, the other functions rely on UTF-8 aswell so we could even hardcode it... We're all on the UTF-8 road nowadays, aren't we :-).
During my tests I found no glitches when using romanian, german (or other western latin characters), bulgarian (that is cyrillic) or greek $str-contents. So your class works pretty good with them (didn't test hebrew or kanji though). I'll push another commit ....

@mtibben
Copy link
Owner

mtibben commented Apr 5, 2013

Cool. Can you squash this, and then I'll merge

@fschaeffer
Copy link
Collaborator Author

I would if I knew how? As the commits are already pushed I get into a diverged state if I rebase them into one commit. Any directions how to get this done?

@mtibben
Copy link
Owner

mtibben commented Apr 5, 2013

Yup sure. You can rebase and squash into one commit. Then force push

@fschaeffer
Copy link
Collaborator Author

Ha, learning each day something new about git, always struggled with "detached state", --force did the trick... thanks a ton.

As the mb_strtoupper-function relies on the internal encoding when used
without the second parameter and this does not work all the time (e.g.
for me on Windows äöü were not uppercased) the mb_strtoupper-call
wasforced to UTF-8.
mtibben added a commit that referenced this pull request Apr 5, 2013
Force UTF-8 in mb_strtoupper-call
@mtibben mtibben merged commit 77f8096 into mtibben:master Apr 5, 2013
@mtibben
Copy link
Owner

mtibben commented Apr 5, 2013

No probs :)

@fschaeffer fschaeffer deleted the mbStrToUpper branch April 5, 2013 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants