Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

punycode.encode('♡.com') should return 'xn--c6h.com' but instead it returns '.com-ku3b' #117

Open
jasonkhanlar opened this issue Mar 20, 2022 · 7 comments

Comments

@jasonkhanlar
Copy link

jasonkhanlar commented Mar 20, 2022

punycode.encode('♡.com') should return 'xn--c6h.com' but instead it returns '.com-ku3b'

@jasonkhanlar jasonkhanlar changed the title '"♡.com" should return "xn--c6h.com punycode.encode('♡.com') should return 'xn--c6h.com' but instead it returns '.com-ku3b' Mar 20, 2022
@jasonkhanlar jasonkhanlar reopened this Mar 20, 2022
@jasonkhanlar
Copy link
Author

jasonkhanlar commented Mar 20, 2022

Correction: The shell had POSIX

LANG=
LANGUAGE=
LC_CTYPE="POSIX"
LC_NUMERIC="POSIX"
LC_TIME="POSIX"
LC_COLLATE=C
LC_MONETARY="POSIX"
LC_MESSAGES="POSIX"
LC_PAPER="POSIX"
LC_NAME="POSIX"
LC_ADDRESS="POSIX"
LC_TELEPHONE="POSIX"
LC_MEASUREMENT="POSIX"
LC_IDENTIFICATION="POSIX"
LC_ALL=

and now I switched to en_US.UTF-8, and the output is "c6h" which is close, but why doesn't it show as "xn--c6h" instead? Also I found the tr46 module which seems to be better.

@alveshelio
Copy link

I'm quite new to the work of "punycode" and so my comment here might not make much sense, I'm sorry for that.

I have the same behaviour when using within a Typescript project in the frontend. I've checked and my file is indeed in UTF-8.
I have a contact form and when when entering an email address like so 张伟@example.com it returns @example.com-0u7sn60p

I'm not sure if it should return 0u7sn60p@example.com but I'm pretty sure that @example.com-0u7sn60p isn't the right output.

@jasonkhanlar where you able to have it work? Where you using it in the command line or in a NodeJS application?

@jasonkhanlar
Copy link
Author

jasonkhanlar commented Apr 29, 2022

@alveshelio I don't remember, but in my project I switched to using xmlbuilder, despite encountering oozcitak/xmlbuilder2#117 and concluding with my own work-around for my project use case scenario oozcitak/xmlbuilder2#131. Also other than that minor hiccup, I found it to be a beautifully wonderful library and I appreciate the devs that made it!

@alveshelio
Copy link

Hey Jason,

Thank you for getting back. I guess I'll have to search for something else :) Cheers

@AlttiRi
Copy link

AlttiRi commented Jun 23, 2022

new URL("https://♡.com").href

"https://xn--c6h.com/"

@milewskibogumil
Copy link

@AlttiRi Great solution! Is there any tricky way to reverse this function? I mean from punycode to Unicode?

@silverwind
Copy link

You should use .toASCII to encode domain names:

> (await import("punycode")).toASCII("♡.com")
'xn--c6h.com'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants