-
Notifications
You must be signed in to change notification settings - Fork 939
Apply patches from kuroda/iso-2022-jp #534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
These patches introduce some special treatments for ISO-2022-JP encoding (so-called JIS-code), which is the de facto standard when we send an email written in Japanese language. These patches have effects only when a Mail object is instanciated with :charset => 'ISO-2022-JP' option. With this option, the header values and body text are automatically converted from UTF-8 to ISO-2022-JP. Note that some characters are replaced with other characters before this conversion takes place in order to solve problems on some platforms. See https://github.com/kuroda/mail-iso-2022-jp for details.
Oops! Travis failed on |
Great work, I'm working on the JRuby issues |
+1 to getting this in! |
@kuroda ok, so I'm playing with this now, I like how you've simplified the header field classes... but from my review, the biggest issue is that we are scattering a lot of IF statements through the code base, do you have any good ideas on how we would simplify this? |
@mikel Are you intend to clean up the |
@@ -114,6 +136,7 @@ def do_encode(field_name) | |||
|
|||
def do_decode | |||
return nil if value.blank? | |||
return value if charset.to_s.downcase == 'iso-2022-jp' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this return early?
Nice work @kuroda! I think the many For most charsets, there's no extra work, so we can use a trivial codec. For iso-2022-jp, we need to remap certain characters and handle kanji code conversion. |
else | ||
value = Mail::Preprocessor.process(value) | ||
value = NKF.nkf(NKF_OPTIONS, value) | ||
"=?ISO-2022-JP?B?#{Base64.encode64(value).gsub("\n", "")}?=" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May contain \r
too? We handle Base64 transfer-encoding elsewhere, too. Why are we duplicating it here?
Thanks for your comments. Admittedly, I want to help to improve the How do you think? Should we go ahead merging two gems, or should we keep our gem separately? |
We could merge the specs and mark them as pending, first. Then introduce some charset abstraction to cleanly handle these cases. As far as I can tell, this boils down to doing Kanji code conversion on fields and MIME parts with the iso-2022-jp charset. That means we'll want to associate a "charset converter" with fields and MIME parts. Any activity on these objects would take this additional conversion into account, in addition to transfer encoding (base64, quoted printable, etc). My first thought is that a good start is to consider Kanji code conversion as part of transfer encoding/decoding duties. |
New pull at #595 (Note, you can force-push a git branch to update a pull request, no need to open new pull requests ❤️) |
I am maintainer of the
mail-iso-2022-jp
gem, a set of patches againstmail
gem forISO-2022-JP
encoding.I applied these patches on the
mail
source tree with @mikel.Note that these patches have effects only when a
Mail
object is instanciated with:charset => 'ISO-2022-JP'
option. With this option, the header values and body text are automatically converted fromUTF-8
toISO-2022-JP
.I am afraid that someone does not like this automatic conversion feature arguing that it is not scope of this library.
I submit this pull request as a start point of discussion. Please let me know if there is a room for improvement.