Closed Bug 405444 Opened 17 years ago Closed 17 years ago

FormatDouble and FormatTriple mangle multi-byte strings in email

Categories

(Bugzilla :: Email Notifications, defect)

3.1.2
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Bugzilla 3.2

People

(Reporter: himorin, Assigned: mkanat)

References

Details

(Keywords: intl)

Attachments

(2 files)

after applying patch at bug 405362, i've got bugmail like below

http://bugzilla-trunk.test.mozilla.gr.jp/show_bug.cgi?id=3D5952

           Summary: =E3=83=86=E3=82=B9=E3=83=88=E3=83=90=E3=82=B0
                    =B9=E3=83=88=E3=83=90=E3=82=B0
                    =E3=83=90=E3=82=B0
                    =90=E3=82=B0
                    =C2=82=C2=B0
    Classification: Unclassified
           Product: =E6=97=A5=E6=9C=AC=E8=AA=9E=E5=8C=96
                    =9C=AC=E8=AA=9E=E5=8C=96
                    =9E=E5=8C=96
                    =C2=8C=C2=96
           Version: unspecified

might be affected by following.
* garbaged
* failed in FormatDouble (we should cut with considering multi-byte utf-8)
  i've write some sub routines only for this on bugzilla-ja...

i set utf8 to 1.
The patch from bug 405362 hasn't been reviewed yet. If it breaks something, you should comment in the other bug instead.
Status: UNCONFIRMED → RESOLVED
Closed: 17 years ago
Resolution: --- → INVALID
I can confirm this on landfill tip, as a separate issue from bug 405362.
Status: RESOLVED → UNCONFIRMED
Resolution: INVALID → ---
Summary: garbaged? charactors in encoded → FormatDouble and FormatTriple mangle multi-byte strings in email
Target Milestone: --- → Bugzilla 3.2
Attached image Screenshot of bad email
Just so everybody else has an idea of what this bug is about, here's a screenshot of a sample email.

I copied some text from the bugzilla-cn project and made it the summary of a bug:

欢迎访问Bugzilla简体中文本地化开源项目网站

That's what *should* be showing up as the summary in this screenshot, instead of the gibberish you see.

The normal body of the message is fine--it's just a problem with things that run through FormatDouble or FormatTriple.
Assignee: email-notifications → mkanat
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Okay, I have a solution for this using sprintf and splitting the strings, I just have to finish writing it.
Keywords: intl
Attached patch v1Splinter Review
Okay, here's a patch that fixes it for me. It was kind of tricky! :-)
Attachment #293239 - Flags: review?(shimono)
Attachment #293239 - Flags: review?(justdave)
Thx. It worked well.

I'll note one think as a adversaria for late bug submitters..
We ignore widthes of griphs on this patch. So, the format would be broken on the languages whose griphs are wider than us-ascii.
# i think this should be included in rel-note??



And,, i know i don't have reviewer status for patches of bugzilla, i think i leave one comment.
             my $desc = $fielddescription{$f};
-            $head .= FormatDouble($desc, $value);
+            $head .= multiline_sprintf(FORMAT_DOUBLE, ["$desc:", $value], 
+                                       FORMAT_2_SIZE);
for performance, shouldn't we change this part as below?
-            my $desc = $fielddescription{$f};
-            $head .= FormatDouble($desc, $value);
+            my $desc = $fielddescription{$f} . ':';
+            $head .= multiline_sprintf(FORMAT_DOUBLE, [$desc, $value], 
+                                       FORMAT_2_SIZE);
Comment on attachment 293239 [details] [diff] [review]
v1

sorry for late! :)
Attachment #293239 - Flags: review?(shimono) → review+
(In reply to comment #6)
> We ignore widthes of griphs on this patch. So, the format would be broken on
> the languages whose griphs are wider than us-ascii.
> # i think this should be included in rel-note??

  I don't think it needs a relnote. The width of glyphs depends on the font being used, as far as I know, so it'll always be different.

> for performance, shouldn't we change this part as below?
> [snip]

  There's no performance difference there, and there's also not a performance problem that needs to be fixed there, as far as I know.
Attachment #293239 - Flags: review?(justdave)
I'd be interested to know if our new method is faster than the old FormatDouble and FormatTriple, after all this. :-) I wonder!

Checking in Bugzilla/BugMail.pm;
/cvsroot/mozilla/webtools/bugzilla/Bugzilla/BugMail.pm,v  <--  BugMail.pm
new revision: 1.114; previous revision: 1.113
done
Checking in Bugzilla/Util.pm;
/cvsroot/mozilla/webtools/bugzilla/Bugzilla/Util.pm,v  <--  Util.pm
new revision: 1.66; previous revision: 1.65
done
Status: ASSIGNED → RESOLVED
Closed: 17 years ago17 years ago
Resolution: --- → FIXED
(In reply to comment #8)
> > We ignore widthes of griphs on this patch. So, the format would be broken on
> > the languages whose griphs are wider than us-ascii.
> > # i think this should be included in rel-note??
> 
>   I don't think it needs a relnote. The width of glyphs depends on the font
> being used, as far as I know, so it'll always be different.

referring from the Unicode Standard Annex #11 / http://unicode.org/reports/tr11/
> 5 Recommendations
> When mapping Unicode to East Asian legacy character encodings
>  * Wide Unicode characters always map to fullwidth characters.
>  * Narrow (and neutral) Unicode characters always map to halfwidth characters.
>  * Halfwidth Unicode characters always map to halfwidth characters.
>  * Ambiguous Unicode characters always map to fullwidth characters.
and see following two glossaries.
http://unicode.org/glossary/#halfwidth
http://unicode.org/glossary/#fullwidth

> > for performance, shouldn't we change this part as below?
> > [snip]
> 
>   There's no performance difference there, and there's also not a performance
> problem that needs to be fixed there, as far as I know.

i follow you.
Blocks: 410521
For the record, this bug needs your a+, mkanat. ;)
Flags: approval?
Oh, thanks LpSolit! :-)
Flags: approval? → approval+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: