Closed Bug 317926 Opened 19 years ago Closed 19 years ago

[l10n] Bookmark Manager's localized title is not encoded correctly.

Categories

(Camino Graveyard :: Bookmarks, defect)

PowerPC
macOS
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: waveridervsnrz, Assigned: mark)

Details

(Keywords: fixed1.8)

Attachments

(1 file, 1 obsolete file)

User-Agent:       Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8) Gecko/20051107 Camino/1.0b1
Build Identifier: Mozilla/5.0 (Macintosh; U; PPC Mac OS X Mach-O; en-US; rv:1.8) Gecko/20051107 Camino/1.0b1

at present, Bookmark Manager's title (NSLocalizedString "BookmarksWindowTitle") is UTF-16.
But Gecko (XPCOM) handle it as UTF-8 and Localized title look incorrectly.



Reproducible: Always

Steps to Reproduce:
Attached patch convert UTF-16 string to UTF-8. (obsolete) — Splinter Review
Are you saying that Gecko treats the input stream as UTF-8, even though it's a UCS2 string?
Status: UNCONFIRMED → NEW
Ever confirmed: true
Yes.

for example, When BookmarksWindowTitle is localized Japan Language "ブックマーク",
  UTF-16 code : 30D6, 30C3, 30AF, 30DE, 30FC, 30AF
and this string look "ÖïÞü¯", this incorrect string is
  UTF-16 code : 00D6, 00C3, 00AF, 00DE, 00FC, 00AF

I think Gecko handle it as UTF-8 and lost upper byte of UTF-16 character.
in UTF-16, upper byte of Alphabet is "00" and alphabet string look correctly (fortunately).
> Are you saying that Gecko treats the input stream as UTF-8, even though it's a
> UCS2 string?

AFAICT NS_NewStringInputStream uses ToNewCString, which

* Performs a lossy encoding conversion by chopping 16-bit wide characters down
  to 8-bits wide while copying |aSource| to your new buffer.
* This conversion is not well defined; but it reproduces legacy string behavior.

http://lxr.mozilla.org/seamonkey/source/xpcom/string/public/nsReadableUtils.h#97
the version of NS_NewCStringInputStream that takes a wide string should be removed. nsIInputStream produces bytes, not characters.
Comment on attachment 204287 [details] [diff] [review]
convert UTF-16 string to UTF-8.

I don't think this patch actually works.  These are being displayed as raw 1-byte C strings, regardless of before-the-fact conversions.

Unpatched, with the Japanese string, the bookmarks manager has a six-character title, with the high bytes stripped.  This is U+00D6 U+00C3 U+00AF U+00DE U+00FC U+00AF or "Ö
Something ate my comment.  I went on to say that the patched version produced a UTF-8 representation of the UTF-16 input, but that the UTF-8 would be treated as a n 18-character C string.
you should pass UTF-8 as charset to NS_NewInputStreamChannel
Attached patch FixSplinter Review
Oh, good, this works.
Assignee: mikepinkerton → mark
Attachment #204287 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #204373 - Flags: review?(joshmoz)
Attachment #204373 - Flags: review?(joshmoz) → review+
Re:comment #9

I test this patch and work fine. Thanks a lot.
Checked in, t&b.
Status: ASSIGNED → RESOLVED
Closed: 19 years ago
Keywords: fixed1.8
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: