Closed
Bug 161479
Opened 22 years ago
Closed 22 years ago
[trunk]JavaScript string always converted to UTF-8 inside a Windows-1251 page
Categories
(Core :: Internationalization, defect)
Tracking
()
VERIFIED
FIXED
mozilla1.2alpha
People
(Reporter: Maniac, Assigned: nhottanscp)
References
Details
(Keywords: intl, regression)
Attachments
(4 files, 2 obsolete files)
161 bytes,
text/html
|
Details | |
20.67 KB,
image/png
|
Details | |
16.41 KB,
image/png
|
Details | |
6.59 KB,
patch
|
nhottanscp
:
review+
jst
:
superreview+
dbaron
:
approval+
|
Details | Diff | Splinter Review |
In document written in Windows-1251 strings that appear as JavaScript string constant are converted in UTF-8 unlike the rest of the page.
Reporter | ||
Comment 1•22 years ago
|
||
1. Save the testcase on local disk (for <META HTTP-EQUIV> take effect) 2. Look into the source: both strings in HREF attribute and between <A></A> should look identical (though unreadable for someone because they are in Russian) 3. Open the testcase in browser and click on a link. 4. Compare two string: the one in alert window looks converted in UTF-8.
Comment 2•22 years ago
|
||
WFM 2002072204 Linux. The alert looks identical to both the source and the displayed text in the browser window (less font size differences, of course).
Comment 3•22 years ago
|
||
It seem works for me either, I don't see the russian characters are displayed much difference between html file and alert window. Reporter: could you please attach a screen shot for the problem? thanks!
Comment 4•22 years ago
|
||
I'm seeing this in Linux build 2002-08-05-08... both locally and from bugzilla.
Reporter | ||
Comment 5•22 years ago
|
||
On my system (win98) it looks pretty much like on Boris' screenshot. Sorry, I forgot to specify my build id: 2002080108
Reporter | ||
Comment 6•22 years ago
|
||
Comment 7•22 years ago
|
||
Hmm, I don't have any problem on my WinXP, also on WinME even though with WinME the characters in Alert window looks like wider than they are in html file. I tried it on linux RH7.2, it has similar result as in WinME, except miss the last charcter"y" in alert window. But I didn't see the garbage display like the 2 screen shot before with both case. I'm going to confirm it now in order to get more investigation.
Status: UNCONFIRMED → NEW
Ever confirmed: true
On windows XP, Build 1.0 2002053012 doesn't show this problem, testcase works fine. On build 2002081409 running on the same computer, the testcase alert produces garbage
Comment 9•22 years ago
|
||
I saw the garbage display in 08-15 trunk build but not branch build / WinME.
Summary: JavaScript string always converted to UTF-8 inside a Windows-1251 page → [trunk]JavaScript string always converted to UTF-8 inside a Windows-1251 page
Comment 10•22 years ago
|
||
can this be related to the renaming of nsISupportsWString / nsISupportsString bug 157624? I had other javascript problems with strings after this was checked in.
Comment 12•22 years ago
|
||
*** Bug 166368 has been marked as a duplicate of this bug. ***
Comment 13•22 years ago
|
||
The problem is definitely with javascipt string constants in href attribute of <a> tag. I mean the problem is simply with href attribute, but the only way to encounter problems with it is using javascript inline functions.
Reporter | ||
Comment 15•22 years ago
|
||
BTW, On Win98 2002090208 testcase now refuses to work :-(. It ignores JavaScript: URL in <a href=...> and just reloads the same page instead. Should I file a new bug or am I missing something?
Target Milestone: --- → mozilla1.2alpha
Reporter | ||
Comment 16•22 years ago
|
||
Oh... Another 'BTW'. This bug looks very similar to bug 147991 which was filed earlier. May be we should resolve this as a duplicate?
Reporter | ||
Comment 17•22 years ago
|
||
Followup to comment #15: The testcase malfunction can be traced in the status bar where javascript:window.alert('...'); looks like javascript:///window.alert('...'); But! This appears only if window.alert contains parameter spelled in cyrillic letters. If I change this to, say, javascript:window.alert('Test'); then popup jumps up instantly upon clicking this thing.
Comment 18•22 years ago
|
||
Because this bug has more comments and testcase it worth marking bug 147991 as dup
It seems like there's weird stuff happening in nsJSProtocolHandler::EnsureUTF8Spec. I'm seeing (using \uNN for nonprintable characters) it be called some of the time with the input: Input spec (charset=windows-1251): JavaScript:window.alert('\uD0\u9F\uD0\uBE\uD1\u87\uD0\uB5\uD0\uBC\uD1\u83 UTF-8?'); which leads it to the first early return. However, sometimes it's called with the input: Input spec (charset=windows-1251): javascript:window.alert('%D0%9F%D0%BE%D1%87%D0%B5%D0%BC%D1%83 UTF-8?'); which leads it to run all the way through and produce the doubly-escaped output: javascript:window.alert('%D0%A0%D1%9F%D0%A0%D1%95%D0%A1%E2%80%A1%D0%A0%C2%B5%D0%A0%D1%98%D0%A1%D1%93 UTF-8?'); which ends up being used. From a low-level point of view, the percent-escaped input should have had |aCharset| as UTF-8, not windows-1251. That said, all this conversion strikes me as awfully messy.
Assignee | ||
Comment 21•22 years ago
|
||
> From a low-level point of view, the percent-escaped input should have had
> |aCharset| as UTF-8, not windows-1251. That said, all this conversion strikes
> me as awfully messy.
This is actually about non escaped case. The current code assumes the URI as
UTF-8 if not percent escaped while many existing documents use raw 8bit as URI
without escaping. In order to support them, we assume UTF-8 only if the string
is UTF-8 otherwise use the document charset. About the conversion, there is a
utility function I can use so it can be simplified.
Status: NEW → ASSIGNED
Assignee | ||
Comment 22•22 years ago
|
||
*** Bug 162958 has been marked as a duplicate of this bug. ***
Assignee | ||
Comment 23•22 years ago
|
||
Assignee | ||
Comment 24•22 years ago
|
||
>This is actually about non escaped case.
This was actually wrong, the data is escaped as dbaron mentioned. The current
patch checks if the URI is UTF-8 or not either escaped or not escaped, so fixes
the problem.
I will take a look at the patch again tomorrow then ask for reviews.
Comment 25•22 years ago
|
||
drivers are interested in this for 1.2a adding to the 1.2 dependency list.
Blocks: 1.2a
Assignee | ||
Comment 26•22 years ago
|
||
Attachment #98048 -
Attachment is obsolete: true
Comment 27•22 years ago
|
||
Add "HZ" to stateful charset set and them mark it as r=shanjian.
Assignee | ||
Comment 28•22 years ago
|
||
Attachment #98157 -
Attachment is obsolete: true
Assignee | ||
Comment 29•22 years ago
|
||
Comment on attachment 98214 [details] [diff] [review] Added HZ as 7bit encoding r=shanjian
Attachment #98214 -
Flags: review+
Comment 30•22 years ago
|
||
Comment on attachment 98214 [details] [diff] [review] Added HZ as 7bit encoding sr=jst
Attachment #98214 -
Flags: superreview+
Comment 31•22 years ago
|
||
This seems reasonable for now... once bug 166996 is fixed, we should revisit this code, though....
Comment on attachment 98214 [details] [diff] [review] Added HZ as 7bit encoding a=dbaron for trunk checkin
Attachment #98214 -
Flags: approval+
Assignee | ||
Comment 33•22 years ago
|
||
checked in to the trunk
Status: ASSIGNED → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Comment 34•22 years ago
|
||
Verified fixed in 09-10 trunk build on windows and linux.
Status: RESOLVED → VERIFIED
Comment 35•22 years ago
|
||
*** Bug 171521 has been marked as a duplicate of this bug. ***
You need to log in
before you can comment on or make changes to this bug.
Description
•