Closed
Bug 200984
Opened 21 years ago
Closed 21 years ago
[FIX]Arabic text in Javascript "unescape" function returns the wrong output
Categories
(Core :: Internationalization, defect, P1)
Core
Internationalization
Tracking
()
VERIFIED
FIXED
mozilla1.4beta
People
(Reporter: neokuwait, Assigned: bzbarsky)
References
Details
Attachments
(3 files, 2 obsolete files)
Gecko/20030328 TO REPRODUCE: 1. view testcase ACTUAL: the unescape function produces Latin characters instead of Arabic EXPECTED: only characters in their %xx.. hexadecimal form should be affected? (see IE6)
Reporter | ||
Comment 1•21 years ago
|
||
Assignee | ||
Comment 2•21 years ago
|
||
Intl -- not arabic specific.
Assignee: mkaply → smontagu
Status: UNCONFIRMED → NEW
Component: BiDi Hebrew & Arabic → Internationalization
Ever confirmed: true
OS: Windows XP → All
QA Contact: zach → ylong
Hardware: PC → All
Assignee | ||
Comment 3•21 years ago
|
||
Assignee | ||
Comment 4•21 years ago
|
||
Attachment #119646 -
Attachment is obsolete: true
Assignee | ||
Comment 5•21 years ago
|
||
Comment on attachment 119647 [details] [diff] [review] Oops, that had an extra chunk in it... Would you review? The key here is that ToNewCString does lossy "conversion" to a CString.... This patch makes the conversion use the proper charset encoder, does a bit of cleanup, and removes the unnecessary call to Reset() (new decoders already come blank).
Attachment #119647 -
Flags: superreview?(jst)
Attachment #119647 -
Flags: review?(smontagu)
Assignee | ||
Comment 6•21 years ago
|
||
mine.
Assignee: smontagu → bzbarsky
Priority: -- → P1
Summary: Arabic text in Javascript "unescape" function returns the wrong output → [FIX]Arabic text in Javascript "unescape" function returns the wrong output
Target Milestone: --- → mozilla1.4beta
Comment 7•21 years ago
|
||
Comment on attachment 119647 [details] [diff] [review] Oops, that had an extra chunk in it... // Allocate a buffer of the maximum length PRUnichar *dest = (PRUnichar*)nsMemory::Alloc(sizeof(PRUnichar) * maxLength); + NS_ENSURE_TRUE(dest, NS_ERROR_OUT_OF_MEMORY); + ... + rv = decoder->Convert(encodedData, &unescapedByteCount, dest, &destLen); + if (NS_FAILED(rv)) { nsMemory::Free(dest); - return result; + return rv; } aReturn.Assign(dest, destLen); nsMemory::Free(dest); Could you make dest an nsAutoPtr<PRUnichar *> here? If not, you may want to flip the code around so that you need only one call to nsMemory::Free(dest)... sr=jst
Attachment #119647 -
Flags: superreview?(jst) → superreview+
Assignee | ||
Comment 8•21 years ago
|
||
nsAutoPtr will call "delete" on the pointer, so that's no good here (unless I actually do a "new PRUnichar[]" and cast or something like that.... I'll flip the test around; good catch.
Comment 9•21 years ago
|
||
Before I review, can you quiet my suspicions that this is a non-problem? Specifically, is non-ASCII text legal input to unescape(), and does calling unescape() with such text have defined output?
Comment 10•21 years ago
|
||
... and another question (possibly for another bug) : is %DA%D1%C8%ED the correct escaping of the string in the testcase? Why not %0639%0631%0628%064A? Or possibly %D8%B9%D8%B1%D8%A8%D9%89?
Comment 11•21 years ago
|
||
Comment on attachment 119647 [details] [diff] [review] Oops, that had an extra chunk in it... OK, I accept that we need this for compatibility, although we need a new bug to consider what the correct behaviour is, especially of escape(). Opera, IE, and Konqueror all render the second line of the testcase as: escaped: %u0639%u0631%u0628%u064A%20%u0639%u0631%u0628%u064A >+ // To gracefully deal with encoding issues, we have to do the following: Nit: don't split infinitives r=smontagu
Attachment #119647 -
Flags: review?(smontagu) → review+
Assignee | ||
Comment 12•21 years ago
|
||
Comment on attachment 119647 [details] [diff] [review] Oops, that had an extra chunk in it... Waldemar, Simon asked me to make sure you were OK with this patch.. could you take a look, please?
Attachment #119647 -
Flags: review+ → review?(waldemar)
Comment 13•21 years ago
|
||
The "new bug" I requested in comment 11 would be a dupe of bug 44272 :-)
Comment 14•21 years ago
|
||
I'll take a look at what the patch does. I'm currently building Mozilla so I can try it.
Comment 15•21 years ago
|
||
I'm not familiar with the character set used in the first test case, so I used a UTF-16 test case. encodeURI/decodeURI work as expected, but escape and unescape don't produce visible output with the patch applied. Hmmmm.....
Assignee | ||
Comment 16•21 years ago
|
||
Comment on attachment 120504 [details]
UTF-16 test case
With this testcase, "escape" produces no output even without my patch...
The problem is the use of flat char* buffers to handle UTF-16 text, of course.
Things die at the first null, which is the first byte if your endianness is
right.
I suppose I could go ahead and rewrite all that stuff in nsGlobalWindow to not
be broken like that....
Attachment #120504 -
Attachment mime type: text/utf16 → text/html
Assignee | ||
Comment 17•21 years ago
|
||
ccing some intl people for opinions
Comment 18•21 years ago
|
||
Asking the more general question: why do the JavaScript 'escape' and 'unescape' functions care what the document character set is? They are always passed UTF-16 from the JS engine and should send UTF-16 back to it. 'encodeURI' and 'decodeURI' don't care and work properly.
Assignee | ||
Comment 19•21 years ago
|
||
> why do the JavaScript 'escape' and 'unescape' functions care what the document
> character set is?
Because that's the behavior real web sites expect. In particular, one will
often have a site escape() a string, then on the server unescape it and expect
it to be in the document encoding.... Also, some sites will escape a string
server-side, converting a string in document encoding to %-escapes. When
unescaped, we get a byte sequence that's in the document encoding; to give it
back to the JS engine we have to convert it to UTF-16.
Assignee | ||
Comment 20•21 years ago
|
||
Waldemar, please see comment 19?
Assignee | ||
Comment 21•21 years ago
|
||
Attachment #119647 -
Attachment is obsolete: true
Assignee | ||
Updated•21 years ago
|
Attachment #119647 -
Flags: review?(waldemar)
Assignee | ||
Updated•21 years ago
|
Attachment #125691 -
Attachment filename: 琀攀猀琀⸀瀀愀琀挀栀 → test.patch
Attachment #125691 -
Flags: review?(waldemar)
Updated•21 years ago
|
Attachment #125691 -
Flags: review?(waldemar) → review+
Assignee | ||
Comment 22•21 years ago
|
||
Checked in.
Status: NEW → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
Comment 23•21 years ago
|
||
Verified fixed in 06-17 trunk build on WinXP except the problem in bug 44272.
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•