Closed Bug 137182 Opened 24 years ago Closed 23 years ago

javascript: URL's with non-UTF8 characters not working.

Categories

(Core :: DOM: Core & HTML, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla1.1beta

People

(Reporter: jlquinn, Assigned: nhottanscp)

References

()

Details

Attachments

(2 files, 4 obsolete files)

Version is win32 0.9.9+ Gecko 2002412 Click on Processor. This pops up a selection box. In Mozilla you cannot select any of the entries. IE 5 works OK.
Attached file Reduced HTML testcase
Confirming bug using Mozilla trunk binary 20020411xx WinNT. Bring up the child window as described above, and click on a link. Nothing happens. The links in the child window fire this function: function SaveChoice(CategoryDataID,CategoryDataName,isVendor) { document.location.href = 'CategorySelectorSubmit.asp' + '?CategoryID=1' + '&CategoryDataID=' + CategoryDataID + '&CategoryDataName=' + CategoryDataName + '&IsVendor=' + isVendor; } Typical values for the three parameters to this function are: 60, 'Intel%AE+Xeon%99+processor+%28400MHz+FSB%29', 'false' It is the string 'Intel%AE+Xeon%99+processor+%28400MHz+FSB%29' that is causing the problem.
Assignee: rogerl → khanson
Status: UNCONFIRMED → NEW
Ever confirmed: true
Here is the HTML for the reduced testcase above: <html><head><title>Bug 137182</title> <script> function SaveChoice() { alert('Hello'); } </script></head><body> <a ref=javascript:SaveChoice('Intel%AE+Xeon%99+processor+%28400MHz+FSB%29')> Click this to get an alertbox </a> </body></html> In Mozilla the test fails: no alertbox comes up. If you remove the four '%' signs from the string, the testcase succeeds. With the '%' characters present, however, the alertbox does not come up. No errors in the JS Console. The testcase always works in NN4.7 and in IE6, whether there are '%' characters in the string or not -
This standalone test for the JS shell succeeds: test(); function test() { SaveChoice('IntelAE+Xeon99+processor+28400MHz+FSB29'); } function SaveChoice() { print('Hello'); } This shows the problem is not with JS Engine. Reassigning to DOM Level 0 for further analysis -
Assignee: khanson → jst
Component: JavaScript Engine → DOM Level 0
QA Contact: pschwartau → desale
Comment #4 has four typos. HOW DID I DO THAT?!??? The standalone JS testcase I'm using __HAS__ the '%' signs in it: test(); function test() { SaveChoice('Intel%AE+Xeon%99+processor+%28400MHz+FSB%29'); } function SaveChoice() { print('Hello'); } It passes in the JS shell as I stated above -
The problem here is that we unescape the javascript: URI and then assume that it's UTF8, but in this case it's not, and the UTF8 to UCS2 conversion fails. The result of this is that the whole script is lost so we never end up calling alert()...
Status: NEW → ASSIGNED
OS: Windows 2000 → All
Priority: -- → P3
Hardware: PC → All
Target Milestone: --- → mozilla1.1alpha
Summary: Javascript selection popup is broken → javascript: URL's with non-UTF8 characters not working.
*** Bug 149296 has been marked as a duplicate of this bug. ***
Naoki, we need to use the document charset and do the correct conversions here when converting the javascript: URL into unicode, could you have a look at this, I know you've done similar things in other places. If not, hand this back...
Assignee: jst → nhotta
Status: ASSIGNED → NEW
Target Milestone: mozilla1.1alpha → mozilla1.1beta
jst, any idea which file I should look at?
Status: NEW → ASSIGNED
Yeah, the problem is in the method nsJSThunk::EvaluateScript() starting at: http://lxr.mozilla.org/seamonkey/source/dom/src/jsurl/nsJSProtocolHandler.cpp#122
nsJSProtocolHandler::NewURI takes a charset, so we can store it. http://lxr.mozilla.org/seamonkey/source/dom/src/jsurl/nsJSProtocolHandler.cpp#782 But I don't know how to access nsJSProtocolHandler in nsJSThunk::EvaluateScript(). jst, is that possible?
We could make the nsJSProtocolHandler that created the nsJSThunk accessable from the nsJSThunk, but there's no guarantee that the nsJSProtocolHandler that created the nsJSThunk is the same one that ::NewURI() was called on, so I'm afraid that won't work. We could however the the conversion in ::NewURI() where we do have the charset, i.e. we could convert the incoming string into UTF8 and then once we get the nsJSThunk::EvaluateScript() we'd know that the JS in the URI is UTF8. Does that sound reasonable?
Yes, doing the conversion in ::NewURI() is good. There is already a code to convert escape style to JS \uXXXX in the code. But that is not executed if the URI is escaped. So the URI has to be unescaped first. Then convert from 'aCharset' to Uncode and apply \uXXXX escape. Not sure we need to apply URL espcae again to the result. Also the callers have to be changed to pass a document charset instead of the default empty string.
We don't need any JS style escaping here, all we need is to convert the 8-bit URI spec into unicode using the document's charset and then convert that into UTF8 and do normal URI escaping on the UTF8 string. Then once we get into nsJSThunk::EvaluateString() we'll take that URI escaped string and unescape it and convert the result from UTF8 into unicode again.
Attachment #88669 - Attachment is obsolete: true
Attachment #91691 - Attachment is obsolete: true
jst, could you review the patch?
*** Bug 158102 has been marked as a duplicate of this bug. ***
*** Bug 158241 has been marked as a duplicate of this bug. ***
Comment on attachment 91692 [details] [diff] [review] forgot to include a header change sr=jst
Attachment #91692 - Flags: superreview+
Comment on attachment 91692 [details] [diff] [review] forgot to include a header change >Index: dom/src/jsurl/nsJSProtocolHandler.cpp >=================================================================== >+NS_METHOD >+nsJSProtocolHandler::EnsureUTF8Spec(const nsACString &aSpec, const char *aCharset, >+ nsACString &aUTF8Spec) >+{ >+ aUTF8Spec.Truncate(0); No need for the 0 there. just Truncate() will do. Why don't we just check IsAscii(aSpec) here ? http://lxr.mozilla.org/mozilla/source/mailnews/compose/src/nsSmtpService.cpp#91 does it that way and it seems like the right thing to do... >+ nsCAutoString unescapedSpec; >+ NS_UnescapeURL(PromiseFlatCString(aSpec).get(), aSpec.Length(), >+ esc_OnlyNonASCII, unescapedSpec); >+ >+ if (IsASCII(unescapedSpec)) >+ return NS_OK; >+ >+ nsresult rv; >+ if (!mCharsetConverterManager) { >+ mCharsetConverterManager = do_GetService(NS_CHARSETCONVERTERMANAGER_CONTRACTID, &rv); >+ NS_ENSURE_SUCCESS(rv, rv); >+ } >+ nsCOMPtr<nsIAtom> charsetAtom; >+ rv = mCharsetConverterManager->GetCharsetAtom2(aCharset, getter_AddRefs(charsetAtom)); >+ NS_ENSURE_SUCCESS(rv, rv); >+ >+ if (mCharsetAtom != charsetAtom) { >+ rv = mCharsetConverterManager->GetUnicodeDecoder(charsetAtom, >+ getter_AddRefs(mUnicodeDecoder)); >+ NS_ENSURE_SUCCESS(rv, rv); >+ mCharsetAtom = charsetAtom; >+ } >+ >+ PRInt32 srcLen = unescapedSpec.Length(); >+ PRInt32 dstLen; >+ rv = mUnicodeDecoder->GetMaxLength(unescapedSpec.get(), srcLen, &dstLen); >+ NS_ENSURE_SUCCESS(rv, rv); >+ >+ PRUnichar *ustr = (PRUnichar *) nsMemory::Alloc(dstLen * sizeof(PRUnichar)); Hm, do we need to also allocate room for |sizeof(PRUnichar('\0'))| here? The string isn't null terminated, but it would be a good thing to check... >+ NS_ENSURE_TRUE(ustr, NS_ERROR_OUT_OF_MEMORY); >+ >+ rv = mUnicodeDecoder->Convert(unescapedSpec.get(), &srcLen, ustr, &dstLen); >+ if (NS_SUCCEEDED(rv)) { >+ NS_ConvertUCS2toUTF8 rawUTF8Spec(ustr, dstLen); >+ nsMemory::Free(ustr); Why is this free inside the if? This will leak |ustr| if Convert() fails. >+ NS_EscapeURL(rawUTF8Spec, esc_AlwaysCopy | esc_OnlyNonASCII, aUTF8Spec); >+ } >+ >+ return rv; >+} >+ >Index: docshell/base/nsWebShell.cpp >=================================================================== >RCS file: /cvsroot/mozilla/docshell/base/nsWebShell.cpp,v >retrieving revision 1.584 >diff -u -r1.584 nsWebShell.cpp >--- docshell/base/nsWebShell.cpp 15 Jul 2002 22:03:33 -0000 1.584 >+++ docshell/base/nsWebShell.cpp 17 Jul 2002 20:22:30 -0000 >@@ -575,11 +575,27 @@ > // Fall through, this seems like the most reasonable action > case eLinkVerb_Replace: > { >+ // get a charset of the document and use is as originCharset >+ nsAutoString docCharset; >+ nsCOMPtr<nsIPresShell> presShell; >+ nsDocShell::GetPresShell(getter_AddRefs(presShell)); >+ if (presShell) >+ { >+ nsCOMPtr<nsIDocument> doc; >+ presShell->GetDocument(getter_AddRefs(doc)); >+ if (doc) >+ { >+ if (NS_FAILED(doc->GetDocumentCharacterSet(docCharset))) >+ docCharset.Truncate(0); Again, you can lose the 0 and just call Truncate(). Also why the extra level of if? Just do if (doc && NS_FAILED(...)) >+ } >+ } >+ > // for now, just hack the verb to be view-link-clicked > // and down in the load document code we'll detect this and > // set the correct uri loader command > nsCOMPtr<nsIURI> uri; >- NS_NewURI(getter_AddRefs(uri), nsDependentString(aURLSpec), nsnull); >+ NS_NewURI(getter_AddRefs(uri), nsDependentString(aURLSpec), >+ docCharset.IsEmpty() ? nsnull : NS_LossyConvertUCS2toASCII(docCharset).get()); Please break up the ?: onto separate lines to make this easier to read. E.g. NS_NewURI(getter_AddRefs(uri), nsDependentString(aURLSpec), docCharset.IsEmpty() ? nsnull : NS_LossyConvertUCS2toASCII(docCharset).get()); > > // No URI object? This may indicate the URLspec is for an > // unrecognized protocol. Embedders might still be interested
*** Bug 158241 has been marked as a duplicate of this bug. ***
uconv does not use null terminated string, so no allocation for that
Attachment #91692 - Attachment is obsolete: true
caillon, could you review the patch? Are you working on dom/content area? Since jst looked at it, I can assume it as a module owner's review and I need one more review.
Comment on attachment 92244 [details] [diff] [review] Changed according to caillon's comment. A couple of minor nits I didn't catch last time I looked at this: + NS_METHOD EnsureUTF8Spec(const nsACString &aSpec, const char *aCharset, + nsACString &aUTF8Spec); Since this is an internal helper, no need to use the NS_METHOD macro for declaring the return type here, simply "nsresult" will do, both here and in the implementation of the method. +nsJSProtocolHandler::EnsureUTF8Spec(const nsACString &aSpec, const char *aCharset, + nsACString &aUTF8Spec) +{ + aUTF8Spec.Truncate(); + + // assume UTF-8 if the spec contains unescaped non ASCII + if (!nsCRT::IsAscii(PromiseFlatCString(aSpec).get())) + return NS_OK; If you'd change the signature of this method to take aSpec as an const nsAFlatCString& you wouldn't need to use PromiseFlatCString() at all here, aSpec.get() would work directly. Slightly less code, either way works... sr=jst
Attachment #92244 - Flags: superreview+
Comment on attachment 92244 [details] [diff] [review] Changed according to caillon's comment. r=caillon with jst's changes.
Attachment #92244 - Flags: review+
Really expect this bug is closed as soon as possible! hehe
Attachment #92244 - Attachment is obsolete: true
Comment on attachment 92414 [details] [diff] [review] Changed return type and argument type of EnsureUTF8Spec. copy r/sr
Attachment #92414 - Flags: superreview+
Attachment #92414 - Flags: review+
Comment on attachment 92414 [details] [diff] [review] Changed return type and argument type of EnsureUTF8Spec. a=asa (on behalf of drivers) for checkin to 1.1
Attachment #92414 - Flags: approval+
checked in to the trunk (yesterday)
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
I still can't see news list in the middle of page pkunews.pku.edu.cn Really want this bug resolved ealier, thanks for all your working!
Oh, I can see my school new now! thanks !!! very very much!
There are something strang, I report it here, but if it's not the bug for here, I will create a new one. The problem is, when I update to 2002072408 version, I can see the right page (pkunews.pku.edu.cn) when I click the link on home page(www.pku.edu.cn). But if I direct to the page pkunew.pku.edu.cn, the mozilla show the wrong page as before, WHY?? Because this website is for chinese, let me teach you how go reproduce the bug. Go www.pku.edu.cn, then click on the &#21271;&#22823;&#26032;&#38395;&#32593;, then you will get the right page, but type pkunews.pku.edu.cn in mozilla, you got wrong one! Really amaing!
I can see the problem by typing pkunews.pku.edu.cn to the URL location bar. The lists in the center of the page do not expand. I think this is a separate problem, at least the jsurl code is not executed for that test case. Please file a separate bug.
The change for nsGenericElement.cpp caused a regression bug 159434.
Ok, I will report a new bug for it!
add topembed.
Hansoo request this to go into m1.0.1
All, Sorry about the fault request on putting the patch on 1.0.1. I looked at the code a little closely and I think we can fix the problem in our hands in our side. Thank you very much, and once again, sorry about the fault request. Can anyone remove the KWs ?
Blocks: 155569
*** Bug 155199 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: