After publishing to a site whose publishing url includes any characters that are "escaped" by nsIURI during publishing, publishing that page again will not find the previously-enterred site information. E.g: Users with Prodigy or SBC Global Net have usernames that are their email address (see bug 138254 for more information about publishing to sbcglobal), e.g., "email@example.com". After publishing to this site, the next time you use the Publish dialog, the username from the document URL will be "robinc%40sbcglobal.net" (the "@" was escaped by nsIURI) and thus we won't find the entry in the publish prefs data, which was saved as the unescaped "firstname.lastname@example.org".
Then the username should be stored escaped or should be escaped before being compared, do not unescape the url it might get unusable because of that.
Created attachment 82093 [details] [diff] [review] patch v1 This patch unescapes the docurl and filename in very selective places: when we get them from the document URL we are editing. The publish UI and data saved in prefs are all unescaped, so we avoid the problems of making the user enter escaped values and the whole escaping/unescaping balancing nightmare. We know for sure that the nsIURI object that is passed to nsIWebBrowserPersist (then on to netwerk code) will always escape properly. It also seems that the Password Manager will contain unescaped values, since they don't escape strings supplied by users before storing them. Thus it seems best to keep all UI and stored data in the unescaped form. Andreas: I don't think we can do it as you suggested.
Created attachment 82094 [details] [diff] [review] patch v2 Small tweak: shouldn't have "var" in + var docUrl = unescape(docUrl);
I'm concerned that StripUsernamePassword will not work correctly if there are certain passwords in login or password: @, :, etc. The var on this line looks wrong: + var docUrl = unescape(docUrl); Can you be sure to fix any JS errors/warnings? I need more time to do a thorough review.
I don't like it. Don't unescape the url and the parse it to find username/password and so on. Escaping it is just what we do to get the parser to do the right thing. If you have to unescape (and it seems you have to) then first parse the url into its components and then unescape the components.
Created attachment 82169 [details] [diff] [review] patch v3 Update from reviewer's comments: We shouldn't unescape the url before we extract the username and password. Thus we have to unescape the parts separately.
Comment on attachment 82169 [details] [diff] [review] patch v3 + pubUrl = unescape(pubUrl.slice(0, lastSlash+1)); + I'm not convinced that we should be doing the unescape here. Also, please don't change the blank line below it (you are adding spaces needlessly). This comment is not clear to me: + // Note: FindSiteIndexAndDocDir unescapes docUrl There is a typo in a comment: embeded should be embedded The same line also contains "As of 5/5/02" which seems impossible to know at this present date. I don't understand this comment at all. I thought StripUsernamePassword didn't have problems with parsing. more later...
Using today's branch build: 20002-05-06-08-1.0.0 (Build ID 2002050606) on Win 32, Windows 2000 professional, I'm not seeing this bug for my publishing site sbcglobal.net. I'm able to repeatedly publish to my site (using ftp://pages.sbcglobal.net/ as the publishing URL) and I don't see duplicate site entries.
Does that mean this bug is fixed Charley/Kathy ?
*** Bug 139781 has been marked as a duplicate of this bug. ***
Created attachment 83365 [details] [diff] [review] patch v4 Corrected extra space, spelling errors, and clarified comments per brade's review. I still think it's a good idea to unescape all parts of the document URL when extracting strings to use for publishing.
Here are some snippets from conversations I've had with Darin about escaping in urls: * window.unescape == nsGlobalWindow::Unescape which is lossy * non-ASCII characters will be interpreted relative to the document charset * it could easily result in truncation or dataloss * don't unescape the URL strings/pieces in JS unless it's known for certain that the unescaped chars belong to the document charset (or UTF-8 if there is no document charset)
After talking to Darin about this issue, there are definitely problems with simply using the JS "unescape" method. This can result in errors, especially with "higer ascii" characters. We are deffering this issue until we have a smarter unescape utility implemented in network code.
darin: any chance of implementing what we discussed about putting better unescape support in necko?
cmanske: nhotta is working on adding a method to uconv that can be used to safely unescape URL strings.
mozilla/intl/uconv/idl/nsITextToSubURI.idl rev1.9 AString unEscapeURIForUI(in ACString aCharset, in AUTF8String aURIFragment); That was checked in to the trunk today.
Composer triage team: nsbeta1+/adt3
MASS-CHANGE: This bug report is registered in the SeaMonkey product, but has been without a comment since the inception of the SeaMonkey project. This means that it was logged against the old Mozilla suite and we cannot determine that it's still valid for the current SeaMonkey suite. Because of this, we are setting it to an UNCONFIRMED state. If you can confirm that this report still applies to current SeaMonkey 2.x nightly builds, please set it back to the NEW state along with a comment on how you reproduced it on what Build ID, or if it's an enhancement request, why it's still worth implementing and in what way. If you can confirm that the report doesn't apply to current SeaMonkey 2.x nightly builds, please set it to the appropriate RESOLVED state (WORKSFORME, INVALID, WONTFIX, or similar). If no action happens within the next few months, we move this bug report to an EXPIRED state. Query tag for this change: mass-UNCONFIRM-20090614
MASS-CHANGE: This bug report is registered in the SeaMonkey product, but still has no comment since the inception of the SeaMonkey project 5 years ago. Because of this, we're resolving the bug as EXPIRED. If you still can reproduce the bug on SeaMonkey 2 or otherwise think it's still valid, please REOPEN it and if it is a platform or toolkit issue, move it to the according component. Query tag for this change: EXPIRED-20100420