After publishing to a site whose publishing url includes any characters that
are "escaped" by nsIURI during publishing, publishing that page again will not
find the previously-enterred site information.
E.g: Users with Prodigy or SBC Global Net have usernames that are their email
address (see bug 138254 for more information about publishing to sbcglobal),
After publishing to this site, the next time you use the Publish dialog, the
username from the document URL will be "robinc%40sbcglobal.net" (the "@" was
escaped by nsIURI) and thus we won't find the entry in the publish prefs data,
which was saved as the unescaped "email@example.com".
Then the username should be stored escaped or should be escaped before being
compared, do not unescape the url it might get unusable because of that.
Created attachment 82093 [details] [diff] [review]
This patch unescapes the docurl and filename in very selective places: when we
get them from the document URL we are editing. The publish UI and data saved in
prefs are all unescaped, so we avoid the problems of making the user enter
escaped values and the whole escaping/unescaping balancing nightmare.
We know for sure that the nsIURI object that is passed to nsIWebBrowserPersist
(then on to netwerk code) will always escape properly. It also seems that the
Password Manager will contain unescaped values, since they don't escape
strings supplied by users before storing them.
Thus it seems best to keep all UI and stored data in the unescaped form.
Andreas: I don't think we can do it as you suggested.
Created attachment 82094 [details] [diff] [review]
Small tweak: shouldn't have "var" in
+ var docUrl = unescape(docUrl);
I'm concerned that StripUsernamePassword will not work correctly if there are
certain passwords in login or password: @, :, etc.
The var on this line looks wrong: + var docUrl = unescape(docUrl);
Can you be sure to fix any JS errors/warnings?
I need more time to do a thorough review.
I don't like it. Don't unescape the url and the parse it to find
username/password and so on. Escaping it is just what we do to get the parser to
do the right thing. If you have to unescape (and it seems you have to) then
first parse the url into its components and then unescape the components.
Created attachment 82169 [details] [diff] [review]
Update from reviewer's comments: We shouldn't unescape the url before we
extract the username and password. Thus we have to unescape the parts
Comment on attachment 82169 [details] [diff] [review]
+ pubUrl = unescape(pubUrl.slice(0, lastSlash+1));
I'm not convinced that we should be doing the unescape here. Also, please
don't change the blank line below it (you are adding spaces needlessly).
This comment is not clear to me:
+ // Note: FindSiteIndexAndDocDir unescapes docUrl
There is a typo in a comment: embeded should be embedded
The same line also contains "As of 5/5/02" which seems impossible to know at
this present date.
I don't understand this comment at all. I thought StripUsernamePassword didn't
have problems with parsing.
Using today's branch build: 20002-05-06-08-1.0.0 (Build ID 2002050606) on Win 32,
Windows 2000 professional, I'm not seeing this bug for my publishing site
sbcglobal.net. I'm able to repeatedly publish to my site (using
ftp://pages.sbcglobal.net/ as the publishing URL) and I don't see duplicate site
Does that mean this bug is fixed Charley/Kathy ?
*** Bug 139781 has been marked as a duplicate of this bug. ***
Created attachment 83365 [details] [diff] [review]
Corrected extra space, spelling errors, and clarified comments per brade's
I still think it's a good idea to unescape all parts of the document URL when
extracting strings to use for publishing.
Here are some snippets from conversations I've had with Darin about escaping in
* window.unescape == nsGlobalWindow::Unescape which is lossy
* non-ASCII characters will be interpreted relative to the document charset
* it could easily result in truncation or dataloss
* don't unescape the URL strings/pieces in JS unless it's known for certain
that the unescaped chars belong to the document charset (or UTF-8 if there is no
After talking to Darin about this issue, there are definitely problems with
simply using the JS "unescape" method. This can result in errors, especially with
"higer ascii" characters. We are deffering this issue until we have a smarter
unescape utility implemented in network code.
darin: any chance of implementing what we discussed about putting better
unescape support in necko?
cmanske: nhotta is working on adding a method to uconv that can be used to
safely unescape URL strings.
AString unEscapeURIForUI(in ACString aCharset, in AUTF8String aURIFragment);
That was checked in to the trunk today.
Composer triage team: nsbeta1+/adt3
This bug report is registered in the SeaMonkey product, but has been without a comment since the inception of the SeaMonkey project. This means that it was logged against the old Mozilla suite and we cannot determine that it's still valid for the current SeaMonkey suite. Because of this, we are setting it to an UNCONFIRMED state.
If you can confirm that this report still applies to current SeaMonkey 2.x nightly builds, please set it back to the NEW state along with a comment on how you reproduced it on what Build ID, or if it's an enhancement request, why it's still worth implementing and in what way.
If you can confirm that the report doesn't apply to current SeaMonkey 2.x nightly builds, please set it to the appropriate RESOLVED state (WORKSFORME, INVALID, WONTFIX, or similar).
If no action happens within the next few months, we move this bug report to an EXPIRED state.
Query tag for this change: mass-UNCONFIRM-20090614
This bug report is registered in the SeaMonkey product, but still has no comment since the inception of the SeaMonkey project 5 years ago.
Because of this, we're resolving the bug as EXPIRED.
If you still can reproduce the bug on SeaMonkey 2 or otherwise think it's still valid, please REOPEN it and if it is a platform or toolkit issue, move it to the according component.
Query tag for this change: EXPIRED-20100420