Last Comment Bug 141819 - Publish URL, filename, username, and password need to be unescaped to avoid creating duplicate site entries
: Publish URL, filename, username, and password need to be unescaped to avoid c...
[adt3] publish,custrtm-
Product: SeaMonkey
Classification: Client Software
Component: Composer (show other bugs)
: Trunk
: x86 Windows 2000
-- normal with 3 votes (vote)
: ---
Assigned To: jag (Peter Annema)
: 139781 (view as bug list)
Depends on:
Blocks: 232560
  Show dependency treegraph
Reported: 2002-05-02 12:03 PDT by Charles Manske
Modified: 2010-04-28 13:13 PDT (History)
9 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---

patch v1 (5.06 KB, patch)
2002-05-02 13:51 PDT, Charles Manske
no flags Details | Diff | Splinter Review
patch v2 (5.05 KB, patch)
2002-05-02 13:55 PDT, Charles Manske
no flags Details | Diff | Splinter Review
patch v3 (6.53 KB, patch)
2002-05-02 22:36 PDT, Charles Manske
no flags Details | Diff | Splinter Review
patch v4 (7.27 KB, patch)
2002-05-13 10:59 PDT, Charles Manske
no flags Details | Diff | Splinter Review

Description User image Charles Manske 2002-05-02 12:03:44 PDT
After publishing to a site whose publishing url includes any characters that 
are "escaped" by nsIURI during publishing, publishing that page again will not
find the previously-enterred site information.
E.g: Users with Prodigy or SBC Global Net have usernames that are their email 
address (see bug 138254 for more information about publishing to sbcglobal),
e.g., "".
After publishing to this site, the next time you use the Publish dialog, the 
username from the document URL will be "" (the "@" was
escaped by nsIURI) and thus we won't find the entry in the publish prefs data,
which was saved as the unescaped "".
Comment 1 User image Andreas Otte 2002-05-02 12:59:20 PDT
Then the username should be stored escaped or should be escaped before being
compared, do not unescape the url it might get unusable because of that.
Comment 2 User image Charles Manske 2002-05-02 13:51:38 PDT
Created attachment 82093 [details] [diff] [review]
patch v1

This patch unescapes the docurl and filename in very selective places: when we
get them from the document URL we are editing. The publish UI and data saved in

prefs are all unescaped, so we avoid the problems of making the user enter
escaped values and the whole escaping/unescaping balancing nightmare.
We know for sure that the nsIURI object that is passed to nsIWebBrowserPersist
(then on to netwerk code) will always escape properly. It also seems that the
Password Manager will contain unescaped values, since they don't escape
strings supplied by users before storing them.
Thus it seems best to keep all UI and stored data in the unescaped form.
Andreas: I don't think we can do it as you suggested.
Comment 3 User image Charles Manske 2002-05-02 13:55:32 PDT
Created attachment 82094 [details] [diff] [review]
patch v2

Small tweak: shouldn't have "var" in  
+  var docUrl = unescape(docUrl);
Comment 4 User image Kathleen Brade 2002-05-02 14:07:13 PDT
I'm concerned that StripUsernamePassword will not work correctly if there are
certain passwords in login or password: @, :, etc.

The var on this line looks wrong: +  var docUrl = unescape(docUrl);
Can you be sure to fix any JS errors/warnings?

I need more time to do a thorough review.
Comment 5 User image Andreas Otte 2002-05-02 14:49:42 PDT
I don't like it. Don't unescape the url and the parse it to find
username/password and so on. Escaping it is just what we do to get the parser to
do the right thing. If you have to unescape (and it seems you have to) then
first parse the url into its components and then unescape the components. 
Comment 6 User image Charles Manske 2002-05-02 22:36:34 PDT
Created attachment 82169 [details] [diff] [review]
patch v3

Update from reviewer's comments: We shouldn't unescape the url before we
extract the username and password. Thus we have to unescape the parts 
Comment 7 User image Kathleen Brade 2002-05-03 10:43:56 PDT
Comment on attachment 82169 [details] [diff] [review]
patch v3

+  pubUrl = unescape(pubUrl.slice(0, lastSlash+1));

I'm not convinced that we should be doing the unescape here.  Also, please
don't change the blank line below it (you are adding spaces needlessly).

This comment is not clear to me:
+    // Note: FindSiteIndexAndDocDir unescapes docUrl

There is a typo in a comment: embeded should be embedded

The same line also contains "As of 5/5/02" which seems impossible to know at
this present date.

I don't understand this comment at all.  I thought StripUsernamePassword didn't
have problems with parsing.

more later...
Comment 8 User image robinf 2002-05-06 11:21:17 PDT
Using today's branch build: 20002-05-06-08-1.0.0 (Build ID 2002050606) on Win 32,
Windows 2000 professional, I'm not seeing this bug for my publishing site I'm able to repeatedly publish to my site (using as the publishing URL) and I don't see duplicate site
Comment 9 User image sujay 2002-05-06 12:11:42 PDT
Does that mean this bug is fixed Charley/Kathy ?
Comment 10 User image Charles Manske 2002-05-07 17:41:53 PDT
*** Bug 139781 has been marked as a duplicate of this bug. ***
Comment 11 User image Charles Manske 2002-05-13 10:59:23 PDT
Created attachment 83365 [details] [diff] [review]
patch v4

Corrected extra space, spelling errors, and clarified comments per brade's
I still think it's a good idea to unescape all parts of the document URL when 
extracting strings to use for publishing.
Comment 12 User image Kathleen Brade 2002-05-13 12:21:44 PDT
Here are some snippets from conversations I've had with Darin about escaping in
 * window.unescape == nsGlobalWindow::Unescape which is lossy
 * non-ASCII characters will be interpreted relative to the document charset
 * it could easily result in truncation or dataloss
 * don't unescape the URL strings/pieces in JS unless it's known for certain
that the unescaped chars belong to the document charset (or UTF-8 if there is no
document charset)
Comment 13 User image Charles Manske 2002-06-12 15:22:54 PDT
After talking to Darin about this issue, there are definitely problems with
simply using the JS "unescape" method. This can result in errors, especially with
"higer ascii" characters. We are deffering this issue until we have a smarter
unescape utility implemented in network code.
Comment 14 User image Charles Manske 2002-08-12 13:59:56 PDT
darin: any chance of implementing what we discussed about putting better 
unescape support in necko?
Comment 15 User image Darin Fisher 2002-08-12 14:17:41 PDT
cmanske: nhotta is working on adding a method to uconv that can be used to
safely unescape URL strings.
Comment 16 User image nhottanscp 2002-08-12 14:41:40 PDT
mozilla/intl/uconv/idl/nsITextToSubURI.idl rev1.9
  AString unEscapeURIForUI(in ACString aCharset, in AUTF8String aURIFragment);  

That was checked in to the trunk today.
Comment 17 User image jag (Peter Annema) 2003-01-24 09:32:37 PST
-> me
Comment 18 User image Samir Gehani 2003-03-11 15:05:11 PST
Composer triage team: nsbeta1+/adt3
Comment 19 User image Robert Kaiser 2009-06-14 14:53:23 PDT
This bug report is registered in the SeaMonkey product, but has been without a comment since the inception of the SeaMonkey project. This means that it was logged against the old Mozilla suite and we cannot determine that it's still valid for the current SeaMonkey suite. Because of this, we are setting it to an UNCONFIRMED state.

If you can confirm that this report still applies to current SeaMonkey 2.x nightly builds, please set it back to the NEW state along with a comment on how you reproduced it on what Build ID, or if it's an enhancement request, why it's still worth implementing and in what way.
If you can confirm that the report doesn't apply to current SeaMonkey 2.x nightly builds, please set it to the appropriate RESOLVED state (WORKSFORME, INVALID, WONTFIX, or similar).
If no action happens within the next few months, we move this bug report to an EXPIRED state.

Query tag for this change: mass-UNCONFIRM-20090614
Comment 20 User image Robert Kaiser 2010-04-28 13:13:22 PDT
This bug report is registered in the SeaMonkey product, but still has no comment since the inception of the SeaMonkey project 5 years ago.

Because of this, we're resolving the bug as EXPIRED.

If you still can reproduce the bug on SeaMonkey 2 or otherwise think it's still valid, please REOPEN it and if it is a platform or toolkit issue, move it to the according component.

Query tag for this change: EXPIRED-20100420

Note You need to log in before you can comment on or make changes to this bug.