Last Comment Bug 408599 - FeedProtocolHandler creates horrifying nsIStandardURLs
: FeedProtocolHandler creates horrifying nsIStandardURLs
Status: RESOLVED FIXED
:
Product: Firefox
Classification: Client Software
Component: RSS Discovery and Preview (show other bugs)
: Trunk
: All All
: -- normal (vote)
: ---
Assigned To: neil@parkwaycc.co.uk
:
Mentors:
: 511294 (view as bug list)
Depends on: 571619
Blocks: 367432
  Show dependency treegraph
 
Reported: 2007-12-16 13:32 PST by Phil Ringnalda (:philor, back in August)
Modified: 2015-04-07 23:06 PDT (History)
14 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Basic patch (1.58 KB, patch)
2009-03-29 16:12 PDT, neil@parkwaycc.co.uk
cbiesinger: review+
bzbarsky: superreview+
Details | Diff | Splinter Review
Untested browser part of the fix (5.09 KB, patch)
2010-06-13 15:19 PDT, neil@parkwaycc.co.uk
asaf: review-
Details | Diff | Splinter Review
updated to tip (without the test) (3.67 KB, patch)
2010-08-05 15:45 PDT, Mano (::mano, needinfo? for any questions; not reading general bugmail)
no flags Details | Diff | Splinter Review
SeaMonkey version of the patch (3.26 KB, patch)
2010-08-08 14:57 PDT, neil@parkwaycc.co.uk
bzbarsky: feedback+
Details | Diff | Splinter Review
Updated for bitrot (5.73 KB, patch)
2011-05-08 04:37 PDT, neil@parkwaycc.co.uk
gavin.sharp: review-
Details | Diff | Splinter Review
Addressed review comments (5.69 KB, patch)
2011-05-10 01:33 PDT, neil@parkwaycc.co.uk
gavin.sharp: review+
Details | Diff | Splinter Review
SeaMonkey changes (3.27 KB, patch)
2011-05-13 13:35 PDT, neil@parkwaycc.co.uk
iann_bugzilla: review+
Details | Diff | Splinter Review

Description Phil Ringnalda (:philor, back in August) 2007-12-16 13:32:22 PST
Four "valid" feed: "URI"s walk into a bar:

feed://example.org/feed/
feed:http://example.org/feed/
feed:https://example.org/feed/
feed:ftp://example.org/feed/

Because FeedProtocolHandler.newURI treats them as nsIStandardURLs, failing to note that the "spec" author didn't read the part of the Generic URI Syntax RFC that makes ":" an invalid character in a scheme, it turns them into the following URIs:

host: example.org path: /feed/
host: http path: //example.org/feed/
host: https path: //example.org/feed/
host: ftp path: //example.org/feed/

Because FeedProtocolHandler.newChannel knows what newURI has done in the case of feed:http:// and feed:https:// and undoes it, that only shows up in the UI when you try to load feed:foo://example.org/feed/, which fails by loading http://www.foo.com/example.org//feed/ - particularly unfortunate when foo == file, and you tell www.file.com that you just clicked a link to feed:file:///Users/philringnalda/nastythings/reallysickstuff/

I rather doubt that nsIStandardURI wants to add a fourth type to http://bonsai.mozilla.org/cvsblame.cgi?file=/mozilla/netwerk/base/public/nsIStandardURL.idl&rev=1.5&mark=55-77#50 for URLTYPE_SPEC_AUTHOR_FAILED_GENERIC_SYNTAX_101, so I see three options, and don't have any idea how to choose among them:

* strip off the broken stupid feed: stuff in newURI, by creating real URIs with the real scheme - is it allowed to have newURI(myspec, ...).spec != myspec to that extent?

* create nsISimpleURIs instead - I think that would mean losing the originCharset, but I'm not sure whether that's a good or a bad thing

* keep creating totally bogus URIs, but teach newChannel to better deal - I suspect that's impossible in the edge cases, since feed:foo://bar/ and feed://foo//bar/ create identical URIs
Comment 1 Boris Zbarsky [:bz] 2007-12-17 22:00:58 PST
Would creating a simple nested URI work (with the nested URI being just whatever the "right" URI here is)?
Comment 2 Boris Zbarsky [:bz] 2007-12-18 07:40:29 PST
I guess simple nested URIs are not really exposed to XPCOM nicely... maybe the should be.
Comment 3 Phil Ringnalda (:philor, back in August) 2007-12-19 01:29:23 PST
Too bad: other than the well-beyond-my-abilities part, that looks like exactly what feed: really is.
Comment 4 Boris Zbarsky [:bz] 2007-12-19 07:24:59 PST
Well.  Do you want to expose them as a start?  Give them a contract and an init function, and remove the assumptions about the inner URI always being non-null (with errors thrown if it's null)?
Comment 5 Phil Ringnalda (:philor, back in August) 2007-12-20 19:51:01 PST
I'd like to, yes, but while you wouldn't have to push my fingers down on the keys, you would have to position them over the correct key for every keypress, and I strongly doubt that Gecko can afford to have that much of your time spent on something which apparently nobody but me has even noticed.
Comment 6 neil@parkwaycc.co.uk 2009-03-24 05:37:07 PDT
How about a newNestedURI factory method on the IO service?
Comment 7 Boris Zbarsky [:bz] 2009-03-24 06:48:14 PDT
IO service is a frozen interface.  We could toss in on nsINetUtil, I suppose... But really, why is that better that just being able to createInstance them?
Comment 8 neil@parkwaycc.co.uk 2009-03-25 09:04:51 PDT
I was just wondering which interface the Init method should go on.

An alternative would be to make SetSpec also create the nested URI.

Would you want view-source to use this?
Comment 9 Boris Zbarsky [:bz] 2009-03-25 09:09:22 PDT
view-source already uses nested URIs.  It doesn't have a problem doing that, since it's part of necko and can just use nsSimpleNestedURI directly.
Comment 10 neil@parkwaycc.co.uk 2009-03-29 16:12:55 PDT
Created attachment 369928 [details] [diff] [review]
Basic patch

I say "Basic patch" because the other idea I had in mind was to change the nsSimpleNestedURI constructor to take a scheme and path as well, and make newSimpleNestedURI take a spec, which would then be sanity-checked, while viewSourceHandler would pass in "view-source" and the inner ASCII spec.
Comment 11 Boris Zbarsky [:bz] 2009-03-30 08:50:00 PDT
Comment on attachment 369928 [details] [diff] [review]
Basic patch

rev the iid, and looks good.
Comment 12 Christian :Biesinger (don't email me, ping me on IRC) 2009-08-22 08:55:52 PDT
Comment on attachment 369928 [details] [diff] [review]
Basic patch

what bz said
Comment 13 neil@parkwaycc.co.uk 2009-08-23 15:54:25 PDT
Comment on attachment 369928 [details] [diff] [review]
Basic patch

Pushed changeset 81eb53fed585 to mozilla-central.
Comment 14 neil@parkwaycc.co.uk 2010-06-13 15:19:53 PDT
Created attachment 450954 [details] [diff] [review]
Untested browser part of the fix

This makes FeedConverter.js use the new API. I wasn't sure what to do about the tests, though; obviously newURI will start working in those cases.

This is a port of changes I made to suite's FeedConverter.js which I did test, but I may have broken something when trying to port it. Sorry about that.
Comment 15 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2010-08-05 15:44:29 PDT
Comment on attachment 450954 [details] [diff] [review]
Untested browser part of the fix

For some reason, this breaks the feed at planet.mozilla.org (I got an empty page and had to click back to see the feed(.
Comment 16 Mano (::mano, needinfo? for any questions; not reading general bugmail) 2010-08-05 15:45:18 PDT
Created attachment 463337 [details] [diff] [review]
updated to tip (without the test)
Comment 17 neil@parkwaycc.co.uk 2010-08-08 14:57:18 PDT
Created attachment 463971 [details] [diff] [review]
SeaMonkey version of the patch

Well, I thought I'd ported it correctly, but just in case, this is the SeaMonkey version of the patch and I haven't noticed any issues with it.
Comment 18 Boris Zbarsky [:bz] 2011-04-29 12:06:53 PDT
Comment on attachment 463971 [details] [diff] [review]
SeaMonkey version of the patch

Looks good.
Comment 19 Shawn Wilsher :sdwilsh 2011-05-02 13:01:38 PDT
Comment on attachment 450954 [details] [diff] [review]
Untested browser part of the fix

Clearing this request since it already has r-.  Feel free to send it my way when you have an updated patch.
Comment 20 Justin Wood (:Callek) 2011-05-03 15:28:47 PDT
(In reply to comment #15)
> Comment on attachment 450954 [details] [diff] [review] [review]
> Untested browser part of the fix
> 
> For some reason, this breaks the feed at planet.mozilla.org (I got an empty
> page and had to click back to see the feed(.

Mano, did you test along with the Basic patch here?
Comment 21 neil@parkwaycc.co.uk 2011-05-08 04:37:24 PDT
Created attachment 530908 [details] [diff] [review]
Updated for bitrot

I had no problems viewing Planet's feed with this patch.
Comment 22 :Gavin Sharp [email: gavin@gavinsharp.com] 2011-05-09 17:07:22 PDT
Comment on attachment 530908 [details] [diff] [review]
Updated for bitrot

>diff --git a/browser/components/feeds/src/FeedConverter.js b/browser/components/feeds/src/FeedConverter.js

>   newURI: function GPH_newURI(spec, originalCharset, baseURI) {

>+    spec = spec.replace(/^\s+|[\r\n\t]+|\s+$/g, "");

The previous code was less lenient than this, so I don't think this is necessary, and so I'd rather avoid doing it (consistency with nsStandardURL quirks doesn't seem valuable).

>+    var prefix = spec.substr(scheme.length, 2) == "//" ? "http:" : "";
>+    var url = Cc["@mozilla.org/network/io-service;1"].
>+              getService(Ci.nsIIOService).newURI(spec.replace(scheme, prefix),
>+                                                 originalCharset, baseURI);

nit: call this "innerURI"?

>+    var uri = Cc["@mozilla.org/network/util;1"].
>+              getService(Ci.nsINetUtil).newSimpleNestedURI(url);
>+    uri.spec = url.spec.replace(prefix, scheme);

This produces a URI whose spec is "feed::", which doesn't seem ideal, even though it works (because the inner URI is correct). Can we do something about that?
Comment 23 neil@parkwaycc.co.uk 2011-05-10 00:40:55 PDT
(In reply to comment #22)
>(From update of attachment 530908 [details] [diff] [review])
>>+    spec = spec.replace(/^\s+|[\r\n\t]+|\s+$/g, "");
>The previous code was less lenient than this, so I don't think this is
>necessary, and so I'd rather avoid doing it (consistency with nsStandardURL
>quirks doesn't seem valuable).
It was only slightly less lenient, since it sends the spec to nsStandardURL anyway, so you still get the middle/end whitespace trimming.

>>+    uri.spec = url.spec.replace(prefix, scheme);
>This produces a URI whose spec is "feed::", which doesn't seem ideal, even
>though it works (because the inner URI is correct). Can we do something
>about that?
Maybe it will be easier to read once I've named uri to innerURI:
innerURI.spec = url.spec.replace(prefix, scheme);
Comment 24 neil@parkwaycc.co.uk 2011-05-10 01:07:55 PDT
(In reply to comment #23)
> Maybe it will be easier to read once I've named uri to innerURI:
> innerURI.spec = url.spec.replace(prefix, scheme);
That's the wrong replacement. Let me just spin up a new patch ;-)
Comment 25 neil@parkwaycc.co.uk 2011-05-10 01:33:02 PDT
Created attachment 531266 [details] [diff] [review]
Addressed review comments
Comment 26 :Gavin Sharp [email: gavin@gavinsharp.com] 2011-05-10 11:43:00 PDT
(In reply to comment #23)
> It was only slightly less lenient, since it sends the spec to nsStandardURL
> anyway, so you still get the middle/end whitespace trimming.

But before doing that it checked the passed-in spec against 3 specific prefixes, and threw if it didn't match any of those, so it was definitely much less lenient.
Comment 27 neil@parkwaycc.co.uk 2011-05-11 00:24:24 PDT
(In reply to comment #26)
> (In reply to comment #23)
> > It was only slightly less lenient, since it sends the spec to nsStandardURL
> > anyway, so you still get the middle/end whitespace trimming.
> But before doing that it checked the passed-in spec against 3 specific
> prefixes, and threw if it didn't match any of those, so it was definitely
> much less lenient.
It was only less lenient against whitespace at the beginning of the string.

On a possibly unrelated note the whitespace trimming allows the patch to accept feed://example.org/feed/ URIs in the same way nsStandardURL would.
Comment 28 :Gavin Sharp [email: gavin@gavinsharp.com] 2011-05-11 11:22:25 PDT
(In reply to comment #27)
> It was only less lenient against whitespace at the beginning of the string.
> 
> On a possibly unrelated note the whitespace trimming allows the patch to
> accept feed://example.org/feed/ URIs in the same way nsStandardURL would.

I don't understand any of these comments. The strict comparisons against 3 exact strings in the previous code is definitely more strict than the scheme check in the most recent patch. After that, the spec gets passed to nsIOService::newURI(), which calls net_ExtractURLScheme, which does leading whitespace stripping, and then just passes it to nsHttpHandler::newURI which boils down to basically the same nsStandardURL code that was in the previous code you're replacing here.

So there's no point to doing any whitespace stripping in this newURI implementation. The code in the r+ed patch handles feed://example.org/feed/ just fine.
Comment 29 neil@parkwaycc.co.uk 2011-05-11 12:03:15 PDT
(In reply to comment #28)
> I don't understand any of these comments.
Join the club. It usually takes me a few goes to make myself understandable.

> The strict comparisons against 3 exact strings in the previous code is
> definitely more strict than the scheme check in the most recent patch.
I think I was confused as to which check you were referring to. The existing code only accepts 3 exact prefixes. All the patches relax this restriction. Furthermore, the existing code does not allow leading whitespace or tab or newline characters anywhere inside the 3 prefixes. The reviewed patch retains the whitespace restrictions. This is what I was referring to when I said it was only slightly less lenient. The original patch did not because I wanted to be able to accept the same range of strings that nsStandardURL would accept.
Comment 30 neil@parkwaycc.co.uk 2011-05-11 16:02:17 PDT
Comment on attachment 531266 [details] [diff] [review]
Addressed review comments

Pushed changeset 73bbec8621c2 to mozilla-central.
Comment 31 neil@parkwaycc.co.uk 2011-05-13 13:35:56 PDT
Created attachment 532335 [details] [diff] [review]
SeaMonkey changes

With variable renaming.
Comment 32 Ian Neal 2011-05-13 15:20:44 PDT
Comment on attachment 532335 [details] [diff] [review]
SeaMonkey changes

>   newChannel: function newChannel(aUri) {
>+    var url = aUri.QueryInterface(Components.interfaces.nsINestedURI).innerURI;
>+    var channel = this._ioSvc.newChannelFromURI(url, null);
Why the change from uri to url here? Do you actually need to pass a second argument to newChannelFromURI?

r=me with the above addressed/commented on.
Comment 33 neil@parkwaycc.co.uk 2011-05-13 16:11:36 PDT
Pushed changeset 9e9715602ab6 to comm-central.
Comment 34 Phil Ringnalda (:philor, back in August) 2015-04-07 23:06:31 PDT
*** Bug 511294 has been marked as a duplicate of this bug. ***

Note You need to log in before you can comment on or make changes to this bug.