Last Comment Bug 677643 - crash nsTHashtable with HTTPS everywhere extension
: crash nsTHashtable with HTTPS everywhere extension
Status: RESOLVED FIXED
[verified-beta][qa+]
: addon-compat, crash, regression
Product: Core
Classification: Components
Component: DOM (show other bugs)
: Trunk
: All All
: -- critical with 6 votes (vote)
: mozilla9
Assigned To: Boris Zbarsky [:bz]
:
Mentors:
www.jahandiamondimports.com/bridal/we...
: 679471 684167 (view as bug list)
Depends on: 692843
Blocks: 631005 548685 691288
  Show dependency treegraph
 
Reported: 2011-08-09 12:16 PDT by Paul O'Shannessy [:zpao] (not reading much bugmail, email directly)
Modified: 2011-11-08 11:06 PST (History)
29 users (show)
bzbarsky: in‑testsuite?
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
+
fixed


Attachments
Apple's crash dump of crash reporter crashing (52.59 KB, text/plain)
2011-09-01 14:51 PDT, bcode
no flags Details
Like so, say. Testing this against the steps in comment 20 now (1.81 KB, patch)
2011-09-26 12:28 PDT, Boris Zbarsky [:bz]
bugs: review+
asa: approval‑mozilla‑beta+
Details | Diff | Splinter Review
part 2. Clone the URI argument when loading external stylesheets from a <link> element to work around content policies mutating the URI. (1.62 KB, patch)
2011-09-26 14:56 PDT, Boris Zbarsky [:bz]
justin.lebar+bug: review+
asa: approval‑mozilla‑beta+
Details | Diff | Splinter Review

Description Paul O'Shannessy [:zpao] (not reading much bugmail, email directly) 2011-08-09 12:16:38 PDT
This bug was filed from the Socorro interface and is 
report bp-babc0f56-3c33-442a-87d0-457bf2110809 .
============================================================= 

I can consistently crash Nightly browsing http://www.jahandiamondimports.com/ - just browse around for a little bit and boom. I haven't tested safe mode so I'm not 100% sure it's not an addon, but the stack seems to indicate not.
Comment 1 Marcia Knous [:marcia - use ni] 2011-08-09 17:09:23 PDT
Haven't been able to repro yet using Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0a1) Gecko/20110809 Firefox/8.0a1. Are you running 10.6 or 10.7?
Comment 2 Paul O'Shannessy [:zpao] (not reading much bugmail, email directly) 2011-08-09 17:37:21 PDT
I also haven't been able to repro in a debug build with a mostly clean profile, so I'm leaning towards an extension (though nothing is jumping out as interacting with history/hashtable). Sync isn't enabled for this test profile but it is in the crashing one, so it's possibly related to that. When I get more of a chance I'll do some testing.

I'm on 10.7 - Mozilla/5.0 (Macintosh; Intel Mac OS X 10.7; rv:8.0a1) Gecko/20110809 Firefox/8.0a1
Comment 3 Justin Scott [:fligtar] 2011-08-24 23:13:23 PDT
I've been hitting this crash a lot lately -- it seems to happen to me very frequently on surveygizmo.com -- trying to log in and loading survey editors. Let me know if there's something I can help debug.

Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0a2) Gecko/20110824 Firefox/8.0a2
Comment 4 bcode 2011-08-31 12:41:21 PDT
It might be related to the https-everywhere@eff.org extension, as I also get it while using http://www.interviewstreet.com and http://www.jahandiamondimports.com/ with https-everywhere extension. It might be related to getting the hash of a URL if https-ev is installed.

Another cash report:
https://crash-stats.mozilla.com/report/index/bp-4176c52b-0c8b-4189-9497-df26c2110831
Comment 5 bcode 2011-08-31 13:01:07 PDT
Yes, I tried it with another profile and I am pretty certain it happens (at least) in concert with https-everywhere.com. I think it happens on sites enabled with olark.com instant chat service.
Comment 6 Justin Scott [:fligtar] 2011-08-31 13:23:54 PDT
I also have HTTPS Everywhere, so it definitely seems related.
Comment 7 Jorge Villalobos [:jorgev] 2011-08-31 15:36:29 PDT
I can confirm this happens in Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0a2) Gecko/20110831 Firefox/8.0a2. I can't reproduce the crashes in version 0.9.3 of the add-on, but I can reproduce with 1.0 and 1.0.1. I filed this bug in their system:
https://trac.torproject.org/projects/tor/ticket/3882
Comment 8 bcode 2011-08-31 20:11:57 PDT
Should a faulty Add-On be able to crash Firefox, or is this a bug in the Firefox code as well, in addition to HTTPS Everywhere?
Comment 9 bcode 2011-09-01 14:51:56 PDT
Created attachment 557653 [details]
Apple's crash dump of crash reporter crashing

This time, even the crash reporter crashed. I am attaching Apple's crash report of the crash reporter crashing.
Comment 10 Jorge Villalobos [:jorgev] 2011-09-01 15:32:52 PDT
yo dawg...
Comment 11 bcode 2011-09-01 22:08:27 PDT
Something similar also happens under linux:
https://bugzilla.mozilla.org/show_bug.cgi?id=684167
Comment 12 Matthias Versen [:Matti] 2011-09-02 15:46:43 PDT
*** Bug 684167 has been marked as a duplicate of this bug. ***
Comment 13 Emanuel Hoogeveen [:ehoogeveen] 2011-09-02 16:39:48 PDT
Should bug 679471 (which I filed) be duped or otherwise linked to this one?
Comment 14 Robert 2011-09-02 19:56:55 PDT
Is this a variation of bug 548685?
Comment 15 Mats Palmgren (:mats) 2011-09-02 21:01:45 PDT
Looks like it is to me, but perhaps it's better to keep them apart for now,
since this bug has info on how to reproduce the crash.

Installing https-everywhere and browsing http://www.jahandiamondimports.com/
for a 30 seconds or so crashed my debug mozilla-beta Linux64 build.
Prior to the crash I got these assertion failures:

###!!! ASSERTION: Trying to unregister for a URI that wasn't registered!: 'Error', file toolkit/components/places/History.cpp, line 1762
###!!! ASSERTION: This should only fail if we misuse the API!: 'NS_SUCCEEDED(rv)', file content/base/src/Link.cpp, line 505
###!!! ASSERTION: mRegistered is true, but we have no cached URI?!: 'mCachedURI', file content/base/src/Link.cpp, line 501
###!!! ASSERTION: Must pass a non-null URI!: 'aURI', file toolkit/components/places/History.cpp, line 1756

which is what dbaron predicted in bug 548685 comment 1.
Comment 16 Mats Palmgren (:mats) 2011-09-02 21:06:34 PDT
*** Bug 679471 has been marked as a duplicate of this bug. ***
Comment 17 Mats Palmgren (:mats) 2011-09-02 23:02:52 PDT
It seems the http -> https rewriting that HTTPS Everywhere does confuses
the Link class.  In Link::LinkState() we call mHistory->RegisterVisitedCallback
for the URI http://static.olark.com/css/f/9/f9d8acd997e04404ad5b183004607a7d.css?http
while Link::UnregisterFromHistory() calls mHistory->UnregisterVisitedCallback
with https://static.olark.com/css/f/9/f9d8acd997e04404ad5b183004607a7d.css?http

Tracking this URL through the nsStandardURL::SetSpec it appears JS code
in the HTTP Everywhere add-on calls it for the specific https URI.

I did a quick-n-dirty wallpaper that Clones the URI when we register it,
and using that clone to unregister fixed the crash.
I have no clue what the real fix is though.
Comment 18 Boris Zbarsky [:bz] 2011-09-03 00:05:43 PDT
Gah.  The real fix is for us to just make URIs immutable so HTTPS Everywhere can't backdoor things like that.... :(

The other option is for the HTTP channel to clone the URI it's given.  That might be simplest for now...
Comment 19 Marco Bonardo [::mak] 2011-09-05 02:22:17 PDT
I won't be able to look at this before leaving, may someone take it, implement comment 18 and reproduce what https everywhere does in a crash test?
Comment 20 Vic 2011-09-06 10:57:21 PDT
Copied my STR here (STR is also on Bug 548685 ). Hope its okay.

I have a faster STR or more accurate (for sure crash on Linux/WinXP/Win7) STR you can say:
0. Remember to keep the Flash plugin enabled for this STR.
1. Install HTTPS Everywhere.
2. Install HTTPS Finder (HTTPS Finder 0.75) @ https://code.google.com/p/https-finder/downloads/list
3. Browse to http://ihaveaplan.ca and http://www.ihaveaplan.ca
4. You should get prompt to save new HTTPS rules for each of the site above.
5. Save the 2 rules and restart.
6. Go to ihaveaplan.ca and under "List of student associations" click "UFV - University of the Fraser Valley (SUS)"
7. On the third box in the middle (under the video) it says "HOW MUCH DOES 
IT COST?" and there is hyperlink to "Your annual fee is $159.92 for coverage including health, dental, vision, and travel. ". 
8. Click the above hyperlink and it should take you to a 404 site (https) if you followed the above steps properly. Wait a few seconds (there is something on that page that tries to redirect you to the home page automatically in a few seconds) and Firefox will crash.

If you did NOT follow the steps properly it will take you here: http://ihaveaplan.ca/rte/en/UniversityCollegeoftheFraserValleySUS_Health_HowMuchDoesItCost (which does not crash)

The Crash URL, last URL before it crashes firefox, is = https://www.ihaveaplan.ca/UniversityCollegeoftheFraserValleySUS_Health_HowMuchDoesItCost

The weird thing is if you go directly to (crash url) https://www.ihaveaplan.ca/UniversityCollegeoftheFraserValleySUS_Health_HowMuchDoesItCost without following the exact STR in Comment 16 after approx 15 seconds you are redirected (without crashing) to https://www.ihaveaplan.ca/
Comment 21 Marcia Knous [:marcia - use ni] 2011-09-06 14:09:11 PDT
#8 top Mac crash in Firefox 7.0b2.
Comment 22 Vic 2011-09-06 20:16:52 PDT
Can I make simple fixes to the bug such as:

x86 Mac OS X ==> All All

My STR was tested on Linux and Windows also.
Comment 23 christian 2011-09-09 10:57:44 PDT
(In reply to Vic from comment #22)
> Can I make simple fixes to the bug such as:
> 
> x86 Mac OS X ==> All All
> 
> My STR was tested on Linux and Windows also.

Yes, if you are are able to do so you are allowed to do so. Thanks for checking though!
Comment 24 Wayne Mery (:wsmwk, NI for questions) 2011-09-11 15:45:44 PDT
nsTHashtable<nsBaseHashtableET<nsURIHashKey, nsRefPtr<nsCSSStyleSheet> > >::s_HashKey(PLDHashTable*, void const*) seems to be another example, bp-12801951-8873-45f3-ae71-dbd572110907

however, not all such crashes list the addon.
Comment 25 Vic 2011-09-12 08:06:56 PDT
(In reply to Christian Legnitto [:LegNeato] from comment #23)
> (In reply to Vic from comment #22)
> > Can I make simple fixes to the bug such as:
> > 
> > x86 Mac OS X ==> All All
> > 
> > My STR was tested on Linux and Windows also.
> 
> Yes, if you are are able to do so you are allowed to do so. Thanks for
> checking though!

How do I get the ability so I am allowed to do so?
Comment 26 Peter Eckersley 2011-09-12 16:14:56 PDT
We HTTPS Everywhere developers would love some help in working out what to do about this.  We inherited a lot of XPCOM black magic for HTTP->HTTPS request rewriting from NoScript (which may also be affected by this bug for sites that set the HSTS header).

The code paths we have are as follows:

1. we observe http-on-modify-request, and try perform a channel replacement operation on nsIHTTPChannels we get there

2. we do the same thing for nsIChannelEventSink / onChannelRedirect()

3. In the nsIContentPolicy shouldLoad() path we don't have a channel.  This is where we try to modify nsIURLs in place to replace the "http" scheme with "https".

If there is a simpler, complete and coherent API we could be using to do our rewrites we would be happy to migrate to it.  If there isn't, perhaps someone on the Mozilla side could help by identifying which commit broke our code, or a way we can work around this.
Comment 27 Boris Zbarsky [:bz] 2011-09-12 17:20:15 PDT
> This is where we try to modify nsIURLs in place to replace the "http" scheme with "https".

Yes, that's what causes the problem...

> If there is a simpler, complete and coherent API

We really need one, yes.

I seem to recall NoScript running into something similar and fixing it...  ccing Giorgio to make sure.

The commit(s) that broke your code were the rewrite of the way visited link tracking is done.  It's now done by using the URI of the link as an index into a hashtable... but you're mutating the URI being used as a hashtable key after the entry has been added, so it can no longer be removed properly (because it hashes to a different bucket now).

Given the fact that URIs are mutable, I think we should just use the solution I suggest in comment 18 for now, unless someone has a brighter idea... and hope that no one does URI pointer equality checks on channel URIs.
Comment 28 Marco Bonardo [::mak] 2011-09-22 09:30:42 PDT
I am trying to make a test for this, but I have to figure more consistent steps for it, since probably involves garbage collection.
I have reproduced the crash in the debugger following mats steps, the first call to UnregisterVisitedCallback happens in resetLinkState (invoked during cycle collection), in this call we try to unregister the uri, we fail to do so (since different) and thus we don't set mRegistered to false, but regardless we set mCachedURI to nsnull. The second call to UnregisterVisitedCallback is in the link destructor, mRegistered is still true, so we proceed and try to unregister the null mCachedURI.

Looks like anything able to invoke GetURI and change the uri may end up down this path, even if we clone the uri in the channel wouldn't this be still an open flaw?
Mats idea to keep an "immutable" cached uri in the link object seems easy to do, we may just replace mRegistered with mRegisteredURI, but I wonder if that may increase memory usage on pages with lots of links (I suspect some string buffers may be shared though?).
Comment 29 Boris Zbarsky [:bz] 2011-09-22 10:14:29 PDT
> but I wonder if that may increase memory usage on pages with lots of links

Yes, it will....

> (I suspect some string buffers may be shared though?)

In most cases, not really.  :(

I agree that anyone else who modifies the link URI could run into similar issues.  We could just make the URI immutable to start with, I suppose, since a priori anyone changing it is doing it wrong, right?
Comment 30 Boris Zbarsky [:bz] 2011-09-22 10:15:00 PDT
But note that on its own, without a clone on the channel, that would break https everywhere, of course.
Comment 31 Peter Eckersley 2011-09-23 11:33:07 PDT
(In reply to Boris Zbarsky (:bz) from comment #27)

> > If there is a simpler, complete and coherent API
> 
> We really need one, yes.
> 

Anyone know if there's already a bug for this, or whether I should open one?
Comment 32 Boris Zbarsky [:bz] 2011-09-23 11:53:15 PDT
Dunno about a bug, but there was recent discussion about the content policy API in .platform that's relevant here.
Comment 33 Peter Eckersley 2011-09-26 10:49:42 PDT
As of last week, we declared the stable branch of HTTPS Everywhere to be incompatible with FF 7 and 8.  That means you should see a  decline in the frequency of this crash report (we have order a million users of our stable branch but fewer than 100K users of our development branch, which will still install on FF7/8).
Comment 34 Boris Zbarsky [:bz] 2011-09-26 11:08:46 PDT
Peter, thanks.

Have you guys gotten in touch with Giorgio yet about how he dealt with this in NoScript?
Comment 35 Boris Zbarsky [:bz] 2011-09-26 11:13:00 PDT
To be precise, we can try what I suggested in comment 18, but that has compat risks and is a certain performance hit.  But that change will _definitely_ not land in Fx7 and most likely not in Fx8, and possibly not even in Fx9 at this point.  So if you can stop the behavior that's triggering the problem on your end, that would be really good...
Comment 36 Boris Zbarsky [:bz] 2011-09-26 11:24:40 PDT
And now that I think about it again, it would fix this crash, but possibly not other things like memory leaks or other misbehavior that directly munging the channel URI can cause; in particular if consumers use the URI as a hashtable key then changing it without issuing a redirect _will_ break things.

So I'm being very tempted to just make channel URIs immutable and be done here.
Comment 37 Giorgio Maone [:mao] 2011-09-26 11:42:53 PDT
I'm not sure I understand clearly the terms of this problem. Does modifying the URI spec in a nsIContentPolicy::shouldLoad() callback (before any channel exist yet) cause this bug to be triggered?

I ask because most comments seem to assume the wrong behavior is specifically touching nsIChannel.URI, which AFAIK neither HTTPS Everywhere nor NoScript do (we should always emulate a redirection, instead, creating a new channel with its own new URI).
Comment 38 Boris Zbarsky [:bz] 2011-09-26 11:58:15 PDT
> Does modifying the URI spec in a nsIContentPolicy::shouldLoad() callback (before any
> channel exist yet) cause this bug to be triggered?

Er.... yes.  If that's what's going on here then cloning on channel init will absolutely not help.  Cloning on link traversal would, however.  That should be even safer, actually.
Comment 40 Boris Zbarsky [:bz] 2011-09-26 12:28:15 PDT
Created attachment 562509 [details] [diff] [review]
Like so, say.  Testing this against the steps in comment 20 now
Comment 41 Peter Eckersley 2011-09-26 12:32:51 PDT
The NoScript code we based forceURI on is:

http://pastebin.com/UviueVGT

(that's from NoScript 2.1.0.5)
Comment 42 Olli Pettay [:smaug] (vacation Aug 25-28) 2011-09-26 12:34:44 PDT
Comment on attachment 562509 [details] [diff] [review]
Like so, say.  Testing this against the steps in comment 20 now

Could you rename the variable. clone -> cloneURI or clonedURI.

We really should make URIs immutable.
Comment 43 Boris Zbarsky [:bz] 2011-09-26 14:02:58 PDT
I've confirmed that I can reproduce the crash using the excellent detailed steps in comment 20 and that the attached patch fixes it with those steps.

The attached patch is not enough, because it doesn't cover CSS loads, and I'm getting the crash for a <link> element...
Comment 44 Jason Duell [:jduell] (needinfo me) 2011-09-26 14:15:46 PDT
> The immediate problem is that in some cases the HTTPS Everywhere extensions
> munges HTTP channel URIs directly, something like "channel.uri.scheme =
> 'https'".

> This causes an issue where a channel consumer passed in a URI that other
> things are also holding references to, so mutating it causes those other
> things to get confused.

> But it can also cause issues where a channel consumer uses channel URIs as any
> sort of key (e.g. in a hashtable) even if the version _they_ are holding is
> not mutated, because the actual URI on the channel won't match what they used
> to have. 

Just to be clear, given that channel.uri is mutable, these are technically programming errors, correct?

It does seem like making URIs immutable is the best long-term solution, but it sounds like a stretch to do that today for FF 9.  We could do it for 10, though.

Until then, it seems like you're doing the right thing--plugging any places we know of where an implicit assumption of immutability is made, like in your docshell patch.
Comment 45 Boris Zbarsky [:bz] 2011-09-26 14:20:23 PDT
One other thing.  As far as I know this problem dates back to the link rewrite that happened for Firefox 4.  So it's been a problem in release builds since March, and in betas for a lot longer than that.  Changing compat settings for Fx7 and Fx8 is not likely to affect anything, unless there's something else going on here that I don't know about.
Comment 47 Giorgio Maone [:mao] 2011-09-26 14:48:31 PDT
Suggested work-around for HTTPS Everywhere: rather than trying forceURI() for every single load, just call it for TYPE_OTHER, TYPE_IMAGE, TYPE_OBJECT and TYPE_OBJECT_SUBREQUEST, which are the load types which may be problematic / unreliable to handle with a redirection in http-on-start-request.
Comment 48 Boris Zbarsky [:bz] 2011-09-26 14:56:47 PDT
Created attachment 562547 [details] [diff] [review]
part 2.  Clone the URI argument when loading external stylesheets from a <link> element to work around content policies mutating the URI.
Comment 49 Boris Zbarsky [:bz] 2011-09-26 15:08:54 PDT
Part 2 fixes the steps in comment 20 to no longer crash.

Pushed both parts:
  https://hg.mozilla.org/integration/mozilla-inbound/rev/cadb12b21740
  https://hg.mozilla.org/integration/mozilla-inbound/rev/952017f5f62b
Comment 50 Boris Zbarsky [:bz] 2011-09-26 15:10:13 PDT
Comment on attachment 562509 [details] [diff] [review]
Like so, say.  Testing this against the steps in comment 20 now

Requesting both aurora and beta approval, since changes are Fx8 will be on beta before this gets triaged.  This is a very safe fix, I think, that should fix crashes for users who are using HTTPS Everywhere or NoScript.
Comment 51 Peter Eckersley 2011-09-26 15:37:14 PDT
(In reply to Giorgio Maone from comment #47)
> Suggested work-around for HTTPS Everywhere: rather than trying forceURI()
> for every single load, just call it for TYPE_OTHER, TYPE_IMAGE, TYPE_OBJECT
> and TYPE_OBJECT_SUBREQUEST, which are the load types which may be
> problematic / unreliable to handle with a redirection in
> http-on-start-request.

Do you mean http-on-modify-request?  http-on-start-request doesn't seem to be documented anywhere, and neither HTTPS Everywhere nor NoScript seem to observe it...
Comment 52 Giorgio Maone [:mao] 2011-09-26 15:47:21 PDT
(In reply to Peter Eckersley from comment #51)

> Do you mean http-on-modify-request?  http-on-start-request doesn't seem to
> be documented anywhere, and neither HTTPS Everywhere nor NoScript seem to
> observe it...

Of course, yes.
Comment 53 Peter Eckersley 2011-09-26 22:51:51 PDT
After an hour or two of tests with Wireshark, I haven't been able to find any definite instances of channel replacement in on-http-modify-request failing to rewrite requests that shouldLoad() URI modification method would have succeeded with.

For tomorrow's Firefox 7 release, I'm going to release HTTPS Everywhere stable 1.0.3 without the nsIContentPolicy path active.  We may run into issues in the cases Giorgio mentioned, but I'm going to err on the side of not crashing for the moment.
Comment 54 Boris Zbarsky [:bz] 2011-09-26 23:01:17 PDT
Peter, thanks!
Comment 56 Asa Dotzler [:asa] 2011-09-27 14:25:44 PDT
Comment on attachment 562509 [details] [diff] [review]
Like so, say.  Testing this against the steps in comment 20 now

Looks like this made aurora in this morning's uplift. Approving for beta.
Comment 57 Asa Dotzler [:asa] 2011-09-27 14:25:54 PDT
Comment on attachment 562547 [details] [diff] [review]
part 2.  Clone the URI argument when loading external stylesheets from a <link> element to work around content policies mutating the URI.

Looks like this made aurora in this morning's uplift. Approving for beta.
Comment 59 Ioana (away) 2011-10-05 05:37:45 PDT
Verified as fixed on:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0) Gecko/20100101 Firefox/8.0
Mozilla/5.0 (X11; Linux x86_64; rv:8.0) Gecko/20100101 Firefox/8.0
Mozilla/5.0 (Windows NT 6.1; rv:8.0) Gecko/20100101 Firefox/8.0
Mozilla/5.0 (Windows NT 5.1; rv:8.0) Gecko/20100101 Firefox/8.0

I used the STR from comment #2. Firefox doesn't crash anymore. A page not found error is displayed since there is no https version of the page the user is trying to open.

This fix will be verified on Aurora (9a2) and Central (10a1) when a version of HTTPS Finder compatible with them will be launched.
Comment 60 Ioana (away) 2011-10-05 05:46:20 PDT
A correction for the previous comment: I used the STR from comment #20 not #2.
Comment 61 Peter Eckersley 2011-10-06 16:40:07 PDT
We have received several bug reports that seem to have stemmed from our removal of the nsIContentPolicy::shoulLoad() path.  Interestingly, none of them have thus far involved the content types that Giorgio mentioned below.

One was this bug: https://trac.torproject.org/projects/tor/ticket/4194
which can be fixed via shouldLoad() for TYPE_FONT ;

another is this bug: https://trac.torproject.org/projects/tor/ticket/4149
which (at least in the Reader/YouTube case) is fixable via shouldLoad() for TYPE_SUBDOCUMENT

I think this means we will be issuing further releases of HTTPS Everywhere with URL mangling reenabled from within nsIContentPolicy::shouldLoad().  I'm not sure if we'll do just TYPE_FONT | TYPE_SUBDOCUMENT, or those plus Giorgio's suggestions.  Probably the latter.

It's also possible that there's some way to improve the way we do nsIChannel replacement to fix these two bugs, because we do seem to get the http-on-modify-request event in these bug cases, it's just that channel replacement goes wrong in some way.  But I presently do not know whether or how that's possible.

(In reply to Giorgio Maone from comment #47)
> Suggested work-around for HTTPS Everywhere: rather than trying forceURI()
> for every single load, just call it for TYPE_OTHER, TYPE_IMAGE, TYPE_OBJECT
> and TYPE_OBJECT_SUBREQUEST, which are the load types which may be
> problematic / unreliable to handle with a redirection in
> http-on-start-request.
Comment 62 Boris Zbarsky [:bz] 2011-10-06 19:16:54 PDT
Fonts might be running into CORS handling of redirects.  How exactly were you doing your channel replacement thing?

TYPE_SUBDOCUMENT, on the other hand, should just work...  Again, how are you doing the channel replacement?
Comment 63 Giorgio Maone [:mao] 2011-10-07 02:40:33 PDT
(In reply to Peter Eckersley from comment #61)
> One was this bug: https://trac.torproject.org/projects/tor/ticket/4194
> which can be fixed via shouldLoad() for TYPE_FONT ;

In this case I get "[Exception... "Access to restricted URI denied"  code: "1012" nsresult: "0x805303f4 (NS_ERROR_DOM_BAD_URI)" while calling nsIChannelEventSink.asyncOnChannelRedirect() on the old channel's notificationCallbacks. Suspect: should I better skip the sink redirection notifications and mimic a content policy calls round instead, just as it was an original request? Can I even do that from JavaScript land?
 
> another is this bug: https://trac.torproject.org/projects/tor/ticket/4149
> which (at least in the Reader/YouTube case) is fixable via shouldLoad() for
> TYPE_SUBDOCUMENT

In this case, as far as I can see, the redirected request is missing the Cookie header which the original one had. Of course, my redirection mimic code purposely avoid copying sensitive headers such as Cookie or Authorization, because we potentially redirect cross-origin (like in this case, http->https), but I suppose the correct headers should be automatically set by the HTTPChannel/Manager machinery when the new channel is open. Shouldn't they, and if they should, why aren't they? 

(In reply to Boris Zbarsky (:bz) from comment #62)
> How exactly were
> you doing your channel replacement thing?

As usual, through the ChannelReplacement stuff here:

https://addons.mozilla.org/en-US/firefox/files/browse/133908/file/chrome/noscript.jar/content/noscript/ChannelReplacement.js#L253

We (me and you, Boris) had already to touch it several times in the context of various bugs, and I always admitted it's a gigantic hack. Wouldn't be time for a proper API?
Comment 64 Giorgio Maone [:mao] 2011-10-07 03:06:17 PDT
(In reply to Giorgio Maone from comment #63)
> Suspect: should I better skip the sink redirection
> notifications and mimic a content policy calls round instead, just as it was
> an original request? Can I even do that from JavaScript land?

Answer to myself: yes I can, http://mxr.mozilla.org/mozilla-central/source/browser/base/content/browser.js#3275

Boris: should I?
Comment 65 Boris Zbarsky [:bz] 2011-10-07 07:06:57 PDT
> In this case I get "[Exception... "Access to restricted URI denied" 

Yes, CORS. In general, redirecting CORS requests is not allowed; we make an exception for proxy redirects, which require that the old and new URI be equal (which they are not in your case, of course).  We could fix that on our end by adding a new redirect flag that says to allow the redirect in CORS or something, but it'd require some Gecko changes.

> In this case, as far as I can see, the redirected request is missing the Cookie header

That's odd.  HTTP should certainly be setting cookies on the new HTTP channel...  I'd love a minimalish extension + steps to reproduce that show the problem there, in a separate bug.

> As usual, through the ChannelReplacement stuff here:

And in particular, you're passing REDIRECT_INTERNAL, but the "uri is the same" checks in CORS break the font case anyway....

Jonas, can we just add a flag to make that work for now?

> Wouldn't be time for a proper API?

Sure.  The new content policy API proposal should make all this easier, no?  It allows changing the URI to be loaded, and will be automatically called on redirects...

> Boris: should I?

Do which?  Just calling content policy from JS is fine, of course, but what do you plan to do with the channels if you do that?
Comment 66 Giorgio Maone [:mao] 2011-10-07 07:28:17 PDT
(In reply to Boris Zbarsky (:bz) from comment #65)

> 
> > Boris: should I?
> 
> Do which?  Just calling content policy from JS is fine, of course, but what
> do you plan to do with the channels if you do that?

I meant replacing the various sink.asyncOnChannelRedirect() calls with the content policy calls (leaving the rest of the redirect-like channel preparation in place) and then, if the load is not vetoed by content policies, going on with newChannel.asyncOpen() passing it the old channel listener. Wouldn't this just work? Are asyncOnChannelRedirect() callbacks supposed to have side effects, or are they just security/sanity checks which we can safely skip (in this context) if we call shouldLoad()?
Comment 67 Boris Zbarsky [:bz] 2011-10-07 08:17:13 PDT
> Are asyncOnChannelRedirect() callbacks supposed to have side effects

Yes.  A number of consumers need to have access to the actual channel they're using (so they can cancel it, check its status partway through the load, get response headers in the case of XHR, and so forth).  So if you change which channel is actually being used and don't call asyncOnChannelRedirect, all sorts of stuff would break.
Comment 68 Giorgio Maone [:mao] 2011-10-07 09:20:27 PDT
(In reply to Boris Zbarsky (:bz) from comment #67)
> > Are asyncOnChannelRedirect() callbacks supposed to have side effects
> 
> Yes.  A number of consumers need to have access to the actual channel
> they're using (so they can cancel it, check its status partway through the
> load, get response headers in the case of XHR, and so forth).  So if you
> change which channel is actually being used and don't call
> asyncOnChannelRedirect, all sorts of stuff would break.

That's what I suspected, and the reason why I was asking, thanks.
So we totally depend on you for fixing this :(
Comment 69 Jonas Sicking (:sicking) No longer reading bugmail consistently 2011-10-07 09:29:51 PDT
Adding a new flag for CORS sounds ok to me.
Comment 70 Boris Zbarsky [:bz] 2011-10-07 09:51:48 PDT
OK,  Giorgio, could you file bugs on the two issues there (the font thing and the subdocument thing)?  Feel free to assign the font bug to me; for the other I'd love a simple way to reproduce...
Comment 71 Peter Eckersley 2011-10-07 12:36:45 PDT
(In reply to Boris Zbarsky (:bz) from comment #70)
> OK,  Giorgio, could you file bugs on the two issues there (the font thing
> and the subdocument thing)?  Feel free to assign the font bug to me; for the
> other I'd love a simple way to reproduce...

I don't know if this counts as "simple" but my reproduction for the subdocument bug was as follows:

Setup
1. Log into google.com/reader
2. Look through your contacts' reader posts, until you find one with an embedded Youtube video.  Star it.

Reproduction (with HTTPS Everywhere 1.0.3 or 2.0development.2):
1. Log into google.com/reader
2. Click on "starred items"
3. Click on the title of the starred post twice, once to unfurl it and once to close it again.  A new tab will open.
Comment 72 Peter Eckersley 2011-10-07 12:43:57 PDT
(In reply to Peter Eckersley from comment #71)

> Reproduction (with HTTPS Everywhere 1.0.3 or 2.0development.2):

Actually, just use 2.0development.2:

https://www.eff.org/files/https-everywhere-2.0development.2.xpi
Comment 73 Peter Eckersley 2011-10-07 13:09:30 PDT
(In reply to Boris Zbarsky (:bz) from comment #67)
> > Are asyncOnChannelRedirect() callbacks supposed to have side effects
> 
> Yes.  A number of consumers need to have access to the actual channel
> they're using (so they can cancel it, check its status partway through the
> load, get response headers in the case of XHR, and so forth).  So if you
> change which channel is actually being used and don't call
> asyncOnChannelRedirect, all sorts of stuff would break.

It certainly does.  This is slightly off-topic, but:

https://trac.torproject.org/projects/tor/ticket/3190

We have been struggling to work around this breakage by turning of rewrites of requests that other extensions tend to make (eg StumbleUpon, LastPass).  In the case of RequestPolicy, Justin Samuel added an explicit signalling mechanism between the two extensions.

I only recently understood that all of these incompatibility bugs were essentially the same thing, and concluded that we needed the new request rewriting API we keep talking about to fix this :) :\
Comment 74 Peter Eckersley 2011-10-07 13:41:11 PDT
I opened an enhancement bug for the API we need:

https://bugzilla.mozilla.org/show_bug.cgi?id=692915
Comment 75 Giorgio Maone [:mao] 2011-10-07 13:59:36 PDT
Quick and dirty work-around until bug 692843 is fixed:

ChannelReplacement.prototype = {

[...]

_callSink: function(sink, oldChan, newChan, flags) {
    try { 
      if ("onChannelRedirect" in sink) sink.onChannelRedirect(oldChan, newChan, flags);
      else sink.asyncOnChannelRedirect(oldChan, newChan, flags, this._redirectCallback);
    } catch(e) {
      if (e.toString().indexOf("(NS_ERROR_DOM_BAD_URI)") !== -1 && oldChan.URI.spec !== newChan.URI.spec) {
        let oldURL = oldChan.URI.spec;
        try {
          oldChan.URI.spec = newChan.URI.spec;
          this._callSink(sink, oldChan, newChan, flags);
        } catch(e1) {
          throw e;
        } finally {
          oldChan.URI.spec = oldURL;
        }
      } else if (e.message.indexOf("(NS_ERROR_NOT_AVAILABLE)") === -1) throw e;
    }
  }

[...]

}

It may format your HD or exterminate your family, but it seems to "fix" https://trac.torproject.org/projects/tor/ticket/3190
Comment 76 Boris Zbarsky [:bz] 2011-10-07 19:46:11 PDT
Peter, for my purposes "simple" would be with as few moving parts as possible.  So as simple an extension as possible and as simple a web page as possible.  If we have to use google reader for the latter, then we have to, but that makes a _lot_ of HTTP requests, so tracking things down becomes a big pain.

I can probably live with a complicated extension and page if there's at least a clear description of which of the bazillions of HTTP channels I should be looking at...

Again, separate bug on this please!
Comment 77 Vic 2011-10-08 06:35:55 PDT
I am a little confused. Boris you fixed this already right? Now why did Peter do this for the development versions of HTTPS Everywhere also?: (i.e. Comment 53)

"After an hour or two of tests with Wireshark, I haven't been able to find any definite instances of channel replacement in on-http-modify-request failing to rewrite requests that shouldLoad() URI modification method would have succeeded with.

For tomorrow's Firefox 7 release, I'm going to release HTTPS Everywhere stable 1.0.3 without the nsIContentPolicy path active.  We may run into issues in the cases Giorgio mentioned, but I'm going to err on the side of not crashing for the moment."

(Below is copied from HTTPS Everywhere bug report which was filed because Firefox crashed (https://trac.torproject.org/projects/tor/ticket/3882) )

vic:Why do you need to disable the nsIContentPolicy::shouldLoad / forceURI path?


pde: Because the Firefox patch isn't in Firefox 4-7, and we'd prefer not to crash those browsers?


vic:The removal was in the main branch of HTTPS Everywhere. Right now Firefox 4-6 has no security updates.

My suggestion is to keep the nsIContentPolicy::shouldLoad / forceURI path in HTTPS Everywhere v2.0.xDev and release version 2 on the same day that Firefox 8 ships.




vic:Please restore the nsIContentPolicy::shouldLoad / forceURI path in stable version.


pde:Despite Giorgio's concerns, we haven't yet found any cases in which disabling nsIContentyPolicy caused an insecure HTTP load. If we find any, we'll try to turn it back on just for those cases.

vic:Really would appreciate if you left it on for Everything, just in case.

I mean come to think of it, the patch released @  https://bugzilla.mozilla.org/show_bug.cgi?id=677643 for Firefox 8+ is now useless? The patch was because we were using the nsIContentPolicy::shouldLoad / forceURI path right?
Comment 78 Boris Zbarsky [:bz] 2011-10-08 20:46:16 PDT
> Boris you fixed this already right?

Yes; the fix will appear in Firefox 8.
Comment 79 Eric Shepherd [:sheppy] 2011-11-08 06:32:00 PST
These patches are very small, and there are a great many comments on this bug. What exactly am I documenting here? There seems to be no new API, so it's not obvious.
Comment 80 Boris Zbarsky [:bz] 2011-11-08 08:58:05 PST
The change was to create clones of URI objects when loading in a docshell and when loading a stylesheet link...

I'm not quite sure what needs documenting either, honestly.

Note You need to log in before you can comment on or make changes to this bug.