1313595 - Lower timeout on HSTS Priming channels

Assignee

Description

•

9 years ago

In certain cases where the HSTS Priming request times out, there is a 30 second delay before the load proceeds for loads which are not blocked. If there is a redirect to another HTTP resource which also times out, an additional 30 seconds are required to resolve the priming request. Suggested timeout is 10 seconds. (Source: https://goo.gl/S5ARGX 95% of HTTP sub-loads happen in ~ 8.82s)

Kate McKinley [:kmckinley, :Kate]

Assignee

Comment 1

•

9 years ago

This may not be possible; see https://bugzilla.mozilla.org/show_bug.cgi?id=407190#c45

Christoph Kerschbaumer [:ckerschb}

Updated

•

9 years ago

Priority: -- → P3

Whiteboard: [domsecurity-backlog1]

Comment hidden (mozreview-request)

Kate McKinley [:kmckinley, :Kate]

Assignee

Comment 3

•

9 years ago

It looks like there is a way to set the timeout on a socket level (nsISocketTransport). Very open to different suggestions, or if there might be a better place to put it.

Christoph Kerschbaumer [:ckerschb}

Comment 4

•

9 years ago

Comment on attachment 8806247 [details] Bug 1313595 - Lower HSTS priming timeout r=me on the loadInfo bits. I suppose Honza can tell you whether Necko bits are ok.

Attachment #8806247 - Flags: review?(ckerschb) → review+

Kate McKinley [:kmckinley, :Kate]

Assignee

Updated

•

9 years ago

Whiteboard: [domsecurity-backlog1] → [domsecurity-backlog1] [hsts-priming]

Honza Bambas (:mayhemer)

Comment 5

•

9 years ago

mozreview-review

Comment on attachment 8806247 [details] Bug 1313595 - Lower HSTS priming timeout https://reviewboard.mozilla.org/r/89744/#review89696 I may rethink my assertion in https://bugzilla.mozilla.org/show_bug.cgi?id=407190#c45 As I understand, the priming request actually blocks the page load, right? first of all, the priority of the channel should be set to "highest" (see nsISupportsPriority) and I believe the timeout should rather be measured on the top most level, ignoring all http transaction queues, connection and socket timeouts. user more cares about the page to load than to be super-secure. I know, it sucks, but that's the reality. so I'd implement it rather simple as: if you don't make it from AsyncOpen in 10 seconds (I would agree even with just 3 seconds here) you fail the priming request with the result "I don't know" and unblock the page load as if there were no priming. make it a preference. you can use higher values for tests to make them less volatile. ::: netwerk/base/nsILoadInfo.idl:665 (Diff revision 1) > */ > [noscript, infallible] readonly attribute boolean mixedContentWouldBlock; > + /** > + * Whether this is a priming channel, so lower timeouts can be enforced. > + */ > + [noscript, infallible] attribute boolean isHSTSPrimingChannel; if you need to carry this to the channel, I think you should add it on nsIHttpChannelInternal rather than on nsILoadInfo ::: netwerk/protocol/http/nsHttpChannel.cpp:6996 (Diff revision 1) > + if (mLoadInfo) { > + bool isPrimingChannel = false; > + rv = mLoadInfo->GetIsHSTSPrimingChannel(&isPrimingChannel); > + NS_ENSURE_SUCCESS(rv, rv); > + if (isPrimingChannel) { > + socketTransport->SetTimeout(nsISocketTransport::TIMEOUT_CONNECT, 10); I believe this is set too late.

Attachment #8806247 - Flags: review?(honzab.moz) → review-

Comment hidden (mozreview-request)

Gerry Chang [:gchang]

Comment 7

•

9 years ago

Track 51+ to reduce the timeout for bug 1311807.

tracking-firefox51: --- → +

Comment hidden (mozreview-request)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 9

•

9 years ago

Kate, do we have a follow up on comment 5, given that this is meant to uplift to 51?

Flags: needinfo?(kmckinley)

Honza Bambas (:mayhemer)

Comment 10

•

9 years ago

mozreview-review

Comment on attachment 8806247 [details] Bug 1313595 - Lower HSTS priming timeout https://reviewboard.mozilla.org/r/89744/#review91608 ::: netwerk/protocol/http/HSTSPrimerListener.cpp:43 (Diff revisions 1 - 3) > +HSTSPrimingListener::HSTSPrimingListener(nsIHstsPrimingCallback* aCallback) > + : mCallback(aCallback) > +{ > + if (sHSTSPrimingTimeout == 0) { > + Preferences::AddUintVarCache(&sHSTSPrimingTimeout, > + "security.mixed_content.hsts_priming_request_timeout"); what if the preference is set to 0? maybe do: static nsresult rv = Preferences::AddUintVarCache(...); Unused << rv; (include "mozilla/Unused.h") ::: netwerk/protocol/http/HSTSPrimerListener.cpp:287 (Diff revisions 1 - 3) > - // Set up listener which will start the original channel > - nsCOMPtr<nsIStreamListener> primingListener(new HSTSPrimingListener(aCallback)); > + // Image channels are loaded by default with reduced priority. > + nsCOMPtr<nsISupportsPriority> p = do_QueryInterface(primingChannel); > + if (p) { > + uint32_t priority = nsISupportsPriority::PRIORITY_HIGHEST; > > + p->AdjustPriority(priority); use SetPriority ::: netwerk/protocol/http/HSTSPrimerListener.cpp:294 (Diff revisions 1 - 3) > + > + // Set up listener which will start the original channel > + HSTSPrimingListener* listener = new HSTSPrimingListener(aCallback); > + nsCOMPtr<nsIStreamListener> primingListener(listener); > // Start priming > rv = primingChannel->AsyncOpen2(primingListener); nit: you can just pass listener directly? ::: netwerk/protocol/http/HSTSPrimerListener.cpp:303 (Diff revisions 1 - 3) > + NS_ENSURE_STATE(timer); > + > + rv = timer->InitWithFuncCallback(OnHSTSPrimingTimeout, > + primingChannel, > + sHSTSPrimingTimeout, > + nsITimer::TYPE_ONE_SHOT); since you are providing a referencable closure (the channel) you must use either nsITimer.init(nsIObserver) or nsITimer.initWithCallback(nsITimerCallback) to handle referencing well. problem with a static function is that it may be called after the channel is already gone. I think HSTSPrimingListener should implement the observer/timer-callback interface instead. also, when the channel makes it sooner than the timer fires, you must cancel the timer, which I don't see in the patch ::: netwerk/protocol/http/HSTSPrimerListener.cpp:306 (Diff revisions 1 - 3) > + primingChannel, > + sHSTSPrimingTimeout, > + nsITimer::TYPE_ONE_SHOT); > + NS_ENSURE_SUCCESS(rv, rv); > + > + listener->mHSTSPrimingTimeout = timer; s/mHSTSPrimingTimeout/mHSTSPrimingTimer/ ::: netwerk/protocol/http/nsHttpChannel.cpp:6996 (Diff revisions 1 - 3) > nsCOMPtr<nsISocketTransport> socketTransport = > do_QueryInterface(trans); > if (socketTransport) { > if (mLoadInfo) { > bool isPrimingChannel = false; > - rv = mLoadInfo->GetIsHSTSPrimingChannel(&isPrimingChannel); > + rv = GetIsHSTSPrimingChannel(&isPrimingChannel); I think you can just use the mIsPrimingChannel member? ::: netwerk/protocol/http/nsHttpChannel.cpp:6999 (Diff revisions 1 - 3) > if (mLoadInfo) { > bool isPrimingChannel = false; > - rv = mLoadInfo->GetIsHSTSPrimingChannel(&isPrimingChannel); > + rv = GetIsHSTSPrimingChannel(&isPrimingChannel); > NS_ENSURE_SUCCESS(rv, rv); > if (isPrimingChannel) { > socketTransport->SetTimeout(nsISocketTransport::TIMEOUT_CONNECT, 10); OTOH, I don't think you need this code when you measure timeout on the upper layer. ::: netwerk/protocol/http/nsHttpChannel.cpp:8039 (Diff revisions 1 - 3) > +{ > + bool wouldBlock = mLoadInfo->GetMixedContentWouldBlock(); > + LOG(("HSTS Priming timed out [this=%p], %s the load", this, > + (wouldBlock) ? "blocking" : "allowing")); > + // A priming request was sent, and no HSTS header was found that allows > + // the upgrade. is this comment right? ::: netwerk/protocol/http/nsHttpChannel.cpp:8044 (Diff revisions 1 - 3) > + // the upgrade. > + Telemetry::Accumulate(Telemetry::MIXED_CONTENT_HSTS_PRIMING_RESULT, > + (wouldBlock) ? HSTSPrimingResult::eHSTS_PRIMING_TIMEOUT_BLOCK : > + HSTSPrimingResult::eHSTS_PRIMING_TIMEOUT_ACCEPT); > + > + return CacheHSTSPrimingAndContinue(NS_ERROR_CONTENT_BLOCKED, wouldBlock); is the priming channel canceled? because when it makes it later it will call other priming callbacks on the request channel and it will totally break.. other reason to move the timer notification to the listener class where you also throw the callback reference away (to ensure that any of the hsts priming callbacks on the request channel can only be called once) ::: netwerk/protocol/http/HSTSPrimerListener.h:112 (Diff revision 3) > + /** > + * Keep a handle to the timer around > + */ > + nsCOMPtr<nsITimer> mHSTSPrimingTimeout; > + > + static uint32_t sHSTSPrimingTimeout; some comment please

Attachment #8806247 - Flags: review?(honzab.moz) → review-

Kate McKinley [:kmckinley, :Kate]

Assignee

Comment 11

•

9 years ago

(In reply to Milan Sreckovic [:milan] from comment #9) > Kate, do we have a follow up on comment 5, given that this is meant to > uplift to 51? It's in progress.

Flags: needinfo?(kmckinley)

Comment hidden (mozreview-request)

Honza Bambas (:mayhemer)

Comment 15

•

9 years ago

mozreview-review

Comment on attachment 8806247 [details] Bug 1313595 - Lower HSTS priming timeout https://reviewboard.mozilla.org/r/89744/#review91928 OK, stiff few nits to fix, then r+ with everything fixed. ::: modules/libpref/init/all.js:5521 (Diff revision 6) > pref("security.mixed_content.use_hsts", true); > #endif > // Approximately 1 week default cache for HSTS priming failures > pref ("security.mixed_content.hsts_priming_cache_timeout", 10080); > +// Force the channel to timeout in 3 seconds if we have not received > +pref ("security.mixed_content.hsts_priming_request_timeout", 3000); please add to the coment a note that the preference expects being set to number of milliseconds, just to make sure the units are clear. ::: netwerk/protocol/http/HSTSPrimerListener.h:113 (Diff revision 6) > * the nsIHttpChannel to notify with the result of HSTS priming. > */ > nsCOMPtr<nsIHstsPrimingCallback> mCallback; > + > + /** > + * Keep a handle to the timer around nice comment, but also please say why :) ::: netwerk/protocol/http/HSTSPrimerListener.cpp:30 (Diff revision 6) > using namespace mozilla; > > NS_IMPL_ISUPPORTS(HSTSPrimingListener, nsIStreamListener, > nsIRequestObserver, nsIInterfaceRequestor) > > +uint32_t HSTSPrimingListener::sHSTSPrimingTimeout = 0; nit: sync this with the default you've added in all.js, i.e. this should be 3000. ::: netwerk/protocol/http/HSTSPrimerListener.cpp:72 (Diff revision 6) > +NS_IMETHODIMP > +HSTSPrimingListener::OnStartRequest(nsIRequest *aRequest, > + nsISupports *aContext) > +{ > + nsCOMPtr<nsIHstsPrimingCallback> callback(mCallback); > + mCallback = nullptr; a bit more optimal is: nsCOMPtr<nsIHstsPrimingCallback> callback; callback.swap(mCallback); this moves the pointer from the member to the local comptr and leaves the member empty (what we want), saves one addref/release ::: netwerk/protocol/http/HSTSPrimerListener.cpp:82 (Diff revision 6) > + nsresult primingResult = CheckHSTSPrimingRequestStatus(aRequest); > + ReportTiming(primingResult); > + > + if (mHSTSPrimingTimer) { > + nsresult rv = mHSTSPrimingTimer->Cancel(); > + Unused << rv; first, please don't declare nsresult rv inside a block, always as the first variable in a function. then, just do (if necessary) Unused << mHSTSPrimingTimer->Cancel(); and, maybe cancel the timer before you check the callback? ::: netwerk/protocol/http/HSTSPrimerListener.cpp:177 (Diff revision 6) > +{ > + nsCOMPtr<nsIHstsPrimingCallback> callback(mCallback); > + mCallback = nullptr; > + NS_ENSURE_STATE(callback); > + ReportTiming(NS_ERROR_HSTS_PRIMING_TIMEOUT); > + return callback->OnHSTSPrimingFailed(NS_ERROR_HSTS_PRIMING_TIMEOUT, false); the result actually goes nowhere and, who cancels the priming channel? ::: netwerk/protocol/http/HSTSPrimerListener.cpp:322 (Diff revision 6) > + NS_ENSURE_STATE(timer); > + > + rv = timer->InitWithCallback(listener, > + sHSTSPrimingTimeout, > + nsITimer::TYPE_ONE_SHOT); > + NS_ENSURE_SUCCESS(rv, rv); turn this to be non-fatal please. because you have already opened the priming channel anyway ::: netwerk/protocol/http/HSTSPrimerListener.cpp:324 (Diff revision 6) > + rv = timer->InitWithCallback(listener, > + sHSTSPrimingTimeout, > + nsITimer::TYPE_ONE_SHOT); > + NS_ENSURE_SUCCESS(rv, rv); > + > + listener->mHSTSPrimingTimer = timer; .swap() will be better here too ::: xpcom/base/ErrorList.h:773 (Diff revision 6) > /* Error code for XBL */ > ERROR(NS_ERROR_XBL_BLOCKED, FAILURE(15)), > /* Error code for when the content process crashed */ > ERROR(NS_ERROR_CONTENT_CRASHED, FAILURE(16)), > + /* Error code for when HSTS priming request times out */ > + ERROR(NS_ERROR_HSTS_PRIMING_TIMEOUT, FAILURE(17)), place this to the network module please

Attachment #8806247 - Flags: review?(honzab.moz) → review+

heidi

Comment 16

•

9 years ago

There are still a lot of issues with HSTS priming even with the timeout reduction, and I'm getting tons of complaints from visitors running the dev build regarding performance being ridiculously slow. For sites that rely on loading content from external hosts where none of the workarounds can be applied, this makes the site literally unusable. 1) Why was the decision made to hijack an existing established header for this mechanism? If a site wanted this functionality, it should be able to signal it somehow, or at the very least signal an opt-out. 2) The mechanism seems to be taking place even if a site is no longer sending the Strict-Transport-Security header, so with the typical long cache periods there is not even a way to back out of it when it inevitably breaks site functionality. If an opt-out signal is not added, the mechanism should absolutely only take place if the origin request is *currently* sending the header. 3) The mechanism is even attempted for a HTTP request to a non-standard port. I would think that the cases where HTTP content retrieved from a port except for 80, and it will be able to successfully request the same content from port 443, are few and far between. This absolutely needs a lot more work to prevent breaking existing sites in ways that cannot be fixed, and cannot be deployed live in any fashion in the current state.

Comment hidden (mozreview-request)

Kate McKinley [:kmckinley, :Kate]

Assignee

Comment 20

•

9 years ago

mozreview-review-reply

Comment on attachment 8806247 [details] Bug 1313595 - Lower HSTS priming timeout https://reviewboard.mozilla.org/r/89744/#review91928 > the result actually goes nowhere > > and, who cancels the priming channel? Can you take a look at the current diff? Updated to cancel the priming channel, then fail priming, but open to swapping order. Since the result goes nowhere, returning NS_OK.

Honza Bambas (:mayhemer)

Comment 21

•

9 years ago

mozreview-review

Comment on attachment 8806247 [details] Bug 1313595 - Lower HSTS priming timeout https://reviewboard.mozilla.org/r/89744/#review92278 ::: netwerk/protocol/http/HSTSPrimerListener.cpp:75 (Diff revisions 6 - 9) > nsISupports *aContext) > { > - nsCOMPtr<nsIHstsPrimingCallback> callback(mCallback); > - mCallback = nullptr; > + nsCOMPtr<nsIHstsPrimingCallback> callback; > + callback.swap(mCallback); > + if (mHSTSPrimingTimer) { > + Unused << mHSTSPrimingTimer->Cancel(); and drop reference to the timer as well please ::: netwerk/protocol/http/HSTSPrimerListener.cpp:327 (Diff revisions 6 - 9) > p->SetPriority(priority); > } > > // Set up listener which will start the original channel > HSTSPrimingListener* listener = new HSTSPrimingListener(aCallback); > //nsCOMPtr<nsIStreamListener> primingListener(listener); remove this line?

Honza Bambas (:mayhemer)

Comment 22

•

9 years ago

Comment 16 ^

Flags: needinfo?(kmckinley)

Comment hidden (mozreview-request)

Kate McKinley [:kmckinley, :Kate]

Assignee

Comment 24

•

9 years ago

Hi Heidi, thank you for your feedback. I would like to note that this patch is not yet landed, so the lower timeouts are not yet in effect. (In reply to heidi from comment #16) > There are still a lot of issues with HSTS priming even with the timeout > reduction, and I'm getting tons of complaints from visitors running the dev > build regarding performance being ridiculously slow. For sites that rely on > loading content from external hosts where none of the workarounds can be > applied, this makes the site literally unusable. Can you provide an example? > 1) Why was the decision made to hijack an existing established header for > this mechanism? If a site wanted this functionality, it should be able to > signal it somehow, or at the very least signal an opt-out. Sites can opt-out of HSTS by sending the Strict-Transport-Security header with a max-age of 0 over a secure channel. HSTS is one of several signal sites can provide to indicate they want to be upgraded to HTTPS. At the moment, HSTS upgrades do not happen prior to blocking for mixed-content, and HSTS Priming is an experiment to be able to change that. The issue with only swapping the enforcement of mixed-content with HSTS is that there is no reliable way for resources specified over HTTP to signal they want that upgrade, and HSTS Priming allows sites to opt-in. > 2) The mechanism seems to be taking place even if a site is no longer > sending the Strict-Transport-Security header, so with the typical long cache > periods there is not even a way to back out of it when it inevitably breaks > site functionality. If an opt-out signal is not added, the mechanism should > absolutely only take place if the origin request is *currently* sending the > header. Best practices for sites who want to turn HSTS on is to start with a shorter timeout rather than going to a long 18 week timeout. A site which needs to back out of HSTS can send the header with max-age of 0, which will clear the browser's cache. > 3) The mechanism is even attempted for a HTTP request to a non-standard > port. I would think that the cases where HTTP content retrieved from a port > except for 80, and it will be able to successfully request the same content > from port 443, are few and far between. Thank you for this, we will take a look at it. > This absolutely needs a lot more work to prevent breaking existing sites in > ways that cannot be fixed, and cannot be deployed live in any fashion in the > current state. It is not currently slated to be turned on in Release.

Flags: needinfo?(kmckinley)

Priority: P3 → P2

heidi

Comment 25

•

9 years ago

(In reply to Kate McKinley [:kmckinley] from comment #24) > Can you provide an example? The problems arise when a site loads resources from diverse external hosts, for example if you do CDN on the application level rather than the DNS level, as the efficiency of the priming cache is hampered. A three second timeout is obviously much better than a 30-60 second timeout, but it would still slow down the load for many of these resources by three seconds, especially if the TTL of the priming cache is as low as the suggested one day. > Sites can opt-out of HSTS by sending the Strict-Transport-Security header > with a max-age of 0 over a secure channel. HSTS is one of several signal > sites can provide to indicate they want to be upgraded to HTTPS. At the > moment, HSTS upgrades do not happen prior to blocking for mixed-content, and > HSTS Priming is an experiment to be able to change that. The issue with only > swapping the enforcement of mixed-content with HSTS is that there is no > reliable way for resources specified over HTTP to signal they want that > upgrade, and HSTS Priming allows sites to opt-in. Thanks for the tip about backing out of HSTS. My suggestion is not to trust any signaling sent over insecure HTTP connections. There should however be a way for the site itself to signal over the secure HTTPS connection, possibly via additional arguments to the Strict-Transport-Security header, whether the insecure HTTP resources can be upgraded or not. So for example, if the site knows that the HTTP resources cannot be upgraded, it could send something like: Strict-Transport-Security: max-age=604800; preload; no-priming It could still default to attempt HSTS priming, but this would preempt any issues from slowdowns caused by timeouts of resources that are known to not be HTTPS-retrievable. > Best practices for sites who want to turn HSTS on is to start with a shorter > timeout rather than going to a long 18 week timeout. A site which needs to > back out of HSTS can send the header with max-age of 0, which will clear the > browser's cache. This makes sense, but the issue here arises if browsers start to interpret the header differently at some point in the future. It is however not a significant problem if you can indeed clear the cache.

Pulsebot

Comment 26

•

9 years ago

Pushed by kmckinley@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/a8be4ebc85cf Reduce timeout for HSTS priming channels r=mayhemer

Carsten Book [:Tomcat]

Comment 27

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/a8be4ebc85cf

Status: ASSIGNED → RESOLVED

Closed: 9 years ago

status-firefox52: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla52

Phil Ringnalda (:philor)

Updated

•

9 years ago

Depends on: 1317115

Pulsebot

Comment 28

•

9 years ago

Backout by philringnalda@gmail.com: https://hg.mozilla.org/mozilla-central/rev/5e7676832766 Backed out changeset a8be4ebc85cf for permaorange unexpected assertion in test_referrerdirective.html, a=backout

Phil Ringnalda (:philor)

Updated

•

9 years ago

•

9 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/3472d9d9dd47

Status: REOPENED → RESOLVED

Closed: 9 years ago → 9 years ago

status-firefox53: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla53

Sankha Narayan Guria [:sankha]

Comment 33

•

9 years ago

Attached patch Backed out changeset a8be4ebc85cf () for permaorange unexpected assertion in test_referrerdirective.html, (obsolete) — Details — Splinter Review

MozReview-Commit-ID: GxBqDrHHg7z

Sankha Narayan Guria [:sankha]

Updated

•

9 years ago

Assignee: kmckinley → sankha93

Kate McKinley [:kmckinley, :Kate]

Assignee

Comment 34

•

9 years ago

(In reply to Sankha Narayan Guria [:sankha] from comment #33) > Created attachment 8811139 [details] [diff] [review] > Backed out changeset a8be4ebc85cf () for permaorange unexpected assertion in > test_referrerdirective.html, > > MozReview-Commit-ID: GxBqDrHHg7z Why was this backed out and reassigned?

Sankha Narayan Guria [:sankha]

Comment 35

•

9 years ago

> Why was this backed out and reassigned? I am sorry, I was struggling to set up the bzexport extension today morning on my machine and probably this got pushed from my account because of it. Nothing has been backed out, just that changeset was attached here as an attachment. Sorry for the noise.

Sankha Narayan Guria [:sankha]

•

9 years ago

Hi Kate, Since this feature won't be enabled in 51, I would like this to ride the train. Can we turn both prefs off in 51?

Flags: qe-verify+

Flags: needinfo?(kmckinley)

Kate McKinley [:kmckinley, :Kate]

Assignee

Comment 38

•

9 years ago

That's fine, we can tell people to use 52 or higher to test.

Flags: needinfo?(kmckinley)

Gerry Chang [:gchang]

Comment 39

•

9 years ago

Comment on attachment 8806247 [details] Bug 1313595 - Lower HSTS priming timeout Let's let this ride the train on 52. Mark 51 won't fix.

Attachment #8806247 - Flags: approval-mozilla-beta? → approval-mozilla-beta-

Gerry Chang [:gchang]

Updated

•

9 years ago

status-firefox51: --- → wontfix

Honza Bambas (:mayhemer)

Updated

•

8 years ago

Depends on: 1317517

Honza Bambas (:mayhemer)

Updated

•

8 years ago

Depends on: 1319606

Honza Bambas (:mayhemer)

Comment 40

•

8 years ago

I personally vote for backing this out now from all branches. Kate, please write automated tests for this functionality, for both timeout and non-timeout case. Also, add traceable channel on channels you are about to test with so that we find out why this is crashing. If not super-complicated, preferably, the test should be a mochitest, so that all the docshell/error page at al machinery is being ran. Note that I will not r+ the next version of the patch if tests are not added. It was a mistake from me the first time.

Flags: needinfo?(kmckinley)

Comment hidden (mozreview-request)

Kate McKinley [:kmckinley, :Kate]

Assignee

Comment 44

•

8 years ago

(In reply to Honza Bambas (:mayhemer) from comment #40) > I personally vote for backing this out now from all branches. > > Kate, please write automated tests for this functionality, for both timeout > and non-timeout case. Also, add traceable channel on channels you are about > to test with so that we find out why this is crashing. If not > super-complicated, preferably, the test should be a mochitest, so that all > the docshell/error page at al machinery is being ran. > > Note that I will not r+ the next version of the patch if tests are not > added. It was a mistake from me the first time. I don't understand what you mean by add traceable channel here. I see the interface, but it isn't clear what is supposed to be done with it. HttpBaseChannel implements nsITraceableChannel. Do you have any examples you can point to?

Flags: needinfo?(kmckinley)

Honza Bambas (:mayhemer)

Comment 45

•

8 years ago

(In reply to Kate McKinley [:kmckinley] from comment #44) > I don't understand what you mean by add traceable channel here. I see the > interface, but it isn't clear what is supposed to be done with it. > HttpBaseChannel implements nsITraceableChannel. Do you have any examples you > can point to? Ah, OK. nsITraceableChannel is used to catch (and potentially interfere with) OnStart/OnData/OnStop notifications coming directly from a channel. It's mainly used by diagnostic tools such as our DevTools (specifically the Network pane) or Firebug and some other extensions. It works the way that an intercepting listener is added between the channel and the current listener (regular consumer of the channel such as script loader, image parser, html parser etc). This intercepting listener can then examine response headers and the response body and is responsible to forward the notifications to the original target listener of the channel. One of the crashes that this patch seems to cause is having nsStreamListenerWrapper on the top frame. That class is used when the intercepting listener is added on the channel via the nsITraceableChannel interface (SetNewListener) wrapping the current channel's listener. See [1]. Tho, it's also used for fetch() at [2]. I unfortunately can't say from the stacks in bug 1319606 which use of nsStreamListenerWrapper leads to the crash for sure. But my bet is more on DevTools than fetch(). Hence, the suggestion to exercise the traceable channel. The reason I insist on adding a test for this is that the priming feature is already causing some intermittent test failures and seems to me overall immature and unstable so far. So I want to make sure that more changes (even attempts to fix those intermittents) don't cause more regressions. It's probably OK if you only modify existing tests too to add prove we don't crash or so with this patch. Anyway, I'd like you to also focus on the already existing tests that are intermittently failing and overall stability improvements here, to find the true causes and fix either the tests or the code. Maybe you already did that, and I might only miss some analyzes and conclusions added as bugzilla comments. And one last question, is this fix actually fixing any of the test failures in bugs it blocks? Does it improve anything? Thanks. [1] https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/HttpBaseChannel.cpp#2698 [2] https://dxr.mozilla.org/mozilla-central/rev/bad312aefb42982f492ad2cf36f4c6c3d698f4f7/dom/xhr/XMLHttpRequestMainThread.cpp#2755

Flags: needinfo?(kmckinley)

Nicholas Nethercote [inactive]

Comment 46

•

8 years ago

mozreview-review

Comment on attachment 8814315 [details] Bug 1313595 - Missing interface in HSTSPrimerListener https://reviewboard.mozilla.org/r/95584/#review95686 I know I did a drive-by formatting nit, but otherwise I'm not qualified to review this :)

Attachment #8814315 - Flags: review?(n.nethercote)

Kate McKinley [:kmckinley, :Kate]

Assignee

Comment 47

•

8 years ago

(In reply to Honza Bambas (:mayhemer) from comment #45) This is a tough question to answer. Without nsITimerCallback in NS_IMPL_ISUPPORTS, I reliably get *a* crash when the timer fires, but I'm having a hard time to reproduce the specific crashes in 1319606 and 1317517, so I can't say for sure. With the patch applied, there are a number of flaky things probably caused by priming that are simply not occurring in this test run. For example, I could reliably reproduce the assertion in bug 1317298 before, and with the patch, I can't. There is a treeherder run here with the most current version, fixing some test flakiness. These new timeout tests need a bit more tuning. https://treeherder.mozilla.org/#/jobs?repo=try&revision=5575a14d72dffb75b287807434ec7976fb74e22b I'm still not sure what you mean by adding a traceable channel. Do you mean to wrap the HSTSPrimerListener in an nsStreamListenerWrapper? Sorry, it just isn't clear form the examples or the class documentation what needs to be done here. If all it is is the nsStreamListenerWrapper, there is a patch on try that adds that and hopefully fixes the test fails. https://treeherder.mozilla.org/#/jobs?repo=try&revision=15564dfe8583bafb7fb613c10b7e5d9ac9410cab > (In reply to Kate McKinley [:kmckinley] from comment #44) > > I don't understand what you mean by add traceable channel here. I see the > > interface, but it isn't clear what is supposed to be done with it. > > HttpBaseChannel implements nsITraceableChannel. Do you have any examples you > > can point to? > > Ah, OK. nsITraceableChannel is used to catch (and potentially interfere > with) OnStart/OnData/OnStop notifications coming directly from a channel. > It's mainly used by diagnostic tools such as our DevTools (specifically the > Network pane) or Firebug and some other extensions. > > It works the way that an intercepting listener is added between the channel > and the current listener (regular consumer of the channel such as script > loader, image parser, html parser etc). This intercepting listener can then > examine response headers and the response body and is responsible to forward > the notifications to the original target listener of the channel. > > One of the crashes that this patch seems to cause is having > nsStreamListenerWrapper on the top frame. That class is used when the > intercepting listener is added on the channel via the nsITraceableChannel > interface (SetNewListener) wrapping the current channel's listener. See [1]. > > Tho, it's also used for fetch() at [2]. I unfortunately can't say from the > stacks in bug 1319606 which use of nsStreamListenerWrapper leads to the > crash for sure. But my bet is more on DevTools than fetch(). Hence, the > suggestion to exercise the traceable channel. > > > The reason I insist on adding a test for this is that the priming feature is > already causing some intermittent test failures and seems to me overall > immature and unstable so far. So I want to make sure that more changes > (even attempts to fix those intermittents) don't cause more regressions. > > It's probably OK if you only modify existing tests too to add prove we don't > crash or so with this patch. > > Anyway, I'd like you to also focus on the already existing tests that are > intermittently failing and overall stability improvements here, to find the > true causes and fix either the tests or the code. Maybe you already did > that, and I might only miss some analyzes and conclusions added as bugzilla > comments. > > > And one last question, is this fix actually fixing any of the test failures > in bugs it blocks? Does it improve anything? > > Thanks. > > > [1] > https://dxr.mozilla.org/mozilla-central/source/netwerk/protocol/http/ > HttpBaseChannel.cpp#2698 > [2] > https://dxr.mozilla.org/mozilla-central/rev/ > bad312aefb42982f492ad2cf36f4c6c3d698f4f7/dom/xhr/XMLHttpRequestMainThread. > cpp#2755

Flags: needinfo?(kmckinley)

Honza Bambas (:mayhemer)

Comment 48

•

8 years ago

(In reply to Kate McKinley [:kmckinley] from comment #47) > (In reply to Honza Bambas (:mayhemer) from comment #45) > > This is a tough question to answer. Without nsITimerCallback in > NS_IMPL_ISUPPORTS, I reliably get *a* crash when the timer fires, It's just an assertion failing when QI (QueryInterface) doesn't return what is expected (having an interface in hands already, but the object's QI doesn't return a result for it). > but I'm > having a hard time to reproduce the specific crashes in 1319606 and 1317517, OK, then we don't know if you've found a fix or not for them. But it sometimes may be too hard to try to reproduce such crashes. > so I can't say for sure. With the patch applied, there are a number of flaky > things probably caused by priming that are simply not occurring in this test > run. For example, I could reliably reproduce the assertion in bug 1317298 > before, and with the patch, I can't. That's the assertion I describe above. It's fixed with this patch (adding the iface to the impl list). > > There is a treeherder run here with the most current version, fixing some > test flakiness. Which exactly? > These new timeout tests need a bit more tuning. OK. > > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=5575a14d72dffb75b287807434ec7976fb74e22b > > I'm still not sure what you mean by adding a traceable channel. Do you mean > to wrap the HSTSPrimerListener in an nsStreamListenerWrapper? Sorry, it just > isn't clear form the examples or the class documentation what needs to be > done here. No. HTTP channel are implementing nsITraceableChannel which allows you to insert an interception listener (SetNewListener - see its IDL docs). Maybe try using that in your tests. Best is to listen for on-modify-request and attach a listener via that interface (just a dummy that forwards to the original one). > > If all it is is the nsStreamListenerWrapper, there is a patch on try that > adds that and hopefully fixes the test fails. > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=15564dfe8583bafb7fb613c10b7e5d9ac9410cab Will look at it.

Honza Bambas (:mayhemer)

Comment 49

•

8 years ago

(In reply to Kate McKinley [:kmckinley] from comment #47) > If all it is is the nsStreamListenerWrapper, there is a patch on try that > adds that and hopefully fixes the test fails. > https://treeherder.mozilla.org/#/ > jobs?repo=try&revision=15564dfe8583bafb7fb613c10b7e5d9ac9410cab The net::nsStreamListenerWrapper is used when the channel is implemented in JS. I don't think you need to do the following: - rv = primingChannel->AsyncOpen2(listener); + nsCOMPtr<nsIStreamListener> wrappedListener = + new net::nsStreamListenerWrapper(listener); + rv = primingChannel->AsyncOpen2(wrappedListener); Also: nsCOMPtr<nsIHstsPrimingCallback> callback; callback.swap(mCallback); + mCallback = nullptr; + mCallback already is null. swap actually swaps - as the name suggests - the two comptrs. hence, if in |callback| was nothing, that "nothing" ends up in mCallback

Honza Bambas (:mayhemer)

Comment 50

•

8 years ago

Kate: - if the patches that originally resolved:fixed this bug have been backed out please reopen this bug - if not, then please open a new bug (to track the progress correctly - this is a mess!) and put the new patches there - when you submit a patch and Review Board doesn't provide an interdiff, please be so kind and write down what has changed from the last patch ; actually, please do it every time, directly in bugzilla since now - I don't understand why https://reviewboard.mozilla.org/r/95582/diff/#index_header again needs my review

Flags: needinfo?(kmckinley)

Kate McKinley [:kmckinley, :Kate]

Assignee

Comment 51

•

8 years ago

Attached patch Bug 1313595 - Backout for Aurora a=crashes (obsolete) — Details — Splinter Review

Back out HSTS priming lower timeouts on Aurora due to ongoing crashes.

Attachment #8806247 - Attachment is obsolete: true

Flags: needinfo?(kmckinley)

Carsten Book [:Tomcat]

Comment 52

•

8 years ago

https://hg.mozilla.org/releases/mozilla-aurora/rev/94dac74c74a6e462fbbf9a1e1cad9fda3024122e backout landed on aurora

Julien Cristau [:jcristau]

Updated

•

8 years ago

Depends on: 1320877

Carsten Book [:Tomcat]

Comment 53

•

8 years ago

also backed out from central in https://hg.mozilla.org/mozilla-central/rev/15b774db7eab7fc4c9489db9c9f77a2e73536e22

Carsten Book [:Tomcat]

Updated

•

8 years ago

Status: RESOLVED → REOPENED

Flags: needinfo?(kmckinley)

Resolution: FIXED → ---

Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)

Updated

•

8 years ago

status-firefox52: fixed → affected

status-firefox53: fixed → affected

Honza Bambas (:mayhemer)

Updated

•

8 years ago

No longer depends on: 1319606

Virtual_ManPL [:Virtual] 🇵🇱 - (please needinfo? me - so I will see your comment/reply/question/etc.)

Comment 54

•

8 years ago

[Tracking Requested - why for this release]: Regression

Severity: normal → major

tracking-firefox52: --- → ?

tracking-firefox53: --- → ?

Keywords: perf, regression, topperf

Liz Henry (:lizzard) (relman/hg->git project)

Comment 55

•

8 years ago

I don't think relman needs to track this for now. Once we have a viable fix, if it seems like a good idea to uplift, you can request uplift.

tracking-firefox52: ? → -

tracking-firefox53: ? → -

Honza Bambas (:mayhemer)

Comment 56

•

8 years ago

mozreview-review

Comment on attachment 8814315 [details] Bug 1313595 - Missing interface in HSTSPrimerListener https://reviewboard.mozilla.org/r/95584/#review95994 I didn't check the tests deeply. conditioned by a green try run (preferable two or three to catch intermittents this time) ::: dom/security/test/hsts/file_timeout_server.sjs:23 (Diff revision 1) > + } > + // avoid confusing cache behaviors > + response.setHeader("Cache-Control", "no-cache", false); > + response.setHeader("Content-Type", "text/javascript", false); > + // never sends Strict-Transport-Security > + response.write('console.log("timeout: "+'+timeout+');'); spaces around signs ::: dom/security/test/hsts/file_timeout_server.sjs:25 (Diff revision 1) > + response.setHeader("Cache-Control", "no-cache", false); > + response.setHeader("Content-Type", "text/javascript", false); > + // never sends Strict-Transport-Security > + response.write('console.log("timeout: "+'+timeout+');'); > + response.finish(); > + }, timeout, Components.interfaces.nsITimer.TYPE_ONE_SHOT); trailing ws ::: netwerk/protocol/http/HSTSPrimerListener.cpp:75 (Diff revision 1) > HSTSPrimingListener::OnStartRequest(nsIRequest *aRequest, > nsISupports *aContext) > { > nsCOMPtr<nsIHstsPrimingCallback> callback; > callback.swap(mCallback); > + mCallback = nullptr; assigning null not needed ::: netwerk/protocol/http/HSTSPrimerListener.cpp:182 (Diff revision 1) > HSTSPrimingListener::Notify(nsITimer* timer) > { > nsresult rv; > nsCOMPtr<nsIHstsPrimingCallback> callback; > callback.swap(mCallback); > - NS_ENSURE_STATE(callback); > + mCallback = nullptr; not needed

Attachment #8814315 - Flags: review?(honzab.moz) → review+

Comment hidden (mozreview-request)

Kate McKinley [:kmckinley, :Kate]

Assignee

Updated

•

8 years ago

Attachment #8814314 - Attachment is obsolete: true

Attachment #8814314 - Flags: review?(honzab.moz)

Kate McKinley [:kmckinley, :Kate]

Assignee

Updated

•

8 years ago

Attachment #8814315 - Attachment is obsolete: true

Kate McKinley [:kmckinley, :Kate]

Assignee

Comment 58

•

8 years ago

I moved the tests from plain mochitests to browser mochitests, so all tests use the same framework. Treeherder with --repeat=20 (filtered) https://treeherder.mozilla.org/#/jobs?repo=try&revision=3bc5b20c8791b5fdd8e38734f036ea458f1fe67a&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-classifiedState=unclassified

Flags: needinfo?(kmckinley)

Kate McKinley [:kmckinley, :Kate]

Assignee

Updated

•

8 years ago

Flags: needinfo?(honzab.moz)

Kate McKinley [:kmckinley, :Kate]

Assignee

Updated

•

8 years ago

Attachment #8815186 - Attachment is obsolete: true

Kate McKinley [:kmckinley, :Kate]

Assignee

Comment 59

•

8 years ago

Honza, I think this try run shows that this patch fixes the problems with the prior version. I moved the unreliable plain mochitests to browser mochitests.

Comment hidden (mozreview-request)

Pulsebot

Comment 62

•

8 years ago

Pushed by kmckinley@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/677c659aed2c Lower HSTS priming timeout r=mayhemer

Carsten Book [:Tomcat]

Comment 63

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/677c659aed2c

Status: REOPENED → RESOLVED

Closed: 9 years ago → 8 years ago

status-firefox53: affected → fixed

Resolution: --- → FIXED

Honza Bambas (:mayhemer)

Comment 64

•

8 years ago

No longer blocks: 1308612

Iulia Cristescu, QA [:JuliaC] (away, please needinfo? cornel.ionce@softvision.ro)

Comment 71

•

8 years ago

Hello, Kate! Can you please give me a hint about how should I reproduce and verify this issue? Should I consider using the links provided in https://bugzilla.mozilla.org/show_bug.cgi?id=1311807#c0 and https://bugzilla.mozilla.org/show_bug.cgi?id=1311807#c17? Is there anything else I should focus on, beside the loading time?

Flags: needinfo?(kmckinley)

Kate McKinley [:kmckinley, :Kate]

Assignee

Comment 72

•

8 years ago

Hi Iulia, Yes, either of those links should work. The loading time should be < 10 seconds since there are two timeouts. It should be cached for a week, so subsequent loads should be quicker. That is all, I think.

Flags: needinfo?(kmckinley)

Iulia Cristescu, QA [:JuliaC] (away, please needinfo? cornel.ionce@softvision.ro)

Comment 73

•

8 years ago

Thank you, Kate, for your instructions! I managed to reproduce the issue on 52.0a1 (2016-10-22). The bug is verified fixed on 53.0b6 (20170323090023), using Windows 10 x64, Ubuntu 16.04 x64 and Mac OS X 10.11.6.

Status: RESOLVED → VERIFIED

status-firefox53: fixed → verified

Bug 1313595 - Lower HSTS priming timeout 9 years ago Kate McKinley [:kmckinley, :Kate] 58 bytes, text/x-review-board-request	mayhemer : review+ gchang : approval-mozilla-beta-	Details
Backed out changeset a8be4ebc85cf () for permaorange unexpected assertion in test_referrerdirective.html, 9 years ago Sankha Narayan Guria [:sankha] 18.70 KB, patch		Details \| Diff \| Splinter Review
Bug 1313595 - Lower timeouts for HSTS priming 8 years ago Kate McKinley [:kmckinley, :Kate] 58 bytes, text/x-review-board-request		Details
Bug 1313595 - Missing interface in HSTSPrimerListener 8 years ago Kate McKinley [:kmckinley, :Kate] 58 bytes, text/x-review-board-request	mayhemer : review+	Details
Bug 1313595 - Backout for Aurora a=crashes 8 years ago Kate McKinley [:kmckinley, :Kate] 19.30 KB, patch		Details \| Diff \| Splinter Review