Closed Bug 1307122 Opened 8 years ago Closed 8 years ago

barmer.de: Clicking login button causes navigation to stop working

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
major

Tracking

()

VERIFIED FIXED
mozilla52
Tracking Status
firefox49 --- wontfix
firefox50 --- wontfix
firefox51 --- fixed
firefox52 --- fixed

People

(Reporter: Tobbi, Assigned: baku)

References

Details

(Keywords: regression)

Attachments

(3 files, 2 obsolete files)

Trying to login to barmer-gek.de currently hangs the content process.

Steps to reproduce:
1. Go to https://www.barmer-gek.de/ and click the Login button at the top right.
2. Notice that this hangs the content process and you cannot navigate away from this tab / click any links etc..
Component: General → Untriaged
Product: Core → Firefox
Summary: barmer.de: Clicking login button hangs content process → barmer.de: Clicking login button causes navigation to stop working
The issue doesn't happen if I disable JavaScript via Developer Tools options.
Has Regression Range: --- → yes
Has STR: --- → yes
Attached file minimal testcase
the page does sync XMLHttpRequest to maybe-nonexistent host in beforeunload and unload event handlers.
so the handler doesn't return forever.

looks like almost a dupe of bug 705617,
but the regression range doesn't match.
Component: Untriaged → DOM
Product: Firefox → Core
See Also: → 705617
in the testcase, I used https://webenfe.barmer-gek.de/ that is copied from the comment #0's website's case, the web server doesn't respond.
the testcase might become invalid if the server become online.
The testcase has gone invalid already. Can anyone find a replacement site?
baku, could you please take a quick look here?
Flags: needinfo?(amarchesini)
I cannot reproduce the issue... Tobias, can you check archive.org?
NI me if we are able to reproduce the issue.
Flags: needinfo?(amarchesini)
the issue is caused by the server down or something.
so archive.org won't be a solution.

we need either
  * webserver that doesn't respond (and firefox doesn't close connection)
  * webpage that accepts connection but doesn't return body forever
here's standalone testcase, that uses webserver running on node.js.
the webserver accepts the connection, but doesn't respond.

test.html in it opens XHR to the webserver in beforeunload and unload handler.
Flags: needinfo?(amarchesini)
Assignee: nobody → amarchesini
Flags: needinfo?(amarchesini)
Our implementation follows the spec. I don't see why we should close the connection if a sync XHR is used in unload or in beforeunload.
Flags: needinfo?(bugs)
(sync XHR during unload were the reason to add sendBeacon).

What do other browsers do here?

We could add some ugly timeout hack here to cancel the XHR eventually. If the testcase is really what the website is doing, it is behaving badly, but perhaps we should have some mitigation.
We could possible use the timeout property of XHR. It is only for async XHR when JS uses it, but perhaps internally we could set it when document is being unloaded or beforeunloaded.
Flags: needinfo?(bugs)
This seems a reasonable approach.
annevk, do you have any suggestion here?
Flags: needinfo?(annevk)
Not really. Also somewhat curious what other browsers do.
Flags: needinfo?(annevk)
Chromium's back button navigation also appears to not be working. However, at least address bar navigation works.

I filed https://bugs.chromium.org/p/chromium/issues/detail?id=652318 on Chromium's bugtracker.
I would like to go for what smaug suggested in comment 11.
I wonder if Chrome can load the next page only if it ends up using a new process.
In cases it can't use (say, you have been opened by another tab so you have .opener and you're trying to load a same origin page), does it perhaps break the same way as FF.
Attached patch timeout.patch (obsolete) — Splinter Review
This fixes the testcase but it doesn't fix a simple:

for (;;) {
  var xhr = new XMLHttpRequest();
  xhr.open("GET", something, false);
  try { xhr.send(); } catch(e) {}
}
Attachment #8801034 - Flags: review?(bugs)
Actually, that script is blocked in non-e10s. I'm debugging how we manage the slow scripts in e10s mode.
Comment on attachment 8801034 [details] [diff] [review]
timeout.patch

>+    mUnloadEventsTimeStamp(0)
Could you call the variable something like
mBeforeUnloadOrUnloadEventTimeStamp
(but see a comment later)

>+  // This helper class must be set when we dispatch beforeunload and unload
>+  // events in order to avoid unterminate sync XHRs.
>+  class MOZ_RAII WhenUnloadEventsHelper
odd name, IMO. Perhaps BeforeUnloadOrUnloadEventTimeStamp or so something like that
(but see a comment later)

>+    {
>+      MOZ_ASSERT(aDocument);
>+      mDocument->SetUnloadEventsTimeStamp(PR_Now());
This kind of stuff really should use TimeStamp, and NowLoRes() would be fine for this.

>+  PRTime UnloadEventsTimeStamp() const
So this should return TimeStamp()


>-    {
>+    SyncTimeoutType syncTimeoutType = MaybeStartSyncTimeoutTimer();
>+    if (syncTimeoutType == eErrorOrExpired) {
>+      CloseRequestWithError(ProgressEventType::abort);
I think you should just explicitly call Abort()

>   //   these items are specific to markup documents (html and xml)
>   //   may consider splitting these out into a subclass
>   unsigned      mIsSticky : 1;
>-  unsigned      mInPermitUnload : 1;
What has this change to do with this patch?

> nsDocumentViewer::GetBeforeUnloadFiring(bool* aInEvent)
> {
>-  *aInEvent = mInPermitUnload;
>+  *aInEvent = mDocument && mDocument->UnloadEventsTimeStamp();
This is wrong since UnloadEventsTimeStamp is set also when doing unload, not only beforeunload.

>@@ -1313,16 +1312,18 @@ nsDocumentViewer::PageHide(bool aIsUnloa
Ahaa, this method name reminds... we should handle also pagehide event.
That is dispatched in nsDocument.cpp

So, need to think about naming a bit.
Perhaps WhenUnloadEventsHelper should become PageUnloadingEventTimeStamp or some such.
And similar naming elsewhere.
Attachment #8801034 - Flags: review?(bugs) → review-
Attached patch timeout.patch (obsolete) — Splinter Review
Attachment #8801034 - Attachment is obsolete: true
Attachment #8801116 - Flags: review?(bugs)
Comment on attachment 8801116 [details] [diff] [review]
timeout.patch

based on irc, I'm expecting to see yet another patch.
Attachment #8801116 - Flags: review?(bugs)
Attached patch timeout.patchSplinter Review
Attachment #8801116 - Attachment is obsolete: true
Attachment #8801173 - Flags: review?(bugs)
Comment on attachment 8801173 [details] [diff] [review]
timeout.patch

>-  DispatchPageTransition(target, NS_LITERAL_STRING("pagehide"), aPersisted);
>+  {
>+    PageUnloadingEventTimeStamp(this);
This shouldn't be a temporary variable, but you need PageUnloadingEventTimeStamp timeStamp(this);
or so.

>+  mozilla::TimeStamp UnloadEventsTimeStamp() const
Call this GetPageUnloadingEventTimeStamp
or something to be more consistent with naming

>+  void SetUnloadEventsTimeStamp()
SetPageUnloadingEventTimeStamp



>+  mozilla::TimeStamp mBeforeUnloadOrUnloadEventTimeStamp;
mPageUnloadingEventTimeStamp

> #define NS_PROGRESS_EVENT_INTERVAL 50
>+#define MAX_SYNC_TIMEOUT_WHEN_UNLOADING 5000
Feels still a bit low on slow connections. 10000 perhaps? This is after all something which well behaving websites shouldn't trigger.

>+    SyncTimeoutType syncTimeoutType = MaybeStartSyncTimeoutTimer();
>+    if (syncTimeoutType == eErrorOrExpired) {
>+      CloseRequestWithError(ProgressEventType::abort);
Why not Abort()?

>+XMLHttpRequestMainThread::HandleSyncTimeoutTimer()
>+{
>+  MOZ_ASSERT(mSyncTimeoutTimer);
>+  MOZ_ASSERT(mFlagSyncLooping);
>+
>+  CancelSyncTimeoutTimer();
>+  CloseRequestWithError(ProgressEventType::abort);
Why not Abort() ?

those fixed, r+
Attachment #8801173 - Flags: review?(bugs) → review+
Could we get some test for this? (I would expect .sjs could be used for the test)
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/94f47e107744
Introducing a timeout for sync XHR when unload events are dispatched, r=smaug
https://hg.mozilla.org/mozilla-central/rev/94f47e107744
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla52
Hi :Tobbi,
could you please verify this issue is fixed as expected on a latest Nightly build? Thanks!
Flags: needinfo?(tobbi.bugs)
(In reply to Gerry Chang [:gchang] from comment #27)
> Hi :Tobbi,
> could you please verify this issue is fixed as expected on a latest Nightly
> build? Thanks!

I was able to verify the fix with the attached testcase. Thanks!
Status: RESOLVED → VERIFIED
Flags: needinfo?(tobbi.bugs)
Hi :baku,
Since this bug is a regression, do you consider to uplift this for 51 aurora at least?
Flags: needinfo?(amarchesini)
I don't think this is a regression, actually, but still, we could uplift this for aurora.
Flags: needinfo?(amarchesini)
Comment on attachment 8801173 [details] [diff] [review]
timeout.patch

Approval Request Comment
[Feature/regressing bug #]: Sync XHR
[User impact if declined]: The tab can freeze if a sync XHR is done in unload event and the server doesn't answer.
[Describe test coverage new/current, TreeHerder]:green on try. Test is part of the patch.
[Risks and why]: not too much. The patch introduces a concept of timer for doing sync XHRs when unloading.
[String/UUID change made/needed]: none
Attachment #8801173 - Flags: approval-mozilla-aurora?
Comment on attachment 8801173 [details] [diff] [review]
timeout.patch

Fix an issue relate to sync XHR and was verified. Take it in 51 aurora.
Attachment #8801173 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: