Figure out how to improve the user experience/reduce the code complexity for the beforeunload timeout

RESOLVED DUPLICATE of bug 1336763

Status

()

Firefox
Tabbed Browser
--
enhancement
RESOLVED DUPLICATE of bug 1336763
11 months ago
7 months ago

People

(Reporter: sworddragon2, Unassigned)

Tracking

50 Branch
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

11 months ago
Forked from bug #1325527:

If e10s is enabled and the user closes a site the related tab/window is being requested to respond if there is a beforeunload handler. If the site does not respond it will be closed after a timeout of 5 seconds. This can cause an UI-race which is handled at the linked ticket and this ticket here tries to figure out if we can enhance this by further increasing the usability and/or reducing the code complexity.

My idea was instead of having the main process to ask the child process for beforeunload handlers on site closing the child process could tell the main process as soon as he adds/removes beforeunload handlers. The main process keeps then track of this information for all sites which avoids the need to get this information if the user closes the site which I currently think would eliminate the 5 second timeout.


(In reply to :Gijs from comment #9)
> Well no, because we still have to do the timeout dance if there is a
> beforeunload handler (which unfortunately is more common than one would
> like...). Having two codepaths instead of 1 is more complex.

If I'm causing an infinite loop in the beforeunload handler the tab instead of being automatically closed after 5 seconds the "A web page is slowing down your browser. What would you like to do?" dialog appears after dom.max_script_run_time seconds. This also requires to keep the code complexity for the timeout dance this ticket tries to reduce?

Comment 1

11 months ago
(In reply to sworddragon2 from comment #0)
> (In reply to :Gijs from comment #9)
> > Well no, because we still have to do the timeout dance if there is a
> > beforeunload handler (which unfortunately is more common than one would
> > like...). Having two codepaths instead of 1 is more complex.
> 
> If I'm causing an infinite loop in the beforeunload handler the tab instead
> of being automatically closed after 5 seconds the "A web page is slowing
> down your browser. What would you like to do?" dialog appears after
> dom.max_script_run_time seconds. This also requires to keep the code
> complexity for the timeout dance this ticket tries to reduce?

Yes, because we'll still need to keep a timeout (now based on the pref you cite) and then we'd need to implement alternative UI to display a notification.

These could be two separate bugs, but I'm curious what Bill thinks before we file #2.
Flags: needinfo?(wmccloskey)
(Reporter)

Comment 2

11 months ago
(In reply to :Gijs from comment #1)
> Yes, because we'll still need to keep a timeout (now based on the pref you
> cite)

So still the same codepath is needed. I have no idea how this specific case can be improved then as this scenario should be always able to cause a potential closing delay.

But in case a site has no beforeunload handler it should at least improve the user experience as the tab would then always close immediately. Currently such sites may still delay the closing which will at least give feedback in case bug #1325527 gets resolved.

I also remember that you have been sayed in the linked ticket that beforeunload handlers are often used in ads/trackers. Are beforeunload handlers that much widespread? Is there a vague information/estimation how many of them are actually causing a dialog to show up?


(In reply to :Gijs from comment #1)
> and then we'd need to implement alternative UI to display a
> notification.

The dom.max_script_run_time timeout does already show its UI. Or what was in your mind? :)

Comment 3

11 months ago
I'm having trouble understanding the question I'm supposed to answer. I'll try my best to give general information.

(In reply to sworddragon2 from comment #0)
> My idea was instead of having the main process to ask the child process for
> beforeunload handlers on site closing the child process could tell the main
> process as soon as he adds/removes beforeunload handlers. The main process
> keeps then track of this information for all sites which avoids the need to
> get this information if the user closes the site which I currently think
> would eliminate the 5 second timeout.

This would sort of work, but it's racy. The danger is that the page could add an onbeforeunload handler right around the time that the user clicks the close box. Then, when we close the tab, we would run the unload handler without running the beforeunload (assuming the parent doesn't receive the notification of a new handler in time). Which is not spec compliant.

We could decide that we don't care about violating the spec in that case. It's likely to be rare. But that's the reason why I did it the way I did.

> (In reply to :Gijs from comment #9)
> > Well no, because we still have to do the timeout dance if there is a
> > beforeunload handler (which unfortunately is more common than one would
> > like...). Having two codepaths instead of 1 is more complex.
> 
> If I'm causing an infinite loop in the beforeunload handler the tab instead
> of being automatically closed after 5 seconds the "A web page is slowing
> down your browser. What would you like to do?" dialog appears after
> dom.max_script_run_time seconds. This also requires to keep the code
> complexity for the timeout dance this ticket tries to reduce?

I don't really understand the argument about code complexity. The beforeunload code is totally different from the max_script_run_time code, so removing the former would not affect the latter. Note that, even if we close the tab, other tabs in the same content process will still show the "Web page is slowing down" infobar. So we don't necessarily need any new UI if we did allow the close to happen.

There is a corner case (mostly in the e10s-multi world) where the tab being closed was the last one in the content process. In that case no UI could be shown for a hung content process. However, we already need to deal with that in other situations (by killing the process). So that's not really an issue either.
Flags: needinfo?(wmccloskey)

Comment 4

11 months ago
(In reply to Bill McCloskey (:billm) from comment #3)
> > (In reply to :Gijs from comment #9)
> > > Well no, because we still have to do the timeout dance if there is a
> > > beforeunload handler (which unfortunately is more common than one would
> > > like...). Having two codepaths instead of 1 is more complex.

My point was about having 1 codepath for the case where we believe there is no beforeunload handler, and 1 codepath for the case where we know there *is* a beforeunload handler. The latter already exists, the former we would have to add (though you've added some more reasons why that might not be a good idea). We'd need to keep the latter even if we added the former, because we still have to keep dealing with extant beforeunload handlers. Hence, more code to deal with the same thing. Even besides the race condition, I'm not sure it's worth investing in.

> > If I'm causing an infinite loop in the beforeunload handler the tab instead
> > of being automatically closed after 5 seconds the "A web page is slowing
> > down your browser. What would you like to do?" dialog appears after
> > dom.max_script_run_time seconds. This also requires to keep the code
> > complexity for the timeout dance this ticket tries to reduce?
> 
> I don't really understand the argument about code complexity. The
> beforeunload code is totally different from the max_script_run_time code, so
> removing the former would not affect the latter. Note that, even if we close
> the tab, other tabs in the same content process will still show the "Web
> page is slowing down" infobar. So we don't necessarily need any new UI if we
> did allow the close to happen.
> 
> There is a corner case (mostly in the e10s-multi world) where the tab being
> closed was the last one in the content process. In that case no UI could be
> shown for a hung content process. However, we already need to deal with that
> in other situations (by killing the process). So that's not really an issue
> either.

I think the suggestion from sworddragon2 is that we use the slow script timeout instead of the timeout for the beforeunload code to 'resolve' the tab that is trying to navigate/close but is 'stuck', and that would allow removing the special 'timeout' case for the beforeunload code. So we would 'never' time out the request, and always just keep the tab open, trusting that after we kill the script that's slowing things down in the client process, the beforeunload request messaging would eventually return. It's not clear to me if that would always work and/or if it would be a better experience.

Comment 5

11 months ago
(In reply to :Gijs from comment #4)
> My point was about having 1 codepath for the case where we believe there is
> no beforeunload handler, and 1 codepath for the case where we know there
> *is* a beforeunload handler. The latter already exists, the former we would
> have to add (though you've added some more reasons why that might not be a
> good idea). We'd need to keep the latter even if we added the former,
> because we still have to keep dealing with extant beforeunload handlers.
> Hence, more code to deal with the same thing. Even besides the race
> condition, I'm not sure it's worth investing in.

Well, I was thinking that the "no beforeunload handler" path would just close the tab immediately. That seems pretty simple. I guess we would have to test both cases, but we should probably be doing that anyway.

> I think the suggestion from sworddragon2 is that we use the slow script
> timeout instead of the timeout for the beforeunload code to 'resolve' the
> tab that is trying to navigate/close but is 'stuck', and that would allow
> removing the special 'timeout' case for the beforeunload code. So we would
> 'never' time out the request, and always just keep the tab open, trusting
> that after we kill the script that's slowing things down in the client
> process, the beforeunload request messaging would eventually return. It's
> not clear to me if that would always work and/or if it would be a better
> experience.

The slow script timeout unfortunately doesn't cover other forms of content slowness. For example, loading the code for sqlite3.c in Searchfox will hang your browser for at least 10 seconds, but it's all in layout. (The code is 200,000 lines in one file.) I think we still need the timeout for that case.

Arguably we should have some sort of general content timer to replace the slow script timer. But that's more work.

Updated

11 months ago
Duplicate of this bug: 1326002
(Reporter)

Comment 7

11 months ago
(In reply to Bill McCloskey (:billm) from comment #3)
> This would sort of work, but it's racy. The danger is that the page could
> add an onbeforeunload handler right around the time that the user clicks the
> close box. Then, when we close the tab, we would run the unload handler
> without running the beforeunload (assuming the parent doesn't receive the
> notification of a new handler in time). Which is not spec compliant.

Is there any way to avoid this race condition? In the worst case can we do some sort of IPC locking in the e10s context to avoid it? Do we eventually even have such IPC locking capabilities already implemented or are there maybe other issues that would eventually require them?


(In reply to Bill McCloskey (:billm) from comment #3)
> I don't really understand the argument about code complexity.

If I'm not wrong in the previous ticket one of Gijs claims was that solving this issue might increase the code complexity. I was thinking that changing the behavior from "request information from the child process on tab closing" to "letting the child process give the host process all needed information as early as possible" would maybe eliminate a code path or at least remove a significant amount of unique code. While changing this behavior might also add additional code I'm hoping that this keeps at least the balance.

In case changing this behavior will increase the code complexity significantly there would be a trade-off between it and enhanced user experience.
Component: Untriaged → Tabbed Browser

Comment 8

7 months ago
I believe this was addressed by mconley in bug 1336763, and we accepted that to improve the UX here we needed more code (and thus slightly more complexity), so we track which tabs have beforeunload handlers now. As a result, I'm duping this bug there.

bug 1350060 is about tracking long-running unload handlers which are hang-y (x-ref also bug 1103323). We could still implement bug 1325527 and/or reduce the timeout. I've heard from Chrome folks that their timeout is a lot shorter (1 second vs. our 5 seconds - see this thread, which may be interesting for some more context: https://groups.google.com/a/chromium.org/d/msg/blink-dev/MRWIaGY4Txg/oXN37OH2DQAJ ). Bill, how would you feel about aligning with Chrome on the timeout duration?
Status: UNCONFIRMED → RESOLVED
Last Resolved: 7 months ago
Flags: needinfo?(wmccloskey)
Resolution: --- → DUPLICATE
Duplicate of bug: 1336763
I'm fine with using a 1 second timeout.
Flags: needinfo?(wmccloskey)

Comment 10

7 months ago
(In reply to Bill McCloskey (:billm) from comment #9)
> I'm fine with using a 1 second timeout.

Excellent, filed bug 1361650.
You need to log in before you can comment on or make changes to this bug.