Open Bug 1441955 Opened 6 years ago Updated 2 years ago

Investigate whether the async tab switcher needs a re-entrancy guard

Categories

(Firefox :: Tabbed Browser, enhancement, P3)

enhancement

Tracking

()

Tracking Status
firefox60 --- affected

People

(Reporter: alexical, Unassigned)

References

(Blocks 1 open bug)

Details

We wrap handleEvent in the async tab switcher with a guard that checks and sets a _processing boolean to determine if it's re-entered itself. If it has, it pushes itself to the end of the event queue with an nsITimer. If the tab switcher is re-entrant, this is incorrect because it could lead to a reordering of its events, and also because the _processing flag is not set in its other methods which call postActions.

Visually inspecting the code, I can't seem to find a way in which it could re-enter itself. Chatting with Gijs, it may have been possible when this was in XBL, but it shouldn't be possible now that it's in tabbrowser.js. However, I don't want to unwittingly introduce bugs by just removing the check. Is there a common procedure for removing code that seems to be useless but may not be, mconley?
Flags: needinfo?(mconley)
I just saw a stack in the debugger where we re-entered into a handleEvent with a MozAfterPaint event when checking minimizedOrFullyOccluded in onSizeModeOrOcclusionStateChange[1]. Gijs, is this the micro-task thing you were referring to? Or some oddity of the browser tools debugger? I couldn't find anything digging through the backend of windowState or isFullyOccluded that could account for this.

[1]: https://searchfox.org/mozilla-central/rev/769222fadff46164f8cc0dc7a0bae5a60dc2f335/browser/base/content/tabbrowser.js#4756
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Doug Thayer [:dthayer] from comment #0)
> Is there a common procedure for removing code that seems to be
> useless but may not be, mconley?

The async tab switcher is kind of a funny case because of its fragility. I'd go back to basics here, and re-examine the bug that added the re-entrancy guard. It was added for a reason there, to fix a real bug. If you remove the guard, does the bug return?

If it doesn't, and all we have is speculation on what the guard could protect against, I'm for removing it and preparing to put  it back if / when regressions surface. At least that way, we can record _why_ it's still necessary.

Removing code from the async tab switcher, especially code that has dubious value or purpose, is righteous.
Flags: needinfo?(mconley)
(In reply to Doug Thayer [:dthayer] from comment #1)
> I just saw a stack in the debugger where we re-entered into a handleEvent
> with a MozAfterPaint event when checking minimizedOrFullyOccluded in
> onSizeModeOrOcclusionStateChange[1]. Gijs, is this the micro-task thing you
> were referring to? Or some oddity of the browser tools debugger? I couldn't
> find anything digging through the backend of windowState or isFullyOccluded
> that could account for this.
> 
> [1]:
> https://searchfox.org/mozilla-central/rev/
> 769222fadff46164f8cc0dc7a0bae5a60dc2f335/browser/base/content/tabbrowser.
> js#4756

Possibly... :arai can explain this in more detail. See https://bugzilla.mozilla.org/show_bug.cgi?id=1426599#c5 and https://bugzilla.mozilla.org/show_bug.cgi?id=1416153 comment 3 and later discussion with Bevis and Olli.


If this is indeed a XBL-induced microtask checkpoint then it may have gone away now that tabbrowser has moved into JS (which landed on central sometime yesterday).

In any case, based on https://bugzilla.mozilla.org/show_bug.cgi?id=1441872#c17 I would expect this to only happen when there's no JS on the stack. Clearly this is buggy for the XBL case per https://bugzilla.mozilla.org/show_bug.cgi?id=1416153#c3 . This is a little bit scary because, while it's easy to say that this won't affect web content directly (because no XBL) it sure does affect browser UI code if we get promise callbacks firing (and/or async functions continuing once the 'await' succeeds) seemingly-willy-nilly while other code is trying to run, just because we happened to touch a XBL property...

On the upside, as we're moving away from XBL these problems will become smaller.

It would be helpful if there was a document explaining in what precise situations this is going to affect scheduling *specifically for chrome code*. Because certainly the stack in that comment does not follow from the explanations in https://docs.google.com/presentation/d/1momsC3suU8m-CrdZyYD_6QATATehjzZHbkGmL6KsmSk/edit#slide=id.g28ecd2197a_0_238 which was linked from the intent to ship in terms of "look here if you need to adapt your code". Bevis/Arai, does something like that exist? (Does this only happen for XBL setters or also getters, for instance?)
Flags: needinfo?(gijskruitbosch+bugs)
Flags: needinfo?(bevistseng)
Flags: needinfo?(arai.unmht)
(and based on bz's comment, xpconnect C++ -> JS calls don't get a microtask checkpoint -- what about the inverse? I would assume not, but the XBL thing has me worried because most (all) our web-exposed C++-implemented stuff is now webidl anyway.)
> what about the inverse?

I would think not.  Nothing in there to do a checkpoint.  

Fundamentally, I am not aware of anything in our codebase that implements the actual spec behavior for microtask checkpoints (skipping them when there is JS on the stack, etc).  We should really do that....
(In reply to Mike Conley (:mconley) (:⚙️) (Totally backlogged on reviews and needinfos) from comment #2)
> The async tab switcher is kind of a funny case because of its fragility. I'd
> go back to basics here, and re-examine the bug that added the re-entrancy
> guard. It was added for a reason there, to fix a real bug. If you remove the
> guard, does the bug return?

The bug is Bug 1192720 - unfortunately I don't seem to have a device capable of reproducing it / I don't know how to get the right on screen keyboard to produce it. (Not too familiar with tablet mode - but neither the regular ease of access on screen keyboard or the on screen keyboard accessed via the icon in the bottom right of the screen yield the bug.)

If anyone has any suggestions for producing it or access to devices that can produce it I'll put up a patch for testing.
> (In reply to Doug Thayer [:dthayer] from comment #1)
> I just saw a stack in the debugger where we re-entered into a handleEvent
> with a MozAfterPaint event when checking minimizedOrFullyOccluded in
> onSizeModeOrOcclusionStateChange[1]. Gijs, is this the micro-task thing you
> were referring to? Or some oddity of the browser tools debugger? I couldn't
> find anything digging through the backend of windowState or isFullyOccluded
> that could account for this.

do you mean JS debugger, or gdb/lldb ?

if you mean JS debugger, I'm not sure if scheduling preserves there.
I'd forward the question to devtools debugger people in that case.

Then, if you mean gdb/lldb,
how do you confirm that the issue happens when checking minimizedOrFullyOccluded ?
I'm wondering if there's still some XBL things deeply inside minimizedOrFullyOccluded accessor body.


(In reply to :Gijs from comment #3)
> Bevis/Arai, does something like that exist? (Does this only
> happen for XBL setters or also getters, for instance?)

I don't have documentation prepared.
then, as far as I know (I confirmed), the strange thing happens only with XBL accessors, both setters and getters.
Flags: needinfo?(arai.unmht)
Priority: -- → P3
(In reply to Tooru Fujisawa [:arai] from comment #7)
> > (In reply to Doug Thayer [:dthayer] from comment #1)
> > I just saw a stack in the debugger where we re-entered into a handleEvent
> > with a MozAfterPaint event when checking minimizedOrFullyOccluded in
> > onSizeModeOrOcclusionStateChange[1]. Gijs, is this the micro-task thing you
> > were referring to? Or some oddity of the browser tools debugger? I couldn't
> > find anything digging through the backend of windowState or isFullyOccluded
> > that could account for this.
> 
> do you mean JS debugger, or gdb/lldb ?
> 
> if you mean JS debugger, I'm not sure if scheduling preserves there.
> I'd forward the question to devtools debugger people in that case.

I do mean the JS debugger. Jim, do you know of anything that could cause funky stacks like that I observed?

I believe the stack was:

handleEvent (with a MozAfterPaint event)
handleEvent (which set the timeout with the re-entrancy guard)
onSizeModeOrOcclusionStateChange
handleEvent
Flags: needinfo?(jimb)
:arai has answered the question in comment 7.
Flags: needinfo?(bevistseng)
See Also: → 1523946
Flags: needinfo?(jimb)

Devtools did at one point try to include asynchronous calls in stack traces - that is, if a function F's returning was going to resolve a promise, and that promise had an async function G suspended awaiting it, then we would show G as F's caller. But we did make a visual distinction between synchronous and asynchronous callers, so you would have been able to tell, if that's what was going on.

Other than that, devtools should always be showing an accurate presentation of the JS on the stack.

Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.