Closed Bug 1229429 Opened 8 years ago Closed 3 years ago

[e10s] Windowed Flash may display with wrong tab after rapid tab switching

Categories

(Core Graveyard :: Plug-ins, defect, P3)

48 Branch
defect

Tracking

(e10s+, firefox45 wontfix, firefox46 disabled, firefox47 disabled, firefox48 wontfix, firefox49 wontfix, firefox50 wontfix, firefox51 ?, firefox52 ?)

RESOLVED WONTFIX
Tracking Status
e10s + ---
firefox45 --- wontfix
firefox46 --- disabled
firefox47 --- disabled
firefox48 --- wontfix
firefox49 --- wontfix
firefox50 --- wontfix
firefox51 --- ?
firefox52 --- ?

People

(Reporter: arni2033, Unassigned)

References

Details

(Keywords: regression, Whiteboard: tpi:+)

Attachments

(3 files, 2 obsolete files)

>>>   My Info:   Win7_64, Nightly 45, 32bit, ID 20151201030226
STR:
1. Open "testcase 1 (flash)" in a new window
2. Open "testcase 2 (no flash)" in that window
3. Make sure that there's only 2 opened tabs in that window
4. Switch to one of those 2 tabs, then hold Ctrl+Tab for a few seconds (~3s)

Result:       
 Either (A) or (B) happens
 A) "testcase 1 (flash)"    display no plugins on the page
 B) "testcase 2 (no flash)" displays flash on the page

Expectations: 
 There should be flash content displayed on "testcase 1 (flash)"
 No plugins should be displayed on "testcase 2 (no flash)"

Note:
 This bug is NOT a regression after bug 1227666. It's reproducible on previous builds.
Correction (step 4 should be modified):
>   4. Quickly press Ctrl+Tab (or Ctrl+PageDown) twice
Blocks: 1152049
No longer blocks: 1152049
Depends on: 1152049
tracking-e10s: --- → +
Flags: needinfo?(jmathies)
Flags: needinfo?(jmathies)
I think this might have something to do with the way the compositor currently waits on confirmation for plugin window updates from CompositorChild. It should wait on a sequence number here [1] vs. a boolean. I'm guessing with the rapid tab switching this triggers that flag is getting out of sync.
 
[1] http://mxr.mozilla.org/mozilla-central/search?string=mPluginUpdateResponsePending
Blocks: e10s-plugins
Summary: [e10s] Flash is (not) displayed on wrong web pages if I quickly switch between tabs → [e10s] Windowed Flash may display with wrong tab after rapid tab switching
Has STR: --- → yes
@Reporter: Not reproducible in this version. Are you still seeing this error?
Name 	Firefox
Version 	46.0a2
Build ID 	20160205004003
User Agent 	Mozilla/5.0 (Windows NT 6.1; WOW64; rv:46.0) Gecko/20100101 Firefox/46.0
Flags: needinfo?(arni2033)
Yes, as I told in bug 1229455, this now requires either new testcase OR
setting pref "plugins.rewrite_youtube_embeds" to false.
Flags: needinfo?(arni2033)
User Agent 	Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:46.0) Gecko/20100101 Firefox/46.0
Build ID 	20160210004006

Tested on Windows 7 x64 on the Aurora 46.0a2 build. 
I set the "plugins.rewrite_youtube_embeds" to false (in about:config) and this issue reproduces with e10s enabled. With e10s disabled, I couldn't reproduce this issue.
Assignee: nobody → jmathies
The cause is that during these rapid tab switches, CrossProcessCompositorParent::ShadowLayersUpdated() never gets called, and that's the call that delivers plugin state changes.

Note I'm only seeing this on really basic test cases. More complex content almost always has layer updates coming across which flush the plugin data out.
(In reply to Jim Mathies [:jimm] from comment #8)
> I'm only seeing this on really basic test cases.
I just reproduced this on the latest Nightly with:
url [1] - Flash (requires disabling "rewrite youtube embeds" pref), and url [2] - no_flash.
[1] http://www.stalker.pl/forum/viewtopic.php?f=9&t=134&start=6276
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1210197
I only get result (B). NI me if you have more complex examples or if you want to see the screencast.
Attached patch patch (obsolete) — — Splinter Review
Blocks: 1229455
I'm tempted to remove this check and the variable completely but I'm worrying about side effects. Doing so would address a couple corner case bugs that are currently on file related to updated clipping with the same layer tree id in use.

I'll investigate doing so in another bug though.
Comment on attachment 8719786 [details]
MozReview Request: Bug 1229429 - Refresh plugin window state during composition if we detect a change of layer trees. Fixes issues with plugin window visibility during rapid tab switching. r?roc

https://reviewboard.mozilla.org/r/35117/#review31743
Attachment #8719786 - Flags: review?(roc) → review+
Keywords: checkin-needed
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/58b7b92c3ad9
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla47
This bug is fixed on:   Win7_64, Nightly 47, 32bit, ID 20160218030349
Status: RESOLVED → VERIFIED
Comment on attachment 8719786 [details]
MozReview Request: Bug 1229429 - Refresh plugin window state during composition if we detect a change of layer trees. Fixes issues with plugin window visibility during rapid tab switching. r?roc

Approval Request Comment
[Feature/regressing bug #]:
e10s plugins
[User impact if declined]:
minor tab switching bug with plugin windows.
[Describe test coverage new/current, TreeHerder]:
on nightly
[Risks and why]: 
low, well understood area of the code.
[String/UUID change made/needed]:
none.
Attachment #8719786 - Flags: approval-mozilla-aurora?
Marking affected for 46.
Comment on attachment 8719786 [details]
MozReview Request: Bug 1229429 - Refresh plugin window state during composition if we detect a change of layer trees. Fixes issues with plugin window visibility during rapid tab switching. r?roc

Fixes an e10s regression, tested on m-c, ok to uplift to aurora.
Attachment #8719786 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Depends on: 1253688
This was backed out in bug 1253688.
Status: VERIFIED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla47 → ---
The problem in bug 1253688 was caused by having two remote browsers composited in the same layer tree. This broke an assumption in UpdatePluginWindowState about current remote layer ids.

This bug is still an issue and at this point I'm not sure how to fix it. With heavy tab switching on simple pages, we fail to get layer tree updates for the final tab the user lands on. Plugin window state will remain as it was when the last remote layer tree update came over.

As a result of my work in bug 1253688 I also filed bug 1253748 which deals with a situation where both remote layer trees have windowed plugins. This is not known to occur but could happen when we release web extensions.
Blocks: 1242273
So I read through the bug, maybe I'm misunderstanding the problem but it seems like your suggestion in comment 4 should fix it, no?

AIUI the rapid tab switching causes mPluginUpdateResponsePending to get set to true multiple times and then it gets set back to false with the response for one of the earlier tab switches. So changing it to a sequence number and only allowing the composite once a response arrives with the most recent sequence number should fix the problem.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #23)
> So I read through the bug, maybe I'm misunderstanding the problem but it
> seems like your suggestion in comment 4 should fix it, no?

No, I put that fix together as an experiment and it didn't help. It's a separate issue I need to file a new bug on. It has some draw backs, spurious paints that leak through this actually make scrolling smoother!

I've debugged what's going on here using the plugin logging and understand the cause - no plugin updates get sent to the main thread on the last tab switch. Plugin updates are triggered by a remote layer tree update that modifies plugin window metrics which get picked up in UpdatePluginWindowState.

I think the fix here involves figuring out when a remote layer tree gets swapped out so we can force a flush of plugin updates to the main thread. My first landing here did this using a cached "last remote layer tree id" check. This fixed the problem but the id caching check broke with Loops chat window which introduced a second remote layer tree into the top level 'chrome' tree.

Detecting the change needs to happen when we walk the tree via AutoResolveRefLayers [1]. I'm not sure how to go about detecting this kind of change. We discard the old relationship shortly after composition through another call to AutoResolveRefLayers.

[1] http://mxr.mozilla.org/mozilla-central/source/gfx/layers/ipc/CompositorParent.cpp#1212
Priority: -- → P1
So I think I understand the problem better now. Restating to make sure I get it: the remote layer tree update sets the mUpdatedPluginDataAvailable flag on the LayerTreeState, and when we go to composite, we check the flag and flush plugin window state to the main-thread if it's set. While tab-switching furiously, we can composite the last tab without having gotten a remote layer tree update for it, and so lts.mUpdatedPluginDataAvailable is false and we skip the flush of the plugin window state.

Looking at the patch you landed earlier, I can see why it would fix the bug (it catches the case where we switch tabs, by detecting the change in mLastPluginUpdateLayerTreeId). It's not as clear to me why it broke Loop. My understanding is you have two different remote layer trees attached to the parent tree, then UpdatePluginWindowState should get called twice (once for each layer id) per composite. With your patch in place, this would cause mLastPluginUpdateLayerTreeId to flip-flop between the two ids, and this code would basically flush plugin state to the main-thread twice per composite, possibly waiting for the response from the main thread, and generally cause the compositor to slow way down. Is that accurate?

In terms of fixing this, I think you could do it by:

1) Add a counter that increments on each composite (or each time you do the AutoResolveRefLayers with plugin resolution).

2) Remove mLastPluginUpdateLayerTreeId and replace it with a per-layer-tree "last composited" member. This member would be updated when that layer tree is touched in UpdatePluginWindowState. The code in UpdatePluginWindowState would look sort of like this:

{
  ...

  if (!lts.mUpdatedPluginDataAvailable && lts.mLastCompositedOn == mCompositeCounter) {
    // no layer tree updates, and this layer tree was part of the last composite,
    // so we're good to go. Make sure to update the mLastCompositedOn counter so that
    // it stays in sync with mCompositeCounter
    lts.mLastCompositedOn++;
    return false;
  }

  // else, compare mCachedPluginData to
  // lts.mPluginData, and send the main-thread flush if anything changed

  lts.mLastCompositedOn = mCompositeCounter + 1;
  ...
}

The above implementation assumes that mCompositeCounter would get incremented after the AutoResolveRefLayers happened, so that at the time UpdatePluginWindowState runs mCompositeCounter reflects the number of the last composite that happened. You could do the increment first too and adjust the code accordingly.

Does that sound like it might work?

One other thing that might be a good idea is to move mCachedPluginData to be on the LayerTreeState, so that it is per-layer-tree rather than "global" to the CompositorParent. Although I'm not entirely sure if that makes sense with the "hide all plugins" behaviour, since that's global and not per-layer-tree. Regardless, I think this is probably not required for the fix as I imagine it.
(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #25)
> So I think I understand the problem better now. Restating to make sure I get
> it: the remote layer tree update sets the mUpdatedPluginDataAvailable flag
> on the LayerTreeState, and when we go to composite, we check the flag and
> flush plugin window state to the main-thread if it's set. While
> tab-switching furiously, we can composite the last tab without having gotten
> a remote layer tree update for it, and so lts.mUpdatedPluginDataAvailable is
> false and we skip the flush of the plugin window state.
> 
> Looking at the patch you landed earlier, I can see why it would fix the bug
> (it catches the case where we switch tabs, by detecting the change in
> mLastPluginUpdateLayerTreeId). It's not as clear to me why it broke Loop. My
> understanding is you have two different remote layer trees attached to the
> parent tree, then UpdatePluginWindowState should get called twice (once for
> each layer id) per composite. With your patch in place, this would cause
> mLastPluginUpdateLayerTreeId to flip-flop between the two ids, and this code
> would basically flush plugin state to the main-thread twice per composite,
> possibly waiting for the response from the main thread, and generally cause
> the compositor to slow way down. Is that accurate?

Yes. Plus since we block composition waiting on an async message sent from CompositorChild confirming plugins were updated, and trigger a composite when we receive it, which triggers another plugin up, the compositor thread grinds things down.

Note the waiting on confirmation stuff is going away soon via bug 1148978.

> In terms of fixing this, I think you could do it by:
> 
> 1) Add a counter that increments on each composite (or each time you do the
> AutoResolveRefLayers with plugin resolution).
> 
> 2) Remove mLastPluginUpdateLayerTreeId and replace it with a per-layer-tree
> "last composited" member. This member would be updated when that layer tree
> is touched in UpdatePluginWindowState. The code in UpdatePluginWindowState
> would look sort of like this:
> 
> {
>   ...
> 
>   if (!lts.mUpdatedPluginDataAvailable && lts.mLastCompositedOn ==
> mCompositeCounter) {
>     // no layer tree updates, and this layer tree was part of the last
> composite,
>     // so we're good to go. Make sure to update the mLastCompositedOn
> counter so that
>     // it stays in sync with mCompositeCounter
>     lts.mLastCompositedOn++;
>     return false;
>   }
> 
>   // else, compare mCachedPluginData to
>   // lts.mPluginData, and send the main-thread flush if anything changed
> 
>   lts.mLastCompositedOn = mCompositeCounter + 1;
>   ...
> }
> 
> The above implementation assumes that mCompositeCounter would get
> incremented after the AutoResolveRefLayers happened, so that at the time
> UpdatePluginWindowState runs mCompositeCounter reflects the number of the
> last composite that happened. You could do the increment first too and
> adjust the code accordingly.
> 
> Does that sound like it might work?

Yeah that looks like it might do it. I'll put something together to test.
Attachment #8719783 - Attachment is obsolete: true
Attachment #8719786 - Attachment is obsolete: true
Whiteboard: e10st?
Should this be tracked for 48?
Making it show up as a 48 regression as that's where e10s is.
Version: Trunk → 48 Branch
This really only first "shows up" in 48 with e10s.
Let's call it "disabled" because it's a feature flag.
Assignee: jmathies → nobody
Priority: P1 → P2
Whiteboard: e10st? → e10st?, tpi:+
Whiteboard: e10st?, tpi:+ → tpi:+
Jim, any updates on this? Can this re-land now that Loop regressions aren't cause for concern?
Flags: needinfo?(jmathies)
(In reply to Ryan VanderMeulen [:RyanVM] from comment #32)
> Jim, any updates on this? Can this re-land now that Loop regressions aren't
> cause for concern?

I don't think so, flash 23 solves a lot of these issues by removing child windows.
Flags: needinfo?(jmathies)
Carrying over regression keyword from bug 1242273.
Keywords: regression
Does this still affect current versions of Firefox, but only for Flash versions older than 23? 
We can ask QE for help if you think that would be useful.
If we are not going to do anything on this issue, do you want to wontfix the bug?
Flags: needinfo?(jmathies)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #36)
> Does this still affect current versions of Firefox

Yes.

> but only for Flash versions older than 23? 

s/flash 23/async painting/ which is currently being held on nightly/aurora while we track down a bunch of bugs. So the windowed plugin bugs associated with e10s we know will get fixed when we get async painting out on release.

> We can ask QE for help if you think that would be useful.
> If we are not going to do anything on this issue, do you want to wontfix the
> bug?

I think it should remain open so people can find it. We'll wontfix it once we know the issue is addressed in a future flash release.
Flags: needinfo?(jmathies)
Moving to p3 because no activity for at least 1 year(s).
See https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md#how-do-you-triage for more information
Priority: P2 → P3
Resolving as wont fix, plugin support deprecated in Firefox 85.
Status: REOPENED → RESOLVED
Closed: 8 years ago3 years ago
Resolution: --- → WONTFIX
Product: Core → Core Graveyard
You need to log in before you can comment on or make changes to this bug.