Permafail linux64-qr X(X6) opt | xpcshell-remote.ini:toolkit/components/extensions/test/xpcshell/test_ext_i18n.js | xpcshell return code: -11 or debug: toolkit/components/extensions/test/xpcshell/test_ext_native_messaging_perf.js

RESOLVED FIXED in Firefox 55

Status

()

P3
normal
RESOLVED FIXED
2 years ago
a year ago

People

(Reporter: kats, Assigned: kats)

Tracking

({intermittent-failure})

Other Branch
mozilla55
x86_64
Linux
intermittent-failure
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox55 fixed)

Details

(Whiteboard: [gfx-noted])

Attachments

(4 attachments)

I can reproduce this pretty easily locally running, for example:

./mach xpcshell-test toolkit/components/extensions/test/xpcshell/test_ext_contentscript.js

What seems to be happening is that we get a parent process with a PuppetWidget and a BasicLayerManager. The check at [1] fails for the BasicLayerManager, so the child layers id is never registered with the compositor (which makes sense, because there is no compositor in this scenario).

Then later the child process tries to create a WebRenderLayerManager which tries to create a WebRenderBridge over the CrossProcessCompositorBridge, finds that the layers id it has doesn't have a corresponding CompositorBridgeParent, and crashes. Prior to bug 1351384 we would create a ClientLayerManager instead of a WebRenderLayerManager, which would go through the fallback code at [2] when it tried to create the PLayerTransaction, and so it would carry on without crashing.

I suspect the correct thing to do in this case might actually be to have the child process create a BasicLayerManager, so that it can still draw itself if it needs to (dvander says this scenario might occur with webextensions which don't get composited, but need to be able to DrawWindow themselves). The tricky part is detecting this scenario in the child process and creating the BasicLayerManager appropriately. I might need to add a new flag to InitRenderingState to make that happen, or maybe send over a sentinel layers id.

[1] http://searchfox.org/mozilla-central/rev/7419b368156a6efa24777b21b0e5706be89a9c2f/layout/ipc/RenderFrameParent.cpp#120
[2] http://searchfox.org/mozilla-central/rev/7419b368156a6efa24777b21b0e5706be89a9c2f/gfx/layers/ipc/CrossProcessCompositorBridgeParent.cpp#100
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
The first three patches are just minor cleanup that I found while poking around. The fourth patch is the fix. I tried using the mParentBackend in the TextureFactoryIdentifier as a hint for this but it looks a lot of the call sites just pass in LAYERS_NONE for that and in general it's not that reliable. I figured adding a new flag specifically to catch this scenario is probably better.

Try pushes: [1] for webrender stuff [2] for everything else.

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=6e28251b684b4cbb068c6058cc42c6d54c39efd4
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=949c969d097bce38ed6c2d573e32d3b47e9fd44c
I pushed a 5th patch to try, which removes that fallback code in CrossProcessCompositorBridgeParent::AllocPLayerTransaction, to see if that still gets hit now: https://treeherder.mozilla.org/#/jobs?repo=try&revision=11baa2b114d7c7a8905bce25979c001c023aaa94
Looks like that 5th patch causes failures, so we can skip that for now.
Comment on attachment 8853128 [details]
Bug 1351777 - Remove unnecessary cast and more tightly scope a local var.

https://reviewboard.mozilla.org/r/125202/#review128118
Attachment #8853128 - Flags: review?(dvander) → review+
Comment on attachment 8853129 [details]
Bug 1351777 - Remove aSuccess argument to RenderFrame constructor.

https://reviewboard.mozilla.org/r/125204/#review128120
Attachment #8853129 - Flags: review?(dvander) → review+
Comment on attachment 8853130 [details]
Bug 1351777 - Use a BasicLayerManager in the content process if the corresponding parent-side layer manager isn't connected to the compositor.

https://reviewboard.mozilla.org/r/125206/#review128114

::: commit-message-fa1d3:4
(Diff revision 1)
> +Bug 1351777 - Use a BasicLayerManager in the content process if the corresponding parent-side layer manager isn't connected to the compositor. r?dvander
> +
> +There are scenarios where we have a TabParent in the UI process hooked up to
> +a PuppetWidget with a BasicLayerManager. Supposedly webextensions fall into this

WebExtensions do this by design, so don't need  "supposedly".
Attachment #8853130 - Flags: review?(dvander) → review+
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Comment 17

2 years ago
Pushed by kgupta@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/376e69e256d4
Remove trivial function. r=dvander
https://hg.mozilla.org/integration/autoland/rev/2e6663fe3045
Remove unnecessary cast and more tightly scope a local var. r=dvander
https://hg.mozilla.org/integration/autoland/rev/46df244233fc
Remove aSuccess argument to RenderFrame constructor. r=dvander
https://hg.mozilla.org/integration/autoland/rev/27b88a5451f3
Use a BasicLayerManager in the content process if the corresponding parent-side layer manager isn't connected to the compositor. r=dvander

Comment 18

2 years ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/376e69e256d4
https://hg.mozilla.org/mozilla-central/rev/2e6663fe3045
https://hg.mozilla.org/mozilla-central/rev/46df244233fc
https://hg.mozilla.org/mozilla-central/rev/27b88a5451f3
Status: NEW → RESOLVED
Last Resolved: 2 years ago
status-firefox55: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
18 failures in 845 pushes (0.021 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* graphics: 14
* try: 4

Platform breakdown:
* linux64-qr: 18

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1351777&startday=2017-03-27&endday=2017-04-02&tree=all

Updated

2 years ago
Duplicate of this bug: 1352287
Duplicate of this bug: 1352545
Depends on: 1356262
1 failures in 1008 pushes (0.001 failures/push) were associated with this bug in the last 7 days.   

Repository breakdown:
* mozilla-inbound: 1

Platform breakdown:
* windows10-64: 1

For more details, see:
https://brasstacks.mozilla.com/orangefactor/?display=Bug&bugid=1351777&startday=2017-07-24&endday=2017-07-30&tree=all
You need to log in before you can comment on or make changes to this bug.