Closed
Bug 1363347
Opened 8 years ago
Closed 8 years ago
Crash [@ mozalloc_abort | abort | webrender::frame::Frame::flatten_items ]
Categories
(Core :: Graphics: WebRender, defect, P3)
Tracking
()
VERIFIED
FIXED
mozilla55
Tracking | Status | |
---|---|---|
firefox-esr52 | --- | unaffected |
firefox53 | --- | unaffected |
firefox54 | --- | unaffected |
firefox55 | --- | fixed |
People
(Reporter: jan, Assigned: aosmond)
References
Details
(Keywords: crash, Whiteboard: gfx-noted)
Crash Data
Attachments
(2 files, 1 obsolete file)
82.98 KB,
text/x-log
|
Details | |
3.93 KB,
patch
|
sotaro
:
review+
|
Details | Diff | Splinter Review |
User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:55.0) Gecko/20100101 Firefox/55.0
Build ID: 20170509100208
Steps to reproduce:
bp-cb24c36d-8e1d-4b5c-9ed4-f296d0170509
STR of bug 1360613
set gfx.webrender.enabled to true
open any youtube video (/watch?v=xxxxxxxxx)
click on the fullscreen button on the bottom right of the video
Firefox crashes, I hear the audio playing for the few seconds even the firefox window is already gone.
With webrender disabled, everything works as expected.
Reporter | ||
Updated•8 years ago
|
Blocks: 1360613
Crash Signature: [@ mozalloc_abort | abort | webrender::frame::Frame::flatten_items ]
Has STR: --- → yes
OS: Unspecified → Linux
Hardware: Unspecified → x86_64
Comment 1•8 years ago
|
||
Thanks for filing this!
@kvark, this looks like potentially an issue in WR. Do we have a strategy for tackling these kinds of problems? If needed I can try to repro this and get a recording, which might help determine if it's a bug in WR or in the gecko code talking to WR. Or if you want you can try to debug it directly.
Flags: needinfo?(kvark)
See Also: → 1344190
Comment 2•8 years ago
|
||
@kats I don't think we have a strategy about this just yet. If you can repro it and reduce the case to something local to WR, that would be terrific. In any case, we may ask @Gankro to add some asserts to the new flattening code in order to catch the problem earlier.
Updated•8 years ago
|
Flags: needinfo?(kvark) → needinfo?(a.beingessner)
Reporter | ||
Comment 3•8 years ago
|
||
On that day I downloaded Servo for Linux64, openend some websites, even youtube, and at some point Servo crashed with the same reason Firefox crashed here: "called `Option::unwrap()` on a `None` value".
I feel ashamed that I didn't immediately take a screenshot. I was unable to click Servo's crash report button, everything was buggy.
Assignee | ||
Comment 4•8 years ago
|
||
Darkspirit, thanks for the STR in bug 1365009 comment 2. I am consistently able to reproduce this now. I don't think this is on the rust side -- the error just before the rust panic suggests we didn't have a TextureHost for the given external image ID.
[GPU 1476] ###!!! ASSERTION: TextureHost does not exist: 'Error', file /home/aosmond/dev/gecko-dev/gfx/layers/wr/WebRenderBridgeParent.cpp, line 438
thread 'RenderBackend' panicked at 'called `Option::unwrap()` on a `None` value', /checkout/src/libcore/option.rs:323
stack backtrace:
[GPU 1476] ###!!! ASSERTION: TextureHost does not exist: 'Error', file /home/aosmond/dev/gecko-dev/gfx/layers/wr/WebRenderBridgeParent.cpp, line 438
[GPU 1476] ###!!! ASSERTION: TextureHost does not exist: 'Error', file /home/aosmond/dev/gecko-dev/gfx/layers/wr/WebRenderBridgeParent.cpp, line 438
[GPU 1476] ###!!! ASSERTION: TextureHost does not exist: 'Error', file /home/aosmond/dev/gecko-dev/gfx/layers/wr/WebRenderBridgeParent.cpp, line 438
[GPU 1476] ###!!! ASSERTION: TextureHost does not exist: 'Error', file /home/aosmond/dev/gecko-dev/gfx/layers/wr/WebRenderBridgeParent.cpp, line 438
++DOCSHELL 0x7f95146d7800 == 3 [pid = 1555] [id = {553b9546-7a3d-445e-a828-f474bd17a944}]
++DOMWINDOW == 6 (0x7f95146da000) [pid = 1555] [serial = 6] [outer = (nil)]
++DOMWINDOW == 7 (0x7f95146de800) [pid = 1555] [serial = 7] [outer = 0x7f95146da000]
0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
at /checkout/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
1: std::sys_common::backtrace::_print
at /checkout/src/libstd/sys_common/backtrace.rs:71
2: std::panicking::default_hook::{{closure}}
at /checkout/src/libstd/sys_common/backtrace.rs:60
at /checkout/src/libstd/panicking.rs:355
3: std::panicking::default_hook
at /checkout/src/libstd/panicking.rs:371
4: std::panicking::rust_panic_with_hook
at /checkout/src/libstd/panicking.rs:549
5: std::panicking::begin_panic
at /checkout/src/libstd/panicking.rs:511
6: std::panicking::begin_panic_fmt
at /checkout/src/libstd/panicking.rs:495
7: rust_begin_unwind
at /checkout/src/libstd/panicking.rs:471
8: core::panicking::panic_fmt
at /checkout/src/libcore/panicking.rs:69
9: core::panicking::panic
at /checkout/src/libcore/panicking.rs:49
10: <core::option::Option<T>>::unwrap
at /checkout/src/libcore/macros.rs:21
11: webrender::resource_cache::ResourceCache::get_image_properties
at ./gfx/webrender/src/resource_cache.rs:602
12: webrender::frame::Frame::flatten_items
at ./gfx/webrender/src/frame.rs:522
13: webrender::frame::Frame::flatten_stacking_context
at ./gfx/webrender/src/frame.rs:394
14: webrender::frame::Frame::flatten_items
at ./gfx/webrender/src/frame.rs:645
15: webrender::frame::Frame::flatten_stacking_context
at ./gfx/webrender/src/frame.rs:394
16: webrender::frame::Frame::flatten_iframe
at ./gfx/webrender/src/frame.rs:475
17: webrender::frame::Frame::flatten_items
at ./gfx/webrender/src/frame.rs:657
18: webrender::frame::Frame::flatten_stacking_context
at ./gfx/webrender/src/frame.rs:394
19: webrender::frame::Frame::create
at ./gfx/webrender/src/frame.rs:269
20: webrender::render_backend::RenderBackend::build_scene
at ./gfx/webrender/src/render_backend.rs:474
21: webrender::profiler::TimeProfileCounter::profile
at ./gfx/webrender/src/profiler.rs:154
22: webrender::render_backend::RenderBackend::run
at ./gfx/webrender/src/render_backend.rs:218
23: webrender::renderer::Renderer::new::{{closure}}
at ./gfx/webrender/src/renderer.rs:1072
24: <std::panic::AssertUnwindSafe<F> as core::ops::FnOnce<()>>::call_once
at /checkout/src/libstd/panic.rs:296
25: std::panicking::try::do_call
at /checkout/src/libstd/panicking.rs:454
26: <unknown>
at /checkout/src/libpanic_abort/lib.rs:40
Redirecting call to abort() to mozalloc_abort
Assignee | ||
Updated•8 years ago
|
Assignee | ||
Comment 5•8 years ago
|
||
Looks like WebRenderImageHost::mWrCompositableHolder is cleared, so that when it tries to get the next image/texture, the timestamp is null (since there is no holder to get it from), and the frame/producers in its queue don't match the last frame (because its a video, we don't keep them around).
Flags: needinfo?(a.beingessner)
Assignee | ||
Comment 6•8 years ago
|
||
It appears the following occurs:
1) External image ID "A" gets bound to handle 1 via WebRenderBridgeParent::RecvAddExternalImageId. Sets mWrCompositableHolder to that of the WRBP.
2) External image ID "B" gets bound to handle 1 via WebRenderBridgeParent::RecvAddExternalImageId. Sets mWrCompositableHolder to that of the WRBP (noop, since it was already set).
3) Remove external image ID "A". Clears mWrCompositableHolder.
4) Use external image ID "B" in rendering. mWrCompositableHolder is null when it should not be.
Assignee | ||
Updated•8 years ago
|
Assignee: nobody → aosmond
Status: NEW → ASSIGNED
Assignee | ||
Updated•8 years ago
|
Priority: -- → P3
Whiteboard: gfx-noted
Assignee | ||
Comment 7•8 years ago
|
||
To summarize, the video's CompositableHandle is created via ImageBridgeChild rather than WebRenderBridgeChild. In WebRenderImageLayer::RenderLayer, we typically create a new ImageClient for each layer, which creates its own CompositableHandle to pair with an external image ID, which effectively makes them one-to-one. However in the video / PImageBridge case, it needs to reuse the CompositableHandle that the video already has. This becomes a problem when a second image layer gets created containing the same video due to moving to full screen (overwriting WebRenderImageHost::mWrCompositableHolder, but it is the same in this case so it is "okay"), followed by the releasing of the first image layer (clearing WebRenderImageHost::mWrCompositableHolder when it releases its own external image ID and CompositableHandle pairing).
Assignee | ||
Comment 8•8 years ago
|
||
The GPU process also crashed on Windows 10 for me, although it did not produce a crash report for that, so I'm not sure if it is the same problem -- instead the crash report that was generated was for the reinit failure (see bug 1365009).
Assignee | ||
Comment 9•8 years ago
|
||
Note that bug 1364302 changed the details (WebRenderImageHost now holds a pointer to WebRenderBridgeParent rather than WebRenderCompositableHolder) but it still crashes.
Assignee | ||
Comment 10•8 years ago
|
||
Reporter | ||
Comment 11•8 years ago
|
||
(In reply to Andrew Osmond [:aosmond] from comment #10)
> Created attachment 8868129 [details] [diff] [review]
> Allow multiple external image IDs to bind to same image host if for same
> WRBridge, v1
>
> try:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=83e174ce540b024c9d426ac494684b4c8963fc65
I tried "Linux x64 debug" with webrender enabled and both gpu-process enabled+disabled: With that build I am able to open videos, switch tabs and make them fullscreen (and back to normal) as crazy. This seems to fix the STR of bug 1360613. I don't know if your patch from bug 1365009 comment 5 was also included in this try build.
Reporter | ||
Comment 12•8 years ago
|
||
(In reply to Darkspirit from comment #11)
> with webrender enabled and both gpu-process enabled
layers.gpu-process.enabled;true
layers.gpu-process.force-enabled;false
layers.gpu-process.max_restarts;1
The GPU process seems to not crash and doesn't get disabled by runtime anymore.
Assignee | ||
Comment 13•8 years ago
|
||
(In reply to Darkspirit from comment #11)
> (In reply to Andrew Osmond [:aosmond] from comment #10)
> > Created attachment 8868129 [details] [diff] [review]
> > Allow multiple external image IDs to bind to same image host if for same
> > WRBridge, v1
> >
> > try:
> > https://treeherder.mozilla.org/#/
> > jobs?repo=try&revision=83e174ce540b024c9d426ac494684b4c8963fc65
>
> I tried "Linux x64 debug" with webrender enabled and both gpu-process
> enabled+disabled: With that build I am able to open videos, switch tabs and
> make them fullscreen (and back to normal) as crazy. This seems to fix the
> STR of bug 1360613. I don't know if your patch from bug 1365009 comment 5
> was also included in this try build.
Great! Yes, the patch from bug 1365009 was included in this build, but it shouldn't matter, as it is only relevant if the GPU process crashes. With this bug's patch, we avoid that.
Reporter | ||
Comment 14•8 years ago
|
||
Extreme test: you can even make a <div> fullscreen, ... with a video in it, then make the video fullscreen, then close video fullscreen, you are now back in the fullscreen <div>, then click on "toggle fullscreen" and you are back on your desktop. http://html5-demos.appspot.com/static/fullscreen.html (the left example). No crashes, nothing disabled by runtime.
Assignee | ||
Comment 15•8 years ago
|
||
Comment on attachment 8868129 [details] [diff] [review]
Allow multiple external image IDs to bind to same image host if for same WRBridge, v1
Sotaro, I'm not sure if this is the best way to fix it, but it does work. An alternative would be to see if the layer manager can free the original layer sooner since it appears to happen in succession after creating the new layer, but I wasn't up for untangling that last night (assuming it is feasible). Let me know if prefer another approach.
Attachment #8868129 -
Flags: review?(sotaro.ikeda.g)
Comment 16•8 years ago
|
||
It seems that we could handle the situation simper way like ImageHost::Composite() does.
https://dxr.mozilla.org/mozilla-central/source/gfx/layers/composite/ImageHost.cpp#213
By the way, gecko media framework permits that async ImageContainer is used by multiple ImageLayers at the same time.
Comment 17•8 years ago
|
||
Comment 18•8 years ago
|
||
Comment on attachment 8868376 [details] [diff] [review]
patch - Call SetWrBridge() before GetAsTextureHostForComposite()
:aosmond, how about this patch?
Attachment #8868376 -
Flags: feedback?(aosmond)
Comment 19•8 years ago
|
||
(In reply to Sotaro Ikeda [:sotaro] from comment #16)
> By the way, gecko media framework permits that async ImageContainer is used
> by multiple ImageLayers at the same time.
Sorry, correction:
gecko media framework permits that TextureHost(Image) is used by multiple ImageLayers at the same time.
Assignee | ||
Comment 20•8 years ago
|
||
Comment on attachment 8868376 [details] [diff] [review]
patch - Call SetWrBridge() before GetAsTextureHostForComposite()
Review of attachment 8868376 [details] [diff] [review]:
-----------------------------------------------------------------
Sotaro, but what about this call path:
ImageBridgeParent::RecvUpdate
CompositableParentManager::ReceiveCompositableUpdate
WebRenderImageHost::UseTextureHost
mWrCompositableHolder->CompositeUntil
Combining this patch with the race happening in this bug means we have a period where mWrBridge is nullptr (after the second external image ID is created, and the first external image ID being removed, but before we call ProcessWebRenderCommands), and we may attempt to update the video.
This doesn't seem terribly important right now, as WebRenderCompositableHolder::GetCompositeUntilTime is unused (thus making mCompositeUntilTime effectively write only, as its value can only impact its own state through CompositeUtil and SetCompositionTime). However I note in bug 1359206 you added:
http://searchfox.org/mozilla-central/diff/288058eab96f2700e5bf5278721cefc5ce1a88e3/gfx/layers/wr/WebRenderBridgeParent.cpp#724
Which may become important later?
Attachment #8868376 -
Flags: feedback?(aosmond)
Assignee | ||
Updated•8 years ago
|
Flags: needinfo?(sotaro.ikeda.g)
Comment 21•8 years ago
|
||
(In reply to Andrew Osmond [:aosmond] from comment #20)
>
> This doesn't seem terribly important right now, as
> WebRenderCompositableHolder::GetCompositeUntilTime is unused (thus making
> mCompositeUntilTime effectively write only, as its value can only impact its
> own state through CompositeUtil and SetCompositionTime). However I note in
> bug 1359206 you added:
>
> http://searchfox.org/mozilla-central/diff/
> 288058eab96f2700e5bf5278721cefc5ce1a88e3/gfx/layers/wr/WebRenderBridgeParent.
> cpp#724
>
> Which may become important later?
Yes, I forgot to thinks about it. Thanks!
Flags: needinfo?(sotaro.ikeda.g)
Comment 22•8 years ago
|
||
It seems that there seems no use case that WebRenderImageHost::mWrBridge is changed to another valid WebRenderBridgeParent for now. Then attachment 8868129 [details] [diff] [review] works good:)
Updated•8 years ago
|
Attachment #8868376 -
Attachment is obsolete: true
Updated•8 years ago
|
Attachment #8868129 -
Flags: review?(sotaro.ikeda.g) → review+
Comment 23•8 years ago
|
||
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/projects/graphics/rev/954c86fc79b0
Allow multiple external image IDs to be bound to the same image host if owned by the same WRBridge. r=sotaro
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Updated•8 years ago
|
status-firefox53:
--- → unaffected
status-firefox54:
--- → unaffected
status-firefox55:
--- → fixed
status-firefox-esr52:
--- → unaffected
Target Milestone: --- → mozilla55
Comment 24•8 years ago
|
||
bugherder |
Reporter | ||
Updated•8 years ago
|
Status: RESOLVED → VERIFIED
You need to log in
before you can comment on or make changes to this bug.
Description
•