Closed Bug 1432375 Opened 6 years ago Closed 6 years ago

image.mem.shared: black bookmarks folder

Categories

(Core :: Graphics: WebRender, defect, P1)

x86_64
Linux
defect

Tracking

()

RESOLVED FIXED
mozilla60
Tracking Status
firefox-esr52 --- unaffected
firefox58 --- unaffected
firefox59 --- unaffected
firefox60 --- disabled

People

(Reporter: jan, Assigned: aosmond)

References

(Blocks 1 open bug)

Details

(Keywords: crash, nightly-community)

Attachments

(5 files)

Nightly 60 x64 20180122220231 de_DE @ Debian Testing (KDE, Radeon RX480)
main profile

I opened a bookmarks folder, it was black. I refreshed about:support and opened another bookmarks folder. This is not new, but more painful without gpu process.

I'll try to summarize it again.

See attached screenshot.

If I would click into the black and open a new tab afterwards I would get bug 1418201.

If I don't click into it, I can get a crash anyway if I open a new tab.
bp-7b88257c-6450-41b6-93c7-bb08a0180123.

If I hover other folders and come to it back again, I see its entries, but often some icons are missing. Not always the same icons are missing. bug 1421556

I think I have overseen "Failed to lock new back buffer" in the past. Or it is rather new. Up to now I am unable to reproduce this with a fresh profile. Maybe it's just because I have that many bookmarks.

Without gpu process, my main window can just completely freeze in the bookmarks menu. I have a screenshot where I tried to open my bookmarks menu via bookmarks menu button but the main window freezed. In my process list I saw a "WRRenderBackend" process. Right now, I'm also not using a gpu process and don't see such a "WRRenderBackend" process.

(I think I get similar reports when restoring a (perhaps multiple window) session after a crash or in some situations when opening multiple tabs fast. I don't know.)

bp-7b88257c-6450-41b6-93c7-bb08a0180123 23.01.18 04:41 [@ mozilla::ipc::FatalError | mozilla::dom::ContentChild::FatalErrorIfNotUsingGPUProcess | mozilla::layers::PCompositorBridgeChild::SendPWebRenderBridgeConstructor ]
> IPDL error [PCompositorBridgeChild]: "constructor for actor failed". abort()ing as a result.
bp-7aab8d44-2af3-4263-b6d8-ac06e0180123 23.01.18 04:38 [@ @0x7f90fc2791c3 ]
> MOZ_CRASH(Failed to create top level actor for PProfiler!)
> |[0][GFX1-]: Failed to lock new back buffer. (t=2772.99) |[211][GFX1-]: Failed to lock new back buffer. (t=2848) |[212][GFX1-]: Failed to lock new back buffer. (t=2852.49) |[213][GFX1-]: Failed to lock new back buffer. (t=2852.5) |[214][GFX1-]: Failed to lock new back buffer. (t=2852.5) |[215][GFX1-]: Failed to lock new back buffer. (t=2852.52) |[201][GFX1-]: Failed to lock new back buffer. (t=2844.51) |[202][GFX1-]: Failed to lock new back buffer. (t=2847.26) |[203][GFX1-]: Failed to lock new back buffer. (t=2847.27) |[204][GFX1-]: Failed to lock new back buffer. (t=2847.28) |[205][GFX1-]: Failed to lock new back buffer. (t=2847.58) |[206][GFX1-]: Failed to lock new back buffer. (t=2847.58) |[207][GFX1-]: Failed to lock new back buffer. (t=2847.91) |[208][GFX1-]: Failed to lock new back buffer. (t=2847.96) |[209][GFX1-]: Failed to lock new back buffer. (t=2847.97) |[210][GFX1-]: Failed to lock new back buffer. (t=2848) 
https://dxr.mozilla.org/mozilla-central/source/gfx/layers/client/ContentClient.cpp#263

-----------------------------------------------------------------------------------------
bp-18b2291b-7120-428c-bfc0-4f42f0180123 23.01.18 03:46 bug 1431448 [@ mozalloc_abort | abort | rayon_core::job::{{impl}}::execute<T> ]
> index out of bounds: the len is 0 but the index is 0
bp-a37b7049-9eaa-42e2-b19f-3195b0180123 23.01.18 03:46 [@ @0x40829e ]
> index out of bounds: the len is 0 but the index is 0
-----------------------------------------------------------------------------------------
bp-5c461c56-b160-4487-9385-9bd280180122 23.01.18 00:30 [@ mozilla::ipc::FatalError | mozilla::dom::ContentChild::FatalErrorIfNotUsingGPUProcess | mozilla::layers::PCompositorBridgeChild::SendPWebRenderBridgeConstructor ]
> IPDL error [PCompositorBridgeChild]: "constructor for actor failed". abort()ing as a result.
-----------------------------------------------------------------------------------------
bp-d048a234-9b93-443a-8a88-e69890180122 23.01.18 00:25 bug 1425181 [@ mozalloc_abort | abort | webrender::prim_store::PrimitiveStore::prepare_prim_for_render_inner ]
-----------------------------------------------------------------------------------------
bp-2f9a3d0f-401d-4704-8074-c96770180122 23.01.18 00:24 [@ EMPTY: no crashing thread identified; ERROR_NO_THREAD_LIST ]
> assertion failed: self.font_contexts.lock_shared_context().has_font(&font.font_key)
-----------------------------------------------------------------------------------------
bp-4458370a-2af2-48ef-ad70-ac8000180122 23.01.18 00:20
bp-8117ec2b-e1bb-4d04-8ac1-bb6390180122 23.01.18 00:20 [@ @0x408304 ]
> index out of bounds: the len is 0 but the index is 0
-----------------------------------------------------------------------------------------
bp-4f2f1d14-d94f-4c2a-81b8-9d6650180122 23.01.18 00:14
bp-be6d8d2b-e5a0-4a63-aef9-545710180122 23.01.18 00:14
bp-2ed29d5a-4937-4430-b1a3-a21560180122 23.01.18 00:14
bp-e1c9d94b-28e2-49ad-8dea-325bd0180122 23.01.18 00:14
bp-09105584-fd1c-459c-bfde-c993f0180122 23.01.18 00:14
bp-ec938712-caf7-4677-a639-d4e630180122 23.01.18 00:14
bp-3a683353-d293-41cb-8a23-e54810180122 23.01.18 00:14
bp-8864f569-a6e7-4559-850b-f4af00180122 23.01.18 00:14
bp-f9ac3fbd-01f0-47d6-8dfc-e21660180122 23.01.18 00:14
bp-b0352f1d-d42b-404f-a0ba-4fcf90180122 23.01.18 00:14
bp-e708a5d9-e08e-40ef-8072-f17210180122 23.01.18 00:14 bug 1425181 [@ mozalloc_abort | abort | webrender::prim_store::PrimitiveStore::prepare_prim_for_render_inner ]
> |[0][GFX1-]: Failed buffer for 0, 0, 263, 322 (t=23.1343) |[76]CP+[GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=70.8976) |[77]CP+[GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=70.8977) |[78]CP+[GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=70.8979) |[79][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=70.8992) |[80][GFX1-]: Could not create content compositor bridge: 0x80610002 (t=70.9862) |[81][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=70.9866) |[82][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=70.9869) |[83][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=70.9871) |[84][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=70.9875) |[85][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=70.9877) |[71]CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=70.8747) |[72]CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=70.8748) |[73]CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=70.8748) |[74]CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=70.8749) |[75]CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=70.8749) 
> assertion failed: self.font_contexts.lock_shared_context().has_font(&font.font_key)

bp-9525e94c-69dd-4693-b988-30a100180122 23.01.18 00:14 bug 1418201 [@ InvalidArrayIndex_CRASH | mozilla::dom::ContentChild::RecvReinitRendering ]

bp-d0406f78-9c0f-4b5b-83a0-2aa750180122 23.01.18 00:14
bp-25fa6704-658a-4853-8ebf-5718c0180122 23.01.18 00:14
bp-f5f0631c-8561-4d26-8edf-0ea620180122 23.01.18 00:14
bp-121cce78-299e-41cd-8fd8-bf9680180122 23.01.18 00:14 bug 1425181 [@ mozalloc_abort | abort | webrender::prim_store::PrimitiveStore::prepare_prim_for_render_inner ]
> assertion failed: self.font_contexts.lock_shared_context().has_font(&font.font_key)

bp-01b77ed8-ad60-4589-bc70-1517a0180122 23.01.18 00:14 bug 1418201 [@ InvalidArrayIndex_CRASH | mozilla::dom::ContentChild::RecvReinitRendering ]
> |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=65.1923)
(includes the latest webrender update)
bp-9b09445c-7904-4210-ac96-81ce30180131 [@ EMPTY: no crashing thread identified; ERROR_NO_THREAD_LIST ]
> assertion failed: self.font_contexts.lock_shared_context().has_font(&font.font_key)
(clicked into black bookmarks folder after switching webrender.blob-images to 0)
bp-1a1eb9b3-ef32-4b69-b8d3-9170b0180131 bug 1425181 [@ mozalloc_abort | abort | webrender::prim_store::PrimitiveStore::prepare_prim_for_render_inner ]
> assertion failed: self.font_contexts.lock_shared_context().has_font(&font.font_key)
See Also: → 1425181, 1418201, 1421556, 1431448
This got worse because there is less information:
* Once I got a crash without any other information when clicking into a black bookmarks folder:
  bp-9788063f-f555-4e28-a67d-1d1980180201 01.02.18 18:29
* 50% the crash reporter is not able to make a crash report at all.
* 50% Nightly completely freezes.

(As I'm not using the GPU process for some days now, I'm talking about browser crashes. I will re-enable the GPU process now.)
Conclusion: It got only worse without gpu process. With gpu process everything seems to be the same.

-----

With gpu process after hovering some black bookmarks folders:

about:support:
> (#0) Error	Failed buffer for 0, 0, 429, 443
> (#52) Error	Failed buffer for 0, 0, 429, 160
> (#53) Error	Failed buffer for 0, 0, 429, 403
> (#54) Error	Failed buffer for 0, 0, 429, 160
> (#55) Error	Failed buffer for 0, 0, 429, 160
> (#56) Error	Failed buffer for 0, 0, 429, 403
> (#57) Error	Failed buffer for 0, 0, 429, 160
> (#58) Error	Failed buffer for 0, 0, 429, 403
> (#59) Error	Failed buffer for 0, 0, 429, 403
> (#60) Error	Failed buffer for 0, 0, 429, 403
> (#61) Error	Failed buffer for 0, 0, 429, 403
> (#62) Error	Failed buffer for 0, 0, 429, 403
> (#63) Error	Failed buffer for 0, 0, 429, 403
> (#64) Error	Failed buffer for 0, 0, 429, 403
> (#65) Error	Failed buffer for 0, 0, 429, 403
> (#66) Error	Failed buffer for 0, 0, 429, 403

-----

After clicking on a black bookmarks item:
about:support
> (#0) Error	Failed buffer for 0, 0, 429, 443
> (#99) Error	Receive IPC close with reason=AbnormalShutdown
> (#100) Error	Receive IPC close with reason=AbnormalShutdown
> (#101) Error	Receive IPC close with reason=AbnormalShutdown
> (#102) Error	Receive IPC close with reason=AbnormalShutdown
> (#103) Error	Receive IPC close with reason=AbnormalShutdown
> (#104) Error	Receive IPC close with reason=AbnormalShutdown
> (#105) Error	Receive IPC close with reason=AbnormalShutdown
> (#106) Error	Receive IPC close with reason=AbnormalShutdown
> (#107) Error	Receive IPC close with reason=AbnormalShutdown
> (#108) Error	Receive IPC close with reason=AbnormalShutdown
> (#109) Error	Receive IPC close with reason=AbnormalShutdown
> (#110) Error	Receive IPC close with reason=AbnormalShutdown
> (#111) Error	Receive IPC close with reason=AbnormalShutdown
> (#112) Error	Receive IPC close with reason=AbnormalShutdown
> (#113) 	CP+[GFX1-]: Receive IPC close with reason=AbnormalShutdown

I got an HTTP Basic auth dialog from that bookmark (as intended): It was black for 2 seconds.
No crash report.

-----

When clicking on a black bookmarks item after restarting Nightly:

Error 404 (that website is gone): Nearly everything in that tab is invisible: I just see 3 <li> bullets and a blue button.
All app tabs shine blue on their bottom and have a "World" icon.
I opened a new tab and got a tab crash. I opened another tab with about:support and saw this:

> (#0) Error	Failed buffer for 0, 0, 429, 1440
> (#199) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#200) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#201) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#202) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#203) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#204) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#205) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#206) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#207) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#208) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#209) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#210) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#211) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#212) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0
> (#213) 	CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0

There are 2 unsent crash reports. (Click, click, to send them. Browser crash. One got sent, the other not. That happens for months.)
Reports are:
bp-29f6a7e5-ae4d-4c78-bf00-9e14e0180201 01.02.18 19:46 bug 1425181 [@ mozalloc_abort | abort | webrender::prim_store::PrimitiveStore::prepare_prim_for_render_inner ]
> assertion failed: self.font_contexts.lock_shared_context().has_font(&font.font_key)
bp-6c31ac17-1dce-4764-842d-7f16e0180201 01.02.18 19:45 [@ InvalidArrayIndex_CRASH | mozilla::dom::ContentChild::RecvReinitRendering ]
> |[C0][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.462) |[C46][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.4765) |[C47][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.4767) |[C48][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.4769) |[C49][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.4771) |[C50][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.4773) |[C51][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.4776) |[C52][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.4778) |[C53][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.478) |[C54][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.4782) |[C55][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.4784) |[C56][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.4786) |[C57][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.4789) |[C58][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.4791) |[C59][GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=13.4794) |[C60][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=13.5878) 
= bug 1418201
(Now with gpu process like comment 4 again because there wouldn't be any information without it like in comment 3.)

I think the crash reason has changed:
The font thing seems to be gone and this is new:
MOZ_RELEASE_ASSERT(result.mFd.fd != -1) (DuplicateDescriptor failed)

bp-3c2b691d-e86c-4506-9041-176390180202 02.02.18 06:33 [@ EMPTY: no crashing thread identified; ERROR_NO_THREAD_LIST ]
> browser crash. clicked on black bookmarks folder item
> MOZ_RELEASE_ASSERT(result.mFd.fd != -1) (DuplicateDescriptor failed)
> |[0][GFX1-]: Failed buffer for 0, 0, 429, 781 (t=62.7417) 
https://dxr.mozilla.org/mozilla-central/rev/bda9adefe73902685d6689a205e7114ae9df7f83/ipc/glue/Transport_posix.cpp#82
https://dxr.mozilla.org/mozilla-central/rev/bda9adefe73902685d6689a205e7114ae9df7f83/ipc/glue/ProtocolUtils.h#832

bp-aeed6d22-9ccb-4e6a-8407-4c1d10180202 02.02.18 06:32
bp-ed76d4e6-a8f5-440c-8504-1528e0180202 02.02.18 06:32
bp-1a6ed643-0836-4ab6-8f2b-af3d00180202 02.02.18 06:32
bp-9fbda480-17a6-477a-801b-6d7de0180202 02.02.18 06:32
bp-9658962b-0d69-40b8-8b04-9754c0180202 02.02.18 06:31 bug 1425181 [@ mozalloc_abort | abort | webrender::prim_store::PrimitiveStore::prepare_prim_for_render_inner ]
> |[0][GFX1-]: Failed buffer for 0, 0, 429, 214 (t=331.599) |[241][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=362.711) |[227]CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=362.612) |[228]CP+[GFX1-]:

bp-e8857352-420f-4bac-89b4-c081c0180202 02.02.18 06:25 [@ EMPTY: no crashing thread identified; ERROR_NO_THREAD_LIST ]
> hovered black bookmarks folder
> MOZ_RELEASE_ASSERT(result.mFd.fd != -1) (DuplicateDescriptor failed)
> |[0][GFX1-]: Failed buffer for 0, 0, 413, 916 (t=10.8699) |[76]CP+[GFX1-]: ShmSegmentsWriter failed to allocate chunk #0 (t=12.6176)
I've set gfx.webrender.blob-images to 0 and saw the self.font_contexts.lock_shared_context().has_font(&font.font_key) thing again.

But with image.mem.shared;0 I could not reproduce so far. I've re-enabled blob-images and it seems to be still unreproducible.
Omg, so embarrassing that I haven't tested this. I will let it disabled.
Now that I accuse image.mem.shared I wanted to explore this a bit more:

The bookmarks menu could be completely white in the past, sometimes it was just slow.
I ignore crashes and only focus on seeing black bookmark menus. white would be good.

mozregression --good 2017-10-20 --bad 2018-01-10 --profile ~/main-profile-copy --pref layers.acceleration.force-enabled:true gfx.webrender.enabled:true gfx.webrendest.enabled:true gfx.webrender.layers-free:true gfx.webrender.blob-images:false gfx.webrender.all:false image.mem.shared:true layout.display-list.retain:false
> 14:41.28 INFO: Last good revision: 6b8a627984d8e475fb3f24421f6edd9a7d41b5a8
> 14:41.28 INFO: First bad revision: 699d482c86c9fab9ed2e5b51dd1369e6bee90a5c
> 14:41.28 INFO: Pushlog:
> https://hg.mozilla.org/integration/autoland/pushloghtml?fromchange=6b8a627984d8e475fb3f24421f6edd9a7d41b5a8&tochange=699d482c86c9fab9ed2e5b51dd1369e6bee90a5c

> 699d482c86c9	Kevin Chen — Bug 1414853 - Ensure LayerManager's backend type is LAYERS_WR in CanUseAdvancedLayer since BasicCompositor might be used for remote extension process; r=sotaro

Before the patch: sometimes white empty menus
After the patch: sometimes black empty menus

testing "last good", but with image.mem.shared:FALSE:
mozregression --repo autoland --launch 6b8a627984d8e475fb3f24421f6edd9a7d41b5a8 --profile ~/main-profile-copy --pref layers.acceleration.force-enabled:true gfx.webrender.enabled:true gfx.webrendest.enabled:true gfx.webrender.layers-free:true gfx.webrender.blob-images:false gfx.webrender.all:false image.mem.shared:false layout.display-list.retain:false

-> Neither black or white menus. Not reproducible. But many bookmark icons won't show up. Some after a delay. Youtube is completely missing.


testing "first bad", but with image.mem.shared:FALSE:
mozregression --repo autoland --launch 699d482c86c9fab9ed2e5b51dd1369e6bee90a5c --profile ~/main-profile-copy --pref layers.acceleration.force-enabled:true gfx.webrender.enabled:true gfx.webrendest.enabled:true gfx.webrender.layers-free:true gfx.webrender.blob-images:false gfx.webrender.all:false image.mem.shared:false layout.display-list.retain:false

-> Neither black or white menus. Not reproducible. But nearly all bookmark icons immediately show up.

I repeated the last two tests 3 times and results were the same. I also ran them again with image.mem.shared:true and confirmed the color change.

Summary:
Above patch changed the color of this issue from white to black when image.mem.shared is enabled.
I can't reproduce with image.mem.shared:false (or 0).
When image.mem.shared was disabled many bookmark icons won't show up before the patch, but after.
I feel totally uncomfortable to describe my possibly wrong perception. But that's just what I've perceived.
With image.mem.shared;0 I still couldn't reproduce the black bookmarks menu issue.

OT: But the video from https://www.zdf.de/gesellschaft/plan-b/plan-b-weniger-ist-mehr-100.html suddenly stopped playing and I saw "ShmSegmentsWriter failed to allocate chunk #0" (and "Receive IPC close with reason=AbnormalShutdown") on about:support, but without any crash. So this message is not restricted to this bookmarks menu bug and the old history sidebar bug (bug 1418201).
OT: I discovered XRender, and that it doesn't like the gpu process: bug 1435586 (So I had to disable the gpu process for the following test.)

This is a screenshot of how it would look like with webrender.all + xrender (main profile).
No failures on about:support, WebRender is active and I get a browser crash without any report after opening a tab. So it seems to be the same behavior as without xrender in comment 3.

Now I will set image.mem.shared back to 0.
finally some reports with image.mem.shared;2 + xrender:

bp-8d5b0ae9-0f3c-4098-8607-4605f0180204 04.02.18 05:49 [@ @0x40851c ]
> assertion failed: self.font_contexts.lock_shared_context().has_font(&font.font_key)
> |[0][GFX1-]: Failed to lock new back buffer. [...]

bp-80e0c943-a5af-4ecb-a209-115100180204 04.02.18 05:58 [@ mozilla::ipc::FatalError | mozilla::dom::ContentChild::FatalErrorIfNotUsingGPUProcess | mozilla::layers::PCompositorBridgeChild::SendPWebRenderBridgeConstructor ]
> |[C0][GFX1-]: Receive IPC close with reason=AbnormalShutdown (t=36.2158) 

Flipped xrender back to false and the visual issue was a black rectangle again.
Summary: black bookmarks folder → image.mem.shared: black bookmarks folder
I saw this once:
> GP+[GFX1-]: CompositableHost does not exist for extId:25769803821
https://searchfox.org/mozilla-central/rev/e06af9c36a73a27864302cd2f829e6200dee8541/gfx/layers/wr/WebRenderBridgeParent.cpp#400
Attached file firefox.debug.log
Made with:
mozregression --launch 2018-02-04 -B debug --profile ~/main-profile-copy --profile-persistence reuse --pref gfx.webrender.all:true image.mem.shared:2 > ~/firefox.debug.log
Attachment #8948300 - Attachment mime type: text/x-log → text/plain
Could you have a look, please? You implemented bug 1339202.

> With image.mem.shared;0 I still couldn't reproduce the black bookmarks menu issue.

from attachment 8948300 [details]
> 0:54.52 INFO: [Parent 4526, ImgDecoder #2] WARNING: file /builds/worker/workspace/build/src/ipc/chromium/src/base/shared_memory_posix.cc, line 212
> 0:54.52 INFO: [Parent 4526, ImgDecoder #2] WARNING: file /builds/worker/workspace/build/src/ipc/chromium/src/base/shared_memory_posix.cc, line 216
> 0:54.52 INFO: [Parent 4526, ImgDecoder #2] WARNING: '!mBuf->Map(len)', file /builds/worker/workspace/build/src/gfx/layers/SourceSurfaceSharedData.cpp, line 64
> 0:54.52 INFO: [Parent 4526, ImgDecoder #2] WARNING: imgFrame::Init should succeed: file /builds/worker/workspace/build/src/image/Decoder.cpp, line 343
> 0:54.52 INFO: [Parent 4526, ImgDecoder #4] WARNING: file /builds/worker/workspace/build/src/ipc/chromium/src/base/shared_memory_posix.cc, line 212
> 0:54.52 INFO: [Parent 4526, ImgDecoder #4] WARNING: file /builds/worker/workspace/build/src/ipc/chromium/src/base/shared_memory_posix.cc, line 216
> 0:54.53 INFO: [Parent 4526, ImgDecoder #1] WARNING: '!mBuf->Create(len)', file /builds/worker/workspace/build/src/gfx/layers/SourceSurfaceSharedData.cpp, line 63

> 0:54.77 INFO: [Parent 4526, Main Thread] WARNING: Attempt to Lock a texture that is being read by the compositor!: file /builds/worker/workspace/build/src/gfx/layers/client/TextureClient.cpp, line 510
> 0:54.77 INFO: [GFX1-]: Failed to lock new back buffer.

> 0:55.66 INFO: Sandbox: Unexpected EOF, op 0 flags 00 path /dev/urandom
> 0:55.66 INFO: [Child 4605, Socket Thread] WARNING: file /builds/worker/workspace/build/src/ipc/chromium/src/base/rand_util_posix.cc, line 21
> 0:55.66 INFO: [Child 4605, Socket Thread] WARNING: file /builds/worker/workspace/build/src/ipc/chromium/src/base/rand_util_posix.cc, line 25
> 0:55.66 INFO: Sandbox: bad read from pid 4605: Die Nachricht ist zu lang
> 0:55.66 INFO: [Child 4605, Chrome_ChildThread] WARNING: pipe error (41): Die Verbindung wurde vom Kommunikationspartner zurückgesetzt: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 353
> 0:55.66 INFO: [Parent 4526, Gecko_IOThread] WARNING: Message needs unreceived descriptors channel:7f527072f000 message-type:2883684 header()->num_fds:1 num_fds:0 fds_i:0: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 515
> 0:55.66 INFO: Hit MOZ_CRASH(Aborting on channel error.) at /builds/worker/workspace/build/src/ipc/glue/MessageChannel.cpp:2534

> 0:55.66 INFO: #11: ??? (???:???)
> 0:55.73 INFO: [Parent 4526, Main Thread] WARNING: 'NS_FAILED(rv)', file /builds/worker/workspace/build/src/dom/indexedDB/ActorsParent.cpp, line 21370
> 0:55.79 INFO: [Parent 4526, ImgDecoder #1] WARNING: file /builds/worker/workspace/build/src/ipc/chromium/src/base/shared_memory_posix.cc, line 212
> 0:55.79 INFO: [Parent 4526, ImgDecoder #1] WARNING: file /builds/worker/workspace/build/src/ipc/chromium/src/base/shared_memory_posix.cc, line 216
> 0:55.79 INFO: [Parent 4526, ImgDecoder #1] WARNING: '!mBuf->Map(len)', file /builds/worker/workspace/build/src/gfx/layers/SourceSurfaceSharedData.cpp, line 64
> 0:55.79 INFO: [Parent 4526, ImgDecoder #1] WARNING: imgFrame::Init should succeed: file /builds/worker/workspace/build/src/image/Decoder.cpp, line 343
> 0:55.79 INFO: 
> 0:55.79 INFO: ###!!! [Parent][RunMessage] Error: Channel error: cannot send/recv

> 1:03.09 INFO: ###!!! [Parent][MessageChannel] Error: (msgtype=0x2C0041,name=PContent::Msg_LoadProcessScript) Channel error: cannot send/recv
> 1:03.09 INFO: 
> 1:03.09 INFO: [Parent 4526, Main Thread] WARNING: NS_ENSURE_TRUE(mCallback->DoLoadMessageManagerScript(aURL, aRunInGlobalScope)) failed: file /builds/worker/workspace/build/src/dom/base/nsFrameMessageManager.cpp, line 377
> 1:03.09 INFO: 

> 1:03.09 INFO: [Parent 4526, Main Thread] WARNING: Unable to create pipe named "4526.7.846737469" in server mode error(Zu viele offene Dateien).: file /builds/worker/workspace/build/src/ipc/chromium/src/chrome/common/ipc_channel_posix.cc, line 203
Flags: needinfo?(aosmond)
Assignee: nobody → aosmond
I'm going to assume for the moment that it is not shared surfaces that is causing the problem directly. Instead shared surfaces caused us to consume too many file handles at once for shared memory due to how it was implemented. Running out of handles inspires all sorts of weird behaviour for me, although I am unable to reproduce the exact behaviour reported (the icons go missing, but nothing goes black).

With that in mind, shared surfaces currently keep the file handle around until the image has finished decoding. For the simple test case of setting a low file handle limit via "ulimit -Sn <num>" in the shell, going to images.google.ca and searching for cats, I need a minimum limit of around 200 to start. Coercing shared surfaces to share itself right away and then close the handle, that goes down to 20.

I'll be attaching patches which address this shortly. They make recovering from a crashed GPU process slightly worse (more memcpys), but it does make the code simpler.

Additionally I noted that if bookmarks icons are all decoded in the UI process. If it is a combined UI/GPU process, then using shared surfaces is a bit overkill. Ideally imagelib would know to decode into the right kind of buffer based on the location of the compositor thread, and the process from which it is decoding. That is just an optimization on top of that fix however, so I'll defer that until later.
Status: NEW → ASSIGNED
Flags: needinfo?(aosmond)
Add SharedSurfacesChild::Share variant which allows non-main thread callers to request a shared surface be shared and nothing else (no image key generation, etc). This will be useful for the decoder threads which create the buffer, but have no idea when it will be used, and for what WebRenderBridgeChild, etc.
Attachment #8949508 - Flags: review?(nical.bugzilla)
Make the decoder threads force a share when the surface is created, to minimize the time we have a handle open. Also close a handle if we fail to share the surface for any reason; we already did if we got to creating a HandleLock on the stack in SharedSurfacesChild::Share(Internal), but didn't if we couldn't get the CompositorManagerChild instance. This lets us remove some other plumbing in imagelib to ensure we released the handle if the image was used in fallback decoding instead.

try (normal): https://treeherder.mozilla.org/#/jobs?repo=try&revision=4b4caf3bf0a51b7e05370867c7d6494ea71000aa
try (force image.mem.shared w/ WR): https://treeherder.mozilla.org/#/jobs?repo=try&revision=7e0967445b305acfb7482ba74fc1d8667117c4a2
Attachment #8949509 - Flags: review?(nical.bugzilla)
(In reply to Andrew Osmond [:aosmond] from comment #16)
> try (force image.mem.shared w/ WR):
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=7e0967445b305acfb7482ba74fc1d8667117c4a2

Hmm, this revealed an existing failure (since it doesn't normally run tests with it turned on). Looks to be a false alarm though, and we were freeing a surface on the compositor thread when trying to shutdown the compositor thread (hence the assert failing). I'll see about fixing this in another bug.
(In reply to Andrew Osmond [:aosmond] from comment #16)
> try (normal):
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=4b4caf3bf0a51b7e05370867c7d6494ea71000aa

mozregression --repo try --launch 4b4caf3bf0a51b7e05370867c7d6494ea71000aa -B debug --profile ~/main-profile-copy --profile-persistence reuse --pref gfx.webrender.all:true image.mem.shared:2 > ~/try.debug.log

-> everything good. just saw this (probably offtopic):

> 2:41.14 INFO: [Parent 5805, Main Thread] WARNING: Extra shutdown CC: 'i < NORMAL_SHUTDOWN_COLLECTIONS', file /builds/worker/workspace/build/src/xpcom/base/nsCycleCollector.cpp, line 3691
> 2:41.26 INFO: [Parent 5805, Main Thread] WARNING: Fonts still alive while shutting down gfxFontCache: 'mFonts.Count() == 0', file /builds/worker/workspace/build/src/gfx/thebes/gfxFont.cpp, line 217
> 2:41.27 INFO: Assertion failed at /builds/worker/workspace/build/src/gfx/cairo/cairo/src/cairo-hash.c:196: hash_table->live_entries == 0
> 2:41.27 INFO: WARNING: YOU ARE LEAKING THE WORLD (at least one JSRuntime and everything alive inside it, that is) AT JS_ShutDown TIME.  FIX THIS!

-----
with gpu process:
mozregression --repo try --launch 4b4caf3bf0a51b7e05370867c7d6494ea71000aa -B debug --profile ~/main-profile-copy --profile-persistence reuse --pref gfx.webrender.all:true image.mem.shared:2 layers.gpu-process.force-enabled:true > ~/try.gpu.debug.log
-> nothing bad
-----

Icons sometimes load with a short delay of maybe half a second. Looks normal and good. Can't reproduce the black menu issue. Thanks! :)
Sorry I haven't had time to look into this today but I will do the review Monday.
Attachment #8949508 - Flags: review?(nical.bugzilla) → review+
Attachment #8949509 - Flags: review?(nical.bugzilla) → review+
Pushed by aosmond@gmail.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/07e679fc2e70
Part 1. Add ability for SharedSurfacesChild callers besides display list building to share surfaces. r=nical
https://hg.mozilla.org/integration/mozilla-inbound/rev/87ba2465c82e
Part 2. Images decoded into an SourceSurfaceSharedData should be shared immediately. r=nical
https://hg.mozilla.org/mozilla-central/rev/07e679fc2e70
https://hg.mozilla.org/mozilla-central/rev/87ba2465c82e
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla60
(In reply to Jan Andre Ikenmeyer [:darkspirit] from comment #13)
> > 0:55.66 INFO: Sandbox: Unexpected EOF, op 0 flags 00 path /dev/urandom
> > 0:55.66 INFO: [Child 4605, Socket Thread] WARNING: file /builds/worker/workspace/build/src/ipc/chromium/src/base/rand_util_posix.cc, line 21
> > 0:55.66 INFO: [Child 4605, Socket Thread] WARNING: file /builds/worker/workspace/build/src/ipc/chromium/src/base/rand_util_posix.cc, line 25
> > 0:55.66 INFO: Sandbox: bad read from pid 4605: Die Nachricht ist zu lang

This looks like file descriptor exhaustion.
See Also: → 1401776
(In reply to Jed Davis [:jld] (⏰UTC-6) from comment #23)
> This looks like file descriptor exhaustion.

= comment 14 ("consume too many file handles at once for shared memory") & https://www.evernote.com/pub/msreckovic/gfxdailies#st=p&x=inadvertently%2520increased%2520our%2520min%2520file%2520handles&n=a12d044f-33d8-4390-8836-b7dd1945514a

This is fixed now. (At least for me and I am not aware of other reports.)
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: