crash in mozilla::layers::TileClient::GetBackBuffer(nsIntRegion const&, gfxContentType, mozilla::layers::SurfaceMode, bool*, nsIntRegion&, bool, mozilla::RefPtr<mozilla::layers::TextureClient>*)

RESOLVED DUPLICATE of bug 1041744

Status

()

Core
Graphics: Layers
--
critical
RESOLVED DUPLICATE of bug 1041744
3 years ago
3 years ago

People

(Reporter: aaronmt, Unassigned)

Tracking

({crash, reproducible, topcrash-android-armv7})

34 Branch
All
Android
crash, reproducible, topcrash-android-armv7
Points:
---

Firefox Tracking Flags

(e10s?, firefox35- verified)

Details

(crash signature)

(Reporter)

Description

3 years ago
This bug was filed from the Socorro interface and is 
report bp-709d1494-9122-4605-9c9e-7897a2140829.
=============================================================

Comment 1

3 years ago
https://crash-stats.mozilla.com/report/list?product=FennecAndroid&signature=mozilla%3A%3Alayers%3A%3ATileClient%3A%3AGetBackBuffer%28nsIntRegion+const%26%2C+gfxContentType%2C+mozilla%3A%3Alayers%3A%3ASurfaceMode%2C+bool%2A%2C+nsIntRegion%26%2C+bool%2C+mozilla%3A%3ARefPtr%3Cmozilla%3A%3Alayers%3A%3ATextureClient%3E%2A%29 - All those crashes seem to be in 34 Nightly.
Probably too much of a coincidence that tile related fix to bug 1016538 landed on the 27th and these crashes start on the 28th?
Flags: needinfo?(nical.bugzilla)
Hmm, or perhaps it was bug 1047945; stopping leaking things can have us crash as the lifetimes are shorter...
Flags: needinfo?(matt.woodrow)
Destroying the tile cache should only happen on shutdown, so unless this a shutdown crash bug 1047945 seems unlikely.

The crash looks like pool->GetTextureClient() returned nullptr.
Flags: needinfo?(matt.woodrow)
(In reply to Matt Woodrow (:mattwoodrow) from comment #4)
> The crash looks like pool->GetTextureClient() returned nullptr.

Yes it very much looks like it.

We could fix our tiling code to recover from this, but it's not just a matter of early-returning and using a PlaceHolder tile where we failed to allocate, because we have a lot of assertions about where we expect tiles to be PlaceHolders or not. It's not a trivial fix but it is definitely doable (and we probably should probably do it). I tried to so this a few weeks ago but stopped when I found out that it wasn't going to be a two hours thing. The question remains that if we aren't going to be able to render the page correctly, what should we do (render something slightly broken, crash, fallback to some other way to render content, don't render anything but don't crash either...)
Flags: needinfo?(nical.bugzilla)
So... is this a regression from bug bug 1016538, or just something that showed up around the same time?  If it is a regression from that bug, we should probably back it out, but from the comment 5 it may be something else?
Based on the crash, any thoughts of putting together an STR (even on a local build) that would reproduce this?
I can reproduce this crash on OSX: https://crash-stats.mozilla.com/report/index/0234e3b2-de5f-4096-9a51-ea2392140904

STR:
 - Open e10s window
 - open 3 tabs (e.g. google.com)
 - hold down ctrl + tab to make the browser quickly switch between tabs

after a second or two of ctrl+tabbing, the content process will crash.
(Reporter)

Updated

3 years ago
Keywords: reproducible
Tagging this as topcrash since it's currently #3 in Nightly @ 4.96%, although it has dropped 5.98% over the last 7 days.
Keywords: topcrash-android-armv7
[Tracking Requested - why for this release]: currently one of our top stability issues in Nightly.
status-firefox35: --- → affected
tracking-firefox35: --- → ?
This is the topcrash for e10s content processes.
Blocks: 899758
tracking-e10s: --- → ?
I've crashed 9 times today with this and 2 other seemingly tile related signatures.
[@ mozilla::layers::ClientTiledLayerBuffer::ValidateTile(mozilla::layers::TileClient, nsIntPoint const&, nsIntRegion const&) ] https://crash-stats.mozilla.com/report/index/a6ba9825-4e59-4b7d-81f1-697f32140911
[@ mozalloc_abort(char const*) | Abort | NS_DebugBreak | mozilla::layers::PLayerTransactionChild::SendPTextureConstructor(mozilla::layers::PTextureChild*, mozilla::layers::SurfaceDescriptor const&, mozilla::layers::TextureFlags const&) ] https://crash-stats.mozilla.com/report/index/f713a997-ca23-442e-bb39-d97632140910

Do we think its all the same root cause? or should I file bugs for those signatures as well?
Those seem like two separate bugs, both distinct from this one.

Are those happening on OSX without e10s? Any ideas on how to reproduce?
Actually, this first one might be the same as this, but where we failed to allocate the 'on white' buffer instead of the default buffer.
The second error is because MessageChannel::Send returned false. This should dump a message to stderr, finding what that is would be useful.
Status: NEW → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → DUPLICATE
Duplicate of bug: 1041744
tracking-firefox35: ? → -
There are 0 reports of this signature in Fennec 35.0a1 over the last week.
status-firefox35: affected → verified
You need to log in before you can comment on or make changes to this bug.