694964 - crash [@ gfxSharedImageSurface::Open] when a layer is painted twice in the same transaction

Reporter

Description

•

13 years ago

This bug was filed from the Socorro interface and is 
report bp-e1096e16-7bc6-458c-9ef3-a527b2111017 .
============================================================= 

Top frames:

0 	libxul.so 	gfxSharedImageSurface::Open 	BaseSize.h:55
1 	libxul.so 	mozilla::layers::ShadowLayerForwarder::OpenDescriptor 	gfx/layers/ipc/ShadowLayers.cpp:464
2 	libxul.so 	mozilla::layers::BasicShadowThebesLayer::Swap 	gfx/layers/basic/BasicLayers.cpp:2792
3 	libxul.so 	mozilla::layers::ShadowLayersParent::RecvUpdate 	PLayers.h:1199
4 	libxul.so 	mozilla::layers::PLayersParent::OnMessageReceived 	obj-firefox/ipc/ipdl/PLayersParent.cpp:220
5 	libxul.so 	mozilla::dom::PContentParent::OnMessageReceived 	obj-firefox/ipc/ipdl/PContentParent.cpp:1464
6 	libxul.so 	mozilla::ipc::SyncChannel::OnDispatchMessage 	ipc/glue/SyncChannel.cpp:175
7 	libxul.so 	mozilla::ipc::RPCChannel::OnMaybeDequeueOne 	ipc/glue/RPCChannel.cpp:431
8 	libxul.so 	RunnableMethod<mozilla::ipc::RPCChannel, bool , Tuple0>::Run 	ipc/chromium/src/base/task.h:308
9 	libxul.so 	mozilla::ipc::RPCChannel::DequeueTask::Run 	RPCChannel.h:487
10 	libxul.so 	MessageLoop::RunTask 	ipc/chromium/src/base/message_loop.cc:319


This crash started popping up on October 12 with Nightly Fennec builds of that day, from looking at https://crash-stats.mozilla.com/report/list?signature=gfxSharedImageSurface%3A%3AOpen it seems to happen on different Android versions (at least that's what I derive from seeing different kernel versions, Naoki or someone might correct me here) and all Nightly builds since then - but it only happens on Fennec and only on Nightly.

This also seems to be the #1 crash on Fennec Trunk right now - at least in my own stats it's the only one of which we have 10 or more crashes every day.

Ali Juma [:ajuma]

Assignee

Comment 1

•

13 years ago

This crash is happening in code that was recently added by Bug 690469.

Naoki Hirata :nhirata (please use needinfo instead of cc)

Updated

•

13 years ago

Summary: crash gfxSharedImageSurface::Open → crash [@ gfxSharedImageSurface::Open]

Oleg Romashin (:romaxa)

Comment 2

•

13 years ago

Is there are some way to reproduce this crash? could it be related to the same problem we've workaround-ed for OGL port (order of ThebesLayer Front/Back sync)...
BasicShadowableThebesLayer::SetBackBufferAndAttrs
Not sure should we just call SyncFrontBufferToBackBuffer from that place for all backends?

Oleg Romashin (:romaxa)

Comment 3

•

13 years ago

but from other side, that problem could not endup with crash... and descriptor open could fail only if Shmem was give to Chrome process and destroyed at the same time on Child process side.

Robert Kaiser

Reporter

Comment 4

•

13 years ago

(In reply to Oleg Romashin (:romaxa) from comment #2)
> Is there are some way to reproduce this crash?

No idea, I only saw crash reports coming in frequently enough that it's worth filing this.

Martijn Wargers (dead)

Comment 5

•

13 years ago

I was hitting this crash once in current trunk build on the EEE Transformer, while playing around with the file input and the camera. Fennec crashed a while after I closed that page:
https://crash-stats.mozilla.com/report/index/bp-da19582c-6b42-4d88-abcc-0a0e92111019

Ludovic Hirlimann [:Usul]

Comment 6

•

13 years ago

I got the crash whl3 loading www.leparisien.fr . The graphics on the pagw started to flicker a lot and then fennec crashed.

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 7

•

13 years ago

Ludovic, I couldn't reproduce on a Galaxy S II Epic 4G on today's (10/23) nightly.  What phone were you using?

Can anyone else repro?

Chris Jones [:cjones] inactive; ni?/f?/r? if you need me

Comment 8

•

13 years ago

(In reply to Chris Jones [:cjones] [:warhammer] from comment #7)
> Ludovic, I couldn't reproduce on a Galaxy S II Epic 4G on today's (10/23)
> nightly.  What phone were you using?

Er, phone *and* build.

Ludovic Hirlimann [:Usul]

Comment 9

•

13 years ago

Galaxy-S one - build was from the day I posted on bugzilla. When I visited the page again it didn't crash. I haven't see this crash since.

Benoit Girard (:BenWa)

Comment 10

•

13 years ago

The crash volume is significant considering how small user base for fennec nightly. I think we should back out bug 690469 until it's fixed.

Oleg Romashin (:romaxa)

Comment 11

•

13 years ago

Sounds like yes we should do that.

Oleg Romashin (:romaxa)

Comment 12

•

13 years ago

One possible solution to make change smaller, is to make this condition
http://mxr.mozilla.org/mozilla-central/source/gfx/layers/basic/BasicLayers.cpp#2262
true always, so we will switch back to previous Swap/Sync/Create order and keep API change.
we had similar problem in bug 694140, but it was not leading to crash, and was causing some rendering problems... sounds like for software mode we have unreproducible crashes...

Naoki Hirata :nhirata (please use needinfo instead of cc)

Updated

•

13 years ago

Whiteboard: [mobile-crash]

Josh Matthews [:jdm]

Comment 13

•

13 years ago

Apparently bug 701617 contains rock-steady STR.

John Hammink

Comment 15

•

13 years ago

I am still able to reproduce this crash also by surfing to my battery crash page:
people.mozilla.com/~jhammink/webapi_test_pages/BatteryCrashv2.html

Actual crash report is here:
https://crash-stats.mozilla.com/report/index/bp-4e92cbbf-54fc-4874-af32-034262111122

Scoobidiver (away)

Updated

•

13 years ago

OS: Linux → Android

Scoobidiver (away)

Comment 16

•

13 years ago

It's #2 top crasher in 10.0a2 (only about 700 ADU).

Scoobidiver (away)

Comment 17

•

13 years ago

It's #1 top crasher in 10.0b2 with about 12,000 ADU.

Keywords: topcrash

Scoobidiver (away)

Comment 18

•

13 years ago

It's #1 top crasher in 10.0b3 (about 11,000 ADU) with 16% of all crashes.
There have been no crashes in 11.0a2 and 12.0a1 (about 500 ADU).

Blocks: 690469

Keywords: regression, reproducible

Whiteboard: [mobile-crash] → [mobile-crash], [STR in comment 13 and 15]

Version: Trunk → 10 Branch

Scoobidiver (away)

Updated

•

13 years ago

tracking-fennec: --- → ?

Joe Drew (not getting mail)

Comment 19

•

13 years ago

Perhaps George can take a look at this when he has a moment.

Oleg Romashin (:romaxa)

Comment 20

•

13 years ago

If it is easy to reproduce, then it would be nice to get it crashed on debug build with NSPR_LOG_MODULES=Layers:5 exported... so we can see order of layers transactions/ctor/dtor calls.

Oleg Romashin (:romaxa)

Comment 21

•

13 years ago

We have here mFrontBuffer which is coming from child process, stick in UI process, and somehow get's destroyed in child process in such way that UI process does not know about that...
I see one way how it might happen:
BasicShadowableThebesLayer::PaintBuffer pushing mBackBuffer into transaction queue, but don't move that buffer into read-only state... while transaction in pending state time between ::PaintBuffer and BasicShadowableThebesLayer::SetBackBufferAndAttrs, we do receive another BasicShadowableThebesLayer::CreateBuffer call with different size... after that we have mBackBuffer  destroyed without notifying UI process.

I guess the right proposal here is to set mBackBuffer = NULL right after PaintedThebesBuffer call, or move it to mROFrontBuffer... so there is no chance that mBuffer will be destroyed after that.
But in that case we should handle situation where we can have 3 buffers, and destroy 3-rd buffer correctly

Oleg Romashin (:romaxa)

Comment 22

•

13 years ago

Attached patch Crash fix on assumption from comment 21 (obsolete) — Details — Splinter Review

Kind of blind fix, because I'm not able to reproduce this crash, will create try build probably someone could run manual test round and check if it still reproducible or not, or we can just push it and check if crash has disappear

Attachment #587937 - Flags: review?(joe)

Brad Lassey [:blassey] (use needinfo?)

Updated

•

13 years ago

tracking-fennec: ? → -

Joe Drew (not getting mail)

Comment 23

•

13 years ago

Comment on attachment 587937 [details] [diff] [review]
Crash fix on assumption from comment 21

This looks perfectly reasonable to me, but I don't know whether this is a thing that should actually happen in the wild. I'm going to cede review to Chris and/or Ali, who have more recent experience with cross-process things.

Attachment #587937 - Flags: review?(jones.chris.g)

Attachment #587937 - Flags: review?(joe)

Attachment #587937 - Flags: review?(ajuma)

Oleg Romashin (:romaxa)

Comment 24

•

13 years ago

Forgot to attach try build:
https://tbpl.mozilla.org/?tree=Try&rev=ed468272c38e

Oleg Romashin (:romaxa)

Comment 25

•

13 years ago

Comment on attachment 587937 [details] [diff] [review]
Crash fix on assumption from comment 21


>+  } else {
>+    BasicManager()->ShadowLayerForwarder::DestroySharedSurface(&mBackBuffer);

Oh, here we should destroy aBuffer.

>   }

Attachment #587937 - Flags: review?(jones.chris.g) → review-

Oleg Romashin (:romaxa)

Comment 26

•

13 years ago

Attached patch Updated patch, drop obsolete buffer carefully (obsolete) — Details — Splinter Review

Attachment #587937 - Attachment is obsolete: true

Attachment #587937 - Flags: review?(ajuma)

Attachment #588206 - Flags: review?(ajuma)

Ali Juma [:ajuma]

Assignee

Updated

•

13 years ago

Attachment #588206 - Flags: review?(ajuma) → review+

Oleg Romashin (:romaxa)

Comment 27

•

13 years ago

Pushed into:
https://hg.mozilla.org/mozilla-central/rev/964b118ac852

hope it will help.
if not then we should reopen and investigate again

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → FIXED

Scoobidiver (away)

Comment 28

•

13 years ago

(In reply to Oleg Romashin (:romaxa) from comment #27)
> hope it will help.
As it doesn't happen in Nightly or Aurora (small sample group of about 800 ADU), we don't know if it's fixed using crash stats.
It's still #1 top crasher in Fennec 10.0b4 (about 10,000 ADU).

status-firefox10: --- → affected

Oleg Romashin (:romaxa)

Comment 29

•

13 years ago

but, aurora does not have that fix... any crashes from nightly?

Scoobidiver (away)

Comment 30

•

13 years ago

(In reply to Oleg Romashin (:romaxa) from comment #29)
> but, aurora does not have that fix... any crashes from nightly?
It has never crashed in Fennec Native Nightly and Aurora.
It crashed in Fennec Aurora 11.0a2/20120111.
People using Nightly and Aurora are not statistically representative.

It seems risky to land this patch in the latest Beta as it also affects the desktop version.

Oleg Romashin (:romaxa)

Comment 31

•

13 years ago

Ok, let's wait when 12 will be representative

Ali Juma [:ajuma]

Assignee

Comment 32

•

13 years ago

(In reply to Scoobidiver from comment #30) 
> It seems risky to land this patch in the latest Beta as it also affects the
> desktop version.

Note that this patch only affects shadow layers, which aren't used by default anywhere on desktop right now. (Shadow layers are currently only used for e10s and for off-main-thread compositing.)

I'd advocate at least taking this on Aurora, so that the next Beta for tablets (which, as I understand it, will still be XUL Fennec rather than native Fennec) has this.

Landing this on Beta right away is indeed risky (given Comment 22), but might be worth doing if there's enough time to land it, measure the effect it's having, and back out if it's making things worse.

Scoobidiver (away)

Updated

•

13 years ago

status-firefox11: --- → affected

Ali Juma [:ajuma]

Assignee

Comment 33

•

13 years ago

I can consistently reproduce this crash on Nightly prior to this landing, but not on the latest Nightly. So I feel reasonably confident that the patch that landed fixed the issue.

Steps to reproduce:
1) Visit http://people.mozilla.com/~ajuma/fennec/test.html
2) Click on "one"
3) Click on OK

status-firefox11: affected → ---

Ali Juma [:ajuma]

Assignee

Updated

•

13 years ago

status-firefox11: --- → affected

status-firefox12: --- → unaffected

Scoobidiver (away)

Updated

•

13 years ago

status-firefox12: unaffected → fixed

Target Milestone: --- → mozilla12

Oleg Romashin (:romaxa)

Comment 34

•

13 years ago

> Steps to reproduce:
> 1) Visit http://people.mozilla.com/~ajuma/fennec/test.html
> 2) Click on "one"
> 3) Click on OK
oh, great, we have test for that crash.
then we definitely should land it to 11, and ajuma is right, this affect only shadowLayers which are currently used only in e10s fennec

Scoobidiver (away)

Comment 35

•

13 years ago

(In reply to Ali Juma [:ajuma] from comment #32)
> might be worth doing if there's enough time to land it, measure the effect
> it's having, and back out if it's making things worse.
As it's reproducible, impacts only the mobile version and there will be a Beta 5 (this night) and 6 (in one week), it's worth asking a Beta approval ASAP.

Ali Juma [:ajuma]

Assignee

Comment 36

•

13 years ago

Comment on attachment 588206 [details] [diff] [review]
Updated patch, drop obsolete buffer carefully

[Approval Request Comment]
Regression caused by (bug #): 690469

User impact if declined: This is the top crash on XUL Fennec, both on Beta and on Aurora.

Testing completed (on m-c, etc.): This has been on m-c since Jan. 12, but this is of limited relevance given the small number of XUL Fennec users we have on Nightly -- the number of users is small enough that we weren't even seeing this crash before the patch landed. In my own testing on a Galaxy Tab, the patch does fix the crash (see Comment 33).

Risk to taking this patch (and alternatives if risky): The patch is sufficiently non-trivial that there is some risk here. The risk is only to XUL Fennec -- this code isn't used by default anywhere else right now. We should definitely take this on Aurora. Given that this is the top crash, we should also consider taking it on Beta despite the risk (ideally though, we should only do this if there's still time to back it out if it causes regressions).

Attachment #588206 - Flags: approval-mozilla-beta?

Attachment #588206 - Flags: approval-mozilla-aurora?

Alex Keybl [:akeybl]

Comment 37

•

12 years ago

Comment on attachment 588206 [details] [diff] [review]
Updated patch, drop obsolete buffer carefully

[Triage Comment]
Our understanding of this issue is that although this is in shared code, XUL Fennec is the only client of this change. Approving for Aurora/Beta.

Attachment #588206 - Flags: approval-mozilla-beta?

Attachment #588206 - Flags: approval-mozilla-beta+

Attachment #588206 - Flags: approval-mozilla-aurora?

Attachment #588206 - Flags: approval-mozilla-aurora+

Ali Juma [:ajuma]

Assignee

Comment 38

•

12 years ago

https://hg.mozilla.org/releases/mozilla-beta/rev/3659d0f64868
https://hg.mozilla.org/releases/mozilla-aurora/rev/462805496704

status-firefox10: affected → fixed

status-firefox11: affected → fixed

Ali Juma [:ajuma]

Assignee

Updated

•

12 years ago

Depends on: 720353

Ali Juma [:ajuma]

Assignee

Comment 39

•

12 years ago

This looks to be the most likely cause of Bug 720353, which is causing more crashes on beta (0.52 crashes/ADU) than the number of crashes fixed by this bug (0.24 crashes/ADU).

Ali Juma [:ajuma]

Assignee

Comment 40

•

12 years ago

Attached patch Backout Attachment 588206 — Details — Splinter Review

Since the landing of this bug on mozilla-beta is the most likely cause of Bug 720353, we should back out this bug from beta.

[Approval Request Comment]
Regression caused by (bug #): The landing of this bug (Attachment 588206 [details] [diff]) on beta.
User impact if declined: Attachment 588206 [details] [diff] is suspected of causing more crashes on beta (0.52 crashes/ADU) than it resolved (0.24 crashes/ADU), so declining to back it out would mean a net increase of 0.28 crashes/ADU.
Testing completed (on m-c, etc.): Bug 720353 has only been seen on beta, likely due to the small number of XUL Fennec users on Aurora and Nightly, so landing on m-c wouldn't give us any useful information.
Risk to taking this patch (and alternatives if risky): Low-risk. This is just a backout of a patch than landed on beta last week.

Attachment #590713 - Flags: approval-mozilla-beta?

Johnathan Nightingale [:johnath]

Updated

•

12 years ago

Attachment #590713 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Ali Juma [:ajuma]

Assignee

Comment 41

•

12 years ago

Backout landed on beta:
https://hg.mozilla.org/releases/mozilla-beta/rev/ebe77c75acab

status-firefox10: fixed → affected

Alex Keybl [:akeybl]

Comment 42

•

12 years ago

Backed out for beta 11 as well: https://hg.mozilla.org/releases/mozilla-beta/rev/788ea1ef610b

Scoobidiver (away)

Updated

•

12 years ago

status-firefox11: fixed → affected

Scoobidiver (away)

Comment 43

•

12 years ago

It's #2 top crasher in 10.0.

Alex Keybl [:akeybl]

Comment 44

•

12 years ago

(In reply to Scoobidiver from comment #43)
> It's #2 top crasher in 10.0.

I think this should be re-opened since it was backed out of Beta 10 and Beta 11. Joe - can you find somebody to take a look?

Assignee: nobody → joe

Status: RESOLVED → REOPENED

Resolution: FIXED → ---

Ali Juma [:ajuma]

Assignee

Comment 45

•

12 years ago

It's conceivable that this crash was really just the problem described in Bug 718150 (or at least closely related). Boris/Chris/Oleg, do you think that patch there might have fixed this crash?

Oleg Romashin (:romaxa)

Comment 46

•

12 years ago

In thebes layers case we do not have internal mSize or mBounds variables, and do check of FrontSurface size using actual buffers structures.
I've noticed one possible problem is that BasicShadowThebesLayer::Swap check only surface size, but not ContentType of buffer.
And another possible problem described in bug 720353#c6
would be nice to find testcase which reproduce that problem

Boris Zbarsky [:bzbarsky]

Comment 47

•

12 years ago

Bug 718150 should not have led to crashes if I understand the code correctly...

Joe Drew (not getting mail)

Comment 48

•

12 years ago

Oleg, can you proceed with your idea in comment 46, and maybe we can resolve the crash in bug 720353?

Alternately, someone able to reproduce that crash would be helpful...

Assignee: joe → romaxa

Scoobidiver (away)

Comment 49

•

12 years ago

(In reply to Joe Drew (:JOEDREW!) from comment #48)
> Alternately, someone able to reproduce that crash would be helpful...
There are STR in comment 13 and comment 15.

Is it reproducible in Native Fennec?

Ali Juma [:ajuma]

Assignee

Comment 50

•

12 years ago

(In reply to Scoobidiver from comment #49)
> (In reply to Joe Drew (:JOEDREW!) from comment #48)
> > Alternately, someone able to reproduce that crash would be helpful...
> There are STR in comment 13 and comment 15.
> 
> Is it reproducible in Native Fennec?

To be clear, there are two crashes being discussed here -- the crash for this bug, and the crash in Bug 720353.

The crash for this bug has STR in comments 13, 15, and 33. It's not reproducible in the current Native Fennec, since the current Native Fennec doesn't use shadow layers. Native Fennec with OMTC does use shadow layers, but the crash for this bug happens in Basic shadow layers while Native Fennec with OMTC uses GL shadow layers. There may turn out to be an analogous crash though, if the root cause is on the content side (since both XUL Fennec and Native Fennec with OMTC use Basic layers on the content side).

The crash in Bug 720353 doesn't have STR. It's also a shadow layers crash, but on the content side, so it may well be possible to reproduce in Native Fennec with OMTC.

Ali Juma [:ajuma]

Assignee

Comment 51

•

12 years ago

Backed out (see Bug 720353):
https://hg.mozilla.org/integration/mozilla-inbound/rev/af19e5ada310

Naoki Hirata :nhirata (please use needinfo instead of cc)

Comment 52

•

12 years ago

I got a different Crash in 33 on Galaxy Nexus doing a 1 off:
> Steps to reproduce:
> 1) Visit http://people.mozilla.com/~ajuma/fennec/test.html
> 2) Click on "one"
> 3) Click on OK
4) Rotate device

It looks like an OOM crash. (03-19 15:06:23.419: E/GeckoApp(3020): low memory); cannot current duplicate the bug based on the repro steps.  Removing reproducible for now.

Keywords: reproducible

Scoobidiver (away)

Updated

•

12 years ago

status-firefox13: --- → fixed

status-firefox14: --- → affected

Robert Kaiser

Reporter

Comment 53

•

12 years ago

This signature is still high in XUL releases (10 ESR). Should we land this on the ESR branch for the upcoming XUL releases?

tracking-firefox-esr10: --- → ?

Scoobidiver (away)

Comment 54

•

12 years ago

(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #53)
> This signature is still high in XUL releases (10 ESR). Should we land this
> on the ESR branch for the upcoming XUL releases?
Fixing this bug with the current patch causes bug 720353 to appear.

Naoki Hirata :nhirata (please use needinfo instead of cc)

Comment 55

•

12 years ago

https://bugzilla.mozilla.org/show_bug.cgi?id=694964#c52 is with nightly build.
Does not crash with nightly.

Doesn't seem to crash for Galaxy Nexus with the ESR release using the steps to repro given here: https://bugzilla.mozilla.org/show_bug.cgi?id=694964#c33

Daniel Veditz [:dveditz]

Comment 56

•

12 years ago

We're going to minus this for ESR10 because this patch caused a worse regression, and there's no patch that fixes both.

status-firefox-esr10: --- → affected

tracking-firefox-esr10: ? → -

Ali Juma [:ajuma]

Assignee

Comment 57

•

12 years ago

I've been able to reproduce this on Native Fennec on an HTC Desire, by rotating the phone back and forth while a page is loading (the crash doesn't always happen while doing this, but happens eventually when I keep repeating this).

Here's what seems to be the problem:

The crash happens when a Thebes layer gets painted twice within a single transaction. This happens when a transaction is set as incomplete (via BasicLayerManager::SetTransactionIncomplete). The result of painting the Thebes layer twice is that we have two TOpPaintThebesBuffer Edits for the same layer within the same transaction. Both of these have the same newFrontBuffer. This means that the ShadowThebesLayer's Swap function gets called twice with the same value of aNewFront. After the first such call, the ShadowThebesLayer's mFrontBufferDescriptor is aNewFront.buffer(). During the second such call, aNewBack's buffer is set to mFrontBufferDescriptor, but mFrontBufferDescriptor's value doesn't change (since it gets set again to the same aNewFront.buffer()). The result is that after this call and the corresponding SetBackBufferAndAttrs on the shadowable layer, the ShadowableThebesLayer's mBackBuffer is the same as ShadowThebesLayer's mFrontBufferDescriptor. The crash then happens when the Shadowable layer decides to destroy mBackBuffer (during a call to CreateBuffer), and the Shadow layer subsequently tries to open mFrontBufferDescriptor.

Another thing that can go wrong when we have two paints of the same layer in the same transaction is mBackBuffer getting destroyed during the second paint. In this case, after the first Swap, mFrontBufferDescriptor will be an already destroyed surface.

Scoobidiver (away)

Updated

•

12 years ago

Whiteboard: [mobile-crash], [STR in comment 13 and 15] → [mobile-crash][native-crash][STR in comment 13 and 15]

Ali Juma [:ajuma]

Assignee

Updated

•

12 years ago

Summary: crash [@ gfxSharedImageSurface::Open] → crash [@ gfxSharedImageSurface::Open] when a layer is painted twice in the same transaction

Benoit Girard (:BenWa)

Updated

•

12 years ago

Assignee: romaxa → ajuma

Cristian Nicolae (:xti)

Comment 58

•

12 years ago

I reproduced this crash on today's Nightly build by going to Awesomebar and back into main view repeatedly:

https://crash-stats.mozilla.com/report/index/bp-e89ea478-0b13-40ec-8665-874a22120327

--
Firefox 14.0a1 (2012-03-27)
Device: Samsung Galaxy S
OS: Android 2.2

Version: 10 Branch → Trunk

Scoobidiver (away)

Comment 59

•

12 years ago

It's a regression from Fx 10.

Version: Trunk → 10 Branch

Naoki Hirata :nhirata (please use needinfo instead of cc)

Comment 60

•

12 years ago

I was able to reproduce this in Fennec Native on Galaxy Nexus by going to http://www.kevs3d.co.uk/dev/ and rotating from portrait to landscape.

Ali Juma [:ajuma]

Assignee

Updated

•

12 years ago

Attachment #588206 - Attachment is obsolete: true

Ali Juma [:ajuma]

Assignee

Comment 61

•

12 years ago

Attached patch Don't generate a Thebes Paint edit in an incomplete transaction. — Details — Splinter Review

The problem is actually that we're generating a TOpPaintThebesBuffer edit when we shouldn't be. Here's how this happens. When BasicShadowableThebesLayer::PaintBuffer gets called, it in turn calls BasicThebesLayer::PaintBuffer with the same arguments. When argument aCallback is null, BasicThebesLayer::PaintBuffer sets the transaction as incomplete, and returns immediately without actually painting anything. But BasicShadowableThebesLayer::PaintBuffer nevertheless goes ahead and generates a spurious TOpPaintThebesBuffer edit (by calling PaintedThebesBuffer). This patch fixes that by making BasicShadowableThebesLayer::PaintBuffer check if the transaction got set as incomplete.

Attachment #610636 - Flags: review?(bgirard)

Benoit Girard (:BenWa)

Comment 62

•

12 years ago

Comment on attachment 610636 [details] [diff] [review]
Don't generate a Thebes Paint edit in an incomplete transaction.

ajuma++!

Attachment #610636 - Flags: review?(bgirard) → review+

Ali Juma [:ajuma]

Assignee

Comment 63

•

12 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/d5a7628eed87

Target Milestone: mozilla12 → mozilla14

Martijn Wargers (dead)

Comment 64

•

12 years ago

I got this crash twice on the HTC Desire HD in current trunk build, while having these 2 tabs open:
http://people.mozilla.com/~mwargers/tests/positonabsolute_redraw_crash.htm
http://planet.mozilla.org/
And while I was performing some panning/zooming actions in the first tab.

Martijn Wargers (dead)

Comment 65

•

12 years ago

Ok, I think I can reproduce this crash (sort of) with only http://people.mozilla.com/~mwargers/tests/positonabsolute_redraw_crash.htm
While being on that page, I first try to pinch zoom out as far as possible, then I pan to the right like crazy to scroll to the left as to make some scroll overflow area visible at the left (this is difficult, because the page has javascript in there that scrolls to the right)

Martijn Wargers (dead)

Comment 66

•

12 years ago

Video here: http://www.youtube.com/watch?v=JCtwByGwgmk

Naoki Hirata :nhirata (please use needinfo instead of cc)

Updated

•

12 years ago

Keywords: reproducible

Naoki Hirata :nhirata (please use needinfo instead of cc)

Comment 67

•

12 years ago

Another STR:
1. Set Fennec -> Settings -> Plugins -> Enabled
2. restart Fennec
3. go to http://people.mozilla.org/~mwargers/tests/flashcrash_ics/buttonopencrash1.htm
4. Select Always Show for popups
5. During the time that it's doing the test, go to menu-> settings

Crash should occur.

Martijn Wargers (dead)

Comment 68

•

12 years ago

I seem to reproducibly crash on the Samsung Galaxy Nexus on this url: http://people.mozilla.org/~mwargers/tests/plugins/flash/scroll.html

Scoobidiver (away)

Updated

•

12 years ago

Whiteboard: [mobile-crash][native-crash][STR in comment 13 and 15] → [mobile-crash][native-crash][STR in comment 13, 15, 58, 60, 65, 67, and 68]

Ed Morley [:emorley]

Comment 69

•

12 years ago

(In reply to Benoit Girard (:BenWa) from comment #62)
> Comment on attachment 610636 [details] [diff] [review]
> Don't generate a Thebes Paint edit in an incomplete transaction.
> 
> ajuma++!

https://hg.mozilla.org/mozilla-central/rev/d5a7628eed87

Status: REOPENED → RESOLVED

Closed: 13 years ago → 12 years ago

Resolution: --- → FIXED

Robert Kaiser

Reporter

Comment 70

•

12 years ago

This is still one of the major topcrashes in the current Fennec "releases" which are from ESR10, should the backout or the fix be ported there?

Ali Juma [:ajuma]

Assignee

Comment 71

•

12 years ago

Comment on attachment 610636 [details] [diff] [review]
Don't generate a Thebes Paint edit in an incomplete transaction.

[Approval Request Comment]
Regression caused by (bug #): Bug 690469
User impact if declined: This is a Fennec top crasher (see Comment 70).
Testing completed (on m-c, etc.): This has been on m-c since March 31 and has fixed the crash there.
Risk to taking this patch (and alternatives if risky): Low risk to Fennec, no risk to desktop (since ShadowLayers aren't used on desktop).
String changes made by this patch: None.

Attachment #610636 - Flags: approval-mozilla-esr10?

Alex Keybl [:akeybl]

Comment 72

•

12 years ago

Comment on attachment 610636 [details] [diff] [review]
Don't generate a Thebes Paint edit in an incomplete transaction.

(In reply to Ali Juma [:ajuma] from comment #71)
> Comment on attachment 610636 [details] [diff] [review]
> Don't generate a Thebes Paint edit in an incomplete transaction.
> 
> [Approval Request Comment]
> Regression caused by (bug #): Bug 690469
> User impact if declined: This is a Fennec top crasher (see Comment 70).
> Testing completed (on m-c, etc.): This has been on m-c since March 31 and
> has fixed the crash there.
> Risk to taking this patch (and alternatives if risky): Low risk to Fennec,
> no risk to desktop (since ShadowLayers aren't used on desktop).
> String changes made by this patch: None.

Approved for the ESR branch. Does bug 720353 need to come along for the ride?

Attachment #610636 - Flags: approval-mozilla-esr10? → approval-mozilla-esr10+

Alex Keybl [:akeybl]

Updated

•

12 years ago

tracking-firefox-esr10: - → 12+

Ali Juma [:ajuma]

Assignee

Comment 73

•

12 years ago

(In reply to Alex Keybl [:akeybl] from comment #72)
 
> Approved for the ESR branch. Does bug 720353 need to come along for the ride?

No, Bug 720353 backs out on 14 a patch that had already been backed out on 10 (see Comment 41 above).

Alex Keybl [:akeybl]

Comment 74

•

12 years ago

(In reply to Ali Juma [:ajuma] from comment #73)
> (In reply to Alex Keybl [:akeybl] from comment #72)
>  
> > Approved for the ESR branch. Does bug 720353 need to come along for the ride?
> 
> No, Bug 720353 backs out on 14 a patch that had already been backed out on
> 10 (see Comment 41 above).

Thanks Ali. We spent a few minutes trying to untangle the status flags and backouts without success during triage, so we appreciate your confirmation.

Ali Juma [:ajuma]

Assignee

Comment 75

•

12 years ago

Landed on esr10:
https://hg.mozilla.org/releases/mozilla-esr10/rev/52183a4ceb7c

along with a follow-up bustage fix (BasicLayerManager::IsTransactionIncomplete wasn't already defined on esr10):
https://hg.mozilla.org/releases/mozilla-esr10/rev/2329d4a9d3d1

status-firefox-esr10: affected → fixed

Scoobidiver (away)

Updated

•

12 years ago

status-firefox14: affected → ---

Scoobidiver (away)

Comment 76

•

12 years ago

There are crashes at a lower volume in XUL Fennec 10.0.4esr (#31 top crasher), 10.0.5esr (#37) and 14.0b6 (#42) that contain the fix: https://crash-stats.mozilla.com/report/list?signature=gfxSharedImageSurface%3A%3AOpen

Crash fix on assumption from comment 21 13 years ago Oleg Romashin (:romaxa) 1.17 KB, patch	romaxa : review-	Details \| Diff \| Splinter Review
Updated patch, drop obsolete buffer carefully 13 years ago Oleg Romashin (:romaxa) 2.49 KB, patch	ajuma : review+ akeybl : approval-mozilla-aurora+ akeybl : approval-mozilla-beta+	Details \| Diff \| Splinter Review
Backout Attachment 588206 12 years ago Ali Juma [:ajuma] 2.49 KB, patch	johnath : approval-mozilla-beta+	Details \| Diff \| Splinter Review
Don't generate a Thebes Paint edit in an incomplete transaction. 12 years ago Ali Juma [:ajuma] 1.21 KB, patch	BenWa : review+ akeybl : approval-mozilla-esr10+	Details \| Diff \| Splinter Review