Closed Bug 1286437 Opened 8 years ago Closed 8 years ago

Crash in mozilla::ipc::FatalError | mozilla::layers::PLayerTransactionParent::Read

Tracking

()

Status:

RESOLVED FIXED

Tracking Flags:

Tracking

Status

firefox47

---

unaffected

firefox48

---

unaffected

firefox49

---

unaffected

firefox-esr45

---

unaffected

firefox50

---

fixed

People

(Reporter: kanru, Assigned: nical)

Details

(Keywords: crash, regression, Whiteboard: [gfx-noted])

Crash Data

Attachments

(1 file)

Check that the textures we are sending are on the same channel on the child side. 8 years ago Nicolas Silva [:nical] 3.77 KB, patch	dvander : review+	Details \| Diff \| Splinter Review

Kan-Ru Chen [:kanru] (UTC+9)

Reporter

Description

•

8 years ago

This bug was filed from the Socorro interface and is 
report bp-ebd9f239-05f7-4e33-b800-843e12160704.
=============================================================

#6 crash for nightly on windows build 20160711034039.

Crash looks started from 2016-07-04.
https://crash-stats.mozilla.com/signature/?product=Firefox&release_channel=nightly&platform=Windows&date=%3E%3D2016-04-04&signature=mozilla%3A%3Aipc%3A%3AFatalError%20%7C%20mozilla%3A%3Alayers%3A%3APLayerTransactionParent%3A%3ARead&_columns=date&_columns=product&_columns=version&_columns=build_id&_columns=platform&_columns=reason&_columns=address&_sort=&page=1#graphs

Possible regression window:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=39dffbba764210b25bfc1e749b4f16db77fa0d46&tochange=c9a70b64f2faa264296f0cc90d68a2ee2bac6ac5

Flags: needinfo?(nical.bugzilla)

Marcia Knous [:marcia]

Comment 1

•

8 years ago

Bug 1208226#c63 mentions getting this crash when he tore away a tab, and there is another comment in the bug about tearing tabs causing this crash - https://crash-stats.mozilla.com/report/index/dd2c131a-332c-4f5a-909f-af9742160706

David Anderson [:dvander] - inactive, e-mail if emergency

Updated

•

8 years ago

Whiteboard: [gfx-noted]

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

8 years ago

Version: unspecified → 50 Branch

Nicolas Silva [:nical]

Assignee

Comment 2

•

8 years ago

I have been looking into this for quite a while now and it's hard to tell what the root cause is, beyound that a texture is added to the update list of a LayerTransactionChild (se UseTextures in ShadowLayers.cpp) and destroyed (seemingly outside of the transaction) causing the parent side to receive the destruction of the texture before the OpUseTexture/Update message.
A band-aid that is probably not unreasonable would be to keep a list of strong references to all textureclients queued for update in ShadowLayers.cpp so that the TextureClient is not destroyed before the transaction is sent.
Ideally I'd find out if the texture is indeed destroyed out of the transaction and how it is even possible since queuing textures for updates happens in the transaction. Or maybe figure out why the destruction of the texture is not added to the current transaction if the texture is in fact destroyed during a transaction.
I'm still digging.

Flags: needinfo?(nical.bugzilla)

Milan Sreckovic [:milan] (needinfo for best results)

Comment 3

•

8 years ago

(In reply to Nicolas Silva [:nical] from comment #2)
> ...
> I'm still digging.

Thanks.  I'd rather this didn't make it to aurora (we have until the end of the month.)

Assignee: nobody → nical.bugzilla

David Anderson [:dvander] - inactive, e-mail if emergency

Comment 4

•

8 years ago

(In reply to Nicolas Silva [:nical] from comment #2)
> I have been looking into this for quite a while now and it's hard to tell
> what the root cause is, beyound that a texture is added to the update list
> of a LayerTransactionChild (se UseTextures in ShadowLayers.cpp) and
> destroyed (seemingly outside of the transaction) causing the parent side to
> receive the destruction of the texture before the OpUseTexture/Update
> message.
> A band-aid that is probably not unreasonable would be to keep a list of
> strong references to all textureclients queued for update in
> ShadowLayers.cpp so that the TextureClient is not destroyed before the
> transaction is sent.
> Ideally I'd find out if the texture is indeed destroyed out of the
> transaction and how it is even possible since queuing textures for updates
> happens in the transaction. Or maybe figure out why the destruction of the
> texture is not added to the current transaction if the texture is in fact
> destroyed during a transaction.
> I'm still digging.

What's confusing is this looks like a deserialization error. Generally those should be impossible - sending a bad actor can do it, but it's not failing to deserialize an actor. It's failing decoding a timestamp (PLayerTransactionParent.cpp:3223). MessageChannel only dispatches after receiving the full length of a message, so we would only expect this to happen if something were severely wrong, like the message dispatching on the wrong top-level actor, or a compromised child process, or pickle/unpickle code not being symmetric.

Nicolas Silva [:nical]

Assignee

Comment 5

•

8 years ago

(In reply to David Anderson [:dvander] from comment #4)
> [...] like the message dispatching on the
> wrong top-level actor

Oooooooh! Somehow I was convinced that the child process would check that, and that a crash in the parent process could therefore not be caused by the child process sending with the wrong top-level actor.
I just hacked something up to check and indeed, the symptoms are the same when that happens. I'll add some checks ShadowLayers.cpp and hopefully we'll get more interesting crash information.

Nicolas Silva [:nical]

Assignee

Comment 6

•

8 years ago

Attached patch Check that the textures we are sending are on the same channel on the child side. — Details — Splinter Review

This is not the ideal thing. I first tried to get the IPDL generator to emit checks that actors have proper top-level protocol when serializing their handles but modifying the IPDL generator is ridiculously hard and I want to get better crash data asap so this will do in the short term.

Attachment #8773705 - Flags: review?(dvander)

David Anderson [:dvander] - inactive, e-mail if emergency

Updated

•

8 years ago

Attachment #8773705 - Flags: review?(dvander) → review+

Pulsebot

Comment 7

•

8 years ago

Pushed by nsilva@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/8c4e144e372b
Crash the content process instead of the parent when an actor is sent on the wrong channel. r=dvander

Nicolas Silva [:nical]

Assignee

Updated

•

8 years ago

Whiteboard: [gfx-noted] → [gfx-noted][leave-open]

Carsten Book [:Tomcat]

Comment 8

•

8 years ago

bugherder

https://hg.mozilla.org/mozilla-central/rev/8c4e144e372b

Milan Sreckovic [:milan] (needinfo for best results)

Comment 9

•

8 years ago

Is this being followed up in bug 1291296 now?  It seems this assert is hitting there, and we have some patches there.  Should we close this?

Flags: needinfo?(nical.bugzilla)

Nicolas Silva [:nical]

Assignee

Comment 10

•

8 years ago

Yes, let's close it now that we know that the crash has moved.

Flags: needinfo?(nical.bugzilla)

Nicolas Silva [:nical]

Assignee

Updated

•

8 years ago

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → FIXED

Whiteboard: [gfx-noted][leave-open] → [gfx-noted]

Nicolas Silva [:nical]

Assignee

Comment 11

•

8 years ago

Fixed in 50 as part of bug 1291296.

status-firefox50: affected → fixed

You need to log in before you can comment on or make changes to this bug.

Bugzilla

Quick Search

Crash in mozilla::ipc::FatalError | mozilla::layers::PLayerTransactionParent::Read

Categories

(Core :: Graphics: Layers, defect)

Tracking

()

People

(Reporter: kanru, Assigned: nical)

References

Details

(Keywords: crash, regression, Whiteboard: [gfx-noted])

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Updated

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Comment 6

Updated

Comment 7

Updated

Comment 8

Comment 9

Comment 10

Updated

Comment 11

Attachment

General

Description

File Name

Content Type