Closed Bug 966284 Opened 11 years ago Closed 11 years ago

Fix gfx IPDL protocol shutdown

Tracking

()

Status:

RESOLVED FIXED

Milestone:

mozilla36

People

(Reporter: nical, Assigned: nical)

References

Details

Attachments

(1 file, 1 obsolete file)

Fix the shut down sequence of PLayerTransaction 11 years ago Nicolas Silva [:nical] 10.78 KB, patch	sotaro : review+	Details \| Diff \| Splinter Review
Fix the shutdown sequence of PLayerTransaction (carrying r=sotaro) 11 years ago Nicolas Silva [:nical] 6.68 KB, patch		Details \| Diff \| Splinter Review

Nicolas Silva [:nical]

Assignee

Description

•

11 years ago

The shutdown the synchronous child-to-parent shutdown of some protocols (such as PLayerTransaction) can race with async parent-to-child messages of their managed protocols (such as PTexture). We need to redesign tear down logic making sure the shutdown of a manager protocol never race with its sub protocols.

Nicolas Silva [:nical]

Assignee

Updated

•

11 years ago

Blocks: 952404

Nicolas Silva [:nical]

Assignee

Comment 1

•

11 years ago

I propose that we do the following: 1) At the protocol level: Almost all of the gfx protocol deletion is triggered by the child side. To make sure that deletion doesn't cross a message coming form the other side, we need a simple hand shake such as FooChild::SendShutdown -> FooParent::RecvShutdown -> FooParent::Send__delete__ Plus carefully keeping a state in FooChild so it knows that after it has sent the destroy message it can't send any other message (although it might receive some) This is how PTexture already works. 2) At the protocol hierarchy level: To avoid that a manager protocol races with it managees, we also need the manager to shutdown with a similar hand shake. The manager must also iterate over it's managees to trigger their destruction sequence as well as in the diagram below: FooManager::Destroy() | Go through all of PFooManager's managed | protocols, let them send their last | potential message, and mark them so | that they can't send any more message. PFooManager::SendShutDown --. \ \ ... can receive and send ... ... \ `--> PFooManager::RecvShutDown can receive | Go through all of PFooManager's managed but not send | protocols, let them send their last | potential message, and mark them so ... | that they can't send any more message. .-- PFooManager::Send__delete__ / / ... can't receive nor send ... / PFooManager::Recv__delete__ <--' 3) Shutting down the channel: When we shut down the channel we need to destroy the top level protocol synchronously with respect to the main thread and also need to accommodate with the managed protocols having an asynchronous shutdown. Right now we do that with two synchronous messages (WillStop followed by Stop). the first triggers all the deletion from the other side (after it returns the other side can't send message anymore), the second is just there to ensure that all of the messages generated by the destruction on the other side has reached the child side's event loop. In order to make it work with 1) and 2) we just need to iterate over our managees and trigger their shutdown before sending WillStop. So there is almost no change to be made there.

Nicolas Silva [:nical]

Assignee

Comment 2

•

11 years ago

Attached patch Fix the shut down sequence of PLayerTransaction (obsolete) — Details — Splinter Review

This patch fixes the shutdown problem by making PlayerTransaction do what I proposed earlier in this bug. This makes sure that destroying PLayerTransaction will not race with PTexture messages.

Assignee: nobody → nical.bugzilla

Attachment #8375544 - Flags: review?(sotaro.ikeda.g)

Sotaro Ikeda [:sotaro]

Comment 3

•

11 years ago

Comment on attachment 8375544 [details] [diff] [review] Fix the shut down sequence of PLayerTransaction Review of attachment 8375544 [details] [diff] [review]: ----------------------------------------------------------------- Looks good!

Attachment #8375544 - Flags: review?(sotaro.ikeda.g) → review+

Nicolas Silva [:nical]

Assignee

Updated

•

11 years ago

Blocks: 972444

Nicolas Silva [:nical]

Assignee

Comment 4

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/9e7cf0b1d80c

Nicolas Silva [:nical]

Assignee

Comment 5

•

11 years ago

(In reply to Nicolas Silva [:nical] from comment #4) > https://hg.mozilla.org/integration/mozilla-inbound/rev/9e7cf0b1d80c sorry that was for another bug

Nicolas Silva [:nical]

Assignee

Comment 6

•

11 years ago

try push https://tbpl.mozilla.org/?tree=Try&rev=2980d34d00fd

Nicolas Silva [:nical]

Assignee

Comment 7

•

11 years ago

another try push https://tbpl.mozilla.org/?tree=Try&rev=fa569276a9ce

Nicolas Silva [:nical]

Assignee

Comment 8

•

11 years ago

I haven't looked at this for a while but it looks like it's close to be landable. The problem is that it doesn't seem to be working well on the non-omtc platforms (surprisingly), as this try push shows https://tbpl.mozilla.org/?tree=Try&rev=78b0483db68c We get some MsgRouteError in PContentChild which this patch doesn't modify directly. Ben, does that ring any bell? What this patch does is mostly have PLayerTransaction send a Shutdown message from the client side and have the parent side send __delete__ rather than haing the child side send __delete__ directly.

Flags: needinfo?(bent.mozilla)

Ben Turner (not reading bugmail, use the needinfo flag!)

Comment 9

•

11 years ago

Hrm. Usually this means that the child had a message in its queue for an actor that has been deleted by the time the message is processed. The only real way to figure out which message it is is to catch it in a debugger and look at the message type directly... It's possible that your patch just changed timing somehow and exposed a latent bug. Or maybe the parent layer actors hold strong refs to other PContent things, and this patch changes the order of releases?

Flags: needinfo?(bent.mozilla)

Nicolas Silva [:nical]

Assignee

Comment 10

•

11 years ago

This try push doesn't have the mysterious PContent issues, although bc1 is failing with some ipc related stuff. https://tbpl.mozilla.org/?tree=Try&rev=fa301b907093

Nicolas Silva [:nical]

Assignee

Comment 11

•

11 years ago

quick fix for bc1, this push looks neat (not all tests are done as I write this): https://tbpl.mozilla.org/?tree=Try&rev=d195ff8212ed

Nicolas Silva [:nical]

Assignee

Comment 12

•

11 years ago

Attached patch Fix the shutdown sequence of PLayerTransaction (carrying r=sotaro) — Details — Splinter Review

patch with the bc1 fix.

Attachment #8375544 - Attachment is obsolete: true

Nicolas Silva [:nical]

Assignee

Comment 13

•

11 years ago

https://hg.mozilla.org/integration/mozilla-inbound/rev/9d614a6431ae

Wes Kocher (:KWierso) (Not reading bugmail; email directly if needed)

Comment 14

•

11 years ago

https://hg.mozilla.org/mozilla-central/rev/9d614a6431ae

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → FIXED

Target Milestone: --- → mozilla36

Milan Sreckovic [:milan] (needinfo for best results)

Updated

•

10 years ago

Bugzilla

Fix gfx IPDL protocol shutdown

Categories

(Core :: Graphics: Layers, defect)

Tracking

()

People

(Reporter: nical, Assigned: nical)

References

Details

Crash Data

Security

(public)

User Story

Attachments

(1 file, 1 obsolete file)

Description

Updated

Comment 1

Comment 2

Comment 3

Updated

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Updated

Attachment

General

Description

File Name

Content Type