Closed Bug 1125030 Opened 5 years ago Closed 5 years ago

Got abort in PVsync.cpp in b2g-ics-debug emulator mochitest

Categories

(Core :: Graphics, defect)

ARM
Gonk (Firefox OS)
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla38
Tracking Status
firefox36 --- wontfix
firefox37 --- wontfix
firefox38 --- fixed
b2g-v2.2 --- fixed
b2g-master --- fixed

People

(Reporter: jerry, Assigned: jerry)

References

Details

Attachments

(1 file, 4 obsolete files)

From https://treeherder.mozilla.org/#/jobs?repo=try&revision=95a920d7f6ec ,
There are some M tests failed with the following log:

01-20 19:25:42.444 I/Gecko ( 779): [Child 779] ###!!! ABORT: corrupted actor state: file PVsync.cpp, line 34
01-20 19:25:42.514 E/Gecko ( 779): mozalloc_abort: [Child 779] ###!!! ABORT: corrupted actor state: file PVsync.cpp, line 34

It seems we use the PVsyncChild actor after free. I will check the cleanup flow for refresh driver and the actor.
Blocks: 1123762
No longer depends on: 1123762
The try result for attachment 8554481 [details] [diff] [review].
There is no abort for child actor in test M2, M3, M4 and M6.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=fb783798915e

But we still have timeout test failed.
Comment on attachment 8554481 [details] [diff] [review]
Handle RefreshDriverVsyncObserver shutdown in VsyncChild ActorDestroy(). v1

Review of attachment 8554481 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/base/nsRefreshDriver.cpp
@@ +354,5 @@
>  
> +    // When PVsyncChild receives ActorDestroy() or refresh driver timer
> +    // shutdowns, we call this shutdown() to detach current observer from
> +    // PVsyncChild. Then the PVsyncChild will no longer be used by this observer.
> +    virtual void Shutdown() MOZ_OVERRIDE

Hmm, doesn't this mean we actually call Shutdown twice? Once with ActorDestroy and a second time with the refresh driver timer shutting down? What's the current proper shutdown sequence?

@@ +360,5 @@
>        MOZ_ASSERT(NS_IsMainThread());
>        mVsyncRefreshDriverTimer = nullptr;
> +
> +      if (mVsyncDispatcher && XRE_IsParentProcess()) {
> +        // parent process

nit: delete //parent process

@@ +364,5 @@
> +        // parent process
> +        mVsyncDispatcher->SetParentRefreshTimer(nullptr);
> +        mVsyncDispatcher = nullptr;
> +      } else if (mVsyncChild) {
> +        // child process

nit: delete // child process

@@ +372,5 @@
> +
> +    void StartTimer()
> +    {
> +      if (mVsyncDispatcher && XRE_IsParentProcess()) {
> +        // parent process

same

::: layout/ipc/VsyncChild.cpp
@@ +59,5 @@
>  }
>  
>  bool
>  VsyncChild::RecvNotify(const TimeStamp& aVsyncTimestamp)
> +{  MOZ_ASSERT(NS_IsMainThread());

nit: move to line below.

@@ +64,1 @@
>    if (mObservingVsync && mObserver) {

We should also make sure that if we shutdown, we shouldn't notify the refresh driver of vsyncs.
I asked :bent. On opt builds, we don't do a full XPCom shutdown for PBackground, so we won't get the ActorDestroy notifications. so we have to potentially have two shutdown paths :/
Update the shutdown flow in comment.

During xpcom shutdown, we release the PVsyncChild actor before refresh driver, but we
might still use that actor to send message(e.g. We call VsyncChild::SendUnobserve() in RemoveRefreshDriver()).
This patch tries to cleanup RefreshDriverVsyncObserver in ActorDestroy().
Attachment #8554481 - Attachment is obsolete: true
Comment on attachment 8555206 [details] [diff] [review]
Handle RefreshDriverVsyncObserver shutdown in VsyncChild ActorDestroy(). v2

Review of attachment 8555206 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/base/nsRefreshDriver.cpp
@@ +293,4 @@
>    {
>      MOZ_ASSERT(!XRE_IsParentProcess());
>      MOZ_ASSERT(NS_IsMainThread());
> +    mVsyncObserver = new RefreshDriverVsyncObserver(this, aVsyncChild);

Move VsyncChild actor and VsyncDispatcher into mVsyncObserver, and refresh driver will not use the actor and VsyncDispatcher directly.

::: layout/ipc/VsyncChild.cpp
@@ +61,5 @@
>  bool
>  VsyncChild::RecvNotify(const TimeStamp& aVsyncTimestamp)
>  {
>    MOZ_ASSERT(NS_IsMainThread());
>    if (mObservingVsync && mObserver) {

After ActorDestroy(), we should not receive any RecvNotify() call.
Should we still need to check mIsShutdown flag?
Attachment #8555206 - Flags: review?(bugmail.mozilla)
Attachment #8555206 - Flags: review?(bent.mozilla)
(In reply to Mason Chang [:mchang] from comment #4)
> I asked :bent. On opt builds, we don't do a full XPCom shutdown for
> PBackground, so we won't get the ActorDestroy notifications. so we have to
> potentially have two shutdown paths :/

I don't really understand the problem. If we just call exit(0) in child processes with optimized builds, I think the parent actor should receive ActorDestroy() and then we do the normal clean up at parent side.
Do you mean we have some problem at child side with opt build?
Comment on attachment 8555206 [details] [diff] [review]
Handle RefreshDriverVsyncObserver shutdown in VsyncChild ActorDestroy(). v2

Review of attachment 8555206 [details] [diff] [review]:
-----------------------------------------------------------------

Looks kinda ok to me but I want to see the Shutdown renamed to Destroy as per my comment below.

::: layout/base/nsRefreshDriver.cpp
@@ +363,3 @@
>        mVsyncRefreshDriverTimer = nullptr;
> +
> +      if (mVsyncDispatcher && XRE_IsParentProcess()) {

The XRE_IsParentProcess() check here is redundant, right? If mVsyncDispatcher is non-null then we must be in the parent process. Same in StartTimer and StopTimer.

::: widget/VsyncDispatcher.h
@@ +27,5 @@
>    virtual bool NotifyVsync(TimeStamp aVsyncTimestamp) = 0;
>  
> +  // Cleanup VsyncObserver for shutdown phase. Subclass can implement this
> +  // function if needs.
> +  virtual void Shutdown() {};

Rename this to Destroy and make it pure virtual. Then make the CompositorVsyncObserver::Destroy be the implementation for this. Basically if you're going to add this function to the interface then you should ensure it actually gets called in all places that use a VsyncObserver, and that includes the compositor.
Attachment #8555206 - Flags: review?(bugmail.mozilla) → feedback+
Comment on attachment 8555206 [details] [diff] [review]
Handle RefreshDriverVsyncObserver shutdown in VsyncChild ActorDestroy(). v2

Review of attachment 8555206 [details] [diff] [review]:
-----------------------------------------------------------------

I think we still have a problem here since Shutdown can be called twice. One by ActorDestroy in the VsyncParent/Child and again in nsRefreshDriver::Shutdown. This version assumes that ActorDestroy is called before nsRefreshDriver::Shutdown, but in opt builds this isn't the case. I think we're better off assuming in the nsRefreshDriver::Shutdown that the IPC pipe is already closed. So the nsRefreshDriver just cleans up what's used there and doesn't make any IPC calls and the VsyncParent/Child just clean up in ActorDestroy or the destructor.

::: layout/ipc/VsyncChild.cpp
@@ +61,5 @@
>  bool
>  VsyncChild::RecvNotify(const TimeStamp& aVsyncTimestamp)
>  {
>    MOZ_ASSERT(NS_IsMainThread());
>    if (mObservingVsync && mObserver) {

I would just assert that mIsShutdown is false here then.
Comment on attachment 8555206 [details] [diff] [review]
Handle RefreshDriverVsyncObserver shutdown in VsyncChild ActorDestroy(). v2

Review of attachment 8555206 [details] [diff] [review]:
-----------------------------------------------------------------

There's nothing really specific to IPC here so I'll defer to Mason.
Attachment #8555206 - Flags: review?(bent.mozilla) → review?(mchang)
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #7)
> (In reply to Mason Chang [:mchang] from comment #4)
> > I asked :bent. On opt builds, we don't do a full XPCom shutdown for
> > PBackground, so we won't get the ActorDestroy notifications. so we have to
> > potentially have two shutdown paths :/
> 
> I don't really understand the problem. If we just call exit(0) in child
> processes with optimized builds, I think the parent actor should receive
> ActorDestroy() and then we do the normal clean up at parent side.
> Do you mean we have some problem at child side with opt build?

I think it means that we just call exit(0) so ActorDestroy never gets called on the child side.
Comment on attachment 8555206 [details] [diff] [review]
Handle RefreshDriverVsyncObserver shutdown in VsyncChild ActorDestroy(). v2

Review of attachment 8555206 [details] [diff] [review]:
-----------------------------------------------------------------

From https://hg.mozilla.org/try/rev/952337d8e324 - I think we only need to ensure that during the nsRefreshDriver shutdown, we assume that the VsyncChild has already been cleaned up by IPDL. The VsyncChild is a dangling pointer and can crash. We can greatly simplify this patch to just not read the vsync child and set it to nullptr. Please update this patch, thanks!
Attachment #8555206 - Flags: review?(mchang)
And "nsRefreshDriver" will call RemoveRefreshDriver() before that.
So during the shutdown, we will not call SendObserve/SendUnobserver to use
VsyncChild actor. The actor will be released by PBackgroundChild, and we can
just clear the mVsyncChild in ~VsyncRefreshDriverTimer().


This try handles the PVsync abort and use "compress" for Notify() message.
It looks good.
https://treeherder.mozilla.org/#/jobs?repo=try&revision=eae91449fd16
Attachment #8555930 - Flags: review?(bent.mozilla)
(In reply to Jerry Shih[:jerry] (UTC+8) from comment #13)
> Created attachment 8555930 [details] [diff] [review]
> I think VsyncChild actor and refresh driver are available before xpcom
> shutdown.
> 
> And "nsRefreshDriver" will call RemoveRefreshDriver() before that.
> So during the shutdown, we will not call SendObserve/SendUnobserver to use
> VsyncChild actor. The actor will be released by PBackgroundChild, and we can
> just clear the mVsyncChild in ~VsyncRefreshDriverTimer().
> 
> 
> This try handles the PVsync abort and use "compress" for Notify() message.
> It looks good.
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=eae91449fd16

This try still needs to enable the refresh driver pref. But this try here combines both and looks better - https://hg.mozilla.org/try/rev/952337d8e324. The patch still applies.
Attachment #8555206 - Attachment is obsolete: true
Comment on attachment 8555930 [details] [diff] [review]
I think VsyncChild actor and refresh driver are available before xpcom shutdown.

Review of attachment 8555930 [details] [diff] [review]:
-----------------------------------------------------------------

We should also ensure mIsObservingVsync = false explicitly in ActorDestroy.
https://dxr.mozilla.org/mozilla-central/source/layout/ipc/VsyncChild.cpp?from=VsyncChild.cpp&case=true#51
Comment on attachment 8555930 [details] [diff] [review]
I think VsyncChild actor and refresh driver are available before xpcom shutdown.

Review of attachment 8555930 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/base/nsRefreshDriver.cpp
@@ +367,5 @@
>        mVsyncDispatcher = nullptr;
>      } else {
> +      // During the shutdown process, PVsyncChild actors is destroyed before
> +      // refresh driver. So the mVsyncChild will no longer be used. Just clear
> +      // to nullptr here.

I don't think this is what you want... You're basically saying "at some time in the recent past this pointer became invalid"... Why not just reach in and null it out when the actor is destroyed?
Attachment #8555930 - Flags: review?(bent.mozilla) → review-
Hmm, I thought in opt builds, the PBackgroundImpl doesn't call ActorDestroy. So the only other applicable time would be when the VsyncChild's destructor runs. I'm not sure we're guaranteed that the destructor will run before the nsRefreshDriver::Shutdown will run. Unless I misunderstood something within the shutdown procedure.
Flags: needinfo?(bent.mozilla)
In opt builds if we exit() before VsyncChild::ActorDestroy then this isn't a problem because a) we won't have a dangling pointer since nothing got deleted, and b) nsRefreshDriver::Shutdown won't try to use it because it won't run either, right?
Flags: needinfo?(bent.mozilla)
Hm ok, then I guess in the opt case we don't have to worry about much. Alright, then the original approach may be better here and in VsyncChild::ActorDestroy, we clean everything up and we don't have to do anything else in nsRefreshDriver::Shutdown.
Attachment #8555206 - Attachment is obsolete: false
In attachment 8555206 [details] [diff] [review], we should have an additional interface and call it in
ActorDestroy(). It will change all VsyncObserver.

This patch just make VsyncChild become ref-counting. RefreshDriver
and BackgroundChild both hold the ref-count for the actor. After actor dies,
we setup a shutdown flag in actor and this actor will not send any ipc message
anymore. And we might not suffer the VsyncChild::ActorDestroy() and RefreshDriver
dtor sequence problem(maybe we call RefreshDriver dtor before ActorDestroy() in
the future).

Ben, what do you think of this patch and attachment 8555206 [details] [diff] [review]?
Attachment #8555930 - Attachment is obsolete: true
Attachment #8556357 - Flags: review?(bent.mozilla)
Comment on attachment 8556357 [details] [diff] [review]
Handle VsyncChild child shutdown in ActorDestroy(). v1

Review of attachment 8556357 [details] [diff] [review]:
-----------------------------------------------------------------

This will work fine, but it seems more complicated than simply nulling out mVsyncChild in nsRefreshDriver... I'm fine either way I guess.
Attachment #8556357 - Flags: review?(bent.mozilla) → review+
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #21)
> Comment on attachment 8556357 [details] [diff] [review]
> Handle VsyncChild child shutdown in ActorDestroy(). v1
> 
> Review of attachment 8556357 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This will work fine, but it seems more complicated than simply nulling out
> mVsyncChild in nsRefreshDriver... I'm fine either way I guess.

Talk with Ben on irc.
If we want to null the mVsyncChild in nsRefreshDriver, we need another interface in VsyncObserver.
We don't hold concrete class in VsyncChild.
update commit message.
Attachment #8556357 - Attachment is obsolete: true
Attachment #8556933 - Attachment description: Handle VsyncChild child shutdown in ActorDestroy(). v2 → Handle VsyncChild child shutdown in ActorDestroy(). v2 r=bent
https://hg.mozilla.org/mozilla-central/rev/e51f02d321f9
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla38
Attachment #8555206 - Attachment is obsolete: true
Comment on attachment 8556933 [details] [diff] [review]
Handle VsyncChild child shutdown in ActorDestroy(). v2 r=bent

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 1123762, refresh driver for silk
User impact if declined: We cannot enable silk for the refresh driver
Testing completed: Manual testing and mochitests.
Risk to taking this patch (and alternatives if risky): Low. This patch ensures that the refresh driver is shut down correctly and is required to pass mochitests with the refresh driver enabled.
String or UUID changes made by this patch: None
Attachment #8556933 - Flags: review+
Attachment #8556933 - Flags: approval-mozilla-b2g37?
Attachment #8556933 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
You need to log in before you can comment on or make changes to this bug.