designMode crash [@ mozilla::a11y::EventQueue::PushEvent]

VERIFIED FIXED in mozilla28

Status

()

defect
--
critical
VERIFIED FIXED
6 years ago
6 years ago

People

(Reporter: jruderman, Assigned: tbsaunde)

Tracking

(Blocks 2 bugs, {crash, regression, testcase})

27 Branch
mozilla28
x86_64
All
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(crash signature)

Attachments

(3 attachments)

Posted file testcase
1. Enable accessibility, e.g. by pasting the following into the Browser Console:

Components.classes["@mozilla.org/accessibilityService;1"].getService(Components.interfaces.nsIAccessibleRetrieval);

2. Load the testcase.
3. Click the button.

Result: Crsah [@ mozilla::a11y::EventQueue::PushEvent]

I can only reproduce in a debug build.
Posted file stack
Regression window(m-i)
Good:
http://hg.mozilla.org/integration/mozilla-inbound/rev/9b1ab0fcde02
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0 ID:20131017082616
Bad:
http://hg.mozilla.org/integration/mozilla-inbound/rev/dca0f18f3e86
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:27.0) Gecko/20100101 Firefox/27.0 ID:20131017084312
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=9b1ab0fcde02&tochange=dca0f18f3e86

Regressed by:Bug 923331
Blocks: 923331
Keywords: regression
OS: Mac OS X → All
Version: Trunk → 27 Branch
Assignee: nobody → wmccloskey
Would you mind taking a look at this Trevor? I can't see how my changes are causing this problem, and I don't understand the code enough to debug it myself.

Only one line of my checkin is needed to trigger the crash. If you remove the disablehistory attribute from tabbrowser.xml, it triggers the crash. I can't figure out how this might be related though.

As far as I can tell, we create a new DocAccessible object, call Init on it, and then call Shutdown on it. The Shutdown call happens via this stack:

#0  mozilla::a11y::DocAccessible::Shutdown (this=0x7fffbf593ec0)
    at /home/billm/mozilla/in1/accessible/src/generic/DocAccessible.cpp:594
#1  0x00007fffefc6126a in mozilla::a11y::OuterDocAccessible::InsertChildAt (this=0x7fffc02cbca0, aIdx=1, 
    aAccessible=0x7fffbf594090) at /home/billm/mozilla/in1/accessible/src/generic/OuterDocAccessible.cpp:165
#2  0x00007fffefbf19fa in mozilla::a11y::Accessible::AppendChild (this=0x7fffc02cbca0, aChild=0x7fffbf594090)
    at /home/billm/mozilla/in1/accessible/src/base/../generic/Accessible.h:334
#3  0x00007fffefbfb817 in mozilla::a11y::NotificationController::WillRefresh (this=0x7fffbe9dcb80, aTime=...)
    at /home/billm/mozilla/in1/accessible/src/base/NotificationController.cpp:223
#4  0x00007ffff023b117 in nsRefreshDriver::Tick (this=0x7fffc1bcd400, aNowEpoch=1382995139310631, aNowTime=...)
    at /home/billm/mozilla/in1/layout/base/nsRefreshDriver.cpp:1074
#5  0x00007ffff023cd03 in mozilla::RefreshDriverTimer::TickDriver (driver=0x7fffc1bcd400, jsnow=1382995139310631, 
    now=...) at /home/billm/mozilla/in1/layout/base/nsRefreshDriver.cpp:168

The Shutdown() call then calls RemoveEventListeners(). In RemoveEventListeners(), the |container| variable is NULL and we print this assertion:

###!!! ASSERTION: doc should support nsIDocShellTreeItem.: 'docShellTreeItem', file /home/billm/mozilla/in1/accessible/src/generic/DocAccessible.cpp, line 732

As a consequence, the obs_documentCreated observer is never removed. Eventually, something triggers the observer and it crashes with a NULL deref on mNotificationController (which was nulled out when we called Shutdown).

It seems like we need to find a way to remove the observer. However, I don't understand what it means for container to be NULL. Maybe that's the real source of the problem?
Flags: needinfo?(trev.saunders)
(In reply to Bill McCloskey (:billm) from comment #3)
> Would you mind taking a look at this Trevor? I can't see how my changes are
> causing this problem, and I don't understand the code enough to debug it
> myself.

yeah, I was already looking because I suspected this wasn't really related to whatever you did.

When I reproduce this DocAccessible::Shutdown() is being called from PresShell::Destroy.  At that point mDocumentNode->mDocumentContainer is null, so the document has no docshell.  document->GetWindow() and presShell->PresContext->GetContainer() are also null so those aren't available as ways to find the docshell.  I talked with smaug about this some  and he seemed to think the thing to do was shut the doc accessible down earlier.  Surkov and Boris you guys worked on bug 571459 where shutting doc accessibles down in PresShell::Destroy was added, is there a reason it needs to be that late?
Flags: needinfo?(trev.saunders)
Flags: needinfo?(surkov.alexander)
Flags: needinfo?(bzbarsky)
Thanks Trevor. Unassigning myself.
Assignee: wmccloskey → nobody
(In reply to Trevor Saunders (:tbsaunde) from comment #4)

> When I reproduce this DocAccessible::Shutdown() is being called from
> PresShell::Destroy.  At that point mDocumentNode->mDocumentContainer is
> null, so the document has no docshell.  document->GetWindow() and
> presShell->PresContext->GetContainer() are also null so those aren't
> available as ways to find the docshell.  I talked with smaug about this some
> and he seemed to think the thing to do was shut the doc accessible down
> earlier.  Surkov and Boris you guys worked on bug 571459 where shutting doc
> accessibles down in PresShell::Destroy was added, is there a reason it needs
> to be that late?

I think PresShell::Destroy was introduced to make sure the document accessible gets shutdown when presshell gets destroyed, so if you need to move the hook slightly in time then it should be fine.
Flags: needinfo?(surkov.alexander)
Flags: needinfo?(bzbarsky)
(In reply to alexander :surkov from comment #6)
> (In reply to Trevor Saunders (:tbsaunde) from comment #4)
> 
> > When I reproduce this DocAccessible::Shutdown() is being called from
> > PresShell::Destroy.  At that point mDocumentNode->mDocumentContainer is
> > null, so the document has no docshell.  document->GetWindow() and
> > presShell->PresContext->GetContainer() are also null so those aren't
> > available as ways to find the docshell.  I talked with smaug about this some
> > and he seemed to think the thing to do was shut the doc accessible down
> > earlier.  Surkov and Boris you guys worked on bug 571459 where shutting doc
> > accessibles down in PresShell::Destroy was added, is there a reason it needs
> > to be that late?
> 
> I think PresShell::Destroy was introduced to make sure the document
> accessible gets shutdown when presshell gets destroyed, so if you need to
> move the hook slightly in time then it should be fine.

Ok, does PresShell::Destroy() always get called from nsDocumentViewer::Destroy() through DestroyPresShell()? If so I guess moving shutting the doc accessible down to just before the point in nsDocumentViewerImpl::Destroy() where mDocument->SetContainer(nullptr) is called should work right?
Flags: needinfo?(bzbarsky)
> Ok, does PresShell::Destroy() always get called from nsDocumentViewer::Destroy() through
> DestroyPresShell()?

In general, probably not.  But in the normal cases, I suspect yes.

One optioin is to shut down in nsDocumentViewerImpl::Destroy _and_ in PresShell::Destroy, as long as shutdown is idempotent.
Flags: needinfo?(bzbarsky)
Accessible::SHutdown and over rides should be itempotent so this should be safe.
Attachment #824259 - Flags: review?(bzbarsky)
Comment on attachment 824259 [details] [diff] [review]
bug 931399 - shutdown DocAccessible's when the related docshell is destroyed

You should null-check mPresShell here; in a display:none iframe it'll be null.  Add a test for that if we have none so far.

r=me
Attachment #824259 - Flags: review?(bzbarsky) → review+
I pushed to try without the mPresShell check to see what goes boom, but added and landed https://hg.mozilla.org/integration/mozilla-inbound/rev/6a613a7288e1
(In reply to Wes Kocher (:KWierso) from comment #12)
> Looks like this broke builds on at least Windows

should be just windows (thank you sooooo much windows.h

(In reply to Boris Zbarsky [:bz] from comment #10)
> You should null-check mPresShell here; in a display:none iframe it'll be
> null.  Add a test for that if we have none so far.

try push without null check https://tbpl.mozilla.org/?tree=Try&rev=2fd859dcef70 looks like the only thing that crshes is b-c in some social test.  So should I add  a crash test that just removes a display none iframe from the DOM?
Worth it, if that crashes without the null-check.
https://hg.mozilla.org/mozilla-central/rev/0b1d63d30f0a
Assignee: nobody → trev.saunders
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla28
This patch was backed out, so the testcase in comment 0 crashes for me again.

changeset:   http://hg.mozilla.org/mozilla-central/rev/5d40a8f1b00b
user:        Trevor Saunders
date:        Mon Nov 18 12:31:56 2013 -0500
summary:     revert bug 931399 to see if it helps with bug 883059 at all
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
backing this out doesn't appear to have improved crashiness of [@ nsTArray_Impl<nsRefPtr<mozilla::a11y::AccEvent>, nsTArrayInfallibleAllocator>::AppendElements<mozilla::a11y::AccEvent*>(mozilla::a11y::AccEvent* const*, unsigned int)] (bug 883059)
https://hg.mozilla.org/mozilla-central/rev/17760eeea357
Status: REOPENED → RESOLVED
Closed: 6 years ago6 years ago
Resolution: --- → FIXED
I get an assertion using the STR in comment 0 on Nightly 2013-10-26-mozilla-central-debug:
"###!!! ASSERTION: doc should support nsIDocShellTreeItem.: 'docShellTreeItem', f
ile e:/builds/moz2_slave/m-cen-w32-d-000000000000000000/build/accessible/src/gen
eric/DocAccessible.cpp, line 726"

No assertion/crash in Nightly 2013-12-08-mozilla-central-debug, Win 7 x64.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.