Closed Bug 634200 Opened 13 years ago Closed 12 years ago

crash [@ nsIFrame::GetStyleVisibility() ]

Categories

(Core :: Disability Access APIs, defect)

defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: surkov, Assigned: surkov)

References

Details

(Keywords: access, crash, Whiteboard: not-ready)

Attachments

(1 file, 1 obsolete file)

example: https://crash-stats.mozilla.com/report/index/65b710b0-62d7-41c3-8596-3ec062110214

stack:
0 	xul.dll 	nsIFrame::GetStyleVisibility 	layout/style/nsStyleStructList.h:103
1 	xul.dll 	nsBulletFrame::GetListItemText 	layout/generic/nsBulletFrame.cpp:1255
2 	xul.dll 	nsBlockFrame::GetBulletText 	layout/generic/nsBlockFrame.cpp:6633
3 	xul.dll 	nsHTMLListBulletAccessible::AppendTextTo 	accessible/src/html/nsHTMLTextAccessible.cpp:400
4 	xul.dll 	nsAccUtils::TextLength 	accessible/src/base/nsAccUtils.cpp:650

https://crash-stats.mozilla.com/report/list?range_value=2&range_unit=weeks&date=2011-02-14%2019%3A00%3A00&signature=nsIFrame%3A%3AGetStyleVisibility%28%29&version=Firefox%3A4.0b12pre
Boris, is it safe (in terms of reflow) to get GetBulletText when refresh driver calls into us?
_Everything_ is safe to do when the refresh driver calls into you.

I keep saying this: the refresh driver assumes that you might run script that closes the whole window under WillRefresh, or whatever other evil things you choose to do.  Is that not clear?
Perhaps used safe word is not very good here. I'm not allowed to flush layout under WillRefresh, I just not sure what else can trigger flush and therefore I keep asking you. Is there a hint how I can learn quickly this or that call into layout may trigger a flush?
Attached patch patch (obsolete) — Splinter Review
fix crash (regress when bullet text is changed)
Assignee: nobody → surkov.alexander
Status: NEW → ASSIGNED
Boris, what's a good place in layout to notify a11y when the bullet is changed?

I can see possible scenarios:
1) insertion list item into list (bullets of list items after this list item gets updated)
2) changing CSS list style type

Are there other cases when bullet can be changed?
(In reply to comment #3)
> Perhaps used safe word is not very good here. I'm not allowed to flush layout
> under WillRefresh, I just not sure what else can trigger flush and therefore I
> keep asking you. Is there a hint how I can learn quickly this or that call into
> layout may trigger a flush?

Yeah. Probably the best thing is to reduce our assertion noise, which you've already been doing. That way we flush based assertions won't get lost in the console output. Additionally it would be great to catch our potential of causing flush via static analysis somehow.

What does this do?
STYLE_STRUCT_INHERITED(Visibility, nsnull, (SSARG_PRESCONTEXT))
> I'm not allowed to flush layout under WillRefresh

Sure you are.  Why wouldn't you be?  It's not great for performance, but it's fine safety-wise...

GetBulletText does not flush.

> I just not sure what else can trigger flush

Generally, layout APIs that will do that will say so (because once you trigger a flush frames may be dead, etc).

But yes, there is no good way to tell short of reading the code of the callee.  We keep meaning to add static analysis annotations for this....

> Boris, what's a good place in layout to notify a11y when the bullet is
> changed?

nsBlockFrame::RenumberListsFor, perhaps?  And yes, you'd need to handle changes to list-style-type separately from that.

> What does this do?
> STYLE_STRUCT_INHERITED(Visibility, nsnull, (SSARG_PRESCONTEXT))

It depends on how the STYLE_STRUCT_INHERITED macro is defined; the idea is that you define it to do whatever you want to do, then include nsStyleStructList.h.

In this case, we're calling nsIFrame::GetStyleVisibility, which calls nsStyleContext::GetStyleVisibility, which is defined as follows (nsStyleContext.h):

256   #define STYLE_STRUCT(name_, checkdata_cb_, ctor_args_)  \
257     const nsStyle##name_ * GetStyle##name_() {            \
258       return DoGetStyle##name_(PR_TRUE);                  \
259     }
260   #include "nsStyleStructList.h"
261   #undef STYLE_STRUCT
....
350   #define STYLE_STRUCT_INHERITED(name_, checkdata_cb_, ctor_args_)      \
351     const nsStyle##name_ * DoGetStyle##name_(PRBool aComputeData) {     \
352       const nsStyle##name_ * cachedData =                               \
353         mCachedInheritedData.m##name_##Data;                            \
354       if (cachedData) /* Have it cached already, yay */                 \
355         return cachedData;                                              \
356       /* Have the rulenode deal */                                      \
357       return mRuleNode->GetStyle##name_(this, aComputeData);            \
358     }

All the crashes are at 0x18 on 32-bit systems.  So at address 24.  24 bytes into nsStyleContext is mPseudoTag.  24 bytes into nsInheritedStyleData is mUserInterfaceData.  24 bytes into nsRuleNode is the mResetData of its mStyleData.  So I'm really not sure what's null here, exactly....
Attached patch patch2Splinter Review
Attachment #513067 - Attachment is obsolete: true
Attachment #514419 - Flags: superreview?(bzbarsky)
Attachment #514419 - Flags: review?(bolterbugz)
Attachment #514419 - Flags: approval2.0?
Comment on attachment 514419 [details] [diff] [review]
patch2

r+ is needed before a? can be even considered
Attachment #514419 - Flags: approval2.0?
(In reply to comment #9)
> Comment on attachment 514419 [details] [diff] [review]
> patch2
> 
> r+ is needed before a? can be even considered

yes, sure, just usually reviewer and approver is the same person in a11y.
Comment on attachment 514419 [details] [diff] [review]
patch2

r=me nice.
Attachment #514419 - Flags: review?(bolterbugz) → review+
Comment on attachment 514419 [details] [diff] [review]
patch2

So why was this crashing before?  I'd like to understand that.

You should probably check that the display is list-item before doing expensive stuff like GetStyleList in style reresolution.

Shouldn't that first get be a peek, anyway?  Then you'll have to deal with the struct being null in the hunk added at end of reresolve, but still...
Attachment #514419 - Flags: superreview?(bzbarsky) → superreview-
(In reply to comment #12)
> Comment on attachment 514419 [details] [diff] [review]
> patch2
> 
> So why was this crashing before?  I'd like to understand that.

I afraid I can't answer this question. I don't have steps to reproduce and don't know much about layout module. But this should be the same reason as we had with nsIFrame::GetRenderedText crashes: a11y does a call in the middle of layout. For that we started to cache rendered text on a11y side and here I tried to do the same.

> You should probably check that the display is list-item before doing expensive
> stuff like GetStyleList in style reresolution.

Sure, I can add the check.

> Shouldn't that first get be a peek, anyway?  Then you'll have to deal with the
> struct being null in the hunk added at end of reresolve, but still...

Could you please reformulate the question? I don't understand it.
I have no idea what the issue with GetRenderedText was, but getting style data from a frame should always be safe, unless the frame tree is in the middle of teardown or something.

I really don't see why the old code would have crashed, or why the new one won't crash.  I think we should at least try to understand what's going on here...

A related question: why do all those stacks terminate at nsAccUtils::TextLength?  Who's calling that?

> Could you please reformulate the question?

You're using GetStyleData, which will force computation of the data even if no one has ever asked for it before.  If you use PeekStyleData, the "never asked before" case will return null instead of doing extra work.
(In reply to comment #14)
> I have no idea what the issue with GetRenderedText was, but getting style data
> from a frame should always be safe, unless the frame tree is in the middle of
> teardown or something.

I think this should be similar to bug 615475 with exception it's called outside (though no idea why the stack is so short, the recent crashes are reported for beta12 with the same stack). I don't have other assumption than you give.

> I really don't see why the old code would have crashed, or why the new one
> won't crash.  I think we should at least try to understand what's going on
> here...

the new code shouldn't crash because we should be guaranteed that "frame tree is NOT in the middle of teardown or something" when I update the bullet text, correct?

> A related question: why do all those stacks terminate at
> nsAccUtils::TextLength?  Who's calling that?

No idea, unfortunately.

I don't see any known dlls of screen readers, though MSAA part is running (oleacc.dll), presumably someone (but not we) started it.

> > Could you please reformulate the question?
> 
> You're using GetStyleData, which will force computation of the data even if no
> one has ever asked for it before.  If you use PeekStyleData, the "never asked
> before" case will return null instead of doing extra work.

I think I can use it for old context only.
> because we should be guaranteed that "frame tree is NOT in the middle of
> teardown or something"

But we have that guarantee in the old code too, unless the a11y code is being called on a non-main thread (and from the crash reports it doesn't seem to be).

> I think I can use it for old context only.

Well, yes.

Basically, my problem is that I don't see how this patch makes things any better.  I'm probably ok with taking it if you fix the peek thing and make the blockframe code check whether accessibility is active (doesn't the code in the patch just force a11y on?), but I think it's just magic handwaving so far.  We need to understand why these crashes are happening...
Note: the crashes really died off. The 4 most recent (by build id) are all from the same system/profile. Worrisome is that the address is consistently 0x18.

I'm not aware of a way an extension could be calling nsAccUtils::TextLength.
Ok, perhaps wontfix then since we don't have good idea what can be wrong here?
Seems ok to me. We'll see this bug referenced in crash-stats and can reopen if necessary.

Boris has made me curious as to the cause but I'm not sure what else to do here.
Well, one possible approach is to talk to Ted about what we could do to figure out who's calling into a11y code here and why we're not seeing it in the crash reports....
Can we cc Ted?
If you download a minidump and load it in Visual Studio or WinDBG, you can sometimes get a better stack if you also download the matching binaries. The Microsoft debuggers can use the binaries to make more educated guesses about return addresses (by looking for call instructions, presumably).
Here's what your stack looks like:
0034d5ec 706d1676 xul!nsIFrame::GetStyleVisibility+0x1 [e:\builds\moz2_slave\cen-w32-ntly\build\layout\style\nsstylestructlist.h @ 103]
0034d628 70bd0025 xul!nsBulletFrame::GetListItemText+0xf [e:\builds\moz2_slave\cen-w32-ntly\build\layout\generic\nsbulletframe.cpp @ 1263]
0034d6d8 70bb82ad xul!nsBlockFrame::GetBulletText+0x6c [e:\builds\moz2_slave\cen-w32-ntly\build\layout\generic\nsblockframe.cpp @ 6634]
0034d794 70b3b8b3 xul!nsHTMLListBulletAccessible::AppendTextTo+0x49 [e:\builds\moz2_slave\cen-w32-ntly\build\accessible\src\html\nshtmltextaccessible.cpp @ 402]
0034d844 70b5d1b4 xul!nsAccUtils::TextLength+0x5f [e:\builds\moz2_slave\cen-w32-ntly\build\accessible\src\base\nsaccutils.cpp @ 651]
0034d85c 70b5d8c7 xul!nsHyperTextAccessible::GetChildOffset+0x6a [e:\builds\moz2_slave\cen-w32-ntly\build\accessible\src\html\nshypertextaccessible.cpp @ 2221]
0034d870 70eb816a xul!nsHyperTextAccessible::GetChildOffset+0x27 [e:\builds\moz2_slave\cen-w32-ntly\build\obj-firefox\dist\include\nshypertextaccessible.h @ 239]
0034d930 70eb822c xul!NotificationController::CreateTextChangeEventFor+0x9f [e:\builds\moz2_slave\cen-w32-ntly\build\accessible\src\base\notificationcontroller.cpp @ 565]
0034d940 70eb825d xul!NotificationController::QueueEvent+0x3d [e:\builds\moz2_slave\cen-w32-ntly\build\accessible\src\base\notificationcontroller.cpp @ 140]
0034d94c 70edc45a xul!nsDocAccessible::FireDelayedAccessibleEvent+0x26 [e:\builds\moz2_slave\cen-w32-ntly\build\accessible\src\base\nsdocaccessible.cpp @ 1676]
0034d970 70edc3a5 xul!nsDocAccessible::UpdateTreeInternal+0xfb [e:\builds\moz2_slave\cen-w32-ntly\build\accessible\src\base\nsdocaccessible.cpp @ 1892]
0034d998 70edc4f7 xul!nsDocAccessible::UpdateTreeInternal+0x46 [e:\builds\moz2_slave\cen-w32-ntly\build\accessible\src\base\nsdocaccessible.cpp @ 1853]
0034d9b4 70eeea9d xul!nsDocAccessible::UpdateTree+0x1d [e:\builds\moz2_slave\cen-w32-ntly\build\accessible\src\base\nsdocaccessible.cpp @ 1795]
0034d9d8 70eeeb0f xul!nsDocAccessible::ProcessContentInserted+0x82 [e:\builds\moz2_slave\cen-w32-ntly\build\accessible\src\base\nsdocaccessible.cpp @ 1783]
0034d9f4 70eeecaa xul!NotificationController::ContentInsertion::Process+0x18 [e:\builds\moz2_slave\cen-w32-ntly\build\accessible\src\base\notificationcontroller.cpp @ 743]
0034da28 7067e3d2 xul!NotificationController::WillRefresh+0x7c [e:\builds\moz2_slave\cen-w32-ntly\build\accessible\src\base\notificationcontroller.cpp @ 244]
0034dac8 7068c838 xul!nsRefreshDriver::Notify+0x5b2 [e:\builds\moz2_slave\cen-w32-ntly\build\layout\base\nsrefreshdriver.cpp @ 256]
0034daf0 7068c8dc xul!nsTimerImpl::Fire+0x278 [e:\builds\moz2_slave\cen-w32-ntly\build\xpcom\threads\nstimerimpl.cpp @ 429]
0034daf8 7060770b xul!nsTimerEvent::Run+0x1c [e:\builds\moz2_slave\cen-w32-ntly\build\xpcom\threads\nstimerimpl.cpp @ 519]
0034db2c 705e23ce xul!nsThread::ProcessNextEvent+0x30b [e:\builds\moz2_slave\cen-w32-ntly\build\xpcom\threads\nsthread.cpp @ 633]
0034db6c 707c47c9 xul!mozilla::ipc::MessagePump::Run+0x6e [e:\builds\moz2_slave\cen-w32-ntly\build\ipc\glue\messagepump.cpp @ 110]
0034db78 707c47b2 xul!MessageLoop::RunInternal+0x11 [e:\builds\moz2_slave\cen-w32-ntly\build\ipc\chromium\src\base\message_loop.cc @ 219]
0034dbb4 74e68b90 xul!MessageLoop::RunHandler+0x1d [e:\builds\moz2_slave\cen-w32-ntly\build\ipc\chromium\src\base\message_loop.cc @ 203]
0034dbd0 707aa87d nspr4!PR_GetThreadPrivate+0x20 [e:\builds\moz2_slave\cen-w32-ntly\build\nsprpub\pr\src\threads\prtpd.c @ 232]
0034dbdc 707c4eaa xul!nsBaseAppShell::Run+0x34 [e:\builds\moz2_slave\cen-w32-ntly\build\widget\src\xpwidgets\nsbaseappshell.cpp @ 198]
0034fb30 707c4ede xul!nsAppShell::Run+0x42 [e:\builds\moz2_slave\cen-w32-ntly\build\widget\src\windows\nsappshell.cpp @ 270]
0034fb3c 706f6c49 xul!nsAppStartup::Run+0x1e [e:\builds\moz2_slave\cen-w32-ntly\build\toolkit\components\startup\src\nsappstartup.cpp @ 221]
0034fecc 00f1134c xul!XRE_main+0xdec [e:\builds\moz2_slave\cen-w32-ntly\build\toolkit\xre\nsapprunner.cpp @ 3770]
0034ff1c 00f116f2 firefox!wmain+0x34c [e:\builds\moz2_slave\cen-w32-ntly\build\toolkit\xre\nswindowswmain.cpp @ 128]

(from the dump on https://crash-stats.mozilla.com/report/index/64107041-67ee-4e85-8b11-d601f2110225)
OK, that is NOT inside frame tree teardown.
(In reply to comment #22)
> If you download a minidump and load it in Visual Studio or WinDBG, you can
> sometimes get a better stack if you also download the matching binaries.

Ted, could you please give me instructions how to do this?
Whiteboard: not-ready
I know someone who gets the crash. His stacks:
https://crash-stats.mozilla.com/report/index/bp-48d4288d-b3be-466f-a0a4-b86f82110608
https://crash-stats.mozilla.com/report/index/bp-8dcb27e4-2be3-40d8-9a7d-d83cc2110608

He's helpful, ping me for contact info.

(In reply to comment #25)
Alexander, I've done minidump debugging before (catch me on IRC). Also see:
https://developer.mozilla.org/en/Debugging_Mozilla_on_Windows_FAQ
https://developer.mozilla.org/en/Debugging_a_minidump
Or are you asking about where to find binaries, or something more specific?
(In reply to comment #26)
> I know someone who gets the crash. His stacks:
> https://crash-stats.mozilla.com/report/index/bp-48d4288d-b3be-466f-a0a4-
> b86f82110608
> https://crash-stats.mozilla.com/report/index/bp-8dcb27e4-2be3-40d8-9a7d-
> d83cc2110608
> 
> He's helpful, ping me for contact info.

The stack trace is different than in this bug. Do we have steps to reproduce?

Boris, can you look at the issue please? 3d party application calls MSAA's IAccessible::get_accName method that makes to call nsBlockFrame::GetBulletText which crashes.
There is nothing obviously wrong in the stacks from comment 26.
Has this recently popped up? Or is it something that resolved itself?
Marco, we have 123 of these over the past 4 weeks across all versions. Some in 7.0.1 and 8.0. There is another bug 527770 that has the same signature. I don't know if it's the same issue or something different. That one has been around for a long time.
Keywords: crash
Severity: normal → critical
there's dozen of crashes in layout but no a11y crashes any more (last crash in Firefox 4), https://crash-stats.mozilla.com/report/list?signature=nsIFrame%3A%3AGetStyleVisibility%28%29

closing as worksforme, please reopen the bug if you see the crash in a11y module
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.