Crash in nsBaseContentList::cycleCollection::CanSkipReal caused by nsDocument::NodesFromRectHelper

VERIFIED FIXED in Firefox 14

Status

()

Core
DOM
--
critical
VERIFIED FIXED
5 years ago
5 years ago

People

(Reporter: nhirata, Assigned: Felipe)

Tracking

(5 keywords)

Trunk
mozilla16
ARM
Android
crash, mobile, reproducible, testcase, topcrash
Points:
---

Firefox Tracking Flags

(firefox14 verified, firefox15+ fixed, firefox16 verified, blocking-fennec1.0 betaN+, fennec15+)

Details

(Whiteboard: [native-crash], crash signature, URL)

Attachments

(5 attachments, 1 obsolete attachment)

Crashing Thread
Frame 	Module 	Signature [Expand] 	Source
0 	libxul.so 	nsBaseContentList::cycleCollection::CanSkipReal 	content/base/src/nsContentList.cpp:97
1 	libxul.so 	nsPurpleBuffer::RemoveSkippable 	nsCycleCollectionParticipant.h:154
2 	libxul.so 	nsCycleCollector::ForgetSkippable 	xpcom/base/nsCycleCollector.cpp:2224
3 	libxul.so 	nsCycleCollector_forgetSkippable 	xpcom/base/nsCycleCollector.cpp:4056
4 	libxul.so 	CCTimerFired 	dom/base/nsJSEnvironment.cpp:3382
5 	libxul.so 	nsTimerImpl::Fire 	xpcom/threads/nsTimerImpl.cpp:428
6 	libxul.so 	nsTimerEvent::Run 	xpcom/threads/nsTimerImpl.cpp:524
7 	libxul.so 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:657
8 	libxul.so 	NS_ProcessNextEvent_P 	obj-firefox/xpcom/build/nsThreadUtils.cpp:245
9 	libxul.so 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:110
10 	libxul.so 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:208
11 	libxul.so 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:201
12 	libxul.so 	nsBaseAppShell::Run 	widget/xpwidgets/nsBaseAppShell.cpp:189
13 	libxul.so 	nsAppStartup::Run 	toolkit/components/startup/nsAppStartup.cpp:295
14 	libxul.so 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3564
15 	libxul.so 	GeckoStart 	toolkit/xre/nsAndroidStartup.cpp:109
16 	libmozglue.so 	__res_nsend 	other-licenses/android/res_send.c:599
17 	dalvik-heap (deleted) 	dalvik-heap @0x529b0e 	
18 	dalvik-heap (deleted) 	dalvik-heap @0x571816 	
19 	libdvm.so 	dvmPlatformInvoke 	
20 	libdvm.so 	dvmCallJNIMethod_general 	
21 	libdvm.so 	dvmResolveNativeMethod 	
22 	libdvm.so 	dvmAsmSisterStart 	
23 	libdvm.so 	dvmMterpStd 	
24 	libdvm.so 	dvmInterpret 	
25 	libdvm.so 	dvmCallMethodV 	
26 	libdvm.so 	dvmCallMethod 	
27 	libdvm.so 	dvmDetachCurrentThread 	
28 	libc.so 	libc.so@0x11946 	
29 	libc.so 	libc.so@0x11512

Some listed devices:
- Samsung GT-I9100G
- Model: 'MB526', Product: 'jordan_hkt', Manufacturer: 'motorola', Hardware: 'mapphone_umts''
- Samsung 'GT-I9100'
- Product: 'GT-P7500', Manufacturer: 'samsung'
- Motorola Triumph

URL:
- http://www.omgubuntu.co.uk/2012/02/adobe-adandons-flash-on-linux/ 

Earliest build seen for 13a1 : 	20120223031236
tracking-fennec: --- → ?
I'm seeing this crash with current trunk build on this url:
http://people.mozilla.org/~mwargers/tests/forms/formautosubmit_parentframes.htm
And then quickly randomly double-tapping all over the place.
It might not be super easy to reproduce the crash, but I've been able to get a crash this way within 1 minute or so.
https://crash-stats.mozilla.com/report/index/bp-250de2af-eeec-4739-a431-13e402120509
This is on the Samsung Galaxy Nexus.
Keywords: mobile
Version: 13 Branch → Trunk
Wow.  Reproducible with Martijn's test case.  Zoom in and tap in the text fields, pan tap, pan tap... eventually you will run into this.

This was with the Samsung Galaxy S II, 5/9/2012 Nightly
Keywords: reproducible

Updated

5 years ago
Summary: [@ nsBaseContentList::cycleCollection::CanSkipReal ] → Crash in nsBaseContentList::cycleCollection::CanSkipReal
Created attachment 625378 [details]
testcase

I can also reproduce it with this testcase on the Galaxy Nexus (and much easier).
Double tapping on the right iframe makes Fennec crash.
The iframe content just contains a script that is quickly refreshing that page.
I guess this might be Java code that is causing the crash in Gecko?
Keywords: testcase
blassey, who would be a good person to investigate CycleCollector crash? This two-month old bug has recently become a #1 or #2 topcrash for 15.0a1 and has a reproducible testcase.
blocking-fennec1.0: --- → ?

Comment 6

5 years ago
(In reply to Chris Peterson (:cpeterson) from comment #5)
> This two-month old bug has recently become a #1 or #2 topcrash for 15.0a1
> and has a reproducible testcase.
Martijn's tests made it #1 top crasher.
tracking-fennec: ? → +
blocking-fennec1.0: ? → -
Odds are something is putting garbage into an nsBaseContentList, and the CC is just the first thing that happens to try to touch it.
It looks like somebody is storing a null pointer into an nsBaseContentList.  I'm not sure if that is okay or not.  Maybe we just need a null check in CanSkipReal?

Comment 9

5 years ago
It's #4 top crasher in 14.0b3.
blocking-fennec1.0: - → ?
Keywords: topcrash
Boris, do you know if it is normal for a field of the mElements of an nsBaseContentList to be null?  I'm trying to figure out if this crash is due to a missing null check in cycle collector code, or because a null is somehow leaking into this array where it shouldn't.
Assignee: nobody → continuation
blocking-fennec1.0: ? → +
Could someone check what is putting null to mElements?
There shouldn't be null things in mElements.  There is already an assert to that effect in nsBaseContentList::InsertElementAt.  Add one in nsBaseContentList::AppendElement too, and see if you hit it?
Though note that some subclasses (e.g. nsContentList) touch the array directly.  They still shouldn't be putting null in, though!
mw22, does your test example hit any asserts when run in a debug build?
Here's the stack when we append a NULL content:
#0  0x4d7b9ec8 in nsBaseContentList::AppendElement (this=0x482cd120, aContent=0x0) at ../../dist/include/nsContentList.h:77
#1  0x4da36e16 in nsDocument::NodesFromRectHelper (this=0x4af45800, aX=474, aY=82, aTopSize=98, aRightSize=65.3333359, aBottomSize=32.6666679, aLeftSize=65.3333359, aIgnoreRootScrollFrame=true, aFlushLayout=false, aReturn=0x47ddff88) at /home/jrmuizel/mozilla/maple/content/base/src/nsDocument.cpp:2904
#2  0x4dd1e2f6 in nsDOMWindowUtils::NodesFromRect (this=0x4afe8540, aX=474, aY=82, aTopSize=98, aRightSize=65.3333359, aBottomSize=32.6666679, aLeftSize=65.3333359, aIgnoreRootScrollFrame=true, aFlushLayout=false, aReturn=0x47ddff88) at /home/jrmuizel/mozilla/maple/dom/base/nsDOMWindowUtils.cpp:1084
#3  0x4e81702e in NS_InvokeByIndex_P (that=0x4afe8540, methodIndex=29, paramCount=<optimized out>, params=<optimized out>) at /home/jrmuizel/mozilla/maple/xpcom/reflect/xptcall/src/md/unix/xptcinvoke_arm.cpp:160
#4  0x4e1c2324 in CallMethodHelper::Invoke (this=0x456728d0) at /home/jrmuizel/mozilla/maple/js/xpconnect/src/XPCWrappedNative.cpp:3062
#5  0x4e1c0890 in CallMethodHelper::Call (this=0x456728d0) at /home/jrmuizel/mozilla/maple/js/xpconnect/src/XPCWrappedNative.cpp:2390
#6  0x4e1c0760 in XPCWrappedNative::CallMethod (ccx=..., mode=XPCWrappedNative::CALL_METHOD) at /home/jrmuizel/mozilla/maple/js/xpconnect/src/XPCWrappedNative.cpp:2356
#7  0x4e1cb7a2 in XPC_WN_CallMethod (cx=0x47451ec0, argc=8, vp=0x47800180) at /home/jrmuizel/mozilla/maple/js/xpconnect/src/XPCWrappedNativeJSOps.cpp:1500
#8  0x4eef0fe0 in js::CallJSNative (cx=0x47451ec0, native=0x4e1cb5bd <XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*)>, args=...) at ../../../js/src/jscntxtinlines.h:395
#9  0x4eef6d76 in js::InvokeKernel (cx=0x47451ec0, args=..., construct=js::NO_CONSTRUCT) at /home/jrmuizel/mozilla/maple/js/src/jsinterp.cpp:310
#10 0x4ef039a8 in js::Interpret (cx=0x47451ec0, entryFrame=0x47800028, interpMode=js::JSINTERP_NORMAL) at /home/jrmuizel/mozilla/maple/js/src/jsinterp.cpp:2512
#11 0x4eef6afa in js::RunScript (cx=0x47451ec0, script=0x487bb7e0, fp=0x47800028) at /home/jrmuizel/mozilla/maple/js/src/jsinterp.cpp:266
#12 0x4eef6e22 in js::InvokeKernel (cx=0x47451ec0, args=..., construct=js::NO_CONSTRUCT) at /home/jrmuizel/mozilla/maple/js/src/jsinterp.cpp:326
#13 0x4ee6e950 in js::Invoke (cx=0x47451ec0, args=..., construct=js::NO_CONSTRUCT) at /home/jrmuizel/mozilla/maple/js/src/jsinterp.h:125
#14 0x4eef6fd4 in js::Invoke (cx=0x47451ec0, thisv=..., fval=..., argc=1, argv=0x456735f0, rval=0x45673710) at /home/jrmuizel/mozilla/maple/js/src/jsinterp.cpp:358
#15 0x4ee6508c in JS_CallFunctionValue (cx=0x47451ec0, obj=0x48779910, fval=..., argc=1, argv=0x456735f0, rval=0x45673710) at /home/jrmuizel/mozilla/maple/js/src/jsapi.cpp:5496
#16 0x4e1b8030 in nsXPCWrappedJSClass::CallMethod (this=0x47d54790, wrapper=0x4ac8ee00, methodIndex=3, info=0x4755e720, nativeParams=0x45673960) at /home/jrmuizel/mozilla/maple/js/xpconnect/src/XPCWrappedJSClass.cpp:1474
#17 0x4e1b0668 in nsXPCWrappedJS::CallMethod (this=0x4ac8ee00, methodIndex=3, info=0x4755e720, params=0x45673960) at /home/jrmuizel/mozilla/maple/js/xpconnect/src/XPCWrappedJS.cpp:579
#18 0x4e817a0e in PrepareAndDispatch (self=<optimized out>, methodIndex=<optimized out>, args=0x45673a1c) at /home/jrmuizel/mozilla/maple/xpcom/reflect/xptcall/src/md/unix/xptcstubs_arm.cpp:105
#19 0x4e817044 in SharedStub () from /home/jrmuizel/mozilla/maple/objdir-debug/dist/bin/libxul.so
#20 0x4db7991a in nsEventListenerManager::HandleEventSubType (this=0x4acb6970, aListenerStruct=0x4a990e48, aListener=0x4ab51cb0, aDOMEvent=0x48561550, aCurrentTarget=0x4a9c2e70, aPhaseFlags=2, aPusher=0x45673b98) at /home/jrmuizel/mozilla/maple/content/events/src/nsEventListenerManager.cpp:809
#21 0x4db79af6 in nsEventListenerManager::HandleEventInternal (this=0x4acb6970, aPresContext=0x4af46400, aEvent=0x456742d8, aDOMEvent=0x45673bac, aCurrentTarget=0x4a9c2e70, aFlags=2, aEventStatus=0x45673bb0, aPusher=0x45673b98) at /home/jrmuizel/mozilla/maple/content/events/src/nsEventListenerManager.cpp:866
#22 0x4dba0468 in nsEventListenerManager::HandleEvent (this=0x4acb6970, aPresContext=0x4af46400, aEvent=0x456742d8, aDOMEvent=0x45673bac, aCurrentTarget=0x4a9c2e70, aFlags=2, aEventStatus=0x45673bb0, aPusher=0x45673b98) at /home/jrmuizel/mozilla/maple/content/events/src/nsEventListenerManager.h:137
#23 0x4dba09ce in nsEventTargetChainItem::HandleEvent (this=0x486f6080, aVisitor=..., aFlags=2, aMayHaveNewListenerManagers=false, aPusher=0x45673b98) at /home/jrmuizel/mozilla/maple/content/events/src/nsEventDispatcher.cpp:185
#24 0x4dba0e98 in nsEventTargetChainItem::HandleEventTargetChain (this=0x486f6160, aVisitor=..., aFlags=6, aCallback=0x45673d18, aMayHaveNewListenerManagers=false, aPusher=0x45673b98) at /home/jrmuizel/mozilla/maple/content/events/src/nsEventDispatcher.cpp:340
#25 0x4dba1c80 in nsEventDispatcher::Dispatch (aTarget=0x48222fb0, aPresContext=0x4af46400, aEvent=0x456742d8, aDOMEvent=0x0, aEventStatus=0x45673cb4, aCallback=0x45673d18, aTargets=0x0) at /home/jrmuizel/mozilla/maple/content/events/src/nsEventDispatcher.cpp:643
#26 0x4d7a3f80 in PresShell::DispatchTouchEvent (this=0x4acf6100, aEvent=0x456742d8, aStatus=0x456741d0, aEventCB=0x45673d18, aTouchIsNew=false) at /home/jrmuizel/mozilla/maple/layout/base/nsPresShell.cpp:6456
#27 0x4d7a3a4c in PresShell::HandleEventInternal (this=0x4acf6100, aEvent=0x456742d8, aStatus=0x456741d0) at /home/jrmuizel/mozilla/maple/layout/base/nsPresShell.cpp:6358
#28 0x4d7a3108 in PresShell::HandlePositionedEvent (this=0x4acf6100, aTargetFrame=0x482f4278, aEvent=0x456742d8, aEventStatus=0x456741d0) at /home/jrmuizel/mozilla/maple/layout/base/nsPresShell.cpp:6093
#29 0x4d7a2668 in PresShell::HandleEvent (this=0x4acf6100, aFrame=0x4af247f8, aEvent=0x456742d8, aDontRetargetEvents=true, aEventStatus=0x456741d0) at /home/jrmuizel/mozilla/maple/layout/base/nsPresShell.cpp:5893
#30 0x4d7a1980 in PresShell::HandleEvent (this=0x483d1400, aFrame=0x485fc7f8, aEvent=0x456742d8, aDontRetargetEvents=false, aEventStatus=0x456741d0) at /home/jrmuizel/mozilla/maple/layout/base/nsPresShell.cpp:5630
#31 0x4dd17044 in nsViewManager::DispatchEvent (this=0x486f42e0, aEvent=0x456742d8, aView=0x486e2340, aStatus=0x456741d0) at /home/jrmuizel/mozilla/maple/view/src/nsViewManager.cpp:865
#32 0x4dd12456 in HandleEvent (aEvent=0x456742d8) at /home/jrmuizel/mozilla/maple/view/src/nsView.cpp:126
#33 0x4e53db06 in nsWindow::DispatchEvent (this=0x47459880, aEvent=0x456742d8) at /home/jrmuizel/mozilla/maple/widget/android/nsWindow.cpp:639
#34 0x4e53dad8 in nsWindow::DispatchEvent (this=0x47459880, aEvent=0x456742d8, aStatus=@0x45674244) at /home/jrmuizel/mozilla/maple/widget/android/nsWindow.cpp:631
#35 0x4e53f8e6 in nsWindow::DispatchMultitouchEvent (this=0x47459880, event=..., ae=0x4849a740) at /home/jrmuizel/mozilla/maple/widget/android/nsWindow.cpp:1441
#36 0x4e53f42e in nsWindow::OnMultitouchEvent (this=0x47459880, ae=0x4849a740) at /home/jrmuizel/mozilla/maple/widget/android/nsWindow.cpp:1358
#37 0x4e53e54c in nsWindow::OnGlobalAndroidEvent (ae=0x4849a740) at /home/jrmuizel/mozilla/maple/widget/android/nsWindow.cpp:842
#38 0x4e529dcc in nsAppShell::ProcessNextNativeEvent (this=0x474303e0, mayWait=false) at /home/jrmuizel/mozilla/maple/widget/android/nsAppShell.cpp:558
#39 0x4e54660e in nsBaseAppShell::DoProcessNextNativeEvent (this=0x474303e0, mayWait=false, recursionDepth=0) at /home/jrmuizel/mozilla/maple/widget/xpwidgets/nsBaseAppShell.cpp:139
#40 0x4e546958 in nsBaseAppShell::OnProcessNextEvent (this=0x474303e0, thr=0x44db11d0, mayWait=false, recursionDepth=0) at /home/jrmuizel/mozilla/maple/widget/xpwidgets/nsBaseAppShell.cpp:280
#41 0x4e7f6fbe in nsThread::ProcessNextEvent (this=0x44db11d0, mayWait=false, result=0x4567471f) at /home/jrmuizel/mozilla/maple/xpcom/threads/nsThread.cpp:586
#42 0x4e79b992 in NS_ProcessNextEvent_P (thread=0x44db11d0, mayWait=false) at /home/jrmuizel/mozilla/maple/objdir-debug/xpcom/build/nsThreadUtils.cpp:213
#43 0x4e66a14a in mozilla::ipc::MessagePump::Run (this=0x44db31f0, aDelegate=0x44dd90e0) at /home/jrmuizel/mozilla/maple/ipc/glue/MessagePump.cpp:82
#44 0x4e84b0bc in MessageLoop::RunInternal (this=0x44dd90e0) at /home/jrmuizel/mozilla/maple/ipc/chromium/src/base/message_loop.cc:208
#45 0x4e84b056 in MessageLoop::RunHandler (this=0x44dd90e0) at /home/jrmuizel/mozilla/maple/ipc/chromium/src/base/message_loop.cc:201
#46 0x4e84affe in MessageLoop::Run (this=0x44dd90e0) at /home/jrmuizel/mozilla/maple/ipc/chromium/src/base/message_loop.cc:175
#47 0x4e546692 in nsBaseAppShell::Run (this=0x474303e0) at /home/jrmuizel/mozilla/maple/widget/xpwidgets/nsBaseAppShell.cpp:163
#48 0x4e346914 in nsAppStartup::Run (this=0x475984c0) at /home/jrmuizel/mozilla/maple/toolkit/components/startup/nsAppStartup.cpp:256
#49 0x4d4ce3d4 in XREMain::XRE_mainRun (this=0x456749d4) at /home/jrmuizel/mozilla/maple/toolkit/xre/nsAppRunner.cpp:3765
#50 0x4d4ce5e2 in XREMain::XRE_main (this=0x456749d4, argc=7, argv=0x44dbf048, aAppData=0x804567ac) at /home/jrmuizel/mozilla/maple/toolkit/xre/nsAppRunner.cpp:3842
#51 0x4d4ce796 in XRE_main (argc=7, argv=0x44dbf048, aAppData=0x804567ac, aFlags=0) at /home/jrmuizel/mozilla/maple/toolkit/xre/nsAppRunner.cpp:3918
#52 0x4d4d8fde in GeckoStart (data=0x484480, appData=0x804567ac) at /home/jrmuizel/mozilla/maple/toolkit/xre/nsAndroidStartup.cpp:73
#53 0x8042a25e in Java_org_mozilla_gecko_GeckoAppShell_nativeRun (jenv=0x471098, jc=0x4054ed70, jargs=0x4059b1e8) at /home/jrmuizel/mozilla/maple/mozglue/android/APKOpen.cpp:966
#54 0x80017e38 in dvmPlatformInvoke () from /home/jrmuizel/src/gdb-root/moz-gdb/lib/353549C2A70D00EC/system/lib/libdvm.so
#55 0x800496e2 in dvmCallJNIMethod_general () from /home/jrmuizel/src/gdb-root/moz-gdb/lib/353549C2A70D00EC/system/lib/libdvm.so
#56 0x8004ee64 in dvmResolveNativeMethod () from /home/jrmuizel/src/gdb-root/moz-gdb/lib/353549C2A70D00EC/system/lib/libdvm.so
#57 0x8001d038 in dvmJitToInterpNoChain () from /home/jrmuizel/src/gdb-root/moz-gdb/lib/353549C2A70D00EC/system/lib/libdvm.so
#58 0x8001d038 in dvmJitToInterpNoChain () from /home/jrmuizel/src/gdb-root/moz-gdb/lib/353549C2A70D00EC/system/lib/libdvm.so
And we do so because CheckAncestryAndGetFrame returns NULL.

In this case 'this' is https://bug731293.bugzilla.mozilla.org/attachment.cgi?id=625378
and currentDoc is text/html;charset=utf-8,%3Cscript%3E%0AsetTimeout%28function%28%29%20%7Bwindow.location.reload%28%29%3B%7D%2C%20200%29%3B%0A%3C/script%3E

I don't know if CheckAncestryAndGetFrame is doing the right thing here or not.
Assignee: continuation → nobody
Component: XPCOM → DOM
QA Contact: xpcom → general
Assignee: nobody → felipc
Summary: Crash in nsBaseContentList::cycleCollection::CanSkipReal → Crash in nsBaseContentList::cycleCollection::CanSkipReal caused by nsDocument::NodesFromRectHelper
Just from looking at the code, it seems like null is a reasonable value for CheckAncestryAndGetFrame to return.  The name implies it can fail, which would result in returning null.
(In reply to Andrew McCreight [:mccr8] from comment #17)
> Just from looking at the code, it seems like null is a reasonable value for
> CheckAncestryAndGetFrame to return.  The name implies it can fail, which
> would result in returning null.

Yeah, it seems like the caller needs to figure out what an appropriate thing is to do when CheckAncestryAndGetFrame returns null.
nsDocument::NodesFromRectHelper is just broken here.  It shouldn't be putting null in that list.  The only two consumers (in Fennec chrome code) would end up throwing exceptions if the list contained null....
Blocks: 489127
In this test case though it seems like CheckAncestryAndGetFrame should be returning the FrameElement for and not NULL. i.e. shouldn't the list of Frames and associated nsIContents all be ancestors of the current document?
bz/jeff - is one of you the right assignee for this, given that you're pretty active in it?
The right assignee is someone who understans what that code is trying to do.

So probably Felipe or roc?

I can fix the crash, obviously, but it's not clear to me what behavior I'd break in the process...  Then again, it's behavior that currently always crashes, so maybe it's not a problem.
(Assignee)

Comment 23

5 years ago
Note that this code is very similar to ElementFromPoint which has existed for longer, and would also return null in that case.

Which is also odd to think that ElementFromPoint could ever return null for a valid x/y point.

Maybe this would happen when the function is being called from a sub-document and there's a higher level document over the x/y point (or, in this case, area) being used.. bz, what do you think?

The CheckAncestryAndGetFrame exists there to not return any cross-document elements. So if that theory makes sense then it's a correct fix to check for null and not add it to the list. Which I believe is the fix we'll end up doing but I wanted to understand why it happens first.
I really don't know offhand why null is being returned here.  I could dig through the code if desired....

Comment 25

5 years ago
It's #1 top crasher in 14.0b5 with 12.7% of crashes.

Comment 26

5 years ago
(In reply to Boris Zbarsky (:bz) from comment #24)
> I really don't know offhand why null is being returned here.  I could dig
> through the code if desired....

That would be very helpful (top crash for Fennec Native right now as per #25).  Thanks.
(Assignee)

Comment 27

5 years ago
Created attachment 629936 [details] [diff] [review]
Patch

Patch doing the obvious. It'd still be good to understand what changed to cause this (this API had been in use in XUL fennec for a long time), but fixing the crash soon seems good. I'm pretty sure there's no point in having null elements in the list returned so the change seems reasonable.

Martijn: I sent it to tryserver; after it is built could you try the testcase and see if it fixes the problem? https://tbpl.mozilla.org/?tree=Try&rev=9100f56a76f2, builds here: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/felipc@gmail.com-9100f56a76f2
Attachment #629936 - Flags: review?(bzbarsky)
Comment on attachment 629936 [details] [diff] [review]
Patch

Review of attachment 629936 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/base/src/nsDocument.cpp
@@ +2904,1 @@
>          elements->AppendElement(elementDoc);

nsBaseContentList::InsertElementAt asserts its aContent parameter != NULL. Should nsBaseContentList::AppendElement have an assert, too?
(In reply to Chris Peterson (:cpeterson) from comment #28)
> nsBaseContentList::InsertElementAt asserts its aContent parameter != NULL.
> Should nsBaseContentList::AppendElement have an assert, too?

Yes!
(Assignee)

Comment 30

5 years ago
Created attachment 630044 [details] [diff] [review]
Patch v2

Added the assertion to AppendElement
Attachment #629936 - Attachment is obsolete: true
Attachment #629936 - Flags: review?(bzbarsky)
Attachment #630044 - Flags: review?(bzbarsky)
(In reply to Felipe Gomes (:felipe) from comment #23)
> Note that this code is very similar to ElementFromPoint which has existed
> for longer, and would also return null in that case.
> 
> Which is also odd to think that ElementFromPoint could ever return null for
> a valid x/y point.
> 
> Maybe this would happen when the function is being called from a
> sub-document and there's a higher level document over the x/y point (or, in
> this case, area) being used.. bz, what do you think?

No, that can't happen here since we're only looking at this document's root frame and its descendants.

I suspect what's happening here is that we're rendering the "zombie document" that is going away due to a pending navigation, we find content in that document and call nsDocument::CheckAncestryAndGetFrame, which returns null for zombie documents for some reason.

I think we can do a much more robust approach that walks the frame tree. working up a patch.
Created attachment 630065 [details] [diff] [review]
Part 1: Refactor GetParentOrPlaceholderFor a bit
Attachment #630065 - Flags: review?(matspal)
Created attachment 630066 [details] [diff] [review]
Part 2: Add nsLayoutUtils::GetParentOrPlaceholderForCrossDoc
Attachment #630066 - Flags: review?(matspal)
Created attachment 630067 [details] [diff] [review]
Part 3: Walk frame tree to find correct element

Felipe, if these patches work for you, let's do it this way :-).
Attachment #630067 - Flags: review?(matspal)
Comment on attachment 630044 [details] [diff] [review]
Patch v2

r=me, in general, though if roc's patches make the nsDocument bit not needed so much the better.
Attachment #630044 - Flags: review?(bzbarsky) → review+
(Assignee)

Comment 36

5 years ago
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #34)
> Created attachment 630067 [details] [diff] [review]
> Part 3: Walk frame tree to find correct element
> 
> Felipe, if these patches work for you, let's do it this way :-).

Yeah, WFM. But I don't have an env ready to build and test Fennec so if someone who does could try the testcase with these patches that'd be nice

Comment 37

5 years ago
Wondering about the risk profile for these two approaches - Felipes patches seem simpler, should we just take that for FF14 (Fennec Native 1.0) since its essentially a null check and let roc's patches land for 15/16?
blocking-fennec1.0: + → betaN+
(In reply to JP Rosevear [:jpr] from comment #37)
> Wondering about the risk profile for these two approaches - Felipes patches
> seem simpler, should we just take that for FF14 (Fennec Native 1.0) since
> its essentially a null check and let roc's patches land for 15/16?

That sounds reasonable.

With Felipe's patch, in testcases where we currently crash we will instead fail to register the touch/click. I.e., touching/clicking an iframe currently reloading its document might fail to do anything. My patches should fix that.

Comment 39

5 years ago
Comment on attachment 630044 [details] [diff] [review]
Patch v2

[Triage Comment]
We'll take this patch for beta, for aurora/m-c we should probably take roc's patch.
Attachment #630044 - Flags: approval-mozilla-beta+
https://hg.mozilla.org/releases/mozilla-beta/rev/0a3f4d59d26b
status-firefox14: --- → fixed

Comment 41

5 years ago
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #38)
> With Felipe's patch, in testcases where we currently crash we will instead
> fail to register the touch/click. I.e., touching/clicking an iframe
> currently reloading its document might fail to do anything.

That behavior doesn't sound unreasonable even for release (and crashing is surely worse), even though improving that with your patches is probably preferable in the long term.

Updated

5 years ago
Attachment #630065 - Flags: review?(matspal) → review+

Updated

5 years ago
Attachment #630066 - Flags: review?(matspal) → review+
Comment on attachment 630067 [details] [diff] [review]
Part 3: Walk frame tree to find correct element

> content/base/src/nsDocument.cpp
> +  if (elem && !elem->IsElement()) {
> +    elem = elem->GetParent();
> +  }
> +  return CallQueryInterface(elem, aReturn);

CallQueryInterface will crash if 'elem' is null.


> +  for (PRInt32 i = 0; i < outFrames.Length(); i++) {

s/PRInt32/nsAutoTArray<nsIFrame*,8>::size_type/


> +    if (node && !node->IsElement() && !node->IsNodeOfType(nsINode::eTEXT)) {
> +      // If we have a node that isn't an element or a text node,
> +      // replace it with the first parent node.
> +      node = node->GetParent();
> +    }

The second comment line is confusing; "replace"? "the first"?
Perhaps this is better:
  // We have a node that isn't an element or a text node,
  // use its parent content instead.


> +    nsIContent* ptContent = f->GetContent();

'pt'?  Just 'content' will do I think.


> +  * Find the (non-anonymous) element in this document that contains the
> +  * content (possibly in a subdocument) associated with aFrame. Returns null
> +  * if there is no such element.

I think 'contains' is a bit misleading since normally it just returns
aFrame->GetContent().  The 'contains' suggests to me it returns
aFrame->GetContent()->GetParent().  The name of the method is clear though,
but if you can rephrase the comment without the "contains" that would be good.

r=mats
Attachment #630067 - Flags: review?(matspal) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/67933a562482
https://hg.mozilla.org/integration/mozilla-inbound/rev/b6b4a7b119a2
https://hg.mozilla.org/integration/mozilla-inbound/rev/cfc01b8371a6
https://hg.mozilla.org/mozilla-central/rev/67933a562482
https://hg.mozilla.org/mozilla-central/rev/b6b4a7b119a2
https://hg.mozilla.org/mozilla-central/rev/cfc01b8371a6
Status: NEW → RESOLVED
Last Resolved: 5 years ago
status-firefox15: --- → affected
status-firefox16: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla16

Comment 45

5 years ago
There are no crashes in 14.0b6.
status-firefox14: fixed → verified
(In reply to Mats Palmgren [:mats] from comment #42)
> Comment on attachment 630067 [details] [diff] [review]
> Part 3: Walk frame tree to find correct element
> 
> > content/base/src/nsDocument.cpp
> > +  if (elem && !elem->IsElement()) {
> > +    elem = elem->GetParent();
> > +  }
> > +  return CallQueryInterface(elem, aReturn);
> 
> CallQueryInterface will crash if 'elem' is null.

It will still crash.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Ms2ger, the crash has gone away, please open a new bug for it speculatively being able to crash
Status: REOPENED → RESOLVED
Last Resolved: 5 years ago5 years ago
Resolution: --- → FIXED

Comment 48

5 years ago
There are no crashes in builds above 16.0a1/20120608.
There are still crashes in Aurora where the patch hasn't landed.
Status: RESOLVED → VERIFIED
status-firefox16: fixed → verified

Comment 49

5 years ago
Ok, we need to decide if the work around or the fuller fix goes on to Aurora.  Given its early in the cycle, I think we take roc's patches.  Roc, can you nom?

Comment 50

5 years ago
Tracking to make sure we don't lose this.
tracking-fennec: + → 15+
tracking-firefox15: --- → +
This crash is still our #2 topcrash for Fennec 15.
roc, see comment 49.
Somehow I failed to land my changes to address review comments :-(
Review comments:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6d5a6c8120c0
Attachment #630065 - Flags: approval-mozilla-aurora?
Attachment #630066 - Flags: approval-mozilla-aurora?
Attachment #630067 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/6d5a6c8120c0
Attachment #630065 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #630066 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #630067 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/21b01f9b5638
https://hg.mozilla.org/releases/mozilla-aurora/rev/3e7571c22e93
https://hg.mozilla.org/releases/mozilla-aurora/rev/83096cfcba66
https://hg.mozilla.org/releases/mozilla-aurora/rev/d21ac6d7b1e7
status-firefox15: affected → fixed
You need to log in before you can comment on or make changes to this bug.