Last Comment Bug 731293 - Crash in nsBaseContentList::cycleCollection::CanSkipReal caused by nsDocument::NodesFromRectHelper
: Crash in nsBaseContentList::cycleCollection::CanSkipReal caused by nsDocument...
Status: VERIFIED FIXED
[native-crash]
: crash, mobile, reproducible, testcase, topcrash
Product: Core
Classification: Components
Component: DOM (show other bugs)
: Trunk
: ARM Android
: -- critical (vote)
: mozilla16
Assigned To: :Felipe Gomes (needinfo me!)
:
Mentors:
http://people.mozilla.org/~mwargers/t...
Depends on:
Blocks: 489127
  Show dependency treegraph
 
Reported: 2012-02-28 10:18 PST by Naoki Hirata :nhirata (please use needinfo instead of cc)
Modified: 2012-06-23 05:59 PDT (History)
19 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
verified
+
fixed
verified
betaN+
15+


Attachments
testcase (442 bytes, text/html)
2012-05-19 01:51 PDT, Martijn Wargers [:mwargers] (gone per 2016-05-31 :-( )
no flags Details
Patch (1.13 KB, patch)
2012-06-04 14:45 PDT, :Felipe Gomes (needinfo me!)
no flags Details | Diff | Review
Patch v2 (1.69 KB, patch)
2012-06-04 20:36 PDT, :Felipe Gomes (needinfo me!)
bzbarsky: review+
jpr: approval‑mozilla‑beta+
Details | Diff | Review
Part 1: Refactor GetParentOrPlaceholderFor a bit (10.27 KB, patch)
2012-06-04 22:01 PDT, Robert O'Callahan (:roc) (Exited; email my personal email if necessary)
mats: review+
blassey.bugs: approval‑mozilla‑aurora+
Details | Diff | Review
Part 2: Add nsLayoutUtils::GetParentOrPlaceholderForCrossDoc (2.07 KB, patch)
2012-06-04 22:02 PDT, Robert O'Callahan (:roc) (Exited; email my personal email if necessary)
mats: review+
blassey.bugs: approval‑mozilla‑aurora+
Details | Diff | Review
Part 3: Walk frame tree to find correct element (8.10 KB, patch)
2012-06-04 22:03 PDT, Robert O'Callahan (:roc) (Exited; email my personal email if necessary)
mats: review+
blassey.bugs: approval‑mozilla‑aurora+
Details | Diff | Review

Description Naoki Hirata :nhirata (please use needinfo instead of cc) 2012-02-28 10:18:22 PST
Crashing Thread
Frame 	Module 	Signature [Expand] 	Source
0 	libxul.so 	nsBaseContentList::cycleCollection::CanSkipReal 	content/base/src/nsContentList.cpp:97
1 	libxul.so 	nsPurpleBuffer::RemoveSkippable 	nsCycleCollectionParticipant.h:154
2 	libxul.so 	nsCycleCollector::ForgetSkippable 	xpcom/base/nsCycleCollector.cpp:2224
3 	libxul.so 	nsCycleCollector_forgetSkippable 	xpcom/base/nsCycleCollector.cpp:4056
4 	libxul.so 	CCTimerFired 	dom/base/nsJSEnvironment.cpp:3382
5 	libxul.so 	nsTimerImpl::Fire 	xpcom/threads/nsTimerImpl.cpp:428
6 	libxul.so 	nsTimerEvent::Run 	xpcom/threads/nsTimerImpl.cpp:524
7 	libxul.so 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:657
8 	libxul.so 	NS_ProcessNextEvent_P 	obj-firefox/xpcom/build/nsThreadUtils.cpp:245
9 	libxul.so 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:110
10 	libxul.so 	MessageLoop::RunInternal 	ipc/chromium/src/base/message_loop.cc:208
11 	libxul.so 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:201
12 	libxul.so 	nsBaseAppShell::Run 	widget/xpwidgets/nsBaseAppShell.cpp:189
13 	libxul.so 	nsAppStartup::Run 	toolkit/components/startup/nsAppStartup.cpp:295
14 	libxul.so 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3564
15 	libxul.so 	GeckoStart 	toolkit/xre/nsAndroidStartup.cpp:109
16 	libmozglue.so 	__res_nsend 	other-licenses/android/res_send.c:599
17 	dalvik-heap (deleted) 	dalvik-heap @0x529b0e 	
18 	dalvik-heap (deleted) 	dalvik-heap @0x571816 	
19 	libdvm.so 	dvmPlatformInvoke 	
20 	libdvm.so 	dvmCallJNIMethod_general 	
21 	libdvm.so 	dvmResolveNativeMethod 	
22 	libdvm.so 	dvmAsmSisterStart 	
23 	libdvm.so 	dvmMterpStd 	
24 	libdvm.so 	dvmInterpret 	
25 	libdvm.so 	dvmCallMethodV 	
26 	libdvm.so 	dvmCallMethod 	
27 	libdvm.so 	dvmDetachCurrentThread 	
28 	libc.so 	libc.so@0x11946 	
29 	libc.so 	libc.so@0x11512

Some listed devices:
- Samsung GT-I9100G
- Model: 'MB526', Product: 'jordan_hkt', Manufacturer: 'motorola', Hardware: 'mapphone_umts''
- Samsung 'GT-I9100'
- Product: 'GT-P7500', Manufacturer: 'samsung'
- Motorola Triumph

URL:
- http://www.omgubuntu.co.uk/2012/02/adobe-adandons-flash-on-linux/ 

Earliest build seen for 13a1 : 	20120223031236
Comment 1 Martijn Wargers [:mwargers] (gone per 2016-05-31 :-( ) 2012-05-09 11:32:00 PDT
I'm seeing this crash with current trunk build on this url:
http://people.mozilla.org/~mwargers/tests/forms/formautosubmit_parentframes.htm
And then quickly randomly double-tapping all over the place.
It might not be super easy to reproduce the crash, but I've been able to get a crash this way within 1 minute or so.
https://crash-stats.mozilla.com/report/index/bp-250de2af-eeec-4739-a431-13e402120509
This is on the Samsung Galaxy Nexus.
Comment 2 Naoki Hirata :nhirata (please use needinfo instead of cc) 2012-05-09 14:34:45 PDT
Wow.  Reproducible with Martijn's test case.  Zoom in and tap in the text fields, pan tap, pan tap... eventually you will run into this.

This was with the Samsung Galaxy S II, 5/9/2012 Nightly
Comment 3 Martijn Wargers [:mwargers] (gone per 2016-05-31 :-( ) 2012-05-19 01:51:57 PDT
Created attachment 625378 [details]
testcase

I can also reproduce it with this testcase on the Galaxy Nexus (and much easier).
Double tapping on the right iframe makes Fennec crash.
The iframe content just contains a script that is quickly refreshing that page.
Comment 4 Martijn Wargers [:mwargers] (gone per 2016-05-31 :-( ) 2012-05-19 01:52:51 PDT
I guess this might be Java code that is causing the crash in Gecko?
Comment 5 Chris Peterson [:cpeterson] 2012-05-21 10:20:46 PDT
blassey, who would be a good person to investigate CycleCollector crash? This two-month old bug has recently become a #1 or #2 topcrash for 15.0a1 and has a reproducible testcase.
Comment 6 Scoobidiver (away) 2012-05-21 10:32:27 PDT
(In reply to Chris Peterson (:cpeterson) from comment #5)
> This two-month old bug has recently become a #1 or #2 topcrash for 15.0a1
> and has a reproducible testcase.
Martijn's tests made it #1 top crasher.
Comment 7 Andrew McCreight [:mccr8] 2012-05-22 11:27:51 PDT
Odds are something is putting garbage into an nsBaseContentList, and the CC is just the first thing that happens to try to touch it.
Comment 8 Andrew McCreight [:mccr8] 2012-05-22 11:47:59 PDT
It looks like somebody is storing a null pointer into an nsBaseContentList.  I'm not sure if that is okay or not.  Maybe we just need a null check in CanSkipReal?
Comment 9 Scoobidiver (away) 2012-05-27 07:39:48 PDT
It's #4 top crasher in 14.0b3.
Comment 10 Andrew McCreight [:mccr8] 2012-05-27 10:35:57 PDT
Boris, do you know if it is normal for a field of the mElements of an nsBaseContentList to be null?  I'm trying to figure out if this crash is due to a missing null check in cycle collector code, or because a null is somehow leaking into this array where it shouldn't.
Comment 11 Olli Pettay [:smaug] 2012-05-28 12:25:08 PDT
Could someone check what is putting null to mElements?
Comment 12 Boris Zbarsky [:bz] 2012-05-28 20:49:21 PDT
There shouldn't be null things in mElements.  There is already an assert to that effect in nsBaseContentList::InsertElementAt.  Add one in nsBaseContentList::AppendElement too, and see if you hit it?
Comment 13 Boris Zbarsky [:bz] 2012-05-28 20:49:55 PDT
Though note that some subclasses (e.g. nsContentList) touch the array directly.  They still shouldn't be putting null in, though!
Comment 14 Andrew McCreight [:mccr8] 2012-05-29 08:18:59 PDT
mw22, does your test example hit any asserts when run in a debug build?
Comment 15 Jeff Muizelaar [:jrmuizel] 2012-05-29 12:05:13 PDT
Here's the stack when we append a NULL content:
#0  0x4d7b9ec8 in nsBaseContentList::AppendElement (this=0x482cd120, aContent=0x0) at ../../dist/include/nsContentList.h:77
#1  0x4da36e16 in nsDocument::NodesFromRectHelper (this=0x4af45800, aX=474, aY=82, aTopSize=98, aRightSize=65.3333359, aBottomSize=32.6666679, aLeftSize=65.3333359, aIgnoreRootScrollFrame=true, aFlushLayout=false, aReturn=0x47ddff88) at /home/jrmuizel/mozilla/maple/content/base/src/nsDocument.cpp:2904
#2  0x4dd1e2f6 in nsDOMWindowUtils::NodesFromRect (this=0x4afe8540, aX=474, aY=82, aTopSize=98, aRightSize=65.3333359, aBottomSize=32.6666679, aLeftSize=65.3333359, aIgnoreRootScrollFrame=true, aFlushLayout=false, aReturn=0x47ddff88) at /home/jrmuizel/mozilla/maple/dom/base/nsDOMWindowUtils.cpp:1084
#3  0x4e81702e in NS_InvokeByIndex_P (that=0x4afe8540, methodIndex=29, paramCount=<optimized out>, params=<optimized out>) at /home/jrmuizel/mozilla/maple/xpcom/reflect/xptcall/src/md/unix/xptcinvoke_arm.cpp:160
#4  0x4e1c2324 in CallMethodHelper::Invoke (this=0x456728d0) at /home/jrmuizel/mozilla/maple/js/xpconnect/src/XPCWrappedNative.cpp:3062
#5  0x4e1c0890 in CallMethodHelper::Call (this=0x456728d0) at /home/jrmuizel/mozilla/maple/js/xpconnect/src/XPCWrappedNative.cpp:2390
#6  0x4e1c0760 in XPCWrappedNative::CallMethod (ccx=..., mode=XPCWrappedNative::CALL_METHOD) at /home/jrmuizel/mozilla/maple/js/xpconnect/src/XPCWrappedNative.cpp:2356
#7  0x4e1cb7a2 in XPC_WN_CallMethod (cx=0x47451ec0, argc=8, vp=0x47800180) at /home/jrmuizel/mozilla/maple/js/xpconnect/src/XPCWrappedNativeJSOps.cpp:1500
#8  0x4eef0fe0 in js::CallJSNative (cx=0x47451ec0, native=0x4e1cb5bd <XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*)>, args=...) at ../../../js/src/jscntxtinlines.h:395
#9  0x4eef6d76 in js::InvokeKernel (cx=0x47451ec0, args=..., construct=js::NO_CONSTRUCT) at /home/jrmuizel/mozilla/maple/js/src/jsinterp.cpp:310
#10 0x4ef039a8 in js::Interpret (cx=0x47451ec0, entryFrame=0x47800028, interpMode=js::JSINTERP_NORMAL) at /home/jrmuizel/mozilla/maple/js/src/jsinterp.cpp:2512
#11 0x4eef6afa in js::RunScript (cx=0x47451ec0, script=0x487bb7e0, fp=0x47800028) at /home/jrmuizel/mozilla/maple/js/src/jsinterp.cpp:266
#12 0x4eef6e22 in js::InvokeKernel (cx=0x47451ec0, args=..., construct=js::NO_CONSTRUCT) at /home/jrmuizel/mozilla/maple/js/src/jsinterp.cpp:326
#13 0x4ee6e950 in js::Invoke (cx=0x47451ec0, args=..., construct=js::NO_CONSTRUCT) at /home/jrmuizel/mozilla/maple/js/src/jsinterp.h:125
#14 0x4eef6fd4 in js::Invoke (cx=0x47451ec0, thisv=..., fval=..., argc=1, argv=0x456735f0, rval=0x45673710) at /home/jrmuizel/mozilla/maple/js/src/jsinterp.cpp:358
#15 0x4ee6508c in JS_CallFunctionValue (cx=0x47451ec0, obj=0x48779910, fval=..., argc=1, argv=0x456735f0, rval=0x45673710) at /home/jrmuizel/mozilla/maple/js/src/jsapi.cpp:5496
#16 0x4e1b8030 in nsXPCWrappedJSClass::CallMethod (this=0x47d54790, wrapper=0x4ac8ee00, methodIndex=3, info=0x4755e720, nativeParams=0x45673960) at /home/jrmuizel/mozilla/maple/js/xpconnect/src/XPCWrappedJSClass.cpp:1474
#17 0x4e1b0668 in nsXPCWrappedJS::CallMethod (this=0x4ac8ee00, methodIndex=3, info=0x4755e720, params=0x45673960) at /home/jrmuizel/mozilla/maple/js/xpconnect/src/XPCWrappedJS.cpp:579
#18 0x4e817a0e in PrepareAndDispatch (self=<optimized out>, methodIndex=<optimized out>, args=0x45673a1c) at /home/jrmuizel/mozilla/maple/xpcom/reflect/xptcall/src/md/unix/xptcstubs_arm.cpp:105
#19 0x4e817044 in SharedStub () from /home/jrmuizel/mozilla/maple/objdir-debug/dist/bin/libxul.so
#20 0x4db7991a in nsEventListenerManager::HandleEventSubType (this=0x4acb6970, aListenerStruct=0x4a990e48, aListener=0x4ab51cb0, aDOMEvent=0x48561550, aCurrentTarget=0x4a9c2e70, aPhaseFlags=2, aPusher=0x45673b98) at /home/jrmuizel/mozilla/maple/content/events/src/nsEventListenerManager.cpp:809
#21 0x4db79af6 in nsEventListenerManager::HandleEventInternal (this=0x4acb6970, aPresContext=0x4af46400, aEvent=0x456742d8, aDOMEvent=0x45673bac, aCurrentTarget=0x4a9c2e70, aFlags=2, aEventStatus=0x45673bb0, aPusher=0x45673b98) at /home/jrmuizel/mozilla/maple/content/events/src/nsEventListenerManager.cpp:866
#22 0x4dba0468 in nsEventListenerManager::HandleEvent (this=0x4acb6970, aPresContext=0x4af46400, aEvent=0x456742d8, aDOMEvent=0x45673bac, aCurrentTarget=0x4a9c2e70, aFlags=2, aEventStatus=0x45673bb0, aPusher=0x45673b98) at /home/jrmuizel/mozilla/maple/content/events/src/nsEventListenerManager.h:137
#23 0x4dba09ce in nsEventTargetChainItem::HandleEvent (this=0x486f6080, aVisitor=..., aFlags=2, aMayHaveNewListenerManagers=false, aPusher=0x45673b98) at /home/jrmuizel/mozilla/maple/content/events/src/nsEventDispatcher.cpp:185
#24 0x4dba0e98 in nsEventTargetChainItem::HandleEventTargetChain (this=0x486f6160, aVisitor=..., aFlags=6, aCallback=0x45673d18, aMayHaveNewListenerManagers=false, aPusher=0x45673b98) at /home/jrmuizel/mozilla/maple/content/events/src/nsEventDispatcher.cpp:340
#25 0x4dba1c80 in nsEventDispatcher::Dispatch (aTarget=0x48222fb0, aPresContext=0x4af46400, aEvent=0x456742d8, aDOMEvent=0x0, aEventStatus=0x45673cb4, aCallback=0x45673d18, aTargets=0x0) at /home/jrmuizel/mozilla/maple/content/events/src/nsEventDispatcher.cpp:643
#26 0x4d7a3f80 in PresShell::DispatchTouchEvent (this=0x4acf6100, aEvent=0x456742d8, aStatus=0x456741d0, aEventCB=0x45673d18, aTouchIsNew=false) at /home/jrmuizel/mozilla/maple/layout/base/nsPresShell.cpp:6456
#27 0x4d7a3a4c in PresShell::HandleEventInternal (this=0x4acf6100, aEvent=0x456742d8, aStatus=0x456741d0) at /home/jrmuizel/mozilla/maple/layout/base/nsPresShell.cpp:6358
#28 0x4d7a3108 in PresShell::HandlePositionedEvent (this=0x4acf6100, aTargetFrame=0x482f4278, aEvent=0x456742d8, aEventStatus=0x456741d0) at /home/jrmuizel/mozilla/maple/layout/base/nsPresShell.cpp:6093
#29 0x4d7a2668 in PresShell::HandleEvent (this=0x4acf6100, aFrame=0x4af247f8, aEvent=0x456742d8, aDontRetargetEvents=true, aEventStatus=0x456741d0) at /home/jrmuizel/mozilla/maple/layout/base/nsPresShell.cpp:5893
#30 0x4d7a1980 in PresShell::HandleEvent (this=0x483d1400, aFrame=0x485fc7f8, aEvent=0x456742d8, aDontRetargetEvents=false, aEventStatus=0x456741d0) at /home/jrmuizel/mozilla/maple/layout/base/nsPresShell.cpp:5630
#31 0x4dd17044 in nsViewManager::DispatchEvent (this=0x486f42e0, aEvent=0x456742d8, aView=0x486e2340, aStatus=0x456741d0) at /home/jrmuizel/mozilla/maple/view/src/nsViewManager.cpp:865
#32 0x4dd12456 in HandleEvent (aEvent=0x456742d8) at /home/jrmuizel/mozilla/maple/view/src/nsView.cpp:126
#33 0x4e53db06 in nsWindow::DispatchEvent (this=0x47459880, aEvent=0x456742d8) at /home/jrmuizel/mozilla/maple/widget/android/nsWindow.cpp:639
#34 0x4e53dad8 in nsWindow::DispatchEvent (this=0x47459880, aEvent=0x456742d8, aStatus=@0x45674244) at /home/jrmuizel/mozilla/maple/widget/android/nsWindow.cpp:631
#35 0x4e53f8e6 in nsWindow::DispatchMultitouchEvent (this=0x47459880, event=..., ae=0x4849a740) at /home/jrmuizel/mozilla/maple/widget/android/nsWindow.cpp:1441
#36 0x4e53f42e in nsWindow::OnMultitouchEvent (this=0x47459880, ae=0x4849a740) at /home/jrmuizel/mozilla/maple/widget/android/nsWindow.cpp:1358
#37 0x4e53e54c in nsWindow::OnGlobalAndroidEvent (ae=0x4849a740) at /home/jrmuizel/mozilla/maple/widget/android/nsWindow.cpp:842
#38 0x4e529dcc in nsAppShell::ProcessNextNativeEvent (this=0x474303e0, mayWait=false) at /home/jrmuizel/mozilla/maple/widget/android/nsAppShell.cpp:558
#39 0x4e54660e in nsBaseAppShell::DoProcessNextNativeEvent (this=0x474303e0, mayWait=false, recursionDepth=0) at /home/jrmuizel/mozilla/maple/widget/xpwidgets/nsBaseAppShell.cpp:139
#40 0x4e546958 in nsBaseAppShell::OnProcessNextEvent (this=0x474303e0, thr=0x44db11d0, mayWait=false, recursionDepth=0) at /home/jrmuizel/mozilla/maple/widget/xpwidgets/nsBaseAppShell.cpp:280
#41 0x4e7f6fbe in nsThread::ProcessNextEvent (this=0x44db11d0, mayWait=false, result=0x4567471f) at /home/jrmuizel/mozilla/maple/xpcom/threads/nsThread.cpp:586
#42 0x4e79b992 in NS_ProcessNextEvent_P (thread=0x44db11d0, mayWait=false) at /home/jrmuizel/mozilla/maple/objdir-debug/xpcom/build/nsThreadUtils.cpp:213
#43 0x4e66a14a in mozilla::ipc::MessagePump::Run (this=0x44db31f0, aDelegate=0x44dd90e0) at /home/jrmuizel/mozilla/maple/ipc/glue/MessagePump.cpp:82
#44 0x4e84b0bc in MessageLoop::RunInternal (this=0x44dd90e0) at /home/jrmuizel/mozilla/maple/ipc/chromium/src/base/message_loop.cc:208
#45 0x4e84b056 in MessageLoop::RunHandler (this=0x44dd90e0) at /home/jrmuizel/mozilla/maple/ipc/chromium/src/base/message_loop.cc:201
#46 0x4e84affe in MessageLoop::Run (this=0x44dd90e0) at /home/jrmuizel/mozilla/maple/ipc/chromium/src/base/message_loop.cc:175
#47 0x4e546692 in nsBaseAppShell::Run (this=0x474303e0) at /home/jrmuizel/mozilla/maple/widget/xpwidgets/nsBaseAppShell.cpp:163
#48 0x4e346914 in nsAppStartup::Run (this=0x475984c0) at /home/jrmuizel/mozilla/maple/toolkit/components/startup/nsAppStartup.cpp:256
#49 0x4d4ce3d4 in XREMain::XRE_mainRun (this=0x456749d4) at /home/jrmuizel/mozilla/maple/toolkit/xre/nsAppRunner.cpp:3765
#50 0x4d4ce5e2 in XREMain::XRE_main (this=0x456749d4, argc=7, argv=0x44dbf048, aAppData=0x804567ac) at /home/jrmuizel/mozilla/maple/toolkit/xre/nsAppRunner.cpp:3842
#51 0x4d4ce796 in XRE_main (argc=7, argv=0x44dbf048, aAppData=0x804567ac, aFlags=0) at /home/jrmuizel/mozilla/maple/toolkit/xre/nsAppRunner.cpp:3918
#52 0x4d4d8fde in GeckoStart (data=0x484480, appData=0x804567ac) at /home/jrmuizel/mozilla/maple/toolkit/xre/nsAndroidStartup.cpp:73
#53 0x8042a25e in Java_org_mozilla_gecko_GeckoAppShell_nativeRun (jenv=0x471098, jc=0x4054ed70, jargs=0x4059b1e8) at /home/jrmuizel/mozilla/maple/mozglue/android/APKOpen.cpp:966
#54 0x80017e38 in dvmPlatformInvoke () from /home/jrmuizel/src/gdb-root/moz-gdb/lib/353549C2A70D00EC/system/lib/libdvm.so
#55 0x800496e2 in dvmCallJNIMethod_general () from /home/jrmuizel/src/gdb-root/moz-gdb/lib/353549C2A70D00EC/system/lib/libdvm.so
#56 0x8004ee64 in dvmResolveNativeMethod () from /home/jrmuizel/src/gdb-root/moz-gdb/lib/353549C2A70D00EC/system/lib/libdvm.so
#57 0x8001d038 in dvmJitToInterpNoChain () from /home/jrmuizel/src/gdb-root/moz-gdb/lib/353549C2A70D00EC/system/lib/libdvm.so
#58 0x8001d038 in dvmJitToInterpNoChain () from /home/jrmuizel/src/gdb-root/moz-gdb/lib/353549C2A70D00EC/system/lib/libdvm.so
Comment 16 Jeff Muizelaar [:jrmuizel] 2012-05-29 12:11:14 PDT
And we do so because CheckAncestryAndGetFrame returns NULL.

In this case 'this' is https://bug731293.bugzilla.mozilla.org/attachment.cgi?id=625378
and currentDoc is text/html;charset=utf-8,%3Cscript%3E%0AsetTimeout%28function%28%29%20%7Bwindow.location.reload%28%29%3B%7D%2C%20200%29%3B%0A%3C/script%3E

I don't know if CheckAncestryAndGetFrame is doing the right thing here or not.
Comment 17 Andrew McCreight [:mccr8] 2012-05-29 12:37:04 PDT
Just from looking at the code, it seems like null is a reasonable value for CheckAncestryAndGetFrame to return.  The name implies it can fail, which would result in returning null.
Comment 18 Jeff Muizelaar [:jrmuizel] 2012-05-29 12:43:24 PDT
(In reply to Andrew McCreight [:mccr8] from comment #17)
> Just from looking at the code, it seems like null is a reasonable value for
> CheckAncestryAndGetFrame to return.  The name implies it can fail, which
> would result in returning null.

Yeah, it seems like the caller needs to figure out what an appropriate thing is to do when CheckAncestryAndGetFrame returns null.
Comment 19 Boris Zbarsky [:bz] 2012-05-29 15:12:17 PDT
nsDocument::NodesFromRectHelper is just broken here.  It shouldn't be putting null in that list.  The only two consumers (in Fennec chrome code) would end up throwing exceptions if the list contained null....
Comment 20 Jeff Muizelaar [:jrmuizel] 2012-05-29 18:09:09 PDT
In this test case though it seems like CheckAncestryAndGetFrame should be returning the FrameElement for and not NULL. i.e. shouldn't the list of Frames and associated nsIContents all be ancestors of the current document?
Comment 21 Johnathan Nightingale [:johnath] 2012-06-01 12:22:46 PDT
bz/jeff - is one of you the right assignee for this, given that you're pretty active in it?
Comment 22 Boris Zbarsky [:bz] 2012-06-01 13:33:08 PDT
The right assignee is someone who understans what that code is trying to do.

So probably Felipe or roc?

I can fix the crash, obviously, but it's not clear to me what behavior I'd break in the process...  Then again, it's behavior that currently always crashes, so maybe it's not a problem.
Comment 23 :Felipe Gomes (needinfo me!) 2012-06-01 14:56:40 PDT
Note that this code is very similar to ElementFromPoint which has existed for longer, and would also return null in that case.

Which is also odd to think that ElementFromPoint could ever return null for a valid x/y point.

Maybe this would happen when the function is being called from a sub-document and there's a higher level document over the x/y point (or, in this case, area) being used.. bz, what do you think?

The CheckAncestryAndGetFrame exists there to not return any cross-document elements. So if that theory makes sense then it's a correct fix to check for null and not add it to the list. Which I believe is the fix we'll end up doing but I wanted to understand why it happens first.
Comment 24 Boris Zbarsky [:bz] 2012-06-01 22:09:59 PDT
I really don't know offhand why null is being returned here.  I could dig through the code if desired....
Comment 25 Scoobidiver (away) 2012-06-04 01:48:28 PDT
It's #1 top crasher in 14.0b5 with 12.7% of crashes.
Comment 26 JP Rosevear [:jpr] 2012-06-04 05:26:52 PDT
(In reply to Boris Zbarsky (:bz) from comment #24)
> I really don't know offhand why null is being returned here.  I could dig
> through the code if desired....

That would be very helpful (top crash for Fennec Native right now as per #25).  Thanks.
Comment 27 :Felipe Gomes (needinfo me!) 2012-06-04 14:45:10 PDT
Created attachment 629936 [details] [diff] [review]
Patch

Patch doing the obvious. It'd still be good to understand what changed to cause this (this API had been in use in XUL fennec for a long time), but fixing the crash soon seems good. I'm pretty sure there's no point in having null elements in the list returned so the change seems reasonable.

Martijn: I sent it to tryserver; after it is built could you try the testcase and see if it fixes the problem? https://tbpl.mozilla.org/?tree=Try&rev=9100f56a76f2, builds here: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/felipc@gmail.com-9100f56a76f2
Comment 28 Chris Peterson [:cpeterson] 2012-06-04 14:59:56 PDT
Comment on attachment 629936 [details] [diff] [review]
Patch

Review of attachment 629936 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/base/src/nsDocument.cpp
@@ +2904,1 @@
>          elements->AppendElement(elementDoc);

nsBaseContentList::InsertElementAt asserts its aContent parameter != NULL. Should nsBaseContentList::AppendElement have an assert, too?
Comment 29 Johnny Stenback (:jst, jst@mozilla.com) 2012-06-04 17:21:52 PDT
(In reply to Chris Peterson (:cpeterson) from comment #28)
> nsBaseContentList::InsertElementAt asserts its aContent parameter != NULL.
> Should nsBaseContentList::AppendElement have an assert, too?

Yes!
Comment 30 :Felipe Gomes (needinfo me!) 2012-06-04 20:36:31 PDT
Created attachment 630044 [details] [diff] [review]
Patch v2

Added the assertion to AppendElement
Comment 31 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-06-04 21:31:02 PDT
(In reply to Felipe Gomes (:felipe) from comment #23)
> Note that this code is very similar to ElementFromPoint which has existed
> for longer, and would also return null in that case.
> 
> Which is also odd to think that ElementFromPoint could ever return null for
> a valid x/y point.
> 
> Maybe this would happen when the function is being called from a
> sub-document and there's a higher level document over the x/y point (or, in
> this case, area) being used.. bz, what do you think?

No, that can't happen here since we're only looking at this document's root frame and its descendants.

I suspect what's happening here is that we're rendering the "zombie document" that is going away due to a pending navigation, we find content in that document and call nsDocument::CheckAncestryAndGetFrame, which returns null for zombie documents for some reason.

I think we can do a much more robust approach that walks the frame tree. working up a patch.
Comment 32 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-06-04 22:01:04 PDT
Created attachment 630065 [details] [diff] [review]
Part 1: Refactor GetParentOrPlaceholderFor a bit
Comment 33 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-06-04 22:02:39 PDT
Created attachment 630066 [details] [diff] [review]
Part 2: Add nsLayoutUtils::GetParentOrPlaceholderForCrossDoc
Comment 34 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-06-04 22:03:35 PDT
Created attachment 630067 [details] [diff] [review]
Part 3: Walk frame tree to find correct element

Felipe, if these patches work for you, let's do it this way :-).
Comment 35 Boris Zbarsky [:bz] 2012-06-04 22:20:31 PDT
Comment on attachment 630044 [details] [diff] [review]
Patch v2

r=me, in general, though if roc's patches make the nsDocument bit not needed so much the better.
Comment 36 :Felipe Gomes (needinfo me!) 2012-06-04 22:45:28 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #34)
> Created attachment 630067 [details] [diff] [review]
> Part 3: Walk frame tree to find correct element
> 
> Felipe, if these patches work for you, let's do it this way :-).

Yeah, WFM. But I don't have an env ready to build and test Fennec so if someone who does could try the testcase with these patches that'd be nice
Comment 37 JP Rosevear [:jpr] 2012-06-05 06:51:50 PDT
Wondering about the risk profile for these two approaches - Felipes patches seem simpler, should we just take that for FF14 (Fennec Native 1.0) since its essentially a null check and let roc's patches land for 15/16?
Comment 38 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-06-05 13:52:20 PDT
(In reply to JP Rosevear [:jpr] from comment #37)
> Wondering about the risk profile for these two approaches - Felipes patches
> seem simpler, should we just take that for FF14 (Fennec Native 1.0) since
> its essentially a null check and let roc's patches land for 15/16?

That sounds reasonable.

With Felipe's patch, in testcases where we currently crash we will instead fail to register the touch/click. I.e., touching/clicking an iframe currently reloading its document might fail to do anything. My patches should fix that.
Comment 39 JP Rosevear [:jpr] 2012-06-05 14:17:56 PDT
Comment on attachment 630044 [details] [diff] [review]
Patch v2

[Triage Comment]
We'll take this patch for beta, for aurora/m-c we should probably take roc's patch.
Comment 40 :Gavin Sharp [email: gavin@gavinsharp.com] 2012-06-05 14:50:22 PDT
https://hg.mozilla.org/releases/mozilla-beta/rev/0a3f4d59d26b
Comment 41 Robert Kaiser (not working on stability any more) 2012-06-06 12:00:34 PDT
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #38)
> With Felipe's patch, in testcases where we currently crash we will instead
> fail to register the touch/click. I.e., touching/clicking an iframe
> currently reloading its document might fail to do anything.

That behavior doesn't sound unreasonable even for release (and crashing is surely worse), even though improving that with your patches is probably preferable in the long term.
Comment 42 Mats Palmgren (:mats) 2012-06-07 12:54:49 PDT
Comment on attachment 630067 [details] [diff] [review]
Part 3: Walk frame tree to find correct element

> content/base/src/nsDocument.cpp
> +  if (elem && !elem->IsElement()) {
> +    elem = elem->GetParent();
> +  }
> +  return CallQueryInterface(elem, aReturn);

CallQueryInterface will crash if 'elem' is null.


> +  for (PRInt32 i = 0; i < outFrames.Length(); i++) {

s/PRInt32/nsAutoTArray<nsIFrame*,8>::size_type/


> +    if (node && !node->IsElement() && !node->IsNodeOfType(nsINode::eTEXT)) {
> +      // If we have a node that isn't an element or a text node,
> +      // replace it with the first parent node.
> +      node = node->GetParent();
> +    }

The second comment line is confusing; "replace"? "the first"?
Perhaps this is better:
  // We have a node that isn't an element or a text node,
  // use its parent content instead.


> +    nsIContent* ptContent = f->GetContent();

'pt'?  Just 'content' will do I think.


> +  * Find the (non-anonymous) element in this document that contains the
> +  * content (possibly in a subdocument) associated with aFrame. Returns null
> +  * if there is no such element.

I think 'contains' is a bit misleading since normally it just returns
aFrame->GetContent().  The 'contains' suggests to me it returns
aFrame->GetContent()->GetParent().  The name of the method is clear though,
but if you can rephrase the comment without the "contains" that would be good.

r=mats
Comment 45 Scoobidiver (away) 2012-06-08 23:32:28 PDT
There are no crashes in 14.0b6.
Comment 46 :Ms2ger 2012-06-09 11:18:38 PDT
(In reply to Mats Palmgren [:mats] from comment #42)
> Comment on attachment 630067 [details] [diff] [review]
> Part 3: Walk frame tree to find correct element
> 
> > content/base/src/nsDocument.cpp
> > +  if (elem && !elem->IsElement()) {
> > +    elem = elem->GetParent();
> > +  }
> > +  return CallQueryInterface(elem, aReturn);
> 
> CallQueryInterface will crash if 'elem' is null.

It will still crash.
Comment 47 Brad Lassey [:blassey] (use needinfo?) 2012-06-11 16:47:49 PDT
Ms2ger, the crash has gone away, please open a new bug for it speculatively being able to crash
Comment 48 Scoobidiver (away) 2012-06-12 01:28:54 PDT
There are no crashes in builds above 16.0a1/20120608.
There are still crashes in Aurora where the patch hasn't landed.
Comment 49 JP Rosevear [:jpr] 2012-06-12 06:16:26 PDT
Ok, we need to decide if the work around or the fuller fix goes on to Aurora.  Given its early in the cycle, I think we take roc's patches.  Roc, can you nom?
Comment 50 JP Rosevear [:jpr] 2012-06-12 06:17:23 PDT
Tracking to make sure we don't lose this.
Comment 51 Chris Peterson [:cpeterson] 2012-06-19 13:23:52 PDT
This crash is still our #2 topcrash for Fennec 15.
Comment 52 Mats Palmgren (:mats) 2012-06-19 13:31:08 PDT
roc, see comment 49.
Comment 53 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-06-20 17:58:56 PDT
Somehow I failed to land my changes to address review comments :-(
Comment 54 Robert O'Callahan (:roc) (Exited; email my personal email if necessary) 2012-06-20 19:50:53 PDT
Review comments:
https://hg.mozilla.org/integration/mozilla-inbound/rev/6d5a6c8120c0
Comment 55 Ed Morley [:emorley] 2012-06-21 04:03:14 PDT
https://hg.mozilla.org/mozilla-central/rev/6d5a6c8120c0

Note You need to log in before you can comment on or make changes to this bug.