Closed Bug 571103 Opened 14 years ago Closed 14 years ago

Crash [@ nsGenericElement::SaveSubtreeState() ]

Categories

(Core :: DOM: Core & HTML, defect)

defect
Not set
critical

Tracking

()

RESOLVED INCOMPLETE
Tracking Status
blocking2.0 --- -

People

(Reporter: MatsPalmgren_bugz, Assigned: sicking)

References

Details

(Keywords: crash, Whiteboard: [sg:critical?][need testcase or STR])

Crash Data

Crash [@ nsGenericElement::SaveSubtreeState() ]
bp-82532c8b-5a15-465d-9243-5210c2100608

nsGenericElement::SaveSubtreeState	 content/base/src/nsGenericElement.cpp:3789
nsGenericElement::SaveSubtreeState	content/base/src/nsGenericElement.cpp:3789
nsGenericElement::SaveSubtreeState	content/base/src/nsGenericElement.cpp:3789
nsGenericElement::SaveSubtreeState	content/base/src/nsGenericElement.cpp:3789
nsGenericElement::SaveSubtreeState	content/base/src/nsGenericElement.cpp:3789
nsGenericElement::SaveSubtreeState	content/base/src/nsGenericElement.cpp:3789
nsGenericElement::SaveSubtreeState	content/base/src/nsGenericElement.cpp:3789
nsDocument::RemovedFromDocShell	content/base/src/nsDocument.cpp:6782
DocumentViewerImpl::Close	layout/base/nsDocumentViewer.cpp:1438
nsDocShell::RestoreFromHistory	docshell/base/nsDocShell.cpp:6730
nsDocShell::RestorePresentationEvent::Run	docshell/base/nsDocShell.cpp:6405
nsThread::ProcessNextEvent	xpcom/threads/nsThread.cpp:547
mozilla::ipc::MessagePump::Run	ipc/glue/MessagePump.cpp:118
MessageLoop::RunInternal	ipc/chromium/src/base/message_loop.cc:216
MessageLoop::RunHandler	ipc/chromium/src/base/message_loop.cc:199
xul.dll@0x30ab13	
MessageLoop::Run
nsBaseAppShell::Run	widget/src/xpwidgets/nsBaseAppShell.cpp:175 


There were 238 crashes in the past week:
http://crash-stats.mozilla.com/query/query?version=ALL%3AALL&range_value=1&range_unit=weeks&date=06%2F09%2F2010+14%3A53%3A09&query_search=signature&query_type=exact&query=nsGenericElement%3A%3ASaveSubtreeState%28%29&build_id=&process_type=browser&hang_type=any&do_query=1
213 on Windows, 25 on MacOSX.  All branches are affected.
Whiteboard: [sg:critical?] → [sg:critical?][critsmash:investigating]
One commonality I see in looking at some of the recent crashes on the 1.9.3 branch are they all have http://addons.mozilla.org/en-US/firefox/addon/6973 installed. Some of the URLs indicate people may be downloading various types of files when they experience this crash. Am investigating further.
Will take a look at some more crash data and report back before the next meeting.
One common thread I also see in the URLs is that some of them are arabic sites, such as http://s3.travian.com.eg/nachrichten.php.
Bug 566441 is the other crash bug I mentioned during the meeting that has a similar beginning stack.
I will be looking for other sites where we can try to possibly reproduce this, as well as looking at any possible data from B1.
Will look at B1 data to see where it is in the crash stats as well if there are any further clues to repro.
still in b1, and it look like the crash rate has increased

checking --- nsGenericElement::SaveSubtreeState 20100712-crashdata.csv
found in: 4.0b1 3.6.6 3.7a5 3.6.7 3.6.3 3.7a6pre 3.5.9 3.0.3
release total-crashes
              nsGenericElement::SaveSubtreeState crashes
                         pct.
all     285620  178     0.000623206
4.0b1   18756   154     0.00821071
3.6.6   162229  13      8.01336e-05
3.7a5   721     3       0.00416089
3.6.7   14043   3       0.00021363
3.6.3   19723   2       0.000101404
3.7a6pre        379     1       0.00263852
3.5.9   5752    1       0.000173853
3.0.3   537     1       0.0018622
the 4.0b1 version of the crash is 100% correlated to IDM

 (161/161) vs.  51% (9199/17949) mozilla_cc@internetdownloadmanager.com (IDM CC, https://addons.mozilla.org/addon/6973)
Depends on: 578443
decided to close this and track any remaining work in the dependency bug to do the blocklisting.
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Reopening: blocklisting may have fixed the 4.0b1 spike but we still have this crash on all branches, Mac and Win OSs, and the stacks show it's exploitable (trying to jump to a random address in several cases).

Checking a handful of 3.6.8 crashes I see none with the IDM add-on. It might still be there in some stacks I didn't check, might have exacerbated the problem in 4.0beta, but it is not the underlying cause.

Any hints on what kind of document or document changes result in recursive calls to nsGenericElement::SaveSubtreeState  ? If we can get an interesting starting point maybe we can get a fuzzer to find it.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Whiteboard: [sg:critical?][critsmash:investigating] → [sg:critical?][need testcase or STR]
I will see if i get urls here for my crash testing and see if i can get a testcase/Firefox to crash

Chris: could you give me the urls?
interesting that this mostly still happens to 4.0 users


checking --- nsGenericElement::SaveSubtreeState 20100913-crashdata.csv
found in: 4.0b6pre 3.6.9 4.0b5 4.0b1 3.6.8 3.7a4webm 3.7a5 3.6.6 3.6.4 3.5.11
release total-crashes
              nsGenericElement::SaveSubtreeState crashes
                         pct.
all     358954  50      0.000139294
4.0b6pre5444    26      0.0047759
3.6.9   167129  7       4.18838e-05
4.0b5   42210   5       0.000118455
4.0b1   1431    3       0.00209644
3.6.8   49757   3       6.0293e-05
3.7a4web17      2       0.117647
3.7a5   240     1       0.00416667
3.6.6   7874    1       0.000127
3.6.4   3136    1       0.000318878
3.5.11  5145    1       0.000194363

os breakdown
nsGenericElement::SaveSubtreeStateTotal 50
Win5.1  0.42
Win6.0  0.10
Win6.1  0.38
Mac10.4 0.06

urls for testing might not be that helpful they are pretty random and might just reflect common browsing.

 1 https://www.tsp.gov/tsp/profileSettings.do?subaction=view&_name=profile
 1 https://ssl.rapidshare.com/premzone.html
   1 https://sports.betfair.com/blank.html
http://www.thestar.com/news/gta/article/857967--tele-robot-connects-workers-separated-by-distance

and a variety of searches on

 http://www.google.pl 
 http://www.google.it/ 
 http://www.google.hu/ 
 http://www.google.fi/
 http://www.google.cz/
 http://www.google.com/
 http://www.google.com.vn/
 http://www.google.co.uk/
 http://www.google.co.jp/
 http://www.google.co.id/
 http://www.google.ca/
Component: General → DOM
QA Contact: general → general
Bug 597535 has almost an identical stack to this - is it a dupe?
This might have gone away in today's build.
... or maybe it's bug 597535 that has; I'm not sure what the distinction is.
3 people crashed using the 2010100100 build, so it is not completely gone. Will take a look at the URLs, but I think I have seen http://globoesporte.globo.com/futebol/times/botafogo/ in the mix more than once.
Can we find out if this has gone away?
Assignee: nobody → jonas
blocking2.0: --- → final+
haven't seen any crashes on trunk since comment 16

date     tl crashes at, count build, count build, ...
         nsGenericElement::SaveSubtreeState
20100920   
20100921 2 4.0b7pre2010092003 2 , 
20100922 2  1 4.0b7pre2010092103, 
	     1 3.6.102010091518, 
20100923 1 4.0b7pre2010092303 1 , 
20100924   
20100925 1 4.0b7pre2010092103 1 , 
20100926 2  1 4.0b7pre2010092603, 
	     1 4.0b7pre2010092503, 
20100927   
20100928 1 4.0b7pre2010092803 1 , 
20100929   
20100930   
20101001   
20101002 2 4.0b62010091407 2 , 
20101003   
20101004   
20101005 1 4.0b62010091407 1 , 
20101006   
20101007   
20101008   
20101009   
20101010   
20101011 2  1 4.0b62010091407, 

should we mark this works for me, or fixed, or wait for b7 to help confirm?
Using a manual search, I still see two instances of this crash on 3.6.11 - https://crash-stats.mozilla.com/report/index/3bd54cf6-d662-4086-9aef-1ea7f2101011 is one. There are also a handful on the trunk such as https://crash-stats.mozilla.com/report/index/2eef8dc5-9295-4147-99ba-a49e02101002 that occurred using the 20101001082844 build, 4.0b7pre.
not sure exactly,  but here is what might be going on.

the web interface says client crash time is     
2010-10-11 07:42:29.674630 

if that report was requested to be processed by a user on oct 12 or 13 then they would have missed the chance to get in .csv run for oct 10.   the .csv files are open to missing reports this kind of situation where there might be a backlog when the .csv run starts each morning or a user requests processing of a report that would have otherwise been held in deferred storage.

I rechecked those UUIDS and they aren't in any of the .csv files.
Jonas, can you update on where you are wrt to this bug?
Jonas, what's most likely to help here?
* You reading some of the functions on the stack and looking for sketchiness
* You looking through minidumps
* Me adding something (iframe-session-history testing?) to my DOM fuzzer

Or should we give up on this, saying it's low-volume enough that it could be prior memory corruption or so?
At the very least I'd like to look at some minidumps before giving up on this one.

I don't think session history stuff is involved here, although it's certainly a possibility, it rather looks like the trigger.
Using this for debugging: c7a2bc4c-c8ce-45b5-9ee5-8fc792101112

We're crashing on the line with the arrow.

EBX is the current this-pointer and is 0x083EE700

ECX is supposed to hold the value of the child we're about to call SaveSubtreeState on and contains 0x08A76180, which seems like a valid pointer. It appears to be the second child of the current element.

However EDX is supposed to contain the vtable pointer of that child, but is null. So presumably ECX is a dangling pointer.

6878EA47  mov         ecx,dword ptr [ebx+2Ch] 
6878EA4A  test        ecx,ecx 
6878EA4C  je          nsGenericElement::SaveSubtreeState+5Ah (6878EA8Ah) 
6878EA4E  mov         eax,dword ptr [ecx] 
6878EA50  and         eax,3FFh 
6878EA55  lea         eax,[esi+eax*2] 
6878EA58  mov         ecx,dword ptr [ecx+eax*4+0Ch] 
6878EA5C  mov         edx,dword ptr [ecx] 
->6878EA5E  mov         eax,dword ptr [edx+148h] 
6878EA64  cmp         eax,offset nsGenericElement::SaveSubtreeState (6878EA30h) 
6878EA69  jne         nsGenericElement::SaveSubtreeState+4Bh (6878EA7Bh) 
6878EA6B  call        nsGenericElement::SaveSubtreeState (6878EA30h) 


I can't think of a way that this would happen. One possibility is that someone is doing in-proper refcounting and has released the child once too many. Another is is memory corruption which has overwritten this entry in the child-array.

In either case the error has happened prior to the crashing stack, so I don't know what type of instrumentation we could add that would be helpful.

Based on that, I'd say lets unblock on this one, the volume is very low for this signature. It's probably an sg:crit, but without steps to reproduce it's hard to get further here.
blocking2.0: final+ → -
INCO per comment 25.
Status: REOPENED → RESOLVED
Closed: 14 years ago14 years ago
Resolution: --- → INCOMPLETE
Crash Signature: [@ nsGenericElement::SaveSubtreeState() ]
Group: core-security
Component: DOM → DOM: Core & HTML
You need to log in before you can comment on or make changes to this bug.