Last Comment Bug 619043 - crash [@ nsHtml5TreeOpExecutor::RunFlushLoop()] mainly with NoScript
: crash [@ nsHtml5TreeOpExecutor::RunFlushLoop()] mainly with NoScript
Status: RESOLVED FIXED
: crash
Product: Core
Classification: Components
Component: HTML: Parser (show other bugs)
: Trunk
: All All
: -- critical with 2 votes (vote)
: mozilla10
Assigned To: Henri Sivonen (:hsivonen)
:
Mentors:
: 619046 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-12-14 05:17 PST by Wayne Mery (:wsmwk, NI for questions)
Modified: 2011-10-18 09:33 PDT (History)
11 users (show)
See Also:
Crash Signature:
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Defer termination if nsIParser::Terminate() is called when it's not safe (2.97 KB, patch)
2011-09-23 08:05 PDT, Henri Sivonen (:hsivonen)
bzbarsky: review+
Details | Diff | Review
Less bogus fix (3.60 KB, patch)
2011-09-26 00:56 PDT, Henri Sivonen (:hsivonen)
bzbarsky: review+
christian: approval‑mozilla‑aurora-
christian: approval‑mozilla‑beta-
Details | Diff | Review

Description Wayne Mery (:wsmwk, NI for questions) 2010-12-14 05:17:12 PST
crash [@ nsHtml5TreeOpExecutor::RunFlushLoop()]

4.0 betas
bp-54e3249d-ca51-4083-8505-ee24e2101125
EXCEPTION_ACCESS_VIOLATION_READ
0x4C
0	xul.dll	nsHtml5TreeOpExecutor::RunFlushLoop	parser/html/nsHtml5TreeOpExecutor.cpp:458
1	xul.dll	nsHtml5ExecutorFlusher::Run	parser/html/nsHtml5TreeOpExecutor.cpp:90
2	xul.dll	nsThread::ProcessNextEvent	xpcom/threads/nsThread.cpp:626
3	xul.dll	mozilla::ipc::MessagePump::Run	ipc/glue/MessagePump.cpp:110
4	xul.dll	xul.dll@0xb0bcbb	
5	xul.dll	MessageLoop::RunHandler	ipc/chromium/src/base/message_loop.cc:202
6	mozcrt19.dll	_VEC_memzero	
7	xul.dll	xul.dll@0x35996d	
8	firefox.exe	firefox.exe@0x1bb7	
9	ntdll.dll	WinSqmSetIfMaxDWORD	
10	ntdll.dll	_RtlUserThreadStart
Comment 1 Scoobidiver (away) 2010-12-14 05:28:03 PST
*** Bug 619046 has been marked as a duplicate of this bug. ***
Comment 2 Robert Kaiser (not working on stability any more) 2011-03-31 07:00:44 PDT
This _literally_ exploded in yesterday's stats on 4.0*, with some very high correlations:

Modules:
90% (253/281) vs.   3% (1751/52073) avgssff4.dll
90% (252/281) vs.   3% (1765/52073) avglngx.dll
90% (253/281) vs.   4% (1832/52073) avgcfgx.dll
90% (253/281) vs.   4% (1878/52073) avgxpl.dll
90% (253/281) vs.   4% (1900/52073) avglogx.dll

Add-Ons:
85% (238/281) vs.   4% (1937/52073) {1E73965B-8B48-48be-9C8D-68B920ABC1C4}

This crash was in the range of 15-25 crashes / 1M ADU from 2011-03-20 to 2011-03-29 and on 2011-03-30 jumped to ~150 crashes / 1M ADU (-29 and -30 numbers roughly "normalized" for throttling of crashes).

This is the #24 topcrash in 4.0* numbers for 2011-03-30 alone.
Comment 3 Henri Sivonen (:hsivonen) 2011-03-31 07:16:57 PDT
Those correlations sure make it look like an AVG product is involved.
Comment 4 Henri Sivonen (:hsivonen) 2011-03-31 07:19:20 PDT
Kairo, are you sure you pasted the correlations on the right bug? I don't see this signature in the top crashes?
Comment 5 Robert Kaiser (not working on stability any more) 2011-03-31 07:44:03 PDT
(In reply to comment #4)
> Kairo, are you sure you pasted the correlations on the right bug? I don't see
> this signature in the top crashes?

It's #122 in https://crash-stats.mozilla.com/topcrasher/byversion/Firefox/4.0 right now - note that this page is a 7-day report on 4.0 alone, while my number in comment #2 is a single-day number for 4.0*, as calculated by my experimental report at http://test.kairo.at/socorro/2011-03-30.4.0.explosiveness.html
Comment 6 Marcia Knous [:marcia - use ni] 2011-04-01 13:30:31 PDT
Checking today this stack has 1584 Windows crashes for the last 7 days.
Comment 7 [Baboo] 2011-09-23 03:38:24 PDT
Can someone have a look into whether there is a correlation with the NoScript extension? From what I have seen many people having this crash had this extension (id: {73a6fe31-595d-460b-a920-fcc0f8843232})
Maybe it's some DOM bug esp. triggered by NoScript.
Comment 8 Scoobidiver (away) 2011-09-23 05:01:11 PDT
Here are crash reports:
https://crash-stats.mozilla.com/report/list?signature=nsHtml5TreeOpExecutor%3A%3ARunFlushLoop%28%29

They show indeed a correlation with NoScript and no longer with AVG in 6.0.2:
71% (42/59) vs.   1% (1333/114741) {73a6fe31-595d-460b-a920-fcc0f8843232} (NoScript, https://addons.mozilla.org/addon/722)
47% (28/59) vs.   9% (10758/114741) {d10d0bf8-f5b5-c8b4-a8b2-2b9879e08c5d} (Adblock Plus, https://addons.mozilla.org/addon/1865)

Stack traces are now different from the ones in comment 0 and seem to be related to IPC:
0 	xul.dll 	nsHtml5TreeOpExecutor::RunFlushLoop 	parser/html/nsHtml5TreeOpExecutor.cpp:458
1 	xul.dll 	nsHtml5ExecutorReflusher::Run 	parser/html/nsHtml5StreamParser.cpp:153
2 	xul.dll 	nsThread::ProcessNextEvent 	xpcom/threads/nsThread.cpp:618
3 	nspr4.dll 	_MD_CURRENT_THREAD 	nsprpub/pr/src/md/windows/w95thred.c:308
4 	xul.dll 	mozilla::ipc::MessagePump::Run 	ipc/glue/MessagePump.cpp:134
5 	xul.dll 	MessageLoop::RunHandler 	ipc/chromium/src/base/message_loop.cc:202
6 	xul.dll 	MessageLoop::Run 	ipc/chromium/src/base/message_loop.cc:176
7 	xul.dll 	nsBaseAppShell::Run 	widget/src/xpwidgets/nsBaseAppShell.cpp:189
8 	xul.dll 	xul.dll@0xb7153f 	
9 	xul.dll 	nsAppStartup::Run 	toolkit/components/startup/nsAppStartup.cpp:222
10 	xul.dll 	XRE_main 	toolkit/xre/nsAppRunner.cpp:3686
11 	firefox.exe 	wmain 	toolkit/xre/nsWindowsWMain.cpp:106
12 	firefox.exe 	__tmainCRTStartup 	obj-firefox/memory/jemalloc/crtsrc/crtexe.c:591
13 	kernel32.dll 	BaseProcessStart
Comment 9 Henri Sivonen (:hsivonen) 2011-09-23 05:14:00 PDT
Giorgio, does noscript stop a document load from an observer that fires synchronously with DOM building?
Comment 10 Giorgio Maone [:mao] 2011-09-23 06:19:05 PDT
(In reply to Henri Sivonen (:hsivonen) from comment #9)
> Giorgio, does noscript stop a document load from an observer that fires
> synchronously with DOM building?

Yes it may (the observer is, yet again, for http-on-modify-request).
And this time there's no way to do otherwise without breaking NoScript (actually, my work around for bug 677050 depends on this ability). So, if this is the culprit of this crash, please fix Gecko back, or give me a way to block *and* possibly redirect requests asynchronously at that stage.
Comment 11 chris hofmann 2011-09-23 07:28:10 PDT
should this be tracked or fx7 or 8.  for the last several weeks its been low volume only in final releases. starting around sept 13 it started increasing in daily volume on beta and aurora channels.
Comment 12 chris hofmann 2011-09-23 07:28:55 PDT
         nsHtml5TreeOpExecutor::RunFlushLoop
date     total    breakdown by build
         crashes  count build, count build, ...


20110910 3  	1 7.02011082417, 
        		1 6.0.22011090213, 	1 4.0.12011041322, 
20110911 1 6.0.22011090213 	1 , 
20110912 5  	3 6.0.22011090213, 
        		1 6.0.22011090606, 	1 6.0.22011090517, 
20110913 11  	5 6.0.22011090213, 
        		2 7.02011090813, 	2 6.0.12011083020, 
        		2 6.02011081116, 
20110914 24  	11 6.0.22011090213, 
        		3 7.02011090813, 	3 7.02011082417, 
        		2 7.02011090909, 	1 8.0a22011090804, 
        		1 7.02011091214, 	1 6.0.22011090517, 
        		1 5.0.12011070718, 	1 4.0.12011041322, 
20110915 8  	5 6.0.22011090213, 
        		2 6.0.22011090517, 	1 7.02011090813,
Comment 13 Henri Sivonen (:hsivonen) 2011-09-23 08:05:19 PDT
Created attachment 562041 [details] [diff] [review]
Defer termination if nsIParser::Terminate() is called when it's not safe

I expect this patch to fix the crash. I didn't try to reproduce the crash or to write a test case, because I expect that to be a similar exercise in futility as with bug 677050.
Comment 14 Boris Zbarsky [:bz] 2011-09-23 09:46:09 PDT
Comment on attachment 562041 [details] [diff] [review]
Defer termination if nsIParser::Terminate() is called when it's not safe

r=me, but should you have an aTerminated test in the DidBuildModel conditional?
Comment 15 Marcia Knous [:marcia - use ni] 2011-09-23 17:12:56 PDT
Adding Mac and Linux specific signature, since it is almost the exact same signature. Mac and Linux crashes happen across all versions and I see No Script installed in few of the crashes.
Comment 16 Henri Sivonen (:hsivonen) 2011-09-24 05:04:29 PDT
Comment on attachment 562041 [details] [diff] [review]
Defer termination if nsIParser::Terminate() is called when it's not safe

Sorry about wasting your time with an over-eager review request. (I was hoping to proceed in time for Firefox 7.) This patch can't be landed, because there are other situations where Terminate() is supposed to work while the flush loop is running.
Comment 17 Henri Sivonen (:hsivonen) 2011-09-26 00:56:19 PDT
Created attachment 562368 [details] [diff] [review]
Less bogus fix
Comment 18 Henri Sivonen (:hsivonen) 2011-09-26 05:39:49 PDT
Comment on attachment 562368 [details] [diff] [review]
Less bogus fix

Checking mParser for null is the pre-existing termination check and the methods to which I added early returns already had early returns. Also, the tryserver was green with this except for one leak that looked random.
Comment 19 Boris Zbarsky [:bz] 2011-10-12 14:08:45 PDT
Comment on attachment 562368 [details] [diff] [review]
Less bogus fix

r=me but could you please file a followup bug on reverting this change once bug 694101 is fixed and load start becomes async?
Comment 20 Henri Sivonen (:hsivonen) 2011-10-14 03:51:55 PDT
(In reply to Boris Zbarsky (:bz) from comment #19)
> Comment on attachment 562368 [details] [diff] [review] [diff] [details] [review]
> Less bogus fix
> 
> r=me 

Thank you.

> but could you please file a followup bug on reverting this change once
> bug 694101 is fixed and load start becomes async?

Filed bug 694528.
Comment 21 Henri Sivonen (:hsivonen) 2011-10-14 03:52:21 PDT
https://hg.mozilla.org/integration/mozilla-inbound/rev/fa65bb9a0909
Comment 22 Henri Sivonen (:hsivonen) 2011-10-14 03:54:17 PDT
Comment on attachment 562368 [details] [diff] [review]
Less bogus fix

Nominating for aurora and beta, since this would be a relatively simple way to reduce crashiness for NoScript users.
Comment 23 :Ehsan Akhgari (busy, don't ask for review please) 2011-10-14 08:10:20 PDT
https://hg.mozilla.org/mozilla-central/rev/fa65bb9a0909
Comment 24 christian 2011-10-18 09:33:01 PDT
Comment on attachment 562368 [details] [diff] [review]
Less bogus fix

We're going to let this one come up through the normal process:

1. There were only ~500 crashes total in the last week
2. This has existed since Firefox 4
3. Changes to the parser feel like they have more risk in general

Note You need to log in before you can comment on or make changes to this bug.