Closed Bug 616421 Opened 15 years ago Closed 6 years ago

crash [@ nsHtml5TreeOperation::Perform(nsHtml5TreeOpExecutor*, nsIContent**) ]

Categories

(Core :: DOM: HTML Parser, defect)

defect
Not set
critical

Tracking

()

RESOLVED INCOMPLETE
Tracking Status
firefox47 --- wontfix
firefox48 --- fixed
firefox49 --- wontfix
firefox-esr45 --- wontfix
firefox50 --- wontfix
firefox51 --- wontfix
firefox52 --- wontfix

People

(Reporter: scoobidiver, Assigned: n.nethercote)

References

Details

(Keywords: crash, Whiteboard: [tbird crash])

Crash Data

Attachments

(2 files)

It is a residual crash signature that exists in the trunk builds. It is #263 top crasher in 4.0b7 for the last week. Signature nsHtml5TreeOperation::Perform(nsHtml5TreeOpExecutor*, nsIContent**) UUID b3ce6b53-6ea7-43c4-a5b5-ecc882101202 Time 2010-12-02 20:17:28.445494 Uptime 1634 Last Crash 458556 seconds (5.3 days) before submission Install Age 17733 seconds (4.9 hours) since version was first installed. Product Firefox Version 4.0b8pre Build ID 20101202030316 Branch 2.0 OS Windows NT OS Version 5.1.2600 Service Pack 3 CPU x86 CPU Info GenuineIntel family 6 model 23 stepping 10 Crash Reason EXCEPTION_ACCESS_VIOLATION_READ Crash Address 0x1afcad8 App Notes AdapterVendorID: 8086, AdapterDeviceID: 29c2 MSAFD Tcpip [TCP/IP] : 2 : 1 : MSAFD Tcpip [UDP/IP] : 2 : 2 : %SystemRoot%\system32\mswsock.dll MSAFD Tcpip [RAW/IP] : 2 : 3 : %SystemRoot%\system32\mswsock.dll RSVP UDP Service Provider : 6 : 2 : %SystemRoot%\system32\rsvpsp.dll RSVP TCP Service Provider : 6 : 1 : %SystemRoot%\system32\rsvpsp.dll MSAFD NetBIOS [\Device\NetBT_Tcpip_{BE5C30CD-B017-4054-AC27-3638F43693E7}] SEQPACKET 0 : 2 : 5 : %SystemRoot%\system32\mswsock.dll MSAFD NetBIOS [\Device\NetBT_Tcpip_{BE5C30CD-B017-4054-AC27-3638F43693E7}] DATAGRAM 0 : 2 : 2 : %SystemRoot%\system32\mswsock.dll MSAFD NetBIOS [\Device\NetBT_Tcpip_{8647200B-DB04-4800-AFEA-7A5D232122D5}] SEQPACKET 1 : 2 : 5 : %SystemRoot%\system32\mswsock.dll MSAFD NetBIOS [\Device\NetBT_Tcpip_{8647200B-DB04-4800-AFEA-7A5D232122D5}] DATAGRAM 1 : 2 : 2 : %SystemRoot%\system32\mswsock.dll MSAFD NetBIOS [\Device\NetBT_Tcpip_{9230AC61-0AAD-4A13-92F7-02591246DD5D}] SEQPACKET 2 : 2 : 5 : %SystemRoot%\syste Frame Module Signature [Expand] Source 0 xul.dll nsHtml5TreeOperation::Perform parser/html/nsHtml5TreeOperation.cpp:493 1 xul.dll nsHtml5TreeOpExecutor::RunFlushLoop parser/html/nsHtml5TreeOpExecutor.cpp:509 2 xul.dll nsHtml5ExecutorReflusher::Run parser/html/nsHtml5TreeOpExecutor.cpp:90 3 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:626 4 nspr4.dll _MD_CURRENT_THREAD nsprpub/pr/src/threads/combined/prulock.c:404 5 nspr4.dll _MD_CURRENT_THREAD nsprpub/pr/src/threads/combined/prulock.c:404 6 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:110 7 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:202 8 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:176 9 xul.dll nsBaseAppShell::Run widget/src/xpwidgets/nsBaseAppShell.cpp:192 10 xul.dll xul.dll@0xb0cd0b 11 xul.dll nsAppStartup::Run toolkit/components/startup/src/nsAppStartup.cpp:191 12 xul.dll XRE_main toolkit/xre/nsAppRunner.cpp:3691 13 firefox.exe wmain toolkit/xre/nsWindowsWMain.cpp:128 14 firefox.exe __tmainCRTStartup obj-firefox/memory/jemalloc/crtsrc/crtexe.c:591 15 kernel32.dll BaseProcessStart More reports at: http://crash-stats.mozilla.com/report/list?product=Firefox&query_search=signature&query_type=exact&query=&range_value=4&range_unit=weeks&hang_type=any&process_type=any&plugin_field=&plugin_query_type=&plugin_query=&do_query=1&admin=&signature=nsHtml5TreeOperation%3A%3APerform%28nsHtml5TreeOpExecutor*%2C%20nsIContent**%29
I am guessing this is a matter of the nsIContent handle nsTArray failing to allocate a handle. Making this depend on bug 610823. If the problem persists after bug 610823 has been fixed, let's investigate more.
Depends on: 610823
Whiteboard: [waiting for bug 610823]
Crash Signature: [@ nsHtml5TreeOperation::Perform(nsHtml5TreeOpExecutor*, nsIContent**) ]
(In reply to comment #1) > I am guessing this is a matter of the nsIContent handle nsTArray failing to > allocate a handle. Making this depend on bug 610823. If the problem persists > after bug 610823 has been fixed, let's investigate more. So, bug 610823 has been fixed, but my brother has seen this crash on his computer: https://crash-stats.mozilla.com/report/index/73984456-422f-4aab-ba2e-571f32110715 0 xul.dll nsHtml5TreeOperation::Perform parser/html/nsHtml5TreeOperation.cpp:279 1 xul.dll nsHtml5TreeOpExecutor::RunFlushLoop parser/html/nsHtml5TreeOpExecutor.cpp:489 2 xul.dll nsHtml5ExecutorFlusher::Run parser/html/nsHtml5TreeOpExecutor.cpp:90 3 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:618 4 xul.dll nsThread::GetObserver xpcom/threads/nsThread.cpp:692 5 xul.dll nsThread::Shutdown xpcom/threads/nsThread.cpp:481 6 xul.dll NS_InvokeByIndex_P xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp:102 7 xul.dll nsProxyObjectCallInfo::Run xpcom/proxy/src/nsProxyEvent.cpp:182 8 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:618 9 xul.dll nsThread::GetObserver xpcom/threads/nsThread.cpp:692 10 xul.dll nsThread::Shutdown xpcom/threads/nsThread.cpp:481 11 xul.dll NS_InvokeByIndex_P xpcom/reflect/xptcall/src/md/win32/xptcinvoke.cpp:102 12 xul.dll nsProxyObjectCallInfo::Run xpcom/proxy/src/nsProxyEvent.cpp:182 13 xul.dll nsThread::ProcessNextEvent xpcom/threads/nsThread.cpp:618 14 xul.dll mozilla::ipc::MessagePump::Run ipc/glue/MessagePump.cpp:110 15 xul.dll xul.dll@0xb60d87 16 xul.dll MessageLoop::RunHandler ipc/chromium/src/base/message_loop.cc:202 17 xul.dll xul.dll@0x36a1ff 18 xul.dll MessageLoop::Run ipc/chromium/src/base/message_loop.cc:176 19 xul.dll nsJARURI::QueryInterface modules/libjar/nsJARURI.cpp:75 Does this mean that the problem still exists?
Can the line number in the stack trace be trusted? The right hg version is http://hg.mozilla.org/releases/mozilla-beta/file/tip/parser/html/nsHtml5TreeOperation.cpp#l279 right? It doesn't make sense for that line to crash. If the method Append() is getting inlined, shouldn't the line numbers point to lines inside Append()? Does it actually make sense for nsJARURI::QueryInterface to be spinning the main thread event loop?
Whiteboard: [waiting for bug 610823]
I got Crashes with this Signature using a recent Thunderbird Nightly. The first one right after fetching E-Mail, the others right on Start-up - even in Safe-Mode. bp-8d54c28f-bda3-4b69-8088-ed3472110829 bp-33ccf474-c0be-457d-b0c7-5ca082110829 bp-881854f3-a750-48ad-9824-66a7b2110829 bp-9aa2e773-5a93-4fc3-a438-086392110829 Opposed to the already mentioned this is with Crash Reason EXCEPTION_ILLEGAL_INSTRUCTION and the Stack looks rather short (corrupted? invalid?).
(In reply to XtC4UaLL [:xtc4uall] from comment #4) > and the Stack looks rather short (corrupted? invalid?). The stacks look completely bogus to me. If you look at what is supposedly calling what, the source has no such calls.
Are there any work arounds for this issue, we have had two questions on this this week. [https://support.mozilla.org/en-US/questions/991262]
Whiteboard: [tbird crash]
I get this crash what feels like every few weeks, often when opening links on Twitter. Unfortunately, I can't reproduce it reliably at all. c378e70d-84ae-4195-b765-d95302140804
(In reply to Doug Turner (:dougt) from comment #7) > https://crash-stats.mozilla.com/report/index/6d5a90c9-849b-40bc-90e9- > 2a78c2140508 (In reply to Andrew McCreight [:mccr8] from comment #8) > I get this crash what feels like every few weeks, often when opening links > on Twitter. Unfortunately, I can't reproduce it reliably at all. > > c378e70d-84ae-4195-b765-d95302140804 Hmm. Unlike the earlier crashes, these two have stacks that look real. The tree op enum has different values for opt and debug builds, which in retrospect is probably not smart. I guess the next step is removing that distinction and seeing if the crashing line changes when the enum value 0 signals an uninitialized enum. (If the crashes still happen in the "append" op, then the enum is OK but a pointer is bad on its own right.)
Assignee: nobody → hsivonen
Landing this would help narrow down the cause of this crash (once more crashes are seen), but I guess first I should find out if explicitly zeroing all those tree ops matters for performance.
I just crashed again with this stack. This was after I tried to submit a form.
Seen on Mac, so not Windows specific.
OS: Windows XP → All
Hardware: x86 → All
Comment on attachment 8475060 [details] [diff] [review] Distinguish bogus enum crashes smaug, what do you think about landing a patch like this in order to make crashed due to uninitialied tree ops and crashes due to initialized tree ops with bad pointers have different stacks? (In order to know which case should be investigated.)
Attachment #8475060 - Flags: review?(bugs)
Comment on attachment 8475060 [details] [diff] [review] Distinguish bogus enum crashes Oh, we should definitely initialize mOpCode to something.
Attachment #8475060 - Flags: review?(bugs) → review+
(In reply to Olli Pettay [:smaug] from comment #15) > Comment on attachment 8475060 [details] [diff] [review] > Distinguish bogus enum crashes > > Oh, we should definitely initialize mOpCode to something. Thanks. https://hg.mozilla.org/integration/mozilla-inbound/rev/e8c2cd5bc9e3
Keywords: leave-open
Got a crash after the debugging patch from comment 18 landed: https://crash-stats.mozilla.com/report/index/09f70a8b-17c8-4d48-a3f6-59b452141006 Looks like I am crashing at the "Bogus tree op" line that was modified.
smaug, do you have ideas how we could end up with zeroed memory as supposedly valid item in nsTarray used as the op queue? I checked that: 1) Whenever the tree builder appends an element, it always initializes the op code. 2) The other queues only receive ops by nsTarray move or swap operation. 3) Removals are by nsTArray operations. (There's only one case that's not Clear() and if that case was bogus, I'd expect many more problems.) The locking code also seems to be there as designed. I wonder if I should review all the non-locking cases to see that non-locking code paths are never used when locking is supposed to be used. I haven't audited that angle yet.
Flags: needinfo?(bugs)
Flags: needinfo?(bugs)
Still crashing with this stack, I guess Olli has no ideas either?
Yeah, I was at actually looking at crash-stats for this last week hoping that hsivonen's recent-ish changes would have fixed this, but no.
Crash Signature: [@ nsHtml5TreeOperation::Perform(nsHtml5TreeOpExecutor*, nsIContent**) ] → [@ nsHtml5TreeOperation::Perform(nsHtml5TreeOpExecutor*, nsIContent**) ] [@ nsHtml5TreeOperation::Perform ]
In the past 7 days, 1302 crashes with this signature have occurred. This makes it the #51 topcrash on 46.0.1. Of those 1302, 491 are due to hitting MOZ_CRASH("Bogus tree op"). We should add some diagnostics to better understand that case -- is the tree op a valid but unexpected one, or is it truly a bogus op?
This may help understand the "Bogus tree op" crashes.
Attachment #8758581 - Flags: review?(hsivonen)
Assignee: hsivonen → n.nethercote
Status: NEW → ASSIGNED
Attachment #8758581 - Flags: review?(hsivonen) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/2313feadbdaf1e2a6a92ef5f6c0d0e6bd34121b2 Bug 616421 - Better distinguish invalid mOpCode values in nsHtml5TreeOperation::Perform. r=hsivonen.
Comment on attachment 8758581 [details] [diff] [review] Better distinguish invalid mOpCode values in nsHtml5TreeOperation::Perform Approval Request Comment [Feature/regressing bug #]: Html parsing. [User impact if declined]: This diagnostic patch may give insight into the cause of some MOZ_CRASH aborts. [Describe test coverage new/current, TreeHerder]: Html parsing is heavily exercised in many tests. [Risks and why]: Negligible. Patch is very simple. [String/UUID change made/needed]: none.
Attachment #8758581 - Flags: approval-mozilla-aurora?
Comment on attachment 8758581 [details] [diff] [review] Better distinguish invalid mOpCode values in nsHtml5TreeOperation::Perform diagnostic patch, taking it
Attachment #8758581 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Attachment #8475060 - Flags: checkin+
Attachment #8758581 - Flags: checkin+
AFAICT we are now getting roughly equal numbers of MOZ_CRASH(Bogus tree op) and MOZ_CRASH(eTreeOpUninitialized) crashes.
Crash volume for signature 'nsHtml5TreeOperation::Perform': - nightly(version 50):26 crashes from 2016-06-06. - aurora (version 49):73 crashes from 2016-06-07. - beta (version 48):680 crashes from 2016-06-06. - release(version 47):7670 crashes from 2016-05-31. - esr (version 45):537 crashes from 2016-04-07. Crash volume on the last weeks: W. N-1 W. N-2 W. N-3 W. N-4 W. N-5 W. N-6 W. N-7 - nightly 7 6 4 1 2 4 2 - aurora 8 13 10 12 5 9 14 - beta 95 87 101 94 83 102 96 - release 983 1105 1090 1096 1055 1082 949 - esr 50 50 51 39 53 63 62 Affected platforms: Windows, Mac OS X, Linux
Crash volume for signature 'nsHtml5TreeOperation::Perform': - nightly (version 51): 8 crashes from 2016-08-01. - aurora (version 50): 32 crashes from 2016-08-01. - beta (version 49): 190 crashes from 2016-08-02. - release (version 48): 795 crashes from 2016-07-25. - esr (version 45): 656 crashes from 2016-05-02. Crash volume on the last weeks (Week N is from 08-22 to 08-28): W. N-1 W. N-2 W. N-3 - nightly 0 1 2 - aurora 11 9 8 - beta 62 63 26 - release 248 212 108 - esr 57 45 49 Affected platforms: Windows, Mac OS X, Linux Crash rank on the last 7 days: Browser Content Plugin - nightly #127 - aurora #67 - beta #332 #133 - release #68 #68 - esr #140
Crash volume for signature 'nsHtml5TreeOperation::Perform': - nightly (version 52): 2 crashes from 2016-09-19. - aurora (version 51): 5 crashes from 2016-09-19. - beta (version 50): 39 crashes from 2016-09-20. - release (version 49): 534 crashes from 2016-09-05. - esr (version 45): 810 crashes from 2016-06-01. Crash volume on the last weeks (Week N is from 10-03 to 10-09): W. N-1 W. N-2 - nightly 1 1 - aurora 5 0 - beta 36 3 - release 439 95 - esr 77 66 Affected platforms: Windows, Mac OS X, Linux Crash rank on the last 7 days: Browser Content Plugin - nightly #986 - aurora #204 - beta #833 #162 - release #249 #46 - esr #148
Too late for firefox 52, mass-wontfix.
See Also: → 1478581

The leave-open keyword is there and there is no activity for 6 months.
:njn, maybe it's time to close this bug?

Flags: needinfo?(n.nethercote)

Sure.

Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Flags: needinfo?(n.nethercote)
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: