Closed Bug 1139250 Opened 9 years ago Closed 9 years ago

Intermittent underline_timestamp_future-ref.html | application crashed [@ mozilla::CycleCollectedJSRuntime::CycleCollectedJSRuntime(JSRuntime *,unsigned int,unsigned int)] or [@ NS_ABORT_OOM(unsigned int)] or [@ mozalloc_abort(char const * const)]

Categories

(Core :: XPCOM, defect)

33 Branch
x86
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla39
Tracking Status
firefox37 --- unaffected
firefox38 --- unaffected
firefox39 --- fixed
firefox-esr31 --- unaffected

People

(Reporter: RyanVM, Assigned: Margaret)

References

Details

(Keywords: crash, intermittent-failure, Whiteboard: [MemShrink])

11:43:56 INFO - TEST-START | /webvtt/rendering/cues-with-video/processing-model/selectors/cue_function/underline_object/underline_timestamp_future.html == /webvtt/rendering/cues-with-video/processing-model/selectors/cue_function/underline_object/underline_timestamp_future-ref.html
11:43:56 INFO - PROCESS | 1660 | MARIONETTE LOG: INFO: classList:
11:43:57 INFO - mozcrash Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/fx-team-win32/1425404688/firefox-39.0a1.en-US.win32.crashreporter-symbols.zip
11:44:05 INFO - mozcrash Saved minidump as C:\slave\test\build\blobber_upload_dir\7a1fa991-1e4f-41a5-b05e-0e650f97db4f.dmp
11:44:05 INFO - mozcrash Saved app info as C:\slave\test\build\blobber_upload_dir\7a1fa991-1e4f-41a5-b05e-0e650f97db4f.extra
11:44:05 INFO - PROCESS-CRASH | /webvtt/rendering/cues-with-video/processing-model/selectors/cue_function/underline_object/underline_timestamp_future.html == /webvtt/rendering/cues-with-video/processing-model/selectors/cue_function/underline_object/underline_timestamp_future-ref.html | application crashed [@ mozilla::CycleCollectedJSRuntime::CycleCollectedJSRuntime(JSRuntime *,unsigned int,unsigned int)]
11:44:05 INFO - Crash dump filename: c:\users\cltbld\appdata\local\temp\tmpdrg0l9.mozrunner\minidumps\7a1fa991-1e4f-41a5-b05e-0e650f97db4f.dmp
11:44:05 INFO - Operating system: Windows NT
11:44:05 INFO - 6.1.7601 Service Pack 1
11:44:05 INFO - CPU: x86
11:44:05 INFO - GenuineIntel family 6 model 30 stepping 5
11:44:05 INFO - 8 CPUs
11:44:05 INFO -
11:44:05 INFO - Crash reason: EXCEPTION_BREAKPOINT
11:44:05 INFO - Crash address: 0x65900a03
11:44:05 INFO -
11:44:05 INFO - Thread 299 (crashed)
11:44:05 INFO - 0 xul.dll!mozilla::CycleCollectedJSRuntime::CycleCollectedJSRuntime(JSRuntime *,unsigned int,unsigned int) [CycleCollectedJSRuntime.cpp:5cd9065e6cad : 484 + 0x0]
11:44:05 INFO - eip = 0x65900a03 esp = 0x7fdff4b0 ebp = 0x7fdff4bc ebx = 0x00000000
11:44:05 INFO - esi = 0x7fdff4f8 edi = 0x677c5ffc eax = 0x00000000 ecx = 0x00000100
11:44:05 INFO - edx = 0x01c000c0 efl = 0x00000246
11:44:05 INFO - Found by: given as instruction pointer in context
11:44:05 INFO - 1 xul.dll!`anonymous namespace'::WorkerJSRuntime::WorkerJSRuntime(JSRuntime *,mozilla::dom::workers::WorkerPrivate *) [RuntimeService.cpp:5cd9065e6cad : 852 + 0x17]
11:44:05 INFO - eip = 0x664ab3dd esp = 0x7fdff4c4 ebp = 0x7fdff4d4
11:44:05 INFO - Found by: call frame info
11:44:05 INFO - 2 xul.dll!`anonymous namespace'::WorkerThreadPrimaryRunnable::Run() [RuntimeService.cpp:5cd9065e6cad : 2657 + 0x10]
11:44:05 INFO - eip = 0x664b2609 esp = 0x7fdff4dc ebp = 0x7fdff5bc
11:44:05 INFO - Found by: call frame info
11:44:05 INFO - 3 xul.dll!nsThread::ProcessNextEvent(bool,bool *) [nsThread.cpp:5cd9065e6cad : 855 + 0x5]
11:44:05 INFO - eip = 0x659357d0 esp = 0x7fdff5c4 ebp = 0x7fdff6a0
11:44:05 INFO - Found by: call frame info
11:44:05 INFO - 4 xul.dll!NS_ProcessNextEvent(nsIThread *,bool) [nsThreadUtils.cpp:5cd9065e6cad : 265 + 0xc]
11:44:05 INFO - eip = 0x6594a638 esp = 0x7fdff6a8 ebp = 0x7fdff6b4
11:44:05 INFO - Found by: call frame info
11:44:05 INFO - 5 xul.dll!mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate *) [MessagePump.cpp:5cd9065e6cad : 368 + 0x9]
11:44:05 INFO - eip = 0x65ac36ee esp = 0x7fdff6bc ebp = 0x7fdff6e0
11:44:05 INFO - Found by: call frame info
11:44:05 INFO - 6 xul.dll!MessageLoop::RunHandler() [message_loop.cc:5cd9065e6cad : 226 + 0x7]
11:44:05 INFO - eip = 0x65ab64ef esp = 0x7fdff6e8 ebp = 0x7fdff718
11:44:05 INFO - Found by: call frame info
11:44:05 INFO - 7 xul.dll!MessageLoop::Run() [message_loop.cc:5cd9065e6cad : 200 + 0x6]
11:44:05 INFO - eip = 0x65ab6308 esp = 0x7fdff720 ebp = 0x7fdff738
11:44:05 INFO - Found by: call frame info
11:44:05 INFO - 8 xul.dll!nsThread::ThreadFunc(void *) [nsThread.cpp:5cd9065e6cad : 356 + 0x6]
11:44:05 INFO - eip = 0x65937e00 esp = 0x7fdff740 ebp = 0x7fdff750
11:44:05 INFO - Found by: call frame info
19:01:39 INFO - TEST-START | /webvtt/rendering/cues-with-video/processing-model/selectors/cue_function/underline_object/underline_white-space_pre-line_wrapped.html == /webvtt/rendering/cues-with-video/processing-model/selectors/cue_function/underline_object/underline_white-space_pre-line_wrapped-ref.html
19:01:39 INFO - PROCESS | 2960 | MARIONETTE LOG: INFO: classList:
19:01:39 INFO - PROCESS | 2960 | JavaScript error: chrome://marionette/content/marionette-listener.js, line 2219: NS_ERROR_FAILURE:
19:01:41 INFO - PROCESS | 2960 | JavaScript error: resource://gre/modules/AddonWatcher.jsm, line 55: NS_ERROR_OUT_OF_MEMORY: Component returned failure code: 0x8007000e (NS_ERROR_OUT_OF_MEMORY) [nsICompartmentInfo.getCompartments]
19:01:45 INFO - mozcrash Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-win32/1425431753/firefox-39.0a1.en-US.win32.crashreporter-symbols.zip
19:01:55 INFO - mozcrash Saved minidump as C:\slave\test\build\blobber_upload_dir\f62df64e-b965-42a6-bf68-41cdff908b17.dmp
19:01:55 INFO - mozcrash Saved app info as C:\slave\test\build\blobber_upload_dir\f62df64e-b965-42a6-bf68-41cdff908b17.extra
19:01:55 INFO - PROCESS-CRASH | /webvtt/rendering/cues-with-video/processing-model/selectors/cue_function/underline_object/underline_white-space_pre-line_wrapped.html == /webvtt/rendering/cues-with-video/processing-model/selectors/cue_function/underline_object/underline_white-space_pre-line_wrapped-ref.html | application crashed [@ NS_ABORT_OOM(unsigned int)]
19:01:55 INFO - Crash dump filename: c:\users\cltbld\appdata\local\temp\tmphm9yhf.mozrunner\minidumps\f62df64e-b965-42a6-bf68-41cdff908b17.dmp
19:01:55 INFO - Operating system: Windows NT
19:01:55 INFO - 6.1.7601 Service Pack 1
19:01:55 INFO - CPU: x86
19:01:55 INFO - GenuineIntel family 6 model 30 stepping 5
19:01:55 INFO - 8 CPUs
19:01:55 INFO -
19:01:55 INFO - Crash reason: EXCEPTION_BREAKPOINT
19:01:55 INFO - Crash address: 0x6633e7ef
19:01:55 INFO -
19:01:55 INFO - Thread 304 (crashed)
19:01:55 INFO - 0 xul.dll!NS_ABORT_OOM(unsigned int) [nsDebugImpl.cpp:a07003e503d6 : 622 + 0x0]
19:01:55 INFO - eip = 0x6633e7ef esp = 0x7e6ff608 ebp = 0x7e6ff608 ebx = 0x00000000
19:01:55 INFO - esi = 0x72266060 edi = 0x00008000 eax = 0x00040000 ecx = 0x00040000
19:01:55 INFO - edx = 0x00000000 efl = 0x00000202
19:01:55 INFO - Found by: given as instruction pointer in context
19:01:55 INFO - 1 xul.dll!PL_DHashTableAdd(PLDHashTable *,void const *) [pldhash.cpp:a07003e503d6 : 738 + 0x108]
19:01:55 INFO - eip = 0x6638b228 esp = 0x7e6ff610 ebp = 0x7e6ff634
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 2 xul.dll!CCGraphBuilder::AddNode(void *,nsCycleCollectionParticipant *) [nsCycleCollector.cpp:a07003e503d6 : 2182 + 0xa]
19:01:55 INFO - eip = 0x663422f4 esp = 0x7e6ff63c ebp = 0x7e6ff644
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 3 xul.dll!CCGraphBuilder::NoteRoot(void *,nsCycleCollectionParticipant *) [nsCycleCollector.cpp:a07003e503d6 : 2106 + 0x8]
19:01:55 INFO - eip = 0x66347f9b esp = 0x7e6ff64c ebp = 0x7e6ff658
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 4 xul.dll!CCGraphBuilder::NoteNativeRoot(void *,nsCycleCollectionParticipant *) [nsCycleCollector.cpp:a07003e503d6 : 2295 + 0x11]
19:01:55 INFO - eip = 0x66347e9e esp = 0x7e6ff660 ebp = 0x7e6ff66c
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 5 xul.dll!NoteJSHolder [CycleCollectedJSRuntime.cpp:a07003e503d6 : 335 + 0xd]
19:01:55 INFO - eip = 0x66347cc9 esp = 0x7e6ff674 ebp = 0x7e6ff690
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 6 xul.dll!nsBaseHashtable<nsPtrHashKey<nsINode const >,bool,bool>::s_EnumStub(PLDHashTable *,PLDHashEntryHdr *,unsigned int,void *) [nsBaseHashtable.h:a07003e503d6 : 411 + 0x12]
19:01:55 INFO - eip = 0x6743a603 esp = 0x7e6ff698 ebp = 0x7e6ff6a4
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 7 xul.dll!PL_DHashTableEnumerate(PLDHashTable *,PLDHashOperator (*)(PLDHashTable *,PLDHashEntryHdr *,unsigned int,void *),void *) [pldhash.cpp:a07003e503d6 : 860 + 0x4c]
19:01:55 INFO - eip = 0x6638b44a esp = 0x7e6ff6ac ebp = 0x7e6ff6dc
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 8 xul.dll!nsBaseHashtable<nsHashKeyDisallowMemmove<nsPtrHashKey<nsXBLPrototypeHandler> >,JS::Heap<JSObject *>,JSObject *>::Enumerate(PLDHashOperator (*)(nsXBLPrototypeHandler *,JS::Heap<JSObject *> &,void *),void *) [nsBaseHashtable.h:a07003e503d6 : 208 + 0xe]
19:01:55 INFO - eip = 0x668661dd esp = 0x7e6ff6e4 ebp = 0x7e6ff6f8
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 9 xul.dll!mozilla::CycleCollectedJSRuntime::TraverseNativeRoots(nsCycleCollectionNoteRootCallback &) [CycleCollectedJSRuntime.cpp:a07003e503d6 : 720 + 0x4]
19:01:55 INFO - eip = 0x6634a93b esp = 0x7e6ff700 ebp = 0x7e6ff718
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 10 xul.dll!mozilla::CycleCollectedJSRuntime::TraverseRoots(nsCycleCollectionNoteRootCallback &) [CycleCollectedJSRuntime.cpp:a07003e503d6 : 960 + 0xa]
19:01:55 INFO - eip = 0x6634a96d esp = 0x7e6ff720 ebp = 0x7e6ff764
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 11 xul.dll!nsCycleCollector::BeginCollection(ccType,nsICycleCollectorListener *) [nsCycleCollector.cpp:a07003e503d6 : 3774 + 0xe]
19:01:55 INFO - eip = 0x663427f7 esp = 0x7e6ff76c ebp = 0x7e6ff79c
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 12 xul.dll!nsCycleCollector::Collect(ccType,js::SliceBudget &,nsICycleCollectorListener *,bool) [nsCycleCollector.cpp:a07003e503d6 : 3588 + 0xc]
19:01:55 INFO - eip = 0x663434cc esp = 0x7e6ff7a4 ebp = 0x7e6ff7bc
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 13 xul.dll!nsCycleCollector_collect(nsICycleCollectorListener *) [nsCycleCollector.cpp:a07003e503d6 : 4183 + 0x11]
19:01:55 INFO - eip = 0x6634b569 esp = 0x7e6ff7c4 ebp = 0x7e6ff7ec
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 14 xul.dll!`anonymous namespace'::WorkerJSRuntime::CustomGCCallback(JSGCStatus) [RuntimeService.cpp:a07003e503d6 : 909 + 0x6]
19:01:55 INFO - eip = 0x66f09db6 esp = 0x7e6ff7f4 ebp = 0x7e6ff7f8
19:01:55 INFO - Found by: call frame info
19:01:55 INFO - 15 xul.dll!mozilla::CycleCollectedJSRuntime::OnGC(JSGCStatus) [CycleCollectedJSRuntime.cpp:a07003e503d6 : 1284 + 0x9]
19:01:55 INFO - eip = 0x66348299 esp = 0x7e6ff800 ebp = 0x7e6ff808
19:01:55 INFO - Found by: call frame info
Summary: Intermittent underline_timestamp_future-ref.html | application crashed [@ mozilla::CycleCollectedJSRuntime::CycleCollectedJSRuntime(JSRuntime *,unsigned int,unsigned int)] or [@ NS_ABORT_OOM(unsigned int)] → Intermittent underline_timestamp_future-ref.html | application crashed [@ mozilla::CycleCollectedJSRuntime::CycleCollectedJSRuntime(JSRuntime *,unsigned int,unsigned int)] or [@ NS_ABORT_OOM(unsigned int)] or [@ mozalloc_abort(char const * const)]
This is really frequent. Any ideas, Andrew? OOM I guess?
Flags: needinfo?(continuation)
Yeah, the cycle collector doesn't have enough memory to grow its hashtable.  You can see in comment 1 that the system is low on memory, because some other operation failed due to OOM.  One somewhat unusual thing here is that the crash is on a worker thread.
Flags: needinfo?(continuation)
I've got this narrowed down to the below fx-team merge:
https://hg.mozilla.org/mozilla-central/pushloghtml?changeset=c5b90c003be8

Retriggers ongoing.
Retriggers on fx-team are calling this a "regression" from bug 1124011.
Blocks: 1124011
Blocks: 1140045
(In reply to Andrew McCreight [:mccr8] from comment #17)
> Yeah, the cycle collector doesn't have enough memory to grow its hashtable. 
> You can see in comment 1 that the system is low on memory, because some
> other operation failed due to OOM.  One somewhat unusual thing here is that
> the crash is on a worker thread.

This makes a lot of sense, because we use a worked thread to parse a DOM and run our Readability script for reader view:
http://mxr.mozilla.org/mozilla-central/source/toolkit/components/reader/ReaderMode.jsm#211

It sounds like this may have exposed some underlying platform bug, hopefully we can find a way to fix this. I just disabled this feature in bug 1140045, so we should stop seeing this crash, but we want to release this feature in Firefox 38, so we do need a fix.
Flags: needinfo?(continuation)
Component: Video/Audio → XPCOM
Whiteboard: [MemShrink]
Do you ever shut down your worker?  The first two stacks are crashing on thread 300 or so.  Gecko usually only has around 50 threads running.
Flags: needinfo?(margaret.leibovic)
Thanks for tracking down the regression, Ryan!  I'll see if I can figure out anything weird about the worker CCs (aside from what khuey mentioned).
From the crash reports it's quite obvious that we're leaking worker threads.  The question is whether that's happening in Reader mode or internally.
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #34)
> Do you ever shut down your worker?  The first two stacks are crashing on
> thread 300 or so.  Gecko usually only has around 50 threads running.

Looks like we don't... wow, that's kinda crazy, since we've been shipping this on Android for years.

bnicholson was involved in the initial implementation of this feature. Do you remember ever discussing shutting down the worker?

If this is the case, I can write a patch to fix this, and maybe (hopefully) that would resolve this issue.
Flags: needinfo?(margaret.leibovic) → needinfo?(bnicholson)
Flags: needinfo?(continuation)
Interesting, nope. I think we just assumed that the readers would close themselves when they were finished, which doesn't make sense in retrospect since they have onmessage listeners. Good find!
Flags: needinfo?(bnicholson)
Depends on: 1140172
I'd like to make a try push with that patch and a patch to re-enable the reader mode button, and see if this goes away.

I'm having trouble determining which test jobs I should be targetting for re-triggers. Ryan, can you tell me where this was happening most frequently?
Flags: needinfo?(ryanvm)
try: -b o -p win32 -u web-platform-tests-reftests[Windows 7] -t none
Flags: needinfo?(ryanvm)
Looks like the disabling push from bug 1140045 indeed eliminated this.
Assignee: nobody → margaret.leibovic
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #41)
> Looks like the disabling push from bug 1140045 indeed eliminated this.

Well that's not exactly a long-term fix, since we'll need to re-enable this.

Here's a try push with my patch from bug 1140172:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=61a2201f456a

This is looking pretty good to me. Ryan, would this give you confidence about us re-enabling this feature if we land bug 1140172 first?
Flags: needinfo?(ryanvm)
(In reply to :Margaret Leibovic from comment #42)
> Well that's not exactly a long-term fix, since we'll need to re-enable this.

Of course :)

> Here's a try push with my patch from bug 1140172:
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=61a2201f456a
> 
> This is looking pretty good to me. Ryan, would this give you confidence
> about us re-enabling this feature if we land bug 1140172 first?

Looks great. We were in the neighborhood of a 15% or so failure rate before IIRC, so it seems very likely we'd have hit problems on that push by now if there were going to be.

Does that fix the ASAN issues you hit awhile back too?
Flags: needinfo?(ryanvm)
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #43)

> Does that fix the ASAN issues you hit awhile back too?

njn fixed that with bug 1138770.
You need to log in before you can comment on or make changes to this bug.