Closed Bug 1344863 Opened 7 years ago Closed 7 years ago

Intermittent test_navigation.py TestBackForward.test_frameset | application crashed [@ js::gc::AssertValidToSkipBarrier]

Categories

(Core :: JavaScript: GC, defect)

defect
Not set
critical

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox54 --- affected

People

(Reporter: intermittent-bug-filer, Unassigned)

References

Details

(Keywords: crash, intermittent-failure)

Filed by: hskupin [at] mozilla.com

https://treeherder.mozilla.org/logviewer.html#?job_id=81940189&repo=try

https://archive.mozilla.org/pub/firefox/try-builds/hskupin@mozilla.com-ed6dc58e4920c614199fc7540e768ac93e9a9e84/try-win32-debug/try_win7_vm-debug_test-marionette-e10s-bm140-tests1-windows-build1538.txt.gz

This seems to be a perma crash as triggered by a new Marionette test which I created for bug 1330348  and hasn't been landed yet. The crash report is part of the following try build:

https://treeherder.mozilla.org/#/jobs?repo=try&revision=ed6dc58e4920

It's only happening on Windows 7 VM.
Here the first five frames of the stack:

09:41:03     INFO -  Crash reason:  EXCEPTION_BREAKPOINT
09:41:03     INFO -  Crash address: 0x5bc13750
09:41:03     INFO -  Process uptime: 242 seconds
09:41:03     INFO -  Thread 0 (crashed)
09:41:03     INFO -   0  xul.dll!js::gc::AssertValidToSkipBarrier [Heap.h:1bc2ad020aee : 1345 + 0x0]
09:41:03     INFO -      eip = 0x5bc13750   esp = 0x002edd58   ebp = 0x002edd6c   ebx = 0x0d2cd800
09:41:03     INFO -      esi = 0x0efac760   edi = 0x0efac740   eax = 0x00925000   ecx = 0x00000000
09:41:03     INFO -      edx = 0x00000005   efl = 0x00000246
09:41:03     INFO -      Found by: given as instruction pointer in context
09:41:03     INFO -   1  xul.dll!js::PropertyTree::insertChild(JSContext *,js::Shape *,js::Shape *) [jspropertytree.cpp:1bc2ad020aee : 63 + 0x8]
09:41:03     INFO -      eip = 0x5bc5b624   esp = 0x002edd74   ebp = 0x002edda0
09:41:03     INFO -      Found by: call frame info
09:41:03     INFO -   2  xul.dll!js::PropertyTree::getChild(JSContext *,js::Shape *,JS::Handle<js::StackShape>) [jspropertytree.cpp:1bc2ad020aee : 189 + 0xd]
09:41:03     INFO -      eip = 0x5bc57f74   esp = 0x002edda8   ebp = 0x002edde0
09:41:03     INFO -      Found by: call frame info
09:41:03     INFO -   3  xul.dll!js::NativeObject::getChildProperty(JSContext *,JS::Handle<js::NativeObject *>,JS::Handle<js::Shape *>,JS::MutableHandle<js::StackShape>) [Shape.cpp:1bc2ad020aee : 451 + 0x36]
09:41:03     INFO -      eip = 0x5bc1ec95   esp = 0x002edde8   ebp = 0x002ede00
09:41:03     INFO -      Found by: call frame info
09:41:03     INFO -   4  xul.dll!js::NativeObject::addPropertyInternal(JSContext *,JS::Handle<js::NativeObject *>,JS::Handle<jsid>,bool (*)(JSContext *,JS::Handle<JSObject *>,JS::Handle<jsid>,JS::MutableHandle<JS::Value>),bool (*)(JSContext *,JS::Handle<JSObject *>,JS::Handle<jsid>,JS::MutableHandle<JS::Value>,JS::ObjectOpResult &),unsigned int,unsigned int,unsigned int,js::ShapeTable::Entry *,bool,js::AutoKeepShapeTables const &) [Shape.cpp:1bc2ad020aee : 633 + 0x19]
09:41:03     INFO -      eip = 0x5bc191ca   esp = 0x002ede08   ebp = 0x002edec8
09:41:03     INFO -      Found by: call frame info
09:41:03     INFO -   5  xul.dll!js::JSONParser<char16_t>::trace(js::JSONParser<char16_t> *,JSTracer *) [JSONParser.h:1bc2ad020aee : 235 + 0x10]
09:41:03     INFO -      eip = 0x5baa85b0   esp = 0x002ede44   ebp = 0x002edec8
09:41:03     INFO -      Found by: stack scanning
Severity: normal → critical
Keywords: crash
I'm not sure how this is possible barring heap corruption or miscompilation.

AssertValidToSkipBarrier() asserts its argument is not in the nursery and is not a JSObject (by checking its alloc kind).  In this case it's being called with Shape so I don't see how either of those assertions can fail.
Maybe Ted can help in regards of which is the actual real crash as happened. There are three PROCESS-CRASH lines involved, and in general it starts with a crash in  google_breakpad::ExceptionHandler::WriteMinidump(). But AFAIR this is not the real underlying crash. So this one is the next in the list.
Flags: needinfo?(ted)
Okay, so if you look at the first stack, you can see that frames 2 and 3 are in `CrashReporterHost::GenerateMinidumpAndPair` and `ContentParent::KillHard`, respectively. That means this is a dump from the chrome process hitting a timeout and killing the content process. This dump is usually not that interesting, the top of the stack is just it writing a minidump of itself, although sometimes there are interesting things on other threads.

The second and third stacks are in content processes that had minidumps written before they were killed by the parent. The `Crash reason:  EXCEPTION_BREAKPOINT` is the tipoff here. They didn't actually crash in that code, that's just what they were running when the chrome process got tired of waiting for them. The first of those two looks like it's running GC during XBL construction or something like that, and the second actually looks like it's trying to clean up some a11y resources as a result of an IPC error.
Flags: needinfo?(ted)
Ted, so I would assume it's the same underlying issue as what is reported on bug 1346209?
Flags: needinfo?(ted)
It's entirely possible, but hard to say exactly.
Flags: needinfo?(ted)
The crash didn't happen anymore since we re-enabled the test.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.