Closed Bug 824526 Opened 12 years ago Closed 12 years ago

b2g process crash during MW0

Categories

(Core :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 822398
blocking-basecamp +

People

(Reporter: cjones, Assigned: dougt)

References

Details

STR (1) follow steps at https://wiki.mozilla.org/B2G/Memory_acceptance_criteria#MW0:_Every_app_is_successfully_launched_into_the_foreground I get a crash after step 22, segfaulting on null. No additional information.
Assignee: nobody → doug.turner
blocking-basecamp: ? → +
I am using a debug build and can reproduce something similar to this problem after step 9. I see lots of OOM warnings, then: I/Gecko ( 481): [Parent 481] ###!!! ASSERTION: op == PL_DHASH_LOOKUP || RECURSION_LEVEL(table) == 0: 'op == PL_DHASH_LOOKUP || RECURSION_LEVEL(table) == 0', file /Users/dougt/builds/B2G/objdir-gecko/xpcom/build/pldhash.cpp, line 574 F/MOZ_Assert( 481): Assertion failure: chars[length] == 0, at /Users/dougt/builds/B2G/gecko/js/src/vm/String-inl.h:284 F/libc ( 481): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1) A few ms later: I/Gecko ( 575): [Child 575] WARNING: NS_ENSURE_TRUE(IsChromeProcess()) failed: file /Users/dougt/builds/B2G/gecko/content/base/src/nsFrameMessageManager.cpp, line 687 D/memalloc( 481): /dev/pmem: Allocated buffer base:0x4b400000 size:4096 offset:2990080 fd:129 D/memalloc( 534): /dev/pmem: Mapped buffer base:0x45356000 size:2994176 offset:2990080 fd:35 D/memalloc( 481): /dev/pmem: Allocated buffer base:0x4b400000 size:122880 offset:2994176 fd:132 I/Gecko ( 534): [Child 534] ###!!! ABORT: unexpected type tag: '(mType) == (aType)', file ../../ipc/ipdl/_ipdlheaders/mozilla/layers/LayersSurfaces.h, line 83 I/Gecko ( 647): [Child 647] WARNING: shutting down early because of crash!: file /Users/dougt/builds/B2G/gecko/dom/ipc/ContentChild.cpp, line 830 I/Gecko ( 647): [Child 647] WARNING: content process _exit()ing: file /Users/dougt/builds/B2G/gecko/dom/ipc/ContentChild.cpp, line 879 F/libc ( 534): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1) Snapshot of ps a about 100ms before the crash: APPLICATION USER PID PPID VSIZE RSS WCHAN PC NAME b2g root 105 1 172192 69104 ffffffff b0003430 t /system/b2g/b2g Homescreen app_351 351 105 93872 24548 ffffffff 4009d6ec S /system/b2g/plugin-container Messages app_1482 1482 105 73680 18856 ffffffff 40082af4 R /system/b2g/plugin-container Browser app_3891 3891 105 67404 15592 ffffffff 4011f330 S /system/b2g/plugin-container Feedback app_4325 4325 105 69584 16916 ffffffff 400b6330 S /system/b2g/plugin-container Gallery app_6673 6673 105 72664 22220 ffffffff 400f6330 S /system/b2g/plugin-container /proc/meminfo memfree stays around 1.5 mb during the last few MW0 steps and never dips below 1mb. Could we just not be killing off plugin-container fast enough?
> Could we just not be killing off plugin-container fast enough? We're not killing either process here -- they're both segfaulting, instead of being SIGKIL'ed. I thought the kernel should kill a process before causing malloc to fail, but maybe that is incorrect.
Nope - the kernel has no control over malloc, that's a user-mode thing. And whether to kill a process or not would be a user-mode policy type decision, which the kernel goes to great lengths to avoid. So in this case, the Assertion failure is causing the segfault (NS_ASSERTION eventually winds up at MOZ_CRASH which does a write to memory location zero which causes the segfault).
> Nope - the kernel has no control over malloc, that's a user-mode thing. Sorry, what I mean is, malloc returns null iff mmap fails. But mmap should only fail if we run out of virtual address space, right? If we're running low on physical memory but have sufficient virtual memory, mmap should succeed. Then when we touch one of those pages, the kernel should notice we're low on memory and kill something. NS_ASSERTION is not fatal in release builds, but I guess it might be fatal in dougt's debug build...
If the NS_ASSERTIONs are causing us to crash (and if they are indeed not fatal in release builds, which I'm pretty sure is true), then dougt's logcat is not necessarily useful in understanding cjones's bug.
The assertion in comment 1 would in fact be exactly what that patch fixes.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → DUPLICATE
You need to log in before you can comment on or make changes to this bug.