Closed Bug 997191 Opened 8 years ago Closed 7 years ago

Weirdness handling crashes in child processes (Crash in CrashReporter::OnChildProcessDumpRequested)

Categories

(Toolkit :: Crash Reporting, defect)

ARM
Android
defect
Not set
normal

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: emorley, Assigned: jchen)

References

Details

(Keywords: crash, intermittent-failure)

Attachments

(1 file)

<edmorley> is there something up with minidump generation (also a few similar like it recently)? https://tbpl.mozilla.org/php/getParsedLog.php?id=37908831&tree=B2g-Inbound
 <ted> edmorley|sheriffduty: minidump generation hasn't changed at all lately
 <ted> edmorley|sheriffduty: that looks like a child process crashing in a weird way
 <ted> edmorley|sheriffduty: hard to say without knowing what's going on there
 <ted> edmorley|sheriffduty: get a bug on file
01:17:22     INFO -  328 INFO TEST-START | /tests/dom/media/tests/mochitest/test_dataChannel_basicAudio.html
01:17:22     INFO -  INFO | automation.py | Application ran for: 0:04:03.175699
01:17:22     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmpCN6NiZpidlog
01:17:22     INFO -  /data/anr/traces.txt not found
01:17:23     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/b2g-inbound-android-debug/1397631244/fennec-31.0a1.en-US.android-arm.crashreporter-symbols.zip
01:17:25  WARNING -  PROCESS-CRASH | /tests/dom/media/tests/mochitest/test_dataChannel_basicAudio.html | application crashed [@ 0x44bed75c]
01:17:25     INFO -  Crash dump filename: /tmp/tmpTRAZJH/006c00fc-3d36-1548-2c7e5ef7-5b460f3e.dmp
01:17:25     INFO -  Operating system: Android
01:17:25     INFO -                    0.0.0 Linux 3.2.0+ #2 SMP PREEMPT Thu Nov 29 08:06:57 EST 2012 armv7l pandaboard/pandaboard/pandaboard:4.0.4/IMM76I/5:eng/test-keys
01:17:25     INFO -  CPU: arm
01:17:25     INFO -       2 CPUs
01:17:25     INFO -  Crash reason:  SIGSEGV
01:17:25     INFO -  Crash address: 0x0
01:17:25     INFO -  Thread 0 (crashed)
01:17:25     INFO -   0  0x44bed75c
01:17:25     INFO -       r4 = 0xbeffe840    r5 = 0x00000064    r6 = 0xbeffe8b0    r7 = 0xffffffff
01:17:25     INFO -       r8 = 0x00000000    r9 = 0x00000000   r10 = 0x00000000    fp = 0xbefff32c
01:17:25     INFO -       sp = 0xbeffe748    lr = 0x44be9c03    pc = 0x44bed75c
01:17:25     INFO -      Found by: given as instruction pointer in context
01:17:25     INFO -  Loaded modules:
01:17:25     INFO -  0x00008000 - 0x0000afff  libplugin-container.so  ???  (main)
01:17:25     INFO -  0x40000000 - 0x40003fff  liblog.so  ???
01:17:25     INFO -  0x40004000 - 0x40064fff  libc.so  ???
01:17:25     INFO -  0x40070000 - 0x40077fff  __properties__ (deleted)  ???
01:17:25     INFO -  0x40078000 - 0x40079fff  libstdc++.so  ???
01:17:25     INFO -  0x4007a000 - 0x4008ffff  libm.so  ???
01:17:25     INFO -  0x40091000 - 0x400e4fff  libmozglue.so  ???
01:17:25     INFO -  0x400e8000 - 0x400fdfff  libz.so  ???
01:17:25     INFO -  0x40202000 - 0x40204fff  libmozalloc.so (deleted)  ???
01:17:25     INFO -  0x40207000 - 0x4020afff  libmozalloc.so (deleted)  ???
01:17:25     INFO -  0x4020b000 - 0x4020ffff  libGLESv2.so  ???
01:17:25     INFO -  0x40210000 - 0x40221fff  libcutils.so  ???
01:17:25     INFO -  0x40231000 - 0x40258fff  libEGL.so  ???
01:17:25     INFO -  0x4025b000 - 0x4028ffff  libutils.so  ???
01:17:25     INFO -  0x40290000 - 0x402b1fff  libGLESv2_dbg.so  ???
01:17:25     INFO -  0x402b2000 - 0x402fafff  libstlport.so  ???
01:17:25     INFO -  0x40400000 - 0x422b9fff  org.mozilla.fennec-1.apk  ???
01:17:25     INFO -  0x422ba000 - 0x424abfff  libnss3.so (deleted)  ???
01:17:25     INFO -  0x424ad000 - 0x424effff  libnss3.so (deleted)  ???
01:17:25     INFO -  0x426aa000 - 0x44763fff  libxul.so (deleted)  ???
01:17:25     INFO -  0x44765000 - 0x447eafff  libxul.so (deleted)  ???
01:17:25     INFO -  mozcrash INFO | Saved minidump as /builds/panda-0851/test/build/blobber_upload_dir/006c00fc-3d36-1548-2c7e5ef7-5b460f3e.dmp
01:17:25     INFO -  mozcrash INFO | Saved app info as /builds/panda-0851/test/build/blobber_upload_dir/006c00fc-3d36-1548-2c7e5ef7-5b460f3e.extra
01:17:29  WARNING -  PROCESS-CRASH | /tests/dom/media/tests/mochitest/test_dataChannel_basicAudio.html | application crashed [@ CrashReporter::OnChildProcessDumpRequested]
01:17:29     INFO -  Crash dump filename: /tmp/tmpTRAZJH/54f26e47-5eec-930a-4a214f20-5176cb70.dmp
01:17:29     INFO -  Operating system: Android
01:17:29     INFO -                    0.0.0 Linux 3.2.0+ #2 SMP PREEMPT Thu Nov 29 08:06:57 EST 2012 armv7l pandaboard/pandaboard/pandaboard:4.0.4/IMM76I/5:eng/test-keys
01:17:29     INFO -  CPU: arm
01:17:29     INFO -       2 CPUs
01:17:29     INFO -  Crash reason:  SIGSEGV
01:17:29     INFO -  Crash address: 0x0
01:17:29     INFO -  Thread 50 (crashed)
01:17:29     INFO -   0  libxul.so!CrashReporter::OnChildProcessDumpRequested [nsExceptionHandler.cpp:092bbc509ddc : 2454 + 0x14]
01:17:29     INFO -       r4 = 0x6d17ac80    r5 = 0x6c9a21f0    r6 = 0x6f5ffcb8    r7 = 0x6f5ffcbc
01:17:29     INFO -       r8 = 0x00000001    r9 = 0x637eb6a7   r10 = 0x63a0d6b5    fp = 0x637eb795
01:17:29     INFO -       sp = 0x6f5ffcb0    lr = 0x62d273cf    pc = 0x62d291b4
01:17:29     INFO -      Found by: given as instruction pointer in context
01:17:29     INFO -   1  libxul.so!google_breakpad::CrashGenerationServer::ClientEvent(short) [crash_generation_server.cc:092bbc509ddc : 401 + 0x3]
01:17:29     INFO -       r4 = 0x6f5ffd48    r5 = 0x0000007c    r6 = 0x6c964c00    r7 = 0x00000001
01:17:29     INFO -       r8 = 0x6f5ffdb4    r9 = 0x637eb6a7   r10 = 0x63a0d6b5    fp = 0x637eb795
01:17:29     INFO -       sp = 0x6f5ffcf0    pc = 0x62d2c6e3
01:17:29     INFO -      Found by: call frame info
01:17:29     INFO -   2  libxul.so!google_breakpad::CrashGenerationServer::Run() [crash_generation_server.cc:092bbc509ddc : 288 + 0x7]
01:17:29     INFO -       r4 = 0x6c964c00    r5 = 0x62d2c7cf    r6 = 0x6c964c00    r7 = 0x00000078
01:17:29     INFO -       r8 = 0x62d2c7cf    r9 = 0x6c964c00   r10 = 0x00100000    fp = 0x00000001
01:17:29     INFO -       sp = 0x6f5ffed0    pc = 0x62d2c7b3
01:17:29     INFO -      Found by: call frame info
01:17:29     INFO -   3  libxul.so!google_breakpad::CrashGenerationServer::ThreadMain(void*) [crash_generation_server.cc:092bbc509ddc : 460 + 0x3]
01:17:29     INFO -       r4 = 0x6f5fff00    r5 = 0x62d2c7cf    r6 = 0x6c964c00    r7 = 0x00000078
01:17:29     INFO -       r8 = 0x62d2c7cf    r9 = 0x6c964c00   r10 = 0x00100000    fp = 0x00000001
01:17:29     INFO -       sp = 0x6f5ffee8    pc = 0x62d2c7d5
01:17:29     INFO -      Found by: call frame info
01:17:29     INFO -   4  libc.so + 0x12a02
01:17:29     INFO -       r4 = 0x6f5fff00    r5 = 0x62d2c7cf    r6 = 0x6c964c00    r7 = 0x00000078
01:17:29     INFO -       r8 = 0x62d2c7cf    r9 = 0x6c964c00   r10 = 0x00100000    fp = 0x00000001
01:17:29     INFO -       sp = 0x6f5ffef0    pc = 0x400bda04
01:17:29     INFO -      Found by: call frame info
01:17:29     INFO -   5  libc.so + 0x550be
01:17:29     INFO -       sp = 0x6f5ffefc    pc = 0x401000c0
01:17:29     INFO -      Found by: stack scanning

...

01:17:30  WARNING -  PROCESS-CRASH | /tests/dom/media/tests/mochitest/test_dataChannel_basicAudio.html | application crashed [@ 0x44bed75c]
01:17:30     INFO -  Crash dump filename: /tmp/tmpTRAZJH/2d4686c8-d487-0374-41238a08-1c4b72ea.dmp
01:17:30     INFO -  Operating system: Android
01:17:30     INFO -                    0.0.0 Linux 3.2.0+ #2 SMP PREEMPT Thu Nov 29 08:06:57 EST 2012 armv7l pandaboard/pandaboard/pandaboard:4.0.4/IMM76I/5:eng/test-keys
01:17:30     INFO -  CPU: arm
01:17:30     INFO -       2 CPUs
01:17:30     INFO -  Crash reason:  SIGSEGV
01:17:30     INFO -  Crash address: 0x0
01:17:30     INFO -  Thread 0 (crashed)
01:17:30     INFO -   0  0x44bed75c
01:17:30     INFO -       r4 = 0xbeffe840    r5 = 0x00000064    r6 = 0xbeffe8b0    r7 = 0xffffffff
01:17:30     INFO -       r8 = 0x00000000    r9 = 0x00000000   r10 = 0x00000000    fp = 0xbefff32c
01:17:30     INFO -       sp = 0xbeffe748    lr = 0x44be9c03    pc = 0x44bed75c
01:17:30     INFO -      Found by: given as instruction pointer in context
01:17:30     INFO -  Loaded modules:
01:17:30     INFO -  0x00008000 - 0x0000afff  libplugin-container.so  ???  (main)
01:17:30     INFO -  0x40000000 - 0x40003fff  liblog.so  ???
01:17:30     INFO -  0x40004000 - 0x40064fff  libc.so  ???
01:17:30     INFO -  0x40070000 - 0x40077fff  __properties__ (deleted)  ???
01:17:30     INFO -  0x40078000 - 0x40079fff  libstdc++.so  ???
01:17:30     INFO -  0x4007a000 - 0x4008ffff  libm.so  ???
01:17:30     INFO -  0x40091000 - 0x400e4fff  libmozglue.so  ???
01:17:30     INFO -  0x400e8000 - 0x400fdfff  libz.so  ???
01:17:30     INFO -  0x40202000 - 0x40204fff  libmozalloc.so (deleted)  ???
01:17:30     INFO -  0x40207000 - 0x4020afff  libmozalloc.so (deleted)  ???
01:17:30     INFO -  0x4020b000 - 0x4020ffff  libGLESv2.so  ???
01:17:30     INFO -  0x40210000 - 0x40221fff  libcutils.so  ???
01:17:30     INFO -  0x40231000 - 0x40258fff  libEGL.so  ???
01:17:30     INFO -  0x4025b000 - 0x4028ffff  libutils.so  ???
01:17:30     INFO -  0x40290000 - 0x402b1fff  libGLESv2_dbg.so  ???
01:17:30     INFO -  0x402b2000 - 0x402fafff  libstlport.so  ???
01:17:30     INFO -  0x40400000 - 0x422b9fff  org.mozilla.fennec-1.apk  ???
01:17:30     INFO -  0x422ba000 - 0x424abfff  libnss3.so (deleted)  ???
01:17:30     INFO -  0x424ad000 - 0x424effff  libnss3.so (deleted)  ???
01:17:30     INFO -  0x426aa000 - 0x44763fff  libxul.so (deleted)  ???
01:17:30     INFO -  0x44765000 - 0x447eafff  libxul.so (deleted)  ???

From logcat:
01:17:32     INFO -  04-16 01:17:00.171 I/Gecko   ( 2338): [Child 2338] WARNING: Leaking the RDF Service.: file /builds/slave/b2g-in-and-d-00000000000000000/build/rdf/build/nsRDFModule.cpp, line 165
01:17:32     INFO -  04-16 01:17:00.250 I/Gecko   ( 2338): [Child 2338] WARNING: '!compMgr', file /builds/slave/b2g-in-and-d-00000000000000000/build/xpcom/glue/nsComponentManagerUtils.cpp, line 59
01:17:32     INFO -  04-16 01:17:00.265 F/MOZ_Assert( 2338): Assertion failure: work_queue_.empty(), at /builds/slave/b2g-in-and-d-00000000000000000/build/ipc/chromium/src/base/message_loop.cc:393
01:17:32     INFO -  04-16 01:17:00.437 F/MOZ_Assert( 2244): Assertion failure: !pd->minidump, at /builds/slave/b2g-in-and-d-00000000000000000/build/toolkit/crashreporter/nsExceptionHandler.cpp:2454
01:17:32     INFO -  04-16 01:17:01.281 F/libc    ( 2244): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1)
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289): *** *** *** *** *** *** *** *** *** *** *** *** *** *** *** ***
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289): Build fingerprint: 'pandaboard/pandaboard/pandaboard:4.0.4/IMM76I/5:eng/test-keys'
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289): pid: 2244, tid: 2337  >>> org.mozilla.fennec <<<
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 00000000
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  r0 00000097  r1 637e9548  r2 0000007b  r3 00000000
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  r4 6d17ac80  r5 6c9a21f0  r6 6f5ffcb8  r7 6f5ffcbc
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  r8 00000001  r9 637eb6a7  10 63a0d6b5  fp 637eb795
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  ip 40055130  sp 6f5ffcb0  lr 62d273cf  pc 62d291b4  cpsr 20000130
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d0  6e61486e6f697432  d1  7070632e72656c34
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d2  3030303030303035  d3  75622f3030303034
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d4  6c6f6f742f646c69  d5  736172632f74696b
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d6  6574726f70657268  d7  65637845736e2f72
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d8  3ff0000000000000  d9  0000000000000000
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d10 0000000000000000  d11 0000000000000000
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d12 0000000000000000  d13 0000000000000000
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d14 0000000000000000  d15 0000000000000000
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d16 0000000000000000  d17 656e65672e736c69
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d18 fe01817ebde0817e  d19 fe01817efe01817e
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d20 fe01817ebde0817e  d21 fe01817efe01817e
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d22 fe01817ef708817e  d23 fe01817efe01817e
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d24 0000005700050057  d25 ff82ff82ff82f882
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d26 0000007d0007007d  d27 0000007d0000007d
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d28 ffc4c4c4ffd9d9d9  d29 ffffffffffd9d9d9
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  d30 ffffffffffd9d9d9  d31 ffffffffffd9d9d9
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):  scr 28000013
01:17:32     INFO -  04-16 01:17:01.687 I/DEBUG   ( 1289):
01:17:32     INFO -  04-16 01:17:01.789 I/DEBUG   ( 1289):          #00  pc 000191b4  /dev/ashmem/libxul.so (deleted)
01:17:32     INFO -  04-16 01:17:01.789 I/DEBUG   ( 1289):          #01  lr 62d273cf  /dev/ashmem/libxul.so (deleted)
01:17:32     INFO -  04-16 01:17:01.789 I/DEBUG   ( 1289):
01:17:32     INFO -  04-16 01:17:01.789 I/DEBUG   ( 1289): code around pc:
01:17:32     INFO -  04-16 01:17:01.789 I/DEBUG   ( 1289): 62d29194 435868e0 d2fef4c6 b16368ab f640481e  .hXC.....hc..H@.
01:17:32     INFO -  04-16 01:17:01.789 I/DEBUG   ( 1289): 62d291a4 491e1296 44794478 f904f7fe 227b2300  ...IxDyD.....#{"
01:17:32     INFO -  04-16 01:17:01.789 I/DEBUG   ( 1289): 62d291b4 f4a9601a 9c02c61e 0808f105 6823b11c  .`............#h
01:17:32     INFO -  04-16 01:17:01.789 I/DEBUG   ( 1289): 62d291c4 685b4620 46404798 f4b24621 4a14d21b   F[h.G@F!F.....J
01:17:32     INFO -  04-16 01:17:01.789 I/DEBUG   ( 1289): 62d291d4 447aa808 33016813 601360eb d286f4af  ..zD.h.3.`.`....
01:17:32     INFO -  04-16 01:17:01.789 I/DEBUG   ( 1289):
01:17:32     INFO -  04-16 01:17:01.789 I/DEBUG   ( 1289): code around lr:
01:17:32     INFO -  04-16 01:17:01.789 I/DEBUG   ( 1289): 62d273ac 2300604b 60cb608b 00004770 4603b507  K`.#.`.`pG.....F
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289): 62d273bc 0006e88d 49032007 44794a03 f4ab447a  ..... .I.JyDzD..
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289): 62d273cc bd0ec4c0 007eb081 007eb08a f44fb513  ......~...~...O.
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289): 62d273dc 680372e0 699b2101 f0104798 46044300  .r.h.!.i.G...C.F
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289): 62d273ec 461cbf08 4b09d00f d101429c e00a2400  ...F...K.B...$..
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289): stack:
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffc70  00000010
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffc74  f6a5ab92
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffc78  6f5ffcc0
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffc7c  5d78a9bb  /dev/ashmem/dalvik-jit-code-cache (deleted)
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffc80  6d17ac80
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffc84  6c9a21f0
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffc88  6f5ffcb8
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffc8c  6f5ffcbc
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffc90  00000001
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffc94  62d273cf  /dev/ashmem/libxul.so (deleted)
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffc98  63512456  /dev/ashmem/libxul.so (deleted)
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffc9c  637e9929  /dev/ashmem/libxul.so (deleted)
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffca0  637e9548  /dev/ashmem/libxul.so (deleted)
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffca4  00000996
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffca8  df0027ad
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcac  00000000
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289): #00 6f5ffcb0  6f5ffcc0
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcb4  6f5ffcc8
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcb8  68ca01f0
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcbc  68ca0280
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcc0  00000922
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcc4  6f5ffcd4
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcc8  5be045b0
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffccc  6f5ffc01
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcd0  6e934550
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcd4  00000001
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcd8  6f5ffd48
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcdc  0000007c
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffce0  6c964c00
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffce4  00000001
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffce8  6f5ffdb4
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcec  62d2c6e3  /dev/ashmem/libxul.so (deleted)
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcf0  00000922
01:17:32     INFO -  04-16 01:17:01.796 I/DEBUG   ( 1289):     6f5ffcf4  6f5ffdb4
In case they get lost in the last message, these lines seem relevant:
01:17:32     INFO -  04-16 01:17:00.265 F/MOZ_Assert( 2338): Assertion failure: work_queue_.empty(), at /builds/slave/b2g-in-and-d-00000000000000000/build/ipc/chromium/src/base/message_loop.cc:393
01:17:32     INFO -  04-16 01:17:00.437 F/MOZ_Assert( 2244): Assertion failure: !pd->minidump, at /builds/slave/b2g-in-and-d-00000000000000000/build/toolkit/crashreporter/nsExceptionHandler.cpp:2454
blassey: can you look at this or find someone to look at it? I have no idea what's going on here, but it looks like a crash in a child process. I didn't think we used those at all on Android, so maybe this is just a test doing something that it shouldn't on Android?
Flags: needinfo?(blassey.bugs)
It looks like there is just a normal crash (or that first assertion failure), and a minidump appears to be created, but then we crash somewhere after that in the crash handling code.
Assignee: nobody → nchen
Flags: needinfo?(blassey.bugs)
Also:
https://tbpl.mozilla.org/php/getParsedLog.php?id=38019769&tree=Mozilla-Inbound

Android 4.0 Panda mozilla-inbound debug test mochitest-4 on 2014-04-17 09:09:35 PDT for push 993c3b5ff459
slave: panda-0160

09:32:46     INFO -  1652 INFO TEST-START | /tests/dom/datastore/tests/test_app_install.html
09:32:46     INFO -  INFO | automation.py | Application ran for: 0:11:44.063618
09:32:46     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmpaxLTKppidlog
09:32:47     INFO -  /data/anr/traces.txt not found
09:32:48     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android-debug/1397746626/fennec-31.0a1.en-US.android-arm.crashreporter-symbols.zip
09:32:50  WARNING -  PROCESS-CRASH | /tests/dom/datastore/tests/test_app_install.html | application crashed [@ 0x44bbdbb4]
09:32:50     INFO -  Crash dump filename: /tmp/tmpASd1hN/700264fa-193f-72f7-1033e56e-1a65d8de.dmp
09:32:50     INFO -  Operating system: Android
09:32:50     INFO -                    0.0.0 Linux 3.2.0+ #2 SMP PREEMPT Thu Nov 29 08:06:57 EST 2012 armv7l pandaboard/pandaboard/pandaboard:4.0.4/IMM76I/5:eng/test-keys
09:32:50     INFO -  CPU: arm
09:32:50     INFO -       2 CPUs
09:32:50     INFO -  Crash reason:  SIGSEGV
09:32:50     INFO -  Crash address: 0x0
09:32:50     INFO -  Thread 0 (crashed)
09:32:50     INFO -   0  0x44bbdbb4
09:32:50     INFO -       r4 = 0xbeffe840    r5 = 0x00000064    r6 = 0xbeffe8b0    r7 = 0xffffffff
09:32:50     INFO -       r8 = 0x00000000    r9 = 0x00000000   r10 = 0x00000000    fp = 0xbefff32c
09:32:50     INFO -       sp = 0xbeffe748    lr = 0x44bba077    pc = 0x44bbdbb4
09:32:50     INFO -      Found by: given as instruction pointer in context
09:32:50     INFO -  Loaded modules:
09:32:50     INFO -  0x00008000 - 0x0000afff  libplugin-container.so  ???  (main)
09:32:50     INFO -  0x40000000 - 0x40003fff  liblog.so  ???
09:32:50     INFO -  0x40004000 - 0x40064fff  libc.so  ???
09:32:50     INFO -  0x40070000 - 0x40077fff  __properties__ (deleted)  ???
09:32:50     INFO -  0x40078000 - 0x40079fff  libstdc++.so  ???
09:32:50     INFO -  0x4007a000 - 0x4008ffff  libm.so  ???
09:32:50     INFO -  0x40091000 - 0x400e4fff  libmozglue.so  ???
09:32:50     INFO -  0x400e8000 - 0x400fdfff  libz.so  ???
09:32:50     INFO -  0x40202000 - 0x40204fff  libmozalloc.so (deleted)  ???
09:32:50     INFO -  0x40207000 - 0x4020afff  libmozalloc.so (deleted)  ???
09:32:50     INFO -  0x4020b000 - 0x4020ffff  libGLESv2.so  ???
09:32:50     INFO -  0x40210000 - 0x40221fff  libcutils.so  ???
09:32:50     INFO -  0x40231000 - 0x40258fff  libEGL.so  ???
09:32:50     INFO -  0x4025b000 - 0x4028ffff  libutils.so  ???
09:32:50     INFO -  0x40290000 - 0x402b1fff  libGLESv2_dbg.so  ???
09:32:50     INFO -  0x402b2000 - 0x402fafff  libstlport.so  ???
09:32:50     INFO -  0x40400000 - 0x42287fff  org.mozilla.fennec-1.apk  ???
09:32:50     INFO -  0x42288000 - 0x42479fff  libnss3.so (deleted)  ???
09:32:50     INFO -  0x4247b000 - 0x424bdfff  libnss3.so (deleted)  ???
09:32:50     INFO -  0x42678000 - 0x44732fff  libxul.so (deleted)  ???
09:32:50     INFO -  0x44734000 - 0x447b9fff  libxul.so (deleted)  ???
09:32:50     INFO -  mozcrash INFO | Saved minidump as /builds/panda-0160/test/build/blobber_upload_dir/700264fa-193f-72f7-1033e56e-1a65d8de.dmp
09:32:50     INFO -  mozcrash INFO | Saved app info as /builds/panda-0160/test/build/blobber_upload_dir/700264fa-193f-72f7-1033e56e-1a65d8de.extra
09:32:53  WARNING -  PROCESS-CRASH | /tests/dom/datastore/tests/test_app_install.html | application crashed [@ CrashReporter::OnChildProcessDumpRequested]
09:32:53     INFO -  Crash dump filename: /tmp/tmpASd1hN/5ea7406f-776c-a9bf-65ab46c1-16ba9078.dmp
09:32:53     INFO -  Operating system: Android
09:32:53     INFO -                    0.0.0 Linux 3.2.0+ #2 SMP PREEMPT Thu Nov 29 08:06:57 EST 2012 armv7l pandaboard/pandaboard/pandaboard:4.0.4/IMM76I/5:eng/test-keys
09:32:53     INFO -  CPU: arm
09:32:53     INFO -       2 CPUs
09:32:53     INFO -  Crash reason:  SIGSEGV
09:32:53     INFO -  Crash address: 0x0
09:32:53     INFO -  Thread 62 (crashed)
09:32:53     INFO -   0  libxul.so!CrashReporter::OnChildProcessDumpRequested [nsExceptionHandler.cpp:993c3b5ff459 : 2454 + 0x14]
09:32:53     INFO -       r4 = 0x6ece7640    r5 = 0x68b5cad0    r6 = 0x706ffcb8    r7 = 0x706ffcbc
09:32:53     INFO -       r8 = 0x00000001    r9 = 0x636bc4ec   r10 = 0x638de51d    fp = 0x636bc5da
09:32:53     INFO -       sp = 0x706ffcb0    lr = 0x62bfa257    pc = 0x62bfc03c
09:32:53     INFO -      Found by: given as instruction pointer in context
09:32:53     INFO -   1  libxul.so!google_breakpad::CrashGenerationServer::ClientEvent(short) [crash_generation_server.cc:993c3b5ff459 : 401 + 0x3]
09:32:53     INFO -       r4 = 0x706ffd48    r5 = 0x00000079    r6 = 0x6e944ec0    r7 = 0x00000001
09:32:53     INFO -       r8 = 0x706ffdb4    r9 = 0x636bc4ec   r10 = 0x638de51d    fp = 0x636bc5da
09:32:53     INFO -       sp = 0x706ffcf0    pc = 0x62bff56b
09:32:53     INFO -      Found by: call frame info
09:32:53     INFO -   2  libxul.so!google_breakpad::CrashGenerationServer::Run() [crash_generation_server.cc:993c3b5ff459 : 288 + 0x7]
09:32:53     INFO -       r4 = 0x6e944ec0    r5 = 0x62bff657    r6 = 0x6e944ec0    r7 = 0x00000078
09:32:53     INFO -       r8 = 0x62bff657    r9 = 0x6e944ec0   r10 = 0x00100000    fp = 0x00000001
09:32:53     INFO -       sp = 0x706ffed0    pc = 0x62bff63b
09:32:53     INFO -      Found by: call frame info
09:32:53     INFO -   3  libxul.so!google_breakpad::CrashGenerationServer::ThreadMain(void*) [crash_generation_server.cc:993c3b5ff459 : 460 + 0x3]
09:32:53     INFO -       r4 = 0x706fff00    r5 = 0x62bff657    r6 = 0x6e944ec0    r7 = 0x00000078
09:32:53     INFO -       r8 = 0x62bff657    r9 = 0x6e944ec0   r10 = 0x00100000    fp = 0x00000001
09:32:53     INFO -       sp = 0x706ffee8    pc = 0x62bff65d
09:32:53     INFO -      Found by: call frame info
09:32:53     INFO -   4  libc.so + 0x12a02
09:32:53     INFO -       r4 = 0x706fff00    r5 = 0x62bff657    r6 = 0x6e944ec0    r7 = 0x00000078
09:32:53     INFO -       r8 = 0x62bff657    r9 = 0x6e944ec0   r10 = 0x00100000    fp = 0x00000001
09:32:53     INFO -       sp = 0x706ffef0    pc = 0x400dda04
09:32:53     INFO -      Found by: call frame info
09:32:53     INFO -   5  libc.so + 0x550be
09:32:53     INFO -       sp = 0x706ffefc    pc = 0x401200c0
09:32:53     INFO -      Found by: stack scanning
(In reply to Ed Morley (Away 18th-21st April - public holiday) [:edmorley UTC+0] from comment #2)
> In case they get lost in the last message, these lines seem relevant:
> 01:17:32     INFO -  04-16 01:17:00.265 F/MOZ_Assert( 2338): Assertion
> failure: work_queue_.empty(), at
> /builds/slave/b2g-in-and-d-00000000000000000/build/ipc/chromium/src/base/
> message_loop.cc:393
> 01:17:32     INFO -  04-16 01:17:00.437 F/MOZ_Assert( 2244): Assertion
> failure: !pd->minidump, at
> /builds/slave/b2g-in-and-d-00000000000000000/build/toolkit/crashreporter/
> nsExceptionHandler.cpp:2454

Can you take a look Ehsan? You added the first assert in bug 976363
Flags: needinfo?(ehsan)
(In reply to Jim Chen [:jchen :nchen] from comment #11)
> (In reply to Ed Morley (Away 18th-21st April - public holiday) [:edmorley
> UTC+0] from comment #2)
> > In case they get lost in the last message, these lines seem relevant:
> > 01:17:32     INFO -  04-16 01:17:00.265 F/MOZ_Assert( 2338): Assertion
> > failure: work_queue_.empty(), at
> > /builds/slave/b2g-in-and-d-00000000000000000/build/ipc/chromium/src/base/
> > message_loop.cc:393
> > 01:17:32     INFO -  04-16 01:17:00.437 F/MOZ_Assert( 2244): Assertion
> > failure: !pd->minidump, at
> > /builds/slave/b2g-in-and-d-00000000000000000/build/toolkit/crashreporter/
> > nsExceptionHandler.cpp:2454
> 
> Can you take a look Ehsan? You added the first assert in bug 976363

That assertion means that someone added a task to the message loop which was not properly handled before the message loop was destroyed (presumably by the thread shutting down.)  I have no idea what that task is and am not familiar with the code involved.
Flags: needinfo?(ehsan)
Back to you, Jim! This is happening with high enough frequency that arguably it's causing Android 4.0 debug mochitests to fail to meet visibility standards. This bug needs active ownership.
Flags: needinfo?(nchen)
Doing try runs now to see what's going on.
Status: NEW → ASSIGNED
Flags: needinfo?(nchen)
Same assertion as bug 986113, so that bug might fix this one.
Depends on: 986113
Turns out we're not doing much IPC cleanup when the child process shuts down. I guess it's fine since the process is terminating anyways, but this makes the assertion not happy. We can either disable the assertion for child process shutdown or do a minimal cleanup. This patch does the latter by closing some IPC channels. It fixes the assertion on my Nexus 4.

Try run here, https://tbpl.mozilla.org/?tree=Try&rev=084ebe5014d5
Why do we have a child process in the first place?
Comment on attachment 8416661 [details] [diff] [review]
Clean up IPC channels during child process shutdown (v1)

Try looks good (https://tbpl.mozilla.org/?tree=Try&rev=084ebe5014d5), so asking for review.

(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #36)
> Why do we have a child process in the first place?

The crash is happening when test_ipc_messagemanager_blob.html cleans up.
Attachment #8416661 - Flags: review?(benjamin)
Comment on attachment 8416661 [details] [diff] [review]
Clean up IPC channels during child process shutdown (v1)

This appears fine to me but I'm not a DOM guy nor 100% sure about this.
Attachment #8416661 - Flags: review?(benjamin) → review?(bent.mozilla)
Ben, this is currently our top orange on trunk and Aurora. Can you please prioritize this review soon? :)
Flags: needinfo?(bent.mozilla)
Comment on attachment 8416661 [details] [diff] [review]
Clean up IPC channels during child process shutdown (v1)

Review of attachment 8416661 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not sure that this will work correctly, we need to Nicolas about it. There are two things I'm worried about:

1. Nicolas is changing the shutdown sequence for all this stuff so I don't want to mess him up.
2. It looks like the destruction for everything currently happens pretty late, far past ContentChild::ActorDestroy:
  a. CompositorChild has an ActorDestroy method that cleans up for it, so calling Close on it might be fine, though I'm not sure the order is guaranteed wrt ContentChild::ActorDestroy.
  b. ImageBridgeChild looks like it gets shut down in LayoutModuleDtor/gfxPlatform::Shutdown. That's long after ContentChild::ActorDestroy, so it sounds worrisome to call Close() earlier...
  c. SharedBufferManagerChild has a Shutdown function that I don't think *ever* gets called at the moment (at least I can't find it with DXR). So we need something more substantial than simply calling Close() I think.
Attachment #8416661 - Flags: review?(bent.mozilla) → review?(nical.bugzilla)
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #41)
> Comment on attachment 8416661 [details] [diff] [review]
> Clean up IPC channels during child process shutdown (v1)
>
> 2. It looks like the destruction for everything currently happens pretty
> late, far past ContentChild::ActorDestroy:
>   a. CompositorChild has an ActorDestroy method that cleans up for it, so
> calling Close on it might be fine, though I'm not sure the order is
> guaranteed wrt ContentChild::ActorDestroy.

CompositorChild was the cause I found for the assertion. It was never cleaned up in a child process. I didn't know if the other two also caused crashes and thought would be nice to clean them up as well.

>   b. ImageBridgeChild looks like it gets shut down in
> LayoutModuleDtor/gfxPlatform::Shutdown. That's long after
> ContentChild::ActorDestroy, so it sounds worrisome to call Close() earlier...

Digging into the clean up path, I think ImageBridgeChild is actually cleaned up correctly so we shouldn't need to close it here. Maybe CompositorChild can be cleaned up in gfxPlatform::Shutdown as well?

>   c. SharedBufferManagerChild has a Shutdown function that I don't think
> *ever* gets called at the moment (at least I can't find it with DXR). So we
> need something more substantial than simply calling Close() I think.

Not sure where this should be cleaned up.
Flags: needinfo?(bent.mozilla) → needinfo?(nical.bugzilla)
(In reply to Jim Chen [:jchen :nchen] from comment #42)
> CompositorChild was the cause I found for the assertion. It was never
> cleaned up in a child process.

Hm, that's interesting... What was actually failing? CompositorChild::ActorDestroy (and all the subsequent shutdown steps) are actually being called, right?
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #43)
> (In reply to Jim Chen [:jchen :nchen] from comment #42)
> > CompositorChild was the cause I found for the assertion. It was never
> > cleaned up in a child process.
> 
> Hm, that's interesting... What was actually failing?
> CompositorChild::ActorDestroy (and all the subsequent shutdown steps) are
> actually being called, right?

We added an assertion that there's no pending messages in the message loop during shutdown in bug 976363 [1], but the assertion was triggered in Android debug builds. The pending message during shutdown turned out to be an IPC call intended for CompositorChild. I think IPC does clean up CompositorChild eventually, but after the message loop gets destroyed. We want to clean it up explicitly before the message loop gets destroyed.

[1] http://mxr.mozilla.org/mozilla-central/source/ipc/chromium/src/base/message_loop.cc?rev=8bd82defae82#405
Wait, that doesn't make sense... There should only be on be two message loops involved here, the main thread and the IPC thread. CompositorChild messages should be generated on the IPC thread and then dispatched to run on the main thread. Which message loop is asserting?
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #45)
> Wait, that doesn't make sense... There should only be on be two message
> loops involved here, the main thread and the IPC thread. CompositorChild
> messages should be generated on the IPC thread and then dispatched to run on
> the main thread. Which message loop is asserting?

The main message loop,

> #2  0x4550f3d4 in MessageLoop::DeletePendingTasks (this=this@entry=0xbecd6630)
>     at /home/nchen/gecko-dev/ipc/chromium/src/base/message_loop.cc:406
> #3  0x45510afa in MessageLoop::~MessageLoop (this=0xbecd6630, __in_chrg=<optimized out>)
>     at /home/nchen/gecko-dev/ipc/chromium/src/base/message_loop.cc:174
> #4  0x4647ba8e in XRE_InitChildProcess (aArgc=<optimized out>, aArgv=<optimized out>, 
>     aProcess=<optimized out>) at /home/nchen/gecko-dev/toolkit/xre/nsEmbedFunctions.cpp:522
> #5  0x4011aae6 in ChildProcessInit (argc=7, argv=0xbecd7134)
>     at /home/nchen/gecko-dev/mozglue/android/APKOpen.cpp:415

The task in MessageLoop::work_queue_ at time of assertion is MessageChannel::mDequeueOneTask, which processes the MessageChannel's pending queue. The message at the top of the MessageChannel's pending queue is a IPC::Message that has type PCompositor::Msg_DidComposite__ID
Hrm... In that case I don't understand how Close() is magically emptying that message loop.
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #47)
> Hrm... In that case I don't understand how Close() is magically emptying
> that message loop.

MessageChannel::Clear() [1] called during Close() cancels the message loop task. So maybe that removes the task from MessageLoop::work_queue_? It also clears pending messages, and once the channel is closed we won't have new messages arriving either.

[1] mxr.mozilla.org/mozilla-central/source/ipc/glue/MessageChannel.cpp?rev=8bd82defae82#263
(In reply to Jim Chen [:jchen :nchen] from comment #48)
> MessageChannel::Clear() [1] called during Close() cancels the message loop
> task.

Canceling the mDequeueOneTask task should, if anything, make hitting the assertion more likely.

> So maybe that removes the task from MessageLoop::work_queue_? It also clears pending messages

No, that's a separate object entirely, this code doesn't clear the message loop. These other members are messages that are not in the MessageLoop::work_queue_.

> once the channel is closed we won't have new messages arriving either.

Yes. I think that your patch probably just changes the timing a bit, so I bet we'll need something else to actually make this correct.
Comment on attachment 8416661 [details] [diff] [review]
Clean up IPC channels during child process shutdown (v1)

Review of attachment 8416661 [details] [diff] [review]:
-----------------------------------------------------------------

I think that this will conflict with ImageBridge and friends shutting down later (their shutdown sequences involves sending some messages to clean things up properly).
Today ImageBridge and CompositorParent are shut down in gfxPlatform::Shutdown which is a bit too late so I am about to land a patch that moves this to NS_ShutdownXPCOM (Bug 924622).
Am I right to believe that NS_ShutdownXPCOM will eventually be called somewhere before XRE_ShutdownChildProcess returns? If so, I think it would be best to close the channels in NS_ShutdownXPCOM too. If not then I think we have another problem.
Flags: needinfo?(nical.bugzilla)
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #49)
> (In reply to Jim Chen [:jchen :nchen] from comment #48)
> > MessageChannel::Clear() [1] called during Close() cancels the message loop
> > task.
> 
> Canceling the mDequeueOneTask task should, if anything, make hitting the
> assertion more likely.
> 
> > So maybe that removes the task from MessageLoop::work_queue_? It also clears pending messages
> 
> No, that's a separate object entirely, this code doesn't clear the message
> loop. These other members are messages that are not in the
> MessageLoop::work_queue_.

Actually I think the sequence goes something like this, without the patch applied,

1) ContentChild::ActorDestroy() => XRE_ShutdownChildProcess() => MessageLoop::Quit()
2) MessageLoop::Quit() marks the quit_received flag
3) Message loop processes pending messages
4) Message loop becomes idle, notices quit_received flag, and exits loop
5) CompositorChild channel received message and posts to message loop
6) Message loop is destroyed with pending message

With the patch applied,

1) CompositorChild::Close()
2) ContentChild::ActorDestroy() => XRE_ShutdownChildProcess() => MessageLoop::Quit()
3) MessageLoop::Quit() marks the quit_received flag
4) Message loop processes pending messages (mDequeueOneTask from CompositorChild removed)
5) Message loop becomes idle, notices quit_received flag, and exits loop
6) No more message from CompositorChild because the channel is closed
7) Message loop is destroyed with no pending messages


(In reply to Nicolas Silva [:nical] from comment #51)
> Comment on attachment 8416661 [details] [diff] [review]
> Clean up IPC channels during child process shutdown (v1)
> 
> Review of attachment 8416661 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I think that this will conflict with ImageBridge and friends shutting down
> later (their shutdown sequences involves sending some messages to clean
> things up properly).
> Today ImageBridge and CompositorParent are shut down in
> gfxPlatform::Shutdown which is a bit too late so I am about to land a patch
> that moves this to NS_ShutdownXPCOM (Bug 924622).

Does CompositorChild need shutting down? This crash is happening because CompositorChild is still getting messages very late in the shutdown process. This is in the content process so I don't know if CompositorParent is involved.

> Am I right to believe that NS_ShutdownXPCOM will eventually be called
> somewhere before XRE_ShutdownChildProcess returns? If so, I think it would
> be best to close the channels in NS_ShutdownXPCOM too. If not then I think
> we have another problem.

NS_ShutdownXPCOM gets called before the message loop is destroyed (through ContentProcess::~ContentProcess()), but after XRE_ShutdownChildProcess, so it might be a bit late.
I don't think this is an issue anymore?
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
Attachment #8416661 - Flags: review?(nical.bugzilla)
You need to log in before you can comment on or make changes to this bug.