Closed Bug 1373980 Opened 7 years ago Closed 7 years ago

Intermittent application crashed [@ SharedLibraryInfo::GetInfoForSelf()]

Categories

(Core :: Gecko Profiler, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla56
Tracking Status
firefox-esr52 --- unaffected
firefox54 --- unaffected
firefox55 --- unaffected
firefox56 --- fixed

People

(Reporter: aryx, Assigned: nika)

References

Details

(Keywords: crash, intermittent-failure)

Crash Data

Attachments

(1 file, 1 obsolete file)

https://treeherder.mozilla.org/logviewer.html#?job_id=107874793&repo=mozilla-inbound


22:18:18     INFO - Buffered messages finished
22:18:18    ERROR - TEST-UNEXPECTED-FAIL | layout/style/test/test_webkit_flex_display.html | application terminated with exit code 1
22:18:18     INFO - runtests.py | Application ran for: 0:03:59.746813
22:18:18     INFO - zombiecheck | Reading PID log: /var/folders/cf/hbtzdksx47l1kfyjdcp1pd5m00000w/T/tmpD7OCUrpidlog
22:18:18     INFO - mozcrash Downloading symbols from: https://queue.taskcluster.net/v1/task/SzAJPrTFSs2sXHmKW47-4g/artifacts/public/build/firefox-56.0a1.en-US.mac.crashreporter-symbols.zip
22:18:25     INFO - mozcrash Copy/paste: /Users/cltbld/tasks/task_1497669461/build/macosx64-minidump_stackwalk /var/folders/cf/hbtzdksx47l1kfyjdcp1pd5m00000w/T/tmp2khNoj.mozrunner/minidumps/79394EFF-58AB-40A1-BE61-5970E51D9E76.dmp /var/folders/cf/hbtzdksx47l1kfyjdcp1pd5m00000w/T/tmpvtZT56
22:18:40     INFO - mozcrash Saved minidump as /Users/cltbld/tasks/task_1497669461/build/blobber_upload_dir/79394EFF-58AB-40A1-BE61-5970E51D9E76.dmp
22:18:40     INFO - mozcrash Saved app info as /Users/cltbld/tasks/task_1497669461/build/blobber_upload_dir/79394EFF-58AB-40A1-BE61-5970E51D9E76.extra
22:18:40     INFO - PROCESS-CRASH | layout/style/test/test_webkit_flex_display.html | application crashed [@ SharedLibraryInfo::GetInfoForSelf()]
22:18:40     INFO - Crash dump filename: /var/folders/cf/hbtzdksx47l1kfyjdcp1pd5m00000w/T/tmp2khNoj.mozrunner/minidumps/79394EFF-58AB-40A1-BE61-5970E51D9E76.dmp
22:18:40     INFO - Operating system: Mac OS X
22:18:40     INFO -                   10.10.5 14F27
22:18:40     INFO - CPU: amd64
22:18:40     INFO -      family 6 model 69 stepping 1
22:18:40     INFO -      4 CPUs
22:18:40     INFO - 
22:18:40     INFO - GPU: UNKNOWN
22:18:40     INFO - 
22:18:40     INFO - Crash reason:  EXC_BAD_ACCESS / KERN_INVALID_ADDRESS
22:18:40     INFO - Crash address: 0x19c8
22:18:40     INFO - Process uptime: 240 seconds
22:18:40     INFO - 
22:18:40     INFO - Thread 43 (crashed)
22:18:40     INFO -  0  XUL!SharedLibraryInfo::GetInfoForSelf() [shared-libraries-macos.cc:aabaa4a353ef : 117 + 0x4]
22:18:40     INFO -     rax = 0x0000000000000000   rdx = 0x0000000000000119
22:18:40     INFO -     rcx = 0x0000000000000339   rbx = 0x00007fff6eab2288
22:18:40     INFO -     rsi = 0x0000000000000113   rdi = 0x0000000000000000
22:18:40     INFO -     rbp = 0x000000011e080ae0   rsp = 0x000000011e080700
22:18:40     INFO -      r8 = 0x0000000000000030    r9 = 0x0000000105a00240
22:18:40     INFO -     r10 = 0x0000000000000003   r11 = 0x0000000119301740
22:18:40     INFO -     r12 = 0x000000011e0809f0   r13 = 0x000000011e080a18
22:18:40     INFO -     r14 = 0x000000011e0809e0   r15 = 0x00007fff94011995
22:18:40     INFO -     rip = 0x00000001099ba7f8
22:18:40     INFO -     Found by: given as instruction pointer in context
22:18:40     INFO -  1  XUL!mozilla::Telemetry::GetStackAndModules(std::__1::vector<unsigned long, std::__1::allocator<unsigned long> > const&) [Telemetry.cpp:aabaa4a353ef : 2521 + 0x5]
22:18:40     INFO -     rbx = 0x00000000000000b0   rbp = 0x000000011e080b90
22:18:40     INFO -     rsp = 0x000000011e080af0   r12 = 0x000000000000000c
22:18:40     INFO -     r13 = 0x000000000000000b   r14 = 0x000000011a9df058
22:18:40     INFO -     r15 = 0x0000000119366400   rip = 0x0000000109bf5af0
22:18:40     INFO -     Found by: call frame info
22:18:40     INFO -  2  XUL!mozilla::ProcessHangRunnable::Run() [BackgroundHangMonitor.cpp:aabaa4a353ef : 465 + 0x5]
22:18:40     INFO -     rbx = 0x0000000105b4dad8   rbp = 0x000000011e080c20
22:18:40     INFO -     rsp = 0x000000011e080ba0   r12 = 0x0000000000000000
22:18:40     INFO -     r13 = 0x0000000000000000   r14 = 0x0000000105b4dab0
22:18:40     INFO -     r15 = 0x00000001192eb780   rip = 0x0000000106109b24
22:18:40     INFO -     Found by: call frame info
22:18:40     INFO -  3  XUL!nsThreadPool::Run() [nsThreadPool.cpp:aabaa4a353ef : 225 + 0x6]
22:18:40     INFO -     rbx = 0x0000000105b4dad8   rbp = 0x000000011e080ca0
22:18:40     INFO -     rsp = 0x000000011e080c30   r12 = 0x0000000000000000
22:18:40     INFO -     r13 = 0x0000000000000000   r14 = 0x0000000105b4dab0
22:18:40     INFO -     r15 = 0x000000011e080c70   rip = 0x00000001061151af
22:18:40     INFO -     Found by: call frame info
22:18:40     INFO -  4  XUL!non-virtual thunk to nsThreadPool::Run() [nsThreadPool.cpp:aabaa4a353ef : 154 + 0x9]
22:18:40     INFO -     rbx = 0x000000012c0f06b0   rbp = 0x000000011e080cb0
22:18:40     INFO -     rsp = 0x000000011e080cb0   r12 = 0x000000012c0f0700
22:18:40     INFO -     r13 = 0x000000011e080d30   r14 = 0x000000012c0f0690
22:18:40     INFO -     r15 = 0x0000000000000000   rip = 0x00000001061152cd
22:18:40     INFO -     Found by: call frame info
22:18:40     INFO -  5  XUL!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:aabaa4a353ef : 1428 + 0x6]
22:18:40     INFO -     rbx = 0x000000012c0f06b0   rbp = 0x000000011e080db0
22:18:40     INFO -     rsp = 0x000000011e080cc0   r12 = 0x000000012c0f0700
22:18:40     INFO -     r13 = 0x000000011e080d30   r14 = 0x000000012c0f0690
22:18:40     INFO -     r15 = 0x0000000000000000   rip = 0x000000010611253b
22:18:40     INFO -     Found by: call frame info
22:18:40     INFO -  6  XUL!NS_ProcessNextEvent(nsIThread*, bool) [nsThreadUtils.cpp:aabaa4a353ef : 475 + 0xd]
22:18:40     INFO -     rbx = 0x0000000000000001   rbp = 0x000000011e080dd0
22:18:40     INFO -     rsp = 0x000000011e080dc0   r12 = 0x0000000000000000
22:18:40     INFO -     r13 = 0x000000011abe2c60   r14 = 0x000000011abe2c40
22:18:40     INFO -     r15 = 0x000000012c0f0690   rip = 0x0000000106116cb7
22:18:40     INFO -     Found by: call frame info
22:18:40     INFO -  7  XUL!mozilla::ipc::MessagePumpForNonMainThreads::Run(base::MessagePump::Delegate*) [MessagePump.cpp:aabaa4a353ef : 368 + 0xd]
22:18:40     INFO -     rbx = 0x000000011a28b1a0   rbp = 0x000000011e080e30
22:18:40     INFO -     rsp = 0x000000011e080de0   r12 = 0x0000000000000000
22:18:40     INFO -     r13 = 0x000000011abe2c60   r14 = 0x000000011abe2c40
22:18:40     INFO -     r15 = 0x000000012c0f0690   rip = 0x00000001066a1a2a
22:18:40     INFO -     Found by: call frame info
22:18:40     INFO -  8  XUL!MessageLoop::Run() [message_loop.cc:aabaa4a353ef : 318 + 0x8]
22:18:40     INFO -     rbx = 0x000000012c0f0730   rbp = 0x000000011e080e60
22:18:40     INFO -     rsp = 0x000000011e080e40   r12 = 0x000000000ab3c201
22:18:40     INFO -     r13 = 0x000000012c0f0690   r14 = 0x000000012c0f06b0
22:18:40     INFO -     r15 = 0x000000011a28b1a0   rip = 0x000000010662bf49
22:18:40     INFO -     Found by: call frame info
22:18:40     INFO -  9  XUL!nsThread::ThreadFunc(void*) [nsThread.cpp:aabaa4a353ef : 503 + 0x8]
22:18:40     INFO -     rbx = 0x000000012c0f0730   rbp = 0x000000011e080ec0
22:18:40     INFO -     rsp = 0x000000011e080e70   r12 = 0x000000000ab3c201
22:18:40     INFO -     r13 = 0x000000012c0f0690   r14 = 0x000000012c0f06b0
22:18:40     INFO -     r15 = 0x000000011a28b1a0   rip = 0x000000010610f8f8
22:18:40     INFO -     Found by: call frame info
22:18:40     INFO - 10  libnss3.dylib!_pt_root [ptthread.c:2733d0c87013 : 216 + 0x3]
22:18:40     INFO -     rbx = 0x000000011946acf0   rbp = 0x000000011e080ef0
22:18:40     INFO -     rsp = 0x000000011e080ed0   r12 = 0x0000000000013207
22:18:40     INFO -     r13 = 0x00000000000008ff   r14 = 0x000000011e081000
22:18:40     INFO -     r15 = 0x0000000000000000   rip = 0x0000000105d2ccd3
22:18:40     INFO -     Found by: call frame info
22:18:40     INFO - 11  libsystem_pthread.dylib!_pthread_body + 0x83
22:18:40     INFO -     rbx = 0x000000011e081000   rbp = 0x000000011e080f10
22:18:40     INFO -     rsp = 0x000000011e080f00   r12 = 0x0000000000013207
22:18:40     INFO -     r13 = 0x00000000000008ff   r14 = 0x000000011946acf0
22:18:40     INFO -     r15 = 0x0000000105d2cc00   rip = 0x00007fff9149e05a
22:18:40     INFO -     Found by: call frame info
22:18:40     INFO - 12  libsystem_pthread.dylib!_pthread_start + 0xb0
22:18:40     INFO -     rbp = 0x000000011e080f50   rsp = 0x000000011e080f20
22:18:40     INFO -     rip = 0x00007fff9149dfd7
22:18:40     INFO -     Found by: previous frame's frame pointer
22:18:40     INFO - 13  libsystem_pthread.dylib!thread_start + 0xd
22:18:40     INFO -     rbp = 0x000000011e080f78   rsp = 0x000000011e080f60
22:18:40     INFO -     rip = 0x00007fff9149b3ed
22:18:40     INFO -     Found by: previous frame's frame pointer
22:18:40     INFO - 14  libnss3.dylib + 0x12cc00
22:18:40     INFO -     rsp = 0x000000011e081030   rip = 0x0000000105d2cc00
22:18:40     INFO -     Found by: stack scanning
>    const dyld_image_info *info = &aii->infoArray[i];
> 
>    // If the magic number doesn't match then go no further
>    // since we're not pointing to where we think we are.
>    if (info->imageLoadAddress->magic != MACHO_MAGIC_NUMBER) {

It's crashing on this line. I wonder what info->imageLoadAddress is.
I looked into this a little bit. In the copy of the dyld_images.h file which I found online, there is this comment:

 /* 
 *	Beginning in Mac OS X 10.4, this is how gdb discovers which mach-o images are loaded in a process.
 *
 *	gdb looks for the symbol "_dyld_all_image_infos" in dyld.  It contains the fields below.  
 *
 *	For a snashot of what images are currently loaded, the infoArray fields contain a pointer
 *	to an array of all images. If infoArray is NULL, it means it is being modified, come back later.
 *
 *	To be notified of changes, gdb sets a break point on the address pointed to by the notificationn
 *	field.  The function it points to is called by dyld with an array of information about what images 
 *	have been added (dyld_image_adding) or are about to be removed (dyld_image_removing). 
 *
 * The notification is called after infoArray is updated.  This means that if gdb attaches to a process
 * and infoArray is NULL, gdb can set a break point on notification and let the proccess continue to
 * run until the break point.  Then gdb can inspect the full infoArray.
 */

It mentions that if infoArray is NULL, it means it is being modified. We don't null-check infoArray in this code path, and that line is the first line where we dereference a pointer based on it, making me think that that is the reason why the crash occurs.

I'm guessing that if we determine that infoArray is null, we should probably just exit gracefully, I'm not sure how we'd recover from that.
(In reply to Michael Layzell [:mystor] from comment #3)
> It mentions that if infoArray is NULL, it means it is being modified.

That seems likely. In some of these logs I can see nsNSSComponent::ShutdownNSS on the stack of a different thread, which may just have unloaded the nss library. In others I can see dlopen being called by FFVPXRuntimeLinker::Init() on another thread.

> I'm guessing that if we determine that infoArray is null, we should probably
> just exit gracefully, I'm not sure how we'd recover from that.

Yeah, we could just return an empty list of libraries. Or we could have a retry loop with a few PR_Sleep(0)'s until some maximum count.
No longer blocks: 1341713
No longer blocks: 1341314
No longer blocks: 1341193
I think this should do it :)

MozReview-Commit-ID: 9hZ9VL9MWl9
Attachment #8880502 - Flags: review?(mstange)
Comment on attachment 8880502 [details] [diff] [review]
Null-check infoArray correctly in GetInfoForSelf on macOS

Review of attachment 8880502 [details] [diff] [review]:
-----------------------------------------------------------------

::: tools/profiler/core/shared-libraries-macos.cc
@@ +99,5 @@
>  
> +  const dyld_image_info* infoArray = nullptr;
> +  size_t infoCount = 0;
> +
> +  // Attempt to get the library info for the currently loaded libraries 100

I think there should be another sentence at the start of this comment. Something like "Getting the info array can fail if it's currently modified, so we might have to try more than once."
Attachment #8880502 - Flags: review?(mstange) → review+
Attachment #8880502 - Attachment is obsolete: true
Pushed by michael@thelayzells.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/5f5a6facf488
Null-check infoArray correctly in GetInfoForSelf on macOS, r=mstange
https://hg.mozilla.org/mozilla-central/rev/5f5a6facf488
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
Assignee: nobody → michael
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: