sporadic issue with mochitest-ipcplugins: "missing output line for total leaks"

RESOLVED FIXED in mozilla1.9.3a1

Status

()

Core
IPC
RESOLVED FIXED
8 years ago
5 years ago

People

(Reporter: dholbert, Assigned: cjones)

Tracking

({intermittent-failure})

Trunk
mozilla1.9.3a1
x86
Linux
intermittent-failure
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(2 attachments)

(Reporter)

Description

8 years ago
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263319076.1263326866.16722.gz
Linux mozilla-central debug test everythingelse on 2010/01/12 09:57:56
s: moz2-linux-slave15
{
TEST-UNEXPECTED-FAIL | plugin process 31456 | automationutils.processLeakLog() | missing output line for total leaks!
}

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263312578.1263318564.21326.gz
Linux mozilla-central debug test everythingelse on 2010/01/12 08:09:38
s: moz2-linux-slave26

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263309338.1263317368.7376.gz
Linux mozilla-central debug test everythingelse on 2010/01/12 07:15:38
s: moz2-linux-slave41
(Reporter)

Comment 1

8 years ago
Not sure what component this should go in.  IPC? Plugins? Testing?  I filed this in RelEng for now, since the failure suggests that it could be an issue with the testing framework.  Feel free to relocate if another component makes more sense.

Comment 2

8 years ago
I think that this is a crash or something in the plugin process and that the harness is working correctly, but we're trying to get crash reporting hooked up today to help figure that out.
Component: Release Engineering → IPC
Product: mozilla.org → Core
QA Contact: release → ipc
Version: other → unspecified
FWIW, if you think something is a test harness issue, you'd file it in Testing:Whatever. RelEng should just be for "I think the build machine is broken" or "buildbot itself is broken".

Comment 4

8 years ago
It doesn't appear that the plugin process is crashing (or at least, not in a way which generates a minidump). Alternate theories:
* The plugin process is shutting down after the main process, and racing with the automation script which reads its log
* The plugin process is randomly exiting early (but not crashing in a way which would produce a minidump)
Assignee: nobody → benjamin

Comment 5

8 years ago
Linux mozilla-central debug test everythingelse on 2010/01/12 13:17:12
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263331032.1263338310.14006.gz

I'm having trouble reproducing locally, but that may be because I'm massively multi-core.

Comment 6

8 years ago
debugged: this is easy to reproduce if you insert a sleep(120) here:
http://hg.mozilla.org/mozilla-central/annotate/2f969cc4f104/toolkit/xre/nsEmbedFunctions.cpp#l330

It appears that this thread is dying either before/during XRE_LogTerm. cjones, I suspect that the parent is killing it off before the leak log is fully written.

Comment 7

8 years ago
The parent process appears to be killing it in:

#0  0x000000336ae331c0 in kill () from /lib64/libc.so.6
#1  0x00007f9fde37e345 in KillProcess (this=0x7f9fc8406b80) at ../../../src/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:126
#2  0x00007f9fde37e45f in ~ChildReaper (this=0x515a) at ../../../src/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc:91
#3  0x00007f9fde359802 in MessageLoop::DeletePendingTasks (this=0x7f9fd6a07ee0) at ../../../src/ipc/chromium/src/base/message_loop.cc:408
#4  0x00007f9fde359e22 in ~MessageLoop (this=0x7f9fd6a07ee0) at ../../../src/ipc/chromium/src/base/message_loop.cc:143
#5  0x00007f9fde3653fb in base::Thread::ThreadMain (this=0x7f9fdbdbfa80) at ../../../src/ipc/chromium/src/base/thread.cc:175
#6  0x00007f9fde376506 in ThreadFunc (closure=0x515a) at ../../../src/ipc/chromium/src/base/platform_thread_posix.cc:26
#7  0x000000336ba073da in start_thread () from /lib64/libpthread.so.0
#8  0x000000336aee627d in clone () from /lib64/libc.so.6

Main thread is in:
#0  0x000000336ba07cb5 in pthread_join () from /lib64/libpthread.so.0
#1  0x00007f9fde3654e9 in base::Thread::Stop (this=0x7f9fdbdbfa80) at ../../../src/ipc/chromium/src/base/thread.cc:114
#2  0x00007f9fde313499 in ~BrowserProcessSubThread (this=0x7f9fd6a089e0) at ../../../src/ipc/glue/GeckoThread.cpp:117
#3  0x00007f9fde38b911 in mozilla::ShutdownXPCOM (servMgr=0x7fffe6d1b320) at ../../../src/xpcom/build/nsXPComInit.cpp:913
#4  0x00007f9fdda18e50 in ~ScopedXPCOMStartup (this=0x7fffe6d1b9c0) at ../../../src/toolkit/xre/nsAppRunner.cpp:1042
#5  0x00007f9fdda1b3ad in XRE_main (argc=<value optimized out>, argv=<value optimized out>, aAppData=<value optimized out>)
    at ../../../src/toolkit/xre/nsAppRunner.cpp:3520
#6  0x0000000000401b4a in main (argc=5, argv=0x7fffe6d1bc78) at ../../../src/browser/app/nsBrowserApp.cpp:158

PluginModuleParent::~PluginModuleParent has already been called, here:
#0  ~PluginModuleParent (this=0x7f7464c11400) at ../../../src/dom/plugins/PluginModuleParent.cpp:81
#1  0x00007f747887da55 in ~nsNPAPIPlugin (this=0x7f7464c097a0) at ../../../../../src/modules/plugin/base/src/nsNPAPIPlugin.cpp:290
#2  0x00007f747887ce20 in nsNPAPIPlugin::Release (this=0x7f7464c097a0) at ../../../../../src/modules/plugin/base/src/nsNPAPIPlugin.cpp:221
#3  0x00007f7478890e1e in nsPluginTag::TryUnloadPlugin (this=0x7f7464c27200) at ../../../../dist/include/nsCOMPtr.h:640
#4  0x00007f7478887d79 in nsPluginHost::Destroy (this=0x7f7464ff3e80) at ../../../../../src/modules/plugin/base/src/nsPluginHost.cpp:2218
#5  0x00007f747888d15b in nsPluginHost::Observe (this=0x7f7464ff3e80, aSubject=<value optimized out>, aTopic=0x7f7478b93ca0 "xpcom-shutdown", 
    someData=<value optimized out>) at ../../../../../src/modules/plugin/base/src/nsPluginHost.cpp:4673
Assignee: benjamin → jones.chris.g

Updated

8 years ago
Blocks: 531142
Created attachment 421499 [details] [diff] [review]
Add an extra EnsureProcessTerminated() parameter to control how lenient to be wrt child shutdown
Attachment #421499 - Flags: review?(bent.mozilla)
Created attachment 421501 [details] [diff] [review]
Use lenient reaping for NS_BUILD_REFCNT_LOGGING builds

This is at dbaron's suggestion.  Any other conditions under which we want lenient reaping?
Attachment #421501 - Flags: review?(benjamin)

Comment 10

8 years ago
Comment on attachment 421501 [details] [diff] [review]
Use lenient reaping for NS_BUILD_REFCNT_LOGGING builds

the ifdefs are kinda ugly, but ok
Attachment #421501 - Flags: review?(benjamin) → review+
Comment on attachment 421499 [details] [diff] [review]
Add an extra EnsureProcessTerminated() parameter to control how lenient to be wrt child shutdown

I'd sub the 'grim' param with 'force' since it's not immediately clear what 'grim' means. But r=me in any case.
Attachment #421499 - Flags: review?(bent.mozilla) → review+
(In reply to comment #10)
> (From update of attachment 421501 [details] [diff] [review])
> the ifdefs are kinda ugly, but ok

Agreed.  I don't really know why this isn't also a problem on Windows, but I'm no win32 API guru.

Comment 14

8 years ago
Please leave open until it hits m-c.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Status: REOPENED → ASSIGNED
Version: unspecified → Trunk
this looks like related? it's about "plugin process"
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263476610.1263482358.30039.gz
Yeah, that's this.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1263576045.1263578203.23322.gz
Linux mozilla-central debug test mochitest-other on 2010/01/15 09:20:45
s: moz2-linux-slave19

Comment 18

8 years ago
http://hg.mozilla.org/mozilla-central/rev/be2e27cd572e
http://hg.mozilla.org/mozilla-central/rev/ad6b1de0470c
Status: ASSIGNED → RESOLVED
Last Resolved: 8 years ago8 years ago
Resolution: --- → FIXED

Updated

8 years ago
Flags: in-testsuite+
Target Milestone: --- → mozilla1.9.3a1

Comment 20

8 years ago
Phil, could you file a new bug? We found and fixed the most common cause, but it's possible there are others lurking!
(In reply to comment #19)
> Is
> http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1264019073.1264020660.32626.gz#err6
> this, post-push, or something else doing the same thing?

This error is on Windows.  The fix in this bug was Linux-only.

(In reply to comment #12)
> (In reply to comment #10)
> > (From update of attachment 421501 [details] [diff] [review] [details])
> > the ifdefs are kinda ugly, but ok
> 
> Agreed.  I don't really know why this isn't also a problem on Windows, but I'm
> no win32 API guru.

Can a Windows guy take a look at this?
Blocks: 540967
To the bug 540967 cave, Robin!
Keywords: intermittent-failure
Whiteboard: [orange]
You need to log in before you can comment on or make changes to this bug.