Closed Bug 523894 Opened 11 years ago Closed 10 years ago

"Serious fd usage error" when running mochitest-ipcplugins on linux

Categories

(Core :: IPC, defect)

All
Linux
defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: jgriffin, Unassigned)

References

Details

Attachments

(1 file)

When running the mochitest plugin tests with OOP plugins enabled under linux, the tests cannot complete because of errors such as the following which immediately appear:

WARNING: Serious fd usage error 14: 'glib warning', file /home/cltbld/electrolysis/src/toolkit/xre/nsSigHandlers.cpp, line 199

** (Gecko:895): WARNING **: Serious fd usage error 14
WARNING: Serious fd usage error 12: 'glib warning', file /home/cltbld/electrolysis/src/toolkit/xre/nsSigHandlers.cpp, line 199

** (Gecko:895): WARNING **: Serious fd usage error 12

The same tests pass normally with OOP plugins disabled.  

Test command-line:

python runtests.py --setpref=dom.ipc.plugins.enabled=true --test-path=modules/plugin/test --autorun
Blocks: 523208
cjones, help
jgriffin, do you get these errors when running the tests locally, on your dev machine, or are these errors that show up on the Tinderboxen?

In the mean time, I'm going to try to reproduce this locally.

What I've found so far: it appears that the error codes are

  G_FILE_ERROR_NOSPC,  // == 12

  G_FILE_ERROR_MFILE,  // == 14

which the glib docs describe as

G_FILE_ERROR_NOSPC
	No space left on device; write operation on a file failed because the disk is full.

G_FILE_ERROR_MFILE
	The current process has too many files open and can't open any more. Duplicate descriptors do count toward this limit.

I need more information to diagnose why these errors might be occurring.
cjones, these occur on my local development machine, a Windows 7 VM.  It definitely is not the case that my disk is full, I have 21GB free space.
scratch that, this problem is from a CentOS 5 VM.
Hmm ... I can't repro on my Ubuntu 9.04 machine.  Does your CentOS 5 VM have enough disk space?
It seems like it should be enough:

Filesystem           1K-blocks      Used Available Use% Mounted on
/dev/sda1             18671036  13972420   3734736  79% /
tmpfs                   521724         0    521724   0% /dev/shm
/dev/sda2               507432    143448    337880  30% /var
.host:/              312235312 249193988  63041324  80% /mnt/hgfs
Can you re-run these tests and attach a full log of the run here?
Attached file log
Log attached.  Apparently the problem occurs because it cannot launch a child process.  I'll make a new build and try again.
Huh, that's bad.  It's trying to exec "/".  ISTR this being fixed a while back; how old is your build?
Actually, ISTR smaug running into this on FC ... 9?  Last I recall, the bug was narrowed down to some STL string initialization problem, IIRC.

smaug, what ended up happening with this?
I just pulled the latest code from the repo and made a fresh build, and this still occurs.
OK, sounds like the libstdc++ issue then.  Hopefully smaug can comment.
I had two different problems.
(1) E10s doesn't compile at all on older gcc/64bit linux
and
(2) with newer gcc ac_add_options --enable-debug="-ggdb -feliminate-unused-debug-symbols" must not be used, or you get the "/" problem.
Thanks!

jgriffin, looks like the temporary solution is to upgrade your linux distro.  See https://wiki.mozilla.org/Content_Processes/Build .  But if e10s is going to be merged into m-c, we ought to fix this.  I'll spin off another bug.  Please close this one if the distro upgrade fixes your problem.
(In reply to comment #14)
> Thanks!
> 
> jgriffin, looks like the temporary solution is to upgrade your linux distro. 
> See https://wiki.mozilla.org/Content_Processes/Build .  But if e10s is going to
> be merged into m-c, we ought to fix this.  I'll spin off another bug.  Please
> close this one if the distro upgrade fixes your problem.

Hey Cjones & Jgriffin, the tinderbox uses CentOS 5 as the build & test reference platform [1].  Can we make doubly certain that this isn't happening on the tinderbox builds? If we need to upgrade the standard build & test reference platform that's  a giant schedule risk and we'll need to let RelEng know ASAP.

[1] https://wiki.mozilla.org/ReferencePlatforms
Just finished debugging this.  We were walking a fine line with std::wstring in our -fshort-wchar builds.  In DEBUG, we were OK, and in OPT, due to inlining, we weren't.

If I'd just asked for mozconfigs in the first place, this could have been fixed weeks ago.  Live and learn.
Pushed http://hg.mozilla.org/projects/electrolysis/rev/96c251dc41d8.

This should have blocked m-c merge, but I don't see any point in marking it retroactively.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.