If you think a bug might affect users in the 57 release, please set the correct tracking and status flags for Release Management.

Mach sometimes hits an IOError when building rapidly

RESOLVED FIXED in mozilla24

Status

()

Core
Build Config
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: johns, Assigned: gps)

Tracking

unspecified
mozilla24
x86_64
Linux
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [mach])

Attachments

(1 attachment)

(Reporter)

Description

4 years ago
While compiling with -j64 in a directory that would have a 100% ccache hitrate, causing really fast file spew:

>  1:35.54 nsMacGreekToUnicode.cpp
>  1:35.54 nsMacGujaratiToUnicode.cpp
>  1:35.55 nsMacGurmukhiToUnicode.cpp
>  1:35.55 nsMacHebrewToUnicode.cpp
>  1:35.55 nsMacIcelandicToUnicode.cpp
> TIER: base nspr nss js platform app SUBTIER: export libs tools DIRECTORIES: 03/81 (intl)Exception in thread Thread-1:
> Traceback (most recent call last):
>   File "/usr/lib/python2.7/threading.py", line 808, in __bootstrap_inner
>     self.run()
>   File "/usr/lib/python2.7/threading.py", line 761, in run
>     self.__target(*self.__args, **self.__kwargs)
>   File "/home/johns/moz/moz-git/testing/mozbase/mozprocess/mozprocess/processhandler.py", line 698, in _processOutput
>     self.processOutputLine(line.rstrip())
>   File "/home/johns/moz/moz-git/testing/mozbase/mozprocess/mozprocess/processhandler.py", line 663, in processOutputLine
>     handler(line)
>   File "/home/johns/moz/moz-git/python/mach/mach/mixin/process.py", line 86, in handleLine
>     line_handler(line)
>   File "/home/johns/moz/moz-git/python/mozbuild/mozbuild/mach_commands.py", line 244, in on_line
>     self.log(logging.INFO, 'build_output', {'line': line}, '{line}')
>   File "/home/johns/moz/moz-git/python/mach/mach/mixin/logging.py", line 54, in log
>     extra={'action': action, 'params': params})
>   File "/usr/lib/python2.7/logging/__init__.py", line 1213, in log
>     self._log(level, msg, args, **kwargs)
>   File "/usr/lib/python2.7/logging/__init__.py", line 1268, in _log
>     self.handle(record)
>   File "/usr/lib/python2.7/logging/__init__.py", line 1278, in handle
>     self.callHandlers(record)
>   File "/usr/lib/python2.7/logging/__init__.py", line 1318, in callHandlers
>     hdlr.handle(record)
>   File "/usr/lib/python2.7/logging/__init__.py", line 749, in handle
>     self.emit(record)
>   File "/home/johns/moz/moz-git/python/mozbuild/mozbuild/mach_commands.py", line 73, in emit
>     self.fh.write('\n')
> IOError: [Errno 5] Input/output error

mach then sat hung until interrupted. Resuming the build worked as expected.
Achievement unlocked.

I have no clue what's going on here. It smells like a multi-threading issue, but the GIL should protect us. Hmm...

Comment 2

4 years ago
This does look vaguely though frighteningly familiar.  What does `python --version` say?
(Reporter)

Comment 3

4 years ago
(In reply to Jeff Hammel [:jhammel] from comment #2)
> This does look vaguely though frighteningly familiar.  What does `python
> --version` say?

Python 2.7.5

Comment 4

4 years ago
Hmmm.....well, that should be about as good as possible :(  Searching near and far has so far not yielded me results.
Created attachment 764241 [details] [diff] [review]
Acquire lock before writing to terminal

I'm guessing this is a multi-threading issue that occurs when trying to
.write() to a fd when a write is in progress. Python's logging
infrastructure has the ability to lock, so let's use it.
Attachment #764241 - Flags: review?(ted)
Assignee: nobody → gps
Attachment #764241 - Flags: review?(ted) → review+
https://hg.mozilla.org/integration/mozilla-inbound/rev/aa1227e3d6ba
Target Milestone: --- → mozilla24
Also, -j64 is insane. I have data showing that -j greater than number of cores doesn't yield a speedup. If -j is *much* greater than number of cores, you're just creating excessive context switching and resource contention and likely making the build slower as a result. I welcome data proving me wrong. You can use my patches in bug 883209 to record and visualize resource usage when building.
Moving component because this was an issue with the |mach build| command and not the mach core.
Status: NEW → ASSIGNED
Component: mach → Build Config
Whiteboard: [mach]
(Reporter)

Comment 9

4 years ago
(In reply to Gregory Szorc [:gps] from comment #7)
> Also, -j64 is insane. I have data showing that -j greater than number of
> cores doesn't yield a speedup. If -j is *much* greater than number of cores,
> you're just creating excessive context switching and resource contention and
> likely making the build slower as a result. I welcome data proving me wrong.
> You can use my patches in bug 883209 to record and visualize resource usage
> when building.

I have -j64 because I'm using distcc with 32 cores. It actually does hurt a bit when doing non-distributable things like idl processing (it'd be neat to have separate job limits for compiler vs other jobs) but is needed to saturate the distcc network
https://hg.mozilla.org/mozilla-central/rev/aa1227e3d6ba
Status: ASSIGNED → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
(Reporter)

Updated

4 years ago
Blocks: 886498
You need to log in before you can comment on or make changes to this bug.