Closed
Bug 1093664
Opened 10 years ago
Closed 9 years ago
Intermitent Windows 8 Build fail with LINK : fatal error LNK1102: out of memory
Categories
(Firefox Build System :: General, defect)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: cbook, Unassigned)
References
()
Details
(Keywords: intermittent-failure)
WINNT 6.1 x86-64 mozilla-central pgo-build https://treeherder.mozilla.org/ui/logviewer.html#?job_id=582941&repo=mozilla-central Finished generating code LINK : fatal error LNK1102: out of memory not sure if there is anything we can do here.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 2•10 years ago
|
||
Well that's exciting. I'm guessing this is a bug in the 64-bit compiler?
Blocks: 1084162
Comment hidden (Legacy TBPL/Treeherder Robot) |
Updated•10 years ago
|
Flags: needinfo?(dmajor)
Uh-oh. I was hoping it would be a one-off, but a second one makes me worry. I don't know of any good solution to this other than bring back MSVC_ENABLE_PGO (possibly used more liberally) for Win64. :-( Anyone have any better ideas?
Flags: needinfo?(dmajor)
Comment 5•10 years ago
|
||
In bug 1093355, sfink is limiting the rooting analysis to -j4 to avoid using too much memory. Could we try that here, or is this a single process gobbling up all the memory on the builders?
It's a single instance of link.exe -- on my machine I've seen it take 5 or 6 GB.
So I guess one question is whether the machines are *actually* running out of memory or whether this is code for "some internal array reached some hardcoded limit".
Hmm it's interesting that in both cases we failed at the PGINSTRUMENT phase. I woulda thought that would be the easier of the two links!
I wonder if this could be useful. I'll try some local builds to see if it brings down peak memory usage. http://msdn.microsoft.com/en-us/library/dn655038.aspx The /CGTHREADS option specifies the maximum number of threads cl.exe uses in parallel for the optimization and code-generation phases of compilation when link-time code generation (/LTCG) is specified. By default, cl.exe uses four threads, as if /CGTHREADS:4 were specified. If more processor cores are available, a larger number value can improve build times.
Comment 10•10 years ago
|
||
How much physical memory do the win64 builders have? Do they have swap disabled? Something to consider: nowadays, other things can happen while xul.dll is linking. If the linker likes to suck all the memory it can and other things on the side sucks some memory, overall, that could be a problem.
Comment 11•10 years ago
|
||
(In reply to David Major [:dmajor] (UTC+13) from comment #8) > Hmm it's interesting that in both cases we failed at the PGINSTRUMENT phase. > I woulda thought that would be the easier of the two links! IME it always was, although I haven't looked in a long time. It was the PGUPDATE link that ate memory/cpu. (In reply to David Major [:dmajor] (UTC+13) from comment #7) > So I guess one question is whether the machines are *actually* running out > of memory or whether this is code for "some internal array reached some > hardcoded limit". Yeah, I'm pretty sure we hit bugs like that in the x86 PGO linker before, where it simply overflowed some internal limit and died. I guess the only way to figure this out would be to try to reproduce and send MS a linkrepro?
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 13•10 years ago
|
||
Mark, how much physical memory and pagefile are on b-2008-ix-0012 and -0013? (And is it the same for all Windows builders?)
Flags: needinfo?(mcornmesser)
Comment 14•10 years ago
|
||
The physical memory is 4GB. The paging file is 4048MB. This should be same across all the 2008 machines.
Flags: needinfo?(mcornmesser)
Comment 15•10 years ago
|
||
Seems totally possible that we could exhaust 8GB total physical+swap.
Comment 16•10 years ago
|
||
Wow! That's surprisingly low for year 2014.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 30•10 years ago
|
||
So....we're pretty much hosed here is what everybody is saying?
Comment 31•10 years ago
|
||
Though I suppose there's no reason to panic since we aren't permafailing yet.
Comment 32•10 years ago
|
||
Well, it sounds like there's no 'no cost' solution here at least. Would it be possible to give the existing machines more RAM, or are they already maxed out?
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 38•9 years ago
|
||
As a stopgap we should try doubling the pagefile to 8192MB. But long-term, if we stay at 4GB RAM, we're gonna have a bad time.
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 43•9 years ago
|
||
Out of curiosity, what would it take for us to upgrade the physical RAM in all of our Windows build slaves? Why do I have this recollection that we actually did something similar once long ago?
Flags: needinfo?(laura)
Comment 44•9 years ago
|
||
Of course, this also might get rendered moot if we can switch to building on AWS with MSVC2013 Community Edition. Ted, is there a bug for that?
Flags: needinfo?(ted)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 51•9 years ago
|
||
Jordan, is b-2008-ix-0051 on your list?
Flags: needinfo?(laura) → needinfo?(jlund)
Comment 52•9 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #51) > Jordan, is b-2008-ix-0051 on your list? 0051 is in try pool. I only looked at build pool in https://bugzilla.mozilla.org/show_bug.cgi?id=1122975#c6 try pool audit will be completed in https://bugzil.la/1125870 and, depending on how many show up with only 4gb of RAM, we will also be requesting a RAM bump for those too. fwiw I just checked 0051 by hand and can confirm it only has 4gb
Flags: needinfo?(jlund)
Comment hidden (Legacy TBPL/Treeherder Robot) |
Comment 54•9 years ago
|
||
Inactive; closing (see bug 1180138).
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
Updated•6 years ago
|
Product: Core → Firefox Build System
You need to log in
before you can comment on or make changes to this bug.
Description
•