Closed
Bug 509960
Opened 15 years ago
Closed 15 years ago
Occasional link failures on win32 (LINK : fatal error LNK1104: cannot open file '<something>.lib') (usually jsctypes-test.lib)
Categories
(Release Engineering :: General, defect, P3)
Tracking
(Not tracked)
RESOLVED
WORKSFORME
People
(Reporter: jrmuizel, Unassigned)
References
Details
(Keywords: intermittent-failure, Whiteboard: [buildduty])
We seem to be getting occasional link failures on the windows builds that look like:
LINK : fatal error LNK1104: cannot open file 'AccessibleMarshal.lib'
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1250066647.1250070149.28096.gz
or
LINK : fatal error LNK1104: cannot open file 'npwinless.lib'
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1250067338.1250069723.23446.gz
Comment 1•15 years ago
|
||
LINK : fatal error LNK1104: cannot open file 'IA2Marshal.lib'
make[7]: *** [IA2Marshal.dll] Error 80
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1252076099.1252077304.8565.gz&fulltext=1
Comment 2•15 years ago
|
||
This may be a race condition where the library doesn't yet exist when the link command is called. Are the dependencies set correctly? Can you test this with try server and a mozconfig that has -j1 instead of -j4. This changeset looks like it may correct the first link failure
http://hg.mozilla.org/mozilla-central/rev/cccf8041fc0b
Comment 3•15 years ago
|
||
I don't think this belongs in releng; where should we move this bug?
Comment 4•15 years ago
|
||
(In reply to comment #2)
> This may be a race condition where the library doesn't yet exist when the link
> command is called. Are the dependencies set correctly? Can you test this with
> try server and a mozconfig that has -j1 instead of -j4. This changeset looks
> like it may correct the first link failure
>
> http://hg.mozilla.org/mozilla-central/rev/cccf8041fc0b
That changeset is completely unrelated, FWIW. (It's just a packaging manifest change.)
If you look at the linker commandline and output in the build log:
link -NOLOGO -DLL -OUT:IA2Marshal.dll -PDB:IA2Marshal.pdb -SUBSYSTEM:WINDOWS dlldata.obj Accessible2_p.obj AccessibleAction_p.obj AccessibleApplication_p.obj AccessibleComponent_p.obj AccessibleEditableText_p.obj AccessibleHyperlink_p.obj AccessibleHypertext_p.obj AccessibleImage_p.obj AccessibleRelation_p.obj AccessibleTable_p.obj AccessibleText_p.obj AccessibleValue_p.obj Accessible2_i.obj AccessibleAction_i.obj AccessibleApplication_i.obj AccessibleComponent_i.obj AccessibleEditableText_i.obj AccessibleHyperlink_i.obj AccessibleHypertext_i.obj AccessibleImage_i.obj AccessibleRelation_i.obj AccessibleTable_i.obj AccessibleText_i.obj AccessibleValue_i.obj ./module.res -NXCOMPAT -SAFESEH -DYNAMICBASE -MANIFEST:NO -LIBPATH:"e:/builds/moz2_slave/mozilla-central-win32/build/obj-firefox/memory/jemalloc/crtsrc/build/intel" -NODEFAULTLIB:msvcrt -NODEFAULTLIB:msvcrtd -NODEFAULTLIB:msvcprt -NODEFAULTLIB:msvcprtd -DEFAULTLIB:mozcrt19 -DEFAULTLIB:mozcpp19 -DEBUG -OPT:REF -OPT:nowin98 -LTCG:PGINSTRUMENT -DEF:e:/builds/moz2_slave/mozilla-central-win32/build/accessible/public/ia2/IA2Marshal.def kernel32.lib rpcns4.lib rpcrt4.lib ole32.lib oleaut32.lib
Creating library IA2Marshal.lib and object IA2Marshal.exp
LINK : fatal error LNK1104: cannot open file 'IA2Marshal.lib'
You'll note that the linker is supposed to be creating the .lib file (it's the import library for the DLL it's creating). As to why it can't open the file, who knows? Clearly it can write to that directory. If we were out of disk space I think we'd error differently. Either there's something weirder happening here, or this is a compiler bug.
Comment 5•15 years ago
|
||
We will have to deal with this later on when we are freed up.
Component: Release Engineering → Release Engineering: Future
Comment 6•15 years ago
|
||
Ted thanks for the last comment
Reporter | ||
Comment 7•15 years ago
|
||
Comment 8•15 years ago
|
||
Is this possibly the same as bug 419445?
Comment 9•15 years ago
|
||
It looks like it, though I believe it'll increase the scope from just VS2008 to VS2008+VS2005.
Reporter | ||
Comment 10•15 years ago
|
||
Comment 11•15 years ago
|
||
This syndrome -- intermittent failures with a nonspecific error message -- makes me think that the problem is at the operating system level; for instance we might be hitting the global limit on the number of open files. I wish the linker printed the GetLastError() message instead of just "cannot open file".
Question for RelEng, does this consistently happen on one particular machine?
Comment 12•15 years ago
|
||
[ N.B. I don't know if Windows *has* a global limit on the number of open files. ]
Comment 13•15 years ago
|
||
It's certainly possible that this is a compiler or OS bug, or some other weird edge case that we're running into. It's just really hard to figure anything out without being able to consistently reproduce.
I agree that VC++ has terrible error messages, FWIW. :-/
Comment 14•15 years ago
|
||
Comment 15•15 years ago
|
||
No, that's not the same thing. Look at the error message! That appears to be fallout from bug 518107.
Comment 16•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.6/1254235707.1254237101.22872.gz&fulltext=1
WINNT 5.2 mozilla-1.9.2 build on 2009/09/29 07:48:27
LINK : fatal error LNK1104: cannot open file 'jsctypes-test.lib'
make[6]: *** [jsctypes-test.dll] Error 80
make[6]: *** Deleting file `jsctypes-test.dll'
make[5]: *** [libs] Error 2
make[4]: *** [libs_tier_gecko] Error 2
make[3]: *** [tier_gecko] Error 2
make[2]: *** [default] Error 2
make[1]: Leaving directory `/e/builds/moz2_slave/mozilla-1.9.2-win32/build'
make[1]: *** [build] Error 2
make: *** [profiledbuild] Error 2
Comment 17•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1256641666.1256650763.4307.gz
WINNT 5.2 mozilla-central build on 2009/10/27 04:07:46
LINK : fatal error LNK1104: cannot open file 'npwinless.lib'
OS: Mac OS X → Windows NT
Comment 18•15 years ago
|
||
There's a possibility that we're running out of disk here... bug 522719 . 8 and 9 gb free before the build, which would seem like enough, but I'm not sure how much temp space we use.
Otherwise, this might be a dupe of bug 419445 .
Comment 19•15 years ago
|
||
So we could be running out of space on C here. moz2-win32-slave31 just died with a similar error, and had 681 MB free on C.
Comment 22•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.6/1257444130.1257448311.29529.gz
WINNT 5.2 mozilla-1.9.2 build on 2009/11/05 10:02:10
LINK : fatal error LNK1104: cannot open file 'jsctypes-test.lib'
Also, let's not have three open bugs for this. I merged the other two into this one as it seems to have more analysis.
Updated•15 years ago
|
Whiteboard: [orange]
Updated•15 years ago
|
OS: Windows NT → Windows Server 2003
Comment 23•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1258316917.1258317506.17115.gz
WINNT 5.2 mozilla-central build on 2009/11/15 12:28:37
"s: moz2-win32-slave06"
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1258644534.1258645310.19196.gz
WINNT 5.2 mozilla-central build on 2009/11/19 07:28:54
"s: moz2-win32-slave41"
LINK : fatal error LNK1104: cannot open file 'jsctypes-test.lib'
make[6]: *** [jsctypes-test.dll] Error 80
Comment 25•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.6/1258669091.1258670404.18892.gz
This seems to be happening a bunch in 'jsctypes-test.lib', which really doesn't have much special about it - just an ordinary little test library, built as part of unit tests. This makes me think that there's a race going on, and maybe our dependencies are a little messed up.
Comment 26•15 years ago
|
||
I don't know what else would be racing with it. Only the linker produces the import library, nothing else does, and I don't see any other suspicious commands when I look at these logs.
Comment 27•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1259178814.1259180025.31314.gz
WINNT 5.2 mozilla-central build on 2009/11/25 11:53:34
"s: moz2-win32-slave09"
LINK : fatal error LNK1104: cannot open file 'jsctypes-test.lib'
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1260653501.1260654862.24272.gz
WINNT 5.2 mozilla-central build on 2009/12/12 13:31:41
LINK : fatal error LNK1104: cannot open file 'jsctypes-test.lib'
Summary: Occasional link failures on win32 (LINK : fatal error LNK1104: cannot open file '<something>.lib') → Occasional link failures on win32 (LINK : fatal error LNK1104: cannot open file '<something>.lib') (usually jsctypes-test.lib)
Comment 29•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1260786098.1260792840.11121.gz
s: win32-slave43
LINK : fatal error LNK1104: cannot open file 'crashinjectdll.lib'
Comment 30•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Places/1260817454.1260823020.6663.gz&fulltext=1
s: win32-slave12
builder: WINNT 5.2 places leak test build
LINK : fatal error LNK1104: cannot open file 'e:\builds\moz2_slave\places-win32-debug\build\obj-firefox\dist\lib\sqlite3.lib'
make[6]: Leaving directory `/e/builds/moz2_slave/places-win32-debug/build/security/nss/lib/softoken'
make[6]: *** [e:/builds/moz2_slave/places-win32-debug/build/obj-firefox/nss/softokn/softokn3.dll] Error 80
Updated•15 years ago
|
Whiteboard: [orange] → [orange][buildduty]
Updated•15 years ago
|
Component: Release Engineering: Future → Release Engineering
Comment 31•15 years ago
|
||
I am trying to see if there are re-incidental slaves but the first seven logs were unavailable.
From the 4 logs without slave name mentioned on the comment:
* win32-slave10
* WIN32-SLAVE36
* moz2-win32-slave37
* WIN32-SLAVE40
Restating slave names starting from comment 19:
* moz2-win32-slave06
* moz2-win32-slave09
* win32-slave12
* moz2-win32-slave31
* moz2-win32-slave41
* win32-slave43
I would like to discourage the theory of machine specific problem.
Comment 32•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey2.0/1261086893.1261087520.19828.gz
WINNT 5.2 comm-1.9.1 build on 2009/12/17 13:54:53
s: cb-seamonkey-win32-02
{
LINK : fatal error LNK1104: cannot open file 'xpcom.lib'
}
Comment 33•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1261267156.1261267864.18833.gz
WINNT 5.2 comm-central-trunk build on 2009/12/19 15:59:16
s: cb-seamonkey-win32-01
LINK : fatal error LNK1104: cannot open file 'xpcom.lib'
Comment 34•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=SeaMonkey/1261440351.1261441593.17997.gz
WINNT 5.2 comm-central-trunk build on 2009/12/21 16:05:51
s: cb-seamonkey-win32-01
LINK : fatal error LNK1104: cannot open file 'jsctypes-test.lib'
Comment 35•15 years ago
|
||
I posted a question in the MSDN forums about this: http://social.msdn.microsoft.com/Forums/en-US/vcgeneral/thread/1c0fb80c-6227-400d-8470-c51f42bdf7c1
Not really expecting to get an answer there but it can't hurt.
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1261522328.1261523506.31957.gz
WINNT 5.2 mozilla-central build on 2009/12/22 14:52:08
s: win32-slave06
LINK : fatal error LNK1104: cannot open file 'jsctypes-test.lib'
Comment 37•15 years ago
|
||
Looking back through the logs here, I found one instance where the build was a clobber: comment 29, http://tinderbox.mozilla.org/showlog.cgi?tree=Firefox&errorparser=unix&logfile=1260786098.1260792840.11121.gz&buildtime=1260786098&buildname=WINNT%205.2%20mozilla-central%20build&fulltext=1
So we can eliminate the possibility that the .lib file is being held open and can't be deleted or modified, given that it's not there to begin with.
Comment 38•15 years ago
|
||
(In reply to comment #37)
> Looking back through the logs here, I found one instance where the build was a
> clobber: comment 29,
> http://tinderbox.mozilla.org/showlog.cgi?tree=Firefox&errorparser=unix&logfile=1260786098.1260792840.11121.gz&buildtime=1260786098&buildname=WINNT%205.2%20mozilla-central%20build&fulltext=1
>
> So we can eliminate the possibility that the .lib file is being held open and
> can't be deleted or modified, given that it's not there to begin with.
That depends on if this was a PGO build, and if so, if this was the first or second pass through.
Comment 39•15 years ago
|
||
Ah. You're right, on the clobber it happened on the second pass. This hints at the pre-existence of the .lib file being relevant, but it's not strong evidence.
On Zack's MSDN forum post, someone suggested adding a rule to remove the .lib file (and friends?) before invoking 'link'. I'm not sure if this would work, in that case that the problem is said file being held open. Or whether it would interfere with PGO, but I'm guessing that data is stored in the .pdb file. How does this idea sound?
Comment 40•15 years ago
|
||
I think removing the import library before linking ought to be safe, but it's not something I've ever tested. I would assume that the linker would just regenerate it, given that it says it's going to in the output.
Comment 41•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox3.6/1262161104.1262161817.17143.gz&fulltext=1
LINK : fatal error LNK1104: cannot open file 'jsctypes-test.lib'
s: win32-slave11
Comment 42•15 years ago
|
||
http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1262209619.1262220995.6646.gz
WINNT 5.2 mozilla-central build on 2009/12/30 13:46:59
s: win32-slave26
LINK : fatal error LNK1104: cannot open file 'npwinless.lib'
Comment 43•15 years ago
|
||
This isn't being worked on at this time, off to the Future.
Component: Release Engineering → Release Engineering: Future
Comment 44•15 years ago
|
||
Mass move of bugs from Release Engineering:Future -> Release Engineering. See
http://coop.deadsquid.com/2010/02/kiss-the-future-goodbye/ for more details.
Component: Release Engineering: Future → Release Engineering
Priority: -- → P3
Comment 45•15 years ago
|
||
Comment 46•15 years ago
|
||
We could still hit this if a VM gets a build, but the combination of prioritizing ix machines for builds and running with -j1 has made this much less common.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → WORKSFORME
Assignee | ||
Updated•12 years ago
|
Keywords: intermittent-failure
Assignee | ||
Updated•12 years ago
|
Whiteboard: [orange][buildduty] → [buildduty]
Assignee | ||
Updated•11 years ago
|
Product: mozilla.org → Release Engineering
You need to log in
before you can comment on or make changes to this bug.
Description
•