Closed Bug 337887 Opened 18 years ago Closed 18 years ago

mingw build failure in sslsock.c

Categories

(NSS :: Build, defect, P1)

3.11
x86
Windows XP

Tracking

(Not tracked)

RESOLVED FIXED
3.11.2

People

(Reporter: d_king, Unassigned)

References

Details

Attachments

(4 files, 1 obsolete file)

With WinXP SP2 cygwin MinGW I have started getting this build error :- /cygdrive/e/mozilla/source/mozilla/build/cygwin-wrapper gcc -mtune=athlon-xp -ma rch=athlon-xp -mmmx -msse -m3dnow -mno-cygwin -o e:/mozilla/source/obj/mozilla/n ss/ssl/sslsock.o -c -g -mno-cygwin -mms-bitfields -DXP_PC -DDEBUG -D_DEBUG -UNDE BUG -DDEBUG_David -DWIN32 -D_WINDOWS -D_X86_ -DWIN95 -DIN_LIBSSL -Ie:/mozilla/so urce/obj/mozilla/dist/include/nspr -Ie:/mozilla/source/obj/mozilla/dist/include -Ie:/mozilla/source/obj/mozilla/dist/public/nss -Ie:/mozilla/source/obj/mozilla /dist/private/nss -Ie:/mozilla/source/obj/mozilla/dist/include sslsock.c sslsock.c:1876: error: initializer element is not constant sslsock.c:1876: error: (near initialization for `ssl_methods.acceptread') sslsock.c:1884: error: initializer element is not constant sslsock.c:1884: error: (near initialization for `ssl_methods.sendfile') make[6]: *** [e:/mozilla/source/obj/mozilla/nss/ssl/sslsock.o] Error 1 make[6]: Leaving directory `/cygdrive/e/mozilla/source/mozilla/security/nss/lib/ ssl'
NSS version is what is used in current Mozilla/SeaMonkey trunk.
Assignee: dveditz → nobody
Component: Security → Libraries
Product: Mozilla Application Suite → NSS
QA Contact: seamonkey → libraries
Version: Trunk → unspecified
The address of an external function is certainly a constant. What version of gcc are you using?
gcc 3.4.5
Attached patch Proposed patchSplinter Review
David, please try this patch. Nelson, the two external functions in question (PR_EmudateAcceptRead and PR_EmulateSendFile) are defined in another shared library (libnspr4.so/nspr4.dll). Perhaps the compiler doesn't consider the functions' addresses constant because of the relocation of shared libraries. This patch is my alternate patch described in bug 318217 comment 1 and proposed in bug 318217 comment 2, although the constness of external function symbols wasn't the reason I proposed this alternate patch then.
Attachment #222052 - Flags: superreview?(rrelyea)
Attachment #222052 - Flags: review?(nelson)
*** Bug 338015 has been marked as a duplicate of this bug. ***
My vote is to mark this bug invalid. This seems to be a compiler bug. The code is correct. We're able to compile it with gcc on numerous platforms, including linux. The fact that a certain version of gcc is unable to properly build it should not cause us to change correct code. I think the submittor should get a working gcc.
The way that calls to other shared libs are handled on all Windows platforms is that a table of JUMP instructions is created at the beginning (or end) of each shared lib. Those jump instructions jump to functions in other shared libs. The run time loader changes the addresses in those jump instructs when the shared lib is loaded. Inside the share lib, calls to those other shared libs' functions emit code to call the jump instruction in the local shared lib. The address of the jump instruction in the local shared lib is always a constant. So, the compiler that claims that the address of an external function is not a constant is just broken. It needs to be replaced.
Comment on attachment 222052 [details] [diff] [review] Proposed patch I oppose forcing the NSS developers to reduce the subset of valid C that they may use when coding NSS, every time that some broken c compiler is found in the wild. I think we must stand our ground and force the eradication of such compilers.
Attachment #222052 - Flags: review?(nelson) → review-
Comment on attachment 222052 [details] [diff] [review] Proposed patch My build completes with this patch.
(In reply to comment #9) > (From update of attachment 222052 [details] [diff] [review] [edit]) > My build completes with this patch. Chris, Which version of gcc are you using? I'm on v3.4.5, and can downgrade to v3.4.2, but I'd rather know which version(s) are supported by NSS before I spend all my time finding out the hard way.
Nelson, I don't have time to do more than proposing a workaround. David, Chris, Bengt-Erik: perhaps you can demonstrate to Nelson that you have made a reasonable effort to contact the MinGW maintainers and have this MinGW bug fixed.
Comment on attachment 222052 [details] [diff] [review] Proposed patch This patch fixes my build
I found that this problem is documented in the GCC manual. For example, in http://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/Function-Attributes.html#index-g_t_0040code_007b_005f_005fdeclspec_0028dllimport_0029_007d-1639 dllimport ... One drawback to using this attribute is that a pointer to a function or variable marked as dllimport cannot be used as a constant address. The attribute can be disabled for functions by setting the -mnop-fun-dllimport flag.
This blog entry explains what __declspec(dllimport) does: http://blogs.msdn.com/larryosterman/archive/2004/12/14/301225.aspx So what Nelson described in comment 7 is the "thunk". When we declare a function with __declspec(dllimport), the thunk isn't generated. I guess MSVC must have generated the thunk anyway when the code tries to use a function pointer declared with __declspec(dllimport) as a constant address. Perhaps GCC should do the same. We can try compiling the sslsock.c file or the whole mozilla/security/nss/lib/ssl directory with the -mnop-fun-dllimport flag.
Now that I've got to the bottom of this, I can propose a second solution. We manually declare those two functions without the dllimport attribute, so that their addresses can be used to initialize the const ssl_methods structure. David, Chris, please test this patch. Chris, Nelson, please review this patch. Nelson, if you don't like this patch either, please let me know which of the following workarounds is acceptable: - Compile sslsock.c with the -mnop-fun-dllimport flag. - If __MINGW32__ is defined, declare ssl_methods without const. I prefer the original proposed patch because it doesn't require any ifdef in the makefile or C code.
Attachment #222225 - Flags: superreview?(nelson)
Attachment #222225 - Flags: review?(cls)
Wan-Teh, you beat me to the punch. :-) I'm fine with just adding -mnop-fun-dllimport unconditionally to the NSS build to avoid limiting the NSS developers in other modules. Adding this flag causes my build to complete (though I can't test it until I can reboot into XP later).
Attachment #222229 - Flags: review?(wtchang)
Comment on attachment 222229 [details] [diff] [review] Solution 3: add -mnop-fun-dllimport to WIN32.mk r=wtc. Let's add a comment to explain why we added the -mnop-fun-dllimport flag. Otherwise it'll be difficult to ever remove this flag in the future. Are you sure you want to compile the whole NSS with this flag? This flag degrades the performance of calls to functions defined in other DLLs slightly.
Attachment #222229 - Flags: review?(wtchang) → review+
Comment on attachment 222225 [details] [diff] [review] Solution 2: manually declare those two functions without the dllimport attribute My build completes with this patch as well.
Attachment #222225 - Flags: review?(cls) → review+
I'm fine either way as long as the builds work. The mingw NSS build isn't running "optimally" due to bug 202293 anyway. I figured that adding the flag globally would be less intrusive than ifdefs and Nelson made it sound like this sort of construct could be used in other places so the global flag avoids needing to rehash this.
Summary: Build failure in sslsock.c → mingw build failure in sslsock.c
It seems to me that the whole POINT of the dllimport declaration is to cause the generation of the thunk. Without dllimport, will the thunk be generated? Will it link properly, resolving the address to the real function, without it?
I'd like some MINGW user to test solutions 2 and 3 for us, and let us know if they work, and what (if any) warnings or errors they produce. I would _expect_ that the result of removing the dllimport declaration and of setting -mnop-fun-dllimport would be to suppress the generation of the necessary thunks. But if not, and if Solution 3 works and doesn't incur some performance penalty, I'd prefer Solution 3.
(In reply to comment #17) > Are you sure you want to compile the whole NSS with this > flag? This flag degrades the performance of calls to > functions defined in other DLLs slightly. What is the nature of this performance degradation? Why is it worse than a thunk? Or is it exactly the cost of a thunk? (in which case it's fine)
Priority: -- → P1
Target Milestone: --- → 3.11.2
Version: unspecified → 3.11
Comment on attachment 222225 [details] [diff] [review] Solution 2: manually declare those two functions without the dllimport attribute My build compiles fine with this patch.
(In reply to comment #21) > I'd like some MINGW user to test solutions 2 and 3 for us, and let us know > if they work, and what (if any) warnings or errors they produce. > > I would _expect_ that the result of removing the dllimport declaration and > of setting -mnop-fun-dllimport would be to suppress the generation of the > necessary thunks. But if not, and if Solution 3 works and doesn't incur > some performance penalty, I'd prefer Solution 3. > As a MINGW user I have tested solution 3 ( setting -mnop-fun-dllimport ) and can confirm it is working. There were no warnings or errors.
Nelson, your comment 20 and comment 21 showed that your definition of a thunk is different from the definition of a thunk in the blog entry I cited in comment 14. I will use the definition from the blog entry. A thunk is a stub function, e.g., PR_EmulateAcceptRead, whose body consists of the instruction jmp [_imp__EmulateAcceptRead] where _imp__PR_EmulateAcceptRead is a global variable. If PR_EmulateAcceptRead is declared without dllimport, a call to PR_EmulateAcceptRead is implemented as this: call PR_EmulateAcceptRead # call the thunk (stub function) If PR_EmulateAcceptRead is declared with dllimport, a call to PR_EmulateAcceptRead is implemented as this: call [_imp__EmulateAcceptRead] saving the jmp instruction. When we declare PR_EmulateAcceptRead, GCC apparently uses _imp__EmulateAcceptRead as the pointer to PR_EmulateAcceptRead and considers it a non-const function address. So the performance degradation of the -mnop-fun-dllimport flag is that all the calls to NSPR functions from NSS will go through the thunks (stub functions) and incur the overhead of the jmp instructions. This is a negligible performance degradation for NSS. I've checked in this patch on the NSS trunk (3.12) and NSS_3_11_BRANCH (3.11.2). I will check in this patch on the Mozilla trunk and MOZILLA_1_8_BRANCH later this week.
Attachment #222229 - Attachment is obsolete: true
Attachment #222375 - Flags: review?(cls)
Attachment #222375 - Flags: review?(cls) → review+
Comment on attachment 222375 [details] [diff] [review] Solution 3: add -mnop-fun-dllimport to WIN32.mk (as checked in) In reply to comment 24, Actually, my definition of "thunk" is exactly the same as the one you cited. While thunks do add an extra jump instruction to most inter-DLL calls, they also potentially reduce (greatly) the number of places in the code that need to be "fixed up" by the run time loader when DLLs are loaded. All the calls to a particular function share a common thunk, and only that one thunk needs to be "fixed up". The compiler/linker puts all the thunks for a DLL (thunks are associated with the DLL/EXE that makes the calls) together on a page, (er, at least the MSVC compiler does that). Consequently, only that "thunk" page needs to be fixed up by the run time loader, and all the rest of the text segment in the DLL/EXE can be an exact copy of the text segment in the DLL/EXE in the file. This in turn allows multiple processes to share many more identical text pages than they would if each text page had fixups that made it unique. And it means that those text pages needn't be swapped out, since they can just be reloaded directly from the EXE/DLL file. So, I think thunks are generally beneficial to the system as a whole. As for a need to be writable to avoid thunks, I'm not convinced. The run time loader makes the thunk page ReadOnly after the fixups are done. But no matter, thunks are still a better solution, IMO. I'm actually surprised that Unix systems haven't adopted them. Maybe a patent? I think the -mnop-fun-dllimport flag basically makes gcc emulate MSVC with respect to generating thunks, and I think that's good, too. So, I completely agree with checking in "Solution 3", and with making it apply to all of NSS.
Attachment #222375 - Flags: superreview+
Nelson, Sorry, one sentence in my comment 25 has two mistakes. It should read: When we declare PR_EmulateAcceptRead with dllimport, GCC apparently uses [_imp__EmulateAcceptRead] as the pointer to PR_EmulateAcceptRead and considers it a non-const function address. Note the square brackets around _imp__EmulateAcceptRead. The square brackets mean the value at that address. So we only need to fix up the value of the global variable _imp__EmulateAcceptRead. Note that the instruction when dllimport is used is call [_imp__EmulateAcceptRead] not call _imp__EmulateAcceptRead
Comment on attachment 222225 [details] [diff] [review] Solution 2: manually declare those two functions without the dllimport attribute Marking r-, since another patch has already been committed.
Attachment #222225 - Flags: superreview?(nelson) → superreview-
Whiteboard: awaiting checkin on 1.8 branch
I just checked in this patch on the Mozilla trunk (1.9 alpha). Checking in client.mk; /cvsroot/mozilla/client.mk,v <-- client.mk new revision: 1.281; previous revision: 1.280 done
I just checked in the fix on the MOZILLA_1_8_BRANCH (1.8 alpha 3).
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Whiteboard: awaiting checkin on 1.8 branch
Component: Libraries → Build
Attachment #222052 - Flags: superreview?(rrelyea)
Blocks: 1151447
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: