Closed Bug 337887 Opened 14 years ago Closed 13 years ago

mingw build failure in sslsock.c

Categories

(NSS :: Build, defect, P1, major)

3.11
x86
Windows XP

Tracking

(Not tracked)

RESOLVED FIXED
3.11.2

People

(Reporter: d_king, Unassigned)

References

Details

Attachments

(4 files, 1 obsolete file)

With WinXP SP2 cygwin MinGW I have started getting this build error :-

/cygdrive/e/mozilla/source/mozilla/build/cygwin-wrapper gcc -mtune=athlon-xp -ma
rch=athlon-xp -mmmx -msse -m3dnow -mno-cygwin -o e:/mozilla/source/obj/mozilla/n
ss/ssl/sslsock.o -c -g -mno-cygwin -mms-bitfields -DXP_PC -DDEBUG -D_DEBUG -UNDE
BUG -DDEBUG_David -DWIN32 -D_WINDOWS -D_X86_ -DWIN95 -DIN_LIBSSL -Ie:/mozilla/so
urce/obj/mozilla/dist/include/nspr -Ie:/mozilla/source/obj/mozilla/dist/include
 -Ie:/mozilla/source/obj/mozilla/dist/public/nss -Ie:/mozilla/source/obj/mozilla
/dist/private/nss -Ie:/mozilla/source/obj/mozilla/dist/include   sslsock.c
sslsock.c:1876: error: initializer element is not constant
sslsock.c:1876: error: (near initialization for `ssl_methods.acceptread')
sslsock.c:1884: error: initializer element is not constant
sslsock.c:1884: error: (near initialization for `ssl_methods.sendfile')
make[6]: *** [e:/mozilla/source/obj/mozilla/nss/ssl/sslsock.o] Error 1
make[6]: Leaving directory `/cygdrive/e/mozilla/source/mozilla/security/nss/lib/
ssl'
NSS version is what is used in current Mozilla/SeaMonkey trunk.
Assignee: dveditz → nobody
Component: Security → Libraries
Product: Mozilla Application Suite → NSS
QA Contact: seamonkey → libraries
Version: Trunk → unspecified
The address of an external function is certainly a constant.
What version of gcc are you using?
gcc 3.4.5
Attached patch Proposed patchSplinter Review
David, please try this patch.

Nelson, the two external functions in question
(PR_EmudateAcceptRead and PR_EmulateSendFile) are
defined in another shared library (libnspr4.so/nspr4.dll).
Perhaps the compiler doesn't consider the functions'
addresses constant because of the relocation of shared
libraries.  This patch is my alternate patch described
in bug 318217 comment 1 and proposed in bug 318217 comment
2, although the constness of external function symbols
wasn't the reason I proposed this alternate patch then.
Attachment #222052 - Flags: superreview?(rrelyea)
Attachment #222052 - Flags: review?(nelson)
*** Bug 338015 has been marked as a duplicate of this bug. ***
My vote is to mark this bug invalid.
This seems to be a compiler bug.
The code is correct.
We're able to compile it with gcc on numerous platforms, including linux.
The fact that a certain version of gcc is unable to properly build it 
should not cause us to change correct code.
I think the submittor should get a working gcc.
The way that calls to other shared libs are handled on all Windows platforms
is that a table of JUMP instructions is created at the beginning (or end) 
of each shared lib.  Those jump instructions jump to functions in other
shared libs.  The run time loader changes the addresses in those jump 
instructs when the shared lib is loaded.  Inside the share lib, calls to
those other shared libs' functions emit code to call the jump instruction
in the local shared lib.  The address of the jump instruction in the local
shared lib is always a constant.  

So, the compiler that claims that the address of an external function is 
not a constant is just broken.  It needs to be replaced.
Comment on attachment 222052 [details] [diff] [review]
Proposed patch

I oppose forcing the NSS developers to reduce the subset of valid C
that they may use when coding NSS, every time that some broken c
compiler is found in the wild.  
I think we must stand our ground and force the eradication of such
compilers.
Attachment #222052 - Flags: review?(nelson) → review-
Comment on attachment 222052 [details] [diff] [review]
Proposed patch

My build completes with this patch.
(In reply to comment #9)
> (From update of attachment 222052 [details] [diff] [review] [edit])
> My build completes with this patch.

Chris,

Which version of gcc are you using? I'm on v3.4.5, and can downgrade to v3.4.2, but I'd rather know which version(s) are supported by NSS before I spend all my time finding out the hard way. 
Nelson, I don't have time to do more than proposing
a workaround.

David, Chris, Bengt-Erik: perhaps you can demonstrate
to Nelson that you have made a reasonable effort to
contact the MinGW maintainers and have this MinGW bug
fixed.

Comment on attachment 222052 [details] [diff] [review]
Proposed patch

This patch fixes my build
I found that this problem is documented in the GCC manual.
For example, in
http://gcc.gnu.org/onlinedocs/gcc-3.3.6/gcc/Function-Attributes.html#index-g_t_0040code_007b_005f_005fdeclspec_0028dllimport_0029_007d-1639

dllimport

    ...

    One drawback to using this attribute is that a pointer
    to a function or variable marked as dllimport cannot be
    used as a constant address. The attribute can be disabled
    for functions by setting the -mnop-fun-dllimport flag.
This blog entry explains what __declspec(dllimport) does:
http://blogs.msdn.com/larryosterman/archive/2004/12/14/301225.aspx

So what Nelson described in comment 7 is the "thunk".  When
we declare a function with __declspec(dllimport), the thunk
isn't generated.  I guess MSVC must have generated the thunk
anyway when the code tries to use a function pointer declared
with __declspec(dllimport) as a constant address.  Perhaps GCC
should do the same.

We can try compiling the sslsock.c file or the whole
mozilla/security/nss/lib/ssl directory with the -mnop-fun-dllimport
flag.
Now that I've got to the bottom of this, I can propose
a second solution.  We manually declare those two functions
without the dllimport attribute, so that their addresses
can be used to initialize the const ssl_methods structure.

David, Chris, please test this patch.

Chris, Nelson, please review this patch.

Nelson, if you don't like this patch either, please let me
know which of the following workarounds is acceptable:
- Compile sslsock.c with the -mnop-fun-dllimport flag.
- If __MINGW32__ is defined, declare ssl_methods without
  const.

I prefer the original proposed patch because it doesn't
require any ifdef in the makefile or C code.
Attachment #222225 - Flags: superreview?(nelson)
Attachment #222225 - Flags: review?(cls)
Wan-Teh, you beat me to the punch. :-)

I'm fine with just adding -mnop-fun-dllimport unconditionally to the NSS build to avoid limiting the NSS developers in other modules.  Adding this flag causes my build to complete (though I can't test it until I can reboot into XP later).
Attachment #222229 - Flags: review?(wtchang)
Comment on attachment 222229 [details] [diff] [review]
Solution 3: add -mnop-fun-dllimport to WIN32.mk

r=wtc.

Let's add a comment to explain why we added the
-mnop-fun-dllimport flag.  Otherwise it'll be difficult
to ever remove this flag in the future.

Are you sure you want to compile the whole NSS with this
flag?  This flag degrades the performance of calls to
functions defined in other DLLs slightly.
Attachment #222229 - Flags: review?(wtchang) → review+
Comment on attachment 222225 [details] [diff] [review]
Solution 2: manually declare those two functions without the dllimport attribute

My build completes with this patch as well.
Attachment #222225 - Flags: review?(cls) → review+
I'm fine either way as long as the builds work.  The mingw NSS build isn't running "optimally" due to bug 202293 anyway.  I figured that adding the flag globally would be less intrusive than ifdefs and Nelson made it sound like this sort of construct could be used in other places so the global flag avoids needing to rehash this.
Summary: Build failure in sslsock.c → mingw build failure in sslsock.c
It seems to me that the whole POINT of the dllimport declaration is to cause
the generation of the thunk.   Without dllimport, will the thunk be generated?
Will it link properly, resolving the address to the real function, without it?
I'd like some MINGW user to test solutions 2 and 3 for us, and let us know
if they work, and what (if any) warnings or errors they produce.

I would _expect_ that the result of removing the dllimport declaration and
of setting -mnop-fun-dllimport would be to suppress the generation of the
necessary thunks.  But if not, and if Solution 3 works and doesn't incur
some performance penalty, I'd prefer Solution 3.
(In reply to comment #17)

> Are you sure you want to compile the whole NSS with this
> flag?  This flag degrades the performance of calls to
> functions defined in other DLLs slightly.

What is the nature of this performance degradation?  
Why is it worse than a thunk?
Or is it exactly the cost of a thunk?  (in which case it's fine)
Priority: -- → P1
Target Milestone: --- → 3.11.2
Version: unspecified → 3.11
Comment on attachment 222225 [details] [diff] [review]
Solution 2: manually declare those two functions without the dllimport attribute

My build compiles fine with this patch.
(In reply to comment #21)
> I'd like some MINGW user to test solutions 2 and 3 for us, and let us know
> if they work, and what (if any) warnings or errors they produce.
> 
> I would _expect_ that the result of removing the dllimport declaration and
> of setting -mnop-fun-dllimport would be to suppress the generation of the
> necessary thunks.  But if not, and if Solution 3 works and doesn't incur
> some performance penalty, I'd prefer Solution 3.
> 

As a MINGW user I have tested solution 3 ( setting -mnop-fun-dllimport ) and can confirm it is working. There were no warnings or errors.

Nelson, your comment 20 and comment 21 showed that your
definition of a thunk is different from the definition
of a thunk in the blog entry I cited in comment 14.  I
will use the definition from the blog entry.

A thunk is a stub function, e.g., PR_EmulateAcceptRead,
whose body consists of the instruction

    jmp [_imp__EmulateAcceptRead]

where _imp__PR_EmulateAcceptRead is a global variable.

If PR_EmulateAcceptRead is declared without dllimport,
a call to PR_EmulateAcceptRead is implemented as this:

    call PR_EmulateAcceptRead  # call the thunk (stub function)

If PR_EmulateAcceptRead is declared with dllimport,
a call to PR_EmulateAcceptRead is implemented as this:

    call [_imp__EmulateAcceptRead]

saving the jmp instruction.

When we declare PR_EmulateAcceptRead, GCC apparently uses
_imp__EmulateAcceptRead as the pointer to PR_EmulateAcceptRead
and considers it a non-const function address.

So the performance degradation of the -mnop-fun-dllimport flag
is that all the calls to NSPR functions from NSS will go through
the thunks (stub functions) and incur the overhead of the jmp
instructions.  This is a negligible performance degradation for
NSS.

I've checked in this patch on the NSS trunk (3.12) and
NSS_3_11_BRANCH (3.11.2).  I will check in this patch on the
Mozilla trunk and MOZILLA_1_8_BRANCH later this week.
Attachment #222229 - Attachment is obsolete: true
Attachment #222375 - Flags: review?(cls)
Attachment #222375 - Flags: review?(cls) → review+
Comment on attachment 222375 [details] [diff] [review]
Solution 3: add -mnop-fun-dllimport to WIN32.mk (as checked in)

In reply to comment 24,

Actually, my definition of "thunk" is exactly the same as the one you cited.  

While thunks do add an extra jump instruction to most inter-DLL calls,
they also potentially reduce (greatly) the number of places in the code
that need to be "fixed up" by the run time loader when DLLs are loaded.
All the calls to a particular function share a common thunk, and only
that one thunk needs to be "fixed up".

The compiler/linker puts all the thunks for a DLL (thunks are associated
with the DLL/EXE that makes the calls) together on a page, (er, at least
the MSVC compiler does that).  Consequently, only that "thunk" page needs 
to be fixed up by the run time loader, and all the rest of the text 
segment in the DLL/EXE can be an exact copy of the text segment in the 
DLL/EXE in the file.  This in turn allows multiple processes to share 
many more identical text pages than they would if each text page had 
fixups that made it unique.  And it means that those text pages needn't 
be swapped out, since they can just be reloaded directly from the EXE/DLL 
file.  

So, I think thunks are generally beneficial to the system as a whole.
As for a need to be writable to avoid thunks, I'm not convinced. The run 
time loader makes the thunk page ReadOnly after the fixups are done.
But no matter, thunks are still a better solution, IMO.  I'm actually
surprised that Unix systems haven't adopted them.  Maybe a patent?

I think the -mnop-fun-dllimport flag basically makes gcc emulate MSVC 
with respect to generating thunks, and I think that's good, too.  
So, I completely agree with checking in "Solution 3", 
and with making it apply to all of NSS.
Attachment #222375 - Flags: superreview+
Nelson,

Sorry, one sentence in my comment 25 has two mistakes.
It should read:

  When we declare PR_EmulateAcceptRead with dllimport,
  GCC apparently uses [_imp__EmulateAcceptRead] as the
  pointer to PR_EmulateAcceptRead and considers it a
  non-const function address.

Note the square brackets around _imp__EmulateAcceptRead.
The square brackets mean the value at that address.  So
we only need to fix up the value of the global variable
_imp__EmulateAcceptRead.  Note that the instruction when
dllimport is used is

  call [_imp__EmulateAcceptRead]

not

  call _imp__EmulateAcceptRead
  
Comment on attachment 222225 [details] [diff] [review]
Solution 2: manually declare those two functions without the dllimport attribute

Marking r-, since another patch has already been committed.
Attachment #222225 - Flags: superreview?(nelson) → superreview-
Whiteboard: awaiting checkin on 1.8 branch
I just checked in this patch on the Mozilla trunk (1.9 alpha).

Checking in client.mk;
/cvsroot/mozilla/client.mk,v  <--  client.mk
new revision: 1.281; previous revision: 1.280
done
I just checked in the fix on the MOZILLA_1_8_BRANCH (1.8 alpha 3).
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Whiteboard: awaiting checkin on 1.8 branch
Component: Libraries → Build
Attachment #222052 - Flags: superreview?(rrelyea)
Blocks: 1151447
You need to log in before you can comment on or make changes to this bug.