Closed
Bug 407866
Opened 17 years ago
Closed 17 years ago
Contributed improvement to security/nss/lib/freebl/mpi/mp_comba.c
Categories
(NSS :: Libraries, defect)
Tracking
(Not tracked)
RESOLVED
FIXED
3.12
People
(Reporter: wolfiR, Assigned: wtc)
Details
Attachments
(1 file)
6.03 KB,
patch
|
Details | Diff | Splinter Review |
Please see
https://bugzilla.novell.com/show_bug.cgi?id=346256
(it's x86-64 only)
shlibsign fails in the build process:
Generating DSA Key Pair....Generating PQG Params: An I/O error occurred during
That's apparently a gcc bug in upcoming version 4.3. So probably nothing to be resolved here but maybe of interest for others.
Current workaround is to undefine NSS_USE_COMBA
Comment 1•17 years ago
|
||
Thanks for letting us know that this defect in a recent version of gcc makes
it fail to correctly compile NSS code that is correct and has successfully
built and worked for years. The solution is to build NSS with an unbroken
working version of gcc.
Status: NEW → RESOLVED
Closed: 17 years ago
Resolution: --- → WONTFIX
Assignee | ||
Comment 2•17 years ago
|
||
Wolfgang, this could be a compiler optimization bug in gcc 4.3.
Can you report this to the gcc team or someone familiar with gcc?
You should also find out what exactly went wrong. shlibsign
doesn't crash. So we need to find out at which step the generation
of PQG params fails. Perhaps it fails to load libfreebl3.so?
Reporter | ||
Comment 3•17 years ago
|
||
Yes, it's something in optimization since building with -O0 works fine.
More details are in the referenced bug (which is open for everybody to read) and hopefully tracked by Novell's gcc team.
Comment 4•17 years ago
|
||
Wan-Teh,
Fedora, Debian, Ubuntu provides gcc 4.3 snapshots. Is there any chance you could test with them?
Comment 5•17 years ago
|
||
There is no bug in NSS to be fixed. The bug report is resolved. The evidence in the bug is clear: there is an optimizer problem. The gcc maintainers can fix gcc, and then we'll be willing to use the fixed compiler.
A great deal of time was spent making the comba code right, by 3 or 4 people two years ago, and this attitude that seems to assume that we have no idea
what our code does is simply insulting and unwelcome.
We're not going to bend over every time some maker of an open source build tool has some new whim. Cygwin tried that, and now they're no longer a supported build platform for Mozilla.
One of the characteristics of inline assembler is that it tells the compiler what instructions to produce, pretty directly. If the compiler can no longer follow those instructions, and now produces code that does not do what it was told, that's a compiler bug. The instructions it was told to produce were, and still are, correct, and do not need to be fixed.
I'm quite happy to continue to use older properly-working versions of gcc to build NSS until they can fix their bug. As long as the tinderbox builds continue to build and pass without errors, there is no bug in NSS's comba code that needs to be fixed.
If for some reason we should desire to build with a version of gcc that can no longer correctly build with inline assembler, then I would produce a .s file containing the correct code built by the older version of gcc, and NSS would start using that .s file instead of the .c sources with that broken gcc compiler. We already do that for another vendor's compilers. I'm quite willing to do it for gcc.
If someone wants to CONTRIBUTE a patch that solves this problem with the new compiler, but produces no less efficient code and doesn't reduce performance ANY on servers in load tests, that would be considered.
Reporter | ||
Comment 6•17 years ago
|
||
Maybe I should make clear that I reported that issue here just to make sure that people hitting that issue can find a reference in bmo and can find out what's going on. I don't expect NSS people to care about that.
Comment 7•17 years ago
|
||
This certainly is a gcc bug, but I think it's worth taking into consideration that the more compilers evolve, the more direct insertion of external inline assembly instruction in the middle of the code they generate perturbs them.
AFAIK the way to lower the risks is usually to save and restore as much of the register state as possible before entering/when leaving the inline assembly instruction blocks. And from a very fast look, it doesn't look like http://lxr.mozilla.org/mozilla/source/security/nss/lib/freebl/mpi/mp_comba.c
does much of that.
Comment 8•17 years ago
|
||
The bug seems fixed.
Using the latest development snapshot of the Fedora Rawhide OS, which comes with gcc 4.3.0, I was able to build NSS trunk fine. I also ran the test suite, which completed without failures.
Comment 9•17 years ago
|
||
The previous comment was based on a debug build.
Wan-Teh reminded me the original bug was reported with an optimized build.
And indeed. I still see a failure with an optimized build :-(
/home/kaie/moz/nss/head/mozilla/security/nss/cmd/shlibsign/Linux2.6_x86_64_glibc_PTH_64_OPT.OBJ/shlibsign -v -i /home/kaie/moz/nss/head/mozilla/security/nss/cmd/shlibsign/../../../../dist/Linux2.6_x86_64_glibc_PTH_64_OPT.OBJ/lib/libsoftokn3.so
Generating DSA Key Pair....Generating PQG Params: An I/O error occurred during security authorization.
make[2]: *** [../../../../dist/Linux2.6_x86_64_glibc_PTH_64_OPT.OBJ/lib/libsoftokn3.chk] Error 1
make[1]: *** [libs] Error 2
make: *** [libs] Error 2
Comment 10•17 years ago
|
||
(I personally think this bug should be kept open until a real solution is known, even if no fix inside NSS is planned.)
I have been asked to rebuild NSS for the next Linux Fedora OS release, where all components are being built with gcc 4.3.0.
Unless a better solution is known, I will use the workaround proposed in comment 0.
Comment 11•17 years ago
|
||
I think it would be less than a day's work (maybe about an hour's work) to:
- build NSS optimized on Linux with some non-broken gcc
- disassemble mp_comba.o generating mp_comba_broken_gcc.s
- change the Makefile(s) to build mp_comba_broken_gcc.s instead of mp_comba.c
on Linux systems which have that broken gcc
- commit mp_comba_broken_gcc.s and the modified Makefile.
I'm not including the wait-for-review time in that estimate.
Comment 12•17 years ago
|
||
There some more analysis of this bug in https://bugzilla.redhat.com/show_bug.cgi?id=432146 and a bug against GCC 4.3 (thanks to Jakub Jelinek) with a reduced testcase in http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35160.
Jakub also suggested a change (in the RH bug) to mp_comba.c that avoids this bug by simplifying the asm constraints and removing unnecessary variable initialization. Would the NSS team be willing to consider including this change?
Comment 13•17 years ago
|
||
In reply to Matthew's comment 12, a question and some comments.
a) The gcc.gnu.org bug seems to suggest that a patch has already been made.
(Or am I misunderstanding that?) If that bug is fixed, then is there any
need or motivation to make any further changes to NSS sources?
b) Changing this in NSS 3.11.x has implications on the FIPS validation of
this code. We're not planning on any further FIPS validations of the NSS
3.11.c code (as I understand things), but we are considering doing another
FIPS evaluation for NSS 3.12. So, this begs the question: if we were to
accept a patch to NSS for NSS 3.12.x only (the trunk) and not for NSS 3.11.x
would that be helpful to anyone?
c) I've seen the patch at https://bugzilla.redhat.com/attachment.cgi?id=294959
and my reaction is that I'd like Tom St. Denis to review it and vouch for it.
If he's willing to take it into libTomFast (or whatever it was called),
then I'd consider taking it in NSS 3.12, IFF it still needs to be fixed
(that is, if GCC hasn't been fixed by the patch for their bug cited above).
Comment 14•17 years ago
|
||
(In reply to comment #13)
> In reply to Matthew's comment 12, a question and some comments.
>
> a) The gcc.gnu.org bug seems to suggest that a patch has already been made.
> (Or am I misunderstanding that?) If that bug is fixed, then is there any
> need or motivation to make any further changes to NSS sources?
That is correct. As the reporter version of GCC 4.3 was not final, it is safe to say that there will not be any release versions of GCC 4.3 that will suffer from that bug. However, jakub seems to imply that the current code does unnecessary work, and could be improved regardless as described with the patch.
>
> b) Changing this in NSS 3.11.x has implications on the FIPS validation of
> this code. We're not planning on any further FIPS validations of the NSS
> 3.11.c code (as I understand things), but we are considering doing another
> FIPS evaluation for NSS 3.12. So, this begs the question: if we were to
> accept a patch to NSS for NSS 3.12.x only (the trunk) and not for NSS 3.11.x
> would that be helpful to anyone?
If it would improve the code, yes.
> c) I've seen the patch at https://bugzilla.redhat.com/attachment.cgi?id=294959
> and my reaction is that I'd like Tom St. Denis to review it and vouch for it.
> If he's willing to take it into libTomFast (or whatever it was called),
> then I'd consider taking it in NSS 3.12, IFF it still needs to be fixed
> (that is, if GCC hasn't been fixed by the patch for their bug cited above).
>
GCC has been fixed but based on Jakub's assertions that the current code is sub-optimal, I think it would be good to consider his change suggestions. Re-opening for consideration.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Assignee | ||
Comment 15•17 years ago
|
||
I've reviewed and tested the patch in the bugzilla.redhat.com bug report.
I had to study the GCC inline assembly syntax and the ISO C implementation
of those SQRADDxx macros in TomsFastMath. I can vouch for its correctness
except for the early clobbers (=&r), which I still don't understand.
Nelson, if you want, we can omit the change to SQRADDDB (which has the
early clobber changes), because the change to SQRADDSC alone is enough to
work around the GCC 4.3pre bug, and I fully understand that change. (That
change also makes it unnecessary to initialize sc0, sc1, sc2.)
Basically, the patch removes unnecessary constraints on the inputs/outputs
of the inline assembly. So as caillon said, the patch should help GCC
generate better code.
Linux distributions such as Fedora and openSUSE have the ability to apply
their own patches to the pristine NSS source code, so Kai doesn't need the
patch on the NSS_3_11_BRANCH. But it would be nice to get this patch into
the NSS trunk. Taking this change would require you to regenerate
mp_comba_amd64_sun.s for Solaris.
It is our pleasure to give the patch back to TomsFastMath upstream. That
should not block us from taking this patch (at least the subset I fully
understand).
Comment 16•17 years ago
|
||
Wan-Teh, given that mp_comba_amd64_sun.s is correct as is, a change to
the .c source does not necessitate a change to mp_comba_amd64_sun.s
I do not want to take any change whose correctness depends on some very
new version of gcc. I am unwilling to take a change that means that the
.c file will cease to correctly compile with any version of gcc with which
it now compiles correctly. As I understand it, the propopsed change to
SQRADDSC meets my criteria, but I am not certain that the SQADDDB
change does.
Basically, my reaction to this whole bug, at this point, is this:
Given that there is no bug in the current source that prevents it from
being compiled and working correctly on Linux with a released gcc, I think
there is no urgency to do anything with this. This bug is now nothing more
than a very minor performance enhancement.
We have other priorities for NSS 3.12, priorities that are causing a lot of
other desirable work to remain undone, such as a lot of SSL work. The change
requested in this bug is much lower priority (in my view) than the SSL work
that is being delayed by higher priority work. So, if I had more resources
to put on NSS at this time, I would put them on SSL, not on this.
If you have the time and interest to take this bug and make the changes, and
to carry the responsiblity for them thereafter, I won't stand in your way. :)
Severity: major → minor
Assignee | ||
Comment 17•17 years ago
|
||
Understood and agreed. This is why I did all the work in the Red Hat bug report, not here
(I didn't want to distract you from the SSL work), and my work on the Red Hat bug report
was all done during non-business hours as a pastime.
Assignee: nobody → wtc
Status: REOPENED → NEW
Comment 18•17 years ago
|
||
Nelson, you claim the bug is not present in any released version of gcc.
You claim the bug is only present in a not-yet-released version of gcc 4.3.0pre.
Well, if we all had refused to work on this bug, this bug would probably have remained undetected, and gcc 4.3.0 would probably have been shipped with this bug.
If we all had simply ignored this issue and waited for the final release of gcc 4.3.0, then very likely we would have ended up in a situation where a released version of gcc would show this bug.
I'm therefore really glad that Wan-Teh and Jakub have offered their work to investigate and analyze this problem.
Comment 19•17 years ago
|
||
> I'm therefore really glad that Wan-Teh and Jakub have offered their work to
> investigate and analyze this problem.
I am too! I'm glad that their work got the problem fixed at the source.
Thanks to all who helped solve the gcc problem.
And now that it's fixed, I don't want to spend time to work around it in NSS.
Comment 20•17 years ago
|
||
(In reply to comment #13)
>
> b) Changing this in NSS 3.11.x has implications on the FIPS validation of
> this code. We're not planning on any further FIPS validations of the NSS
> 3.11.c code (as I understand things), but we are considering doing another
> FIPS evaluation for NSS 3.12. So, this begs the question: if we were to
> accept a patch to NSS for NSS 3.12.x only (the trunk) and not for NSS 3.11.x
> would that be helpful to anyone?
From my point of view, changes to 3.11 should be avoided.
From my point of view, if possible optimizations have been identified, it's reasonable to apply them to 3.12 before the next FIPS evaluation starts.
Assignee | ||
Comment 21•17 years ago
|
||
I submitted this patch to the email list for LibTom Projects in
http://groups.google.com/group/libtom/browse_thread/thread/46b7827d4374892c?hl=en
Assignee | ||
Comment 22•17 years ago
|
||
Comment on attachment 303938 [details] [diff] [review]
Proposed patch by Jakub Jelinek of Red Hat
I forgot to mention that this patch was contributed by
Jakub Jelinek of Red Hat. I have reviewed the patch and
fully understand it, including the "earlyclobbers".
I only added a note to mp_comba_amd64_sun.s because it
will get out of sync with mp_comba.c after the patch is
applied.
Attachment #303938 -
Attachment description: Proposed patch → Proposed patch by Jakub Jelinek of Red Hat
Comment 23•17 years ago
|
||
Here are Jakub's arguments, he recommends that we apply this change to NSS, because (quote):
1) you can remove the unnecessary workarounds where you need to initialize
sc* = 0 in order to avoid compiler warnings. All the clearing of the vars
really has to be done in the generated code
2) by accurrately describing what the inline asm does you give the compiler more
freedom to generate better code, especially if register preassure is high.
Comment 24•17 years ago
|
||
I don't find those reasons persuasive. The still don't make this more important
than (say) any of the pending SSL work.
Comment 25•17 years ago
|
||
(In reply to comment #24)
> The still don't make this more important than (say) any of the pending SSL work.
I don't understand this argument.
If I understand correctly, the work is already done.
Jakub produced the patch, Wan-Teh reviewed it, I ran the test suites.
Wan-Teh adds a comment to the SUN assembler file, in order to follow your wish to keep the SUN code unchanged - in order to avoid any work on SUN's side.
Are there arguments to reject the contribution?
I'm changing the bug summary, because this is no longer about miscompilation (gcc with fix compiles NSS fine), it's now about a contributed improvement.
I'm changing the Version attribute of this bug, to make it clear the improvement is proposed for trunk (only).
Summary: gcc 4.3pre miscompiles security/nss/lib/freebl/mpi/mp_comba.c → Contributed improvement to security/nss/lib/freebl/mpi/mp_comba.c
Version: 3.11.8 → trunk
Assignee | ||
Comment 26•17 years ago
|
||
Nelson,
In the last paragraph of comment 16, you said it's fine for me to take
this bug and make the change.
During the President Day long weekend, I worked on this bug as a pastime.
I attached the patch to this bug to wrap up the work. I did *not* ask
anyone to review the patch.
Why can't I work on something that fascinates me in my spare time? I
didn't waste any NSS resources.
Not to mention I also reviewed Julien's SSL patch (bug 403240 comment 33)
in the long weekend.
Comment 27•17 years ago
|
||
Wan-Teh, I didn't realize you had taken the bug. I was stating that *I*
will not work on this bug until more important work is done. You are
(of course, as always) both free and WELCOME to work on this if you wish!
Sorry for my confusion.
Comment 28•17 years ago
|
||
Wan-Teh,
If you modify the comba code, please also regenerate the mp_comba_amd64_sun.s to make sure that it is in sync with the C master source. When we made our performance improvements for amd64, we made sure that everything built on both Solaris and Linux, even though we could have chosen to do Solaris only. We expect the code to be maintained on both platforms.
The "unnamed compiler" is gcc 3.4.3 on Solaris x64, which is in /usr/sfw/bin/gcc, bundled with s10, which you can DL free . Some testing with rsaperf should also be done to make sure that performance doesn't actually regress due to the patch before checkin. Note that the code was never tested on Intel x64 chips. It would be interesting to see how it performs there. We don't have any machines to test that on at Sun yet.
Comment 29•17 years ago
|
||
Julien, do you make it a requirement to update the Sun code at the same time?
If I understand correctly:
- Nelson has proposed to avoid changes to the Sun code
- Wan-Teh might have a hard time to test it on Sun
Because of Nelson's preference to avoid changes to the Sun code, I'd conclude, checking in any changes for Sun code might be postponed to a later time?
I understand the current state of this bug as follows:
- Jakub Jelinek contributed the patch
- Wan-Teh has reviewed the patch
Is it OK if Wan-Teh marks the attachment as reviewed?
I'd think so.
Then he should be OK to check it in.
Comment 30•17 years ago
|
||
I'm putting on my "NSS project lead for Sun" hat for this comment.
Sun does not require ANY change to mp_comba_amd64_sun.s for NSS 3.12.
No such change is prerequisite to the integration of Wan-Teh's patch
to the .c file. I will write more in private email.
Assignee | ||
Comment 31•17 years ago
|
||
I checked in Jakub Jelinek's patch for mp_comba.c on the NSS trunk
for NSS 3.12.
Checking in mp_comba.c;
/cvsroot/mozilla/security/nss/lib/freebl/mpi/mp_comba.c,v <-- mp_comba.c
new revision: 1.3; previous revision: 1.2
done
I didn't update the comment in mp_comba_amd64_sun.s as I proposed in
attachment 303938 [details] [diff] [review]. Let me know if you want me to do that.
Severity: minor → trivial
Status: NEW → RESOLVED
Closed: 17 years ago → 17 years ago
Resolution: --- → FIXED
Target Milestone: --- → 3.12
Comment 32•16 years ago
|
||
Today, the bug at http://gcc.gnu.org/bugzilla/show_bug.cgi?id=35160 was
marked as Resolved/Fixed in gcc 4.4.0
You need to log in
before you can comment on or make changes to this bug.
Description
•