Closed Bug 353896 Opened 14 years ago Closed 13 years ago
Building tip with NSS
_ECC _MORE _THAN _SUITE _B causes crashes in all .sh
1.04 KB, patch
|Details | Diff | Splinter Review|
1.10 KB, patch
|Details | Diff | Splinter Review|
729 bytes, patch
|Details | Diff | Splinter Review|
Currently, if you pull the NSS tip with NSS_ENABLE_ECC=1 and NSS_ECC_MORE_THAN_SUITE_B=1, then run all.sh, you will get 20 core files in selfserv as follows : [jp96085@monstre]/export/home/nss/tip/mozilla/tests_results/security/monstre.1 337 % find . | grep core ./ext_client/core.selfserv.monstre.23860 ./ext_client/core.selfserv.monstre.25200 ./ext_client/core.selfserv.monstre.25296 ./ext_client/core.selfserv.monstre.25366 ./ext_client/core.selfserv.monstre.25463 ./ext_client/core.selfserv.monstre.28745 ./ext_client/core.selfserv.monstre.120 ./ext_client/core.selfserv.monstre.234 ./ext_client/core.selfserv.monstre.312 ./ext_client/core.selfserv.monstre.412 ./client/core.selfserv.monstre.21442 ./client/core.selfserv.monstre.22836 ./client/core.selfserv.monstre.22933 ./client/core.selfserv.monstre.23003 ./client/core.selfserv.monstre.23099 ./client/core.selfserv.monstre.26308 ./client/core.selfserv.monstre.27717 ./client/core.selfserv.monstre.27815 ./client/core.selfserv.monstre.27885 ./client/core.selfserv.monstre.27981 Most of the stacks are of this type : (dbx) where current thread: t@2  strlen(0x8068a2a, 0xfe869220, 0xfec2b090, 0x0), at 0xfeb84b8c  _fprintf(0xfec2b090, 0x8068a08, 0x8080ae0, 0x0, 0x0), at 0xfebdccfd => errWarn(funcString = 0x8080ae0 "HDX PR_Read"), line 248 in "selfserv.c"  handle_connection(tcp_sock = 0x817b9f0, model_sock = 0x80836c8, requestCert = 0), line 968 in "selfserv.c"  jobLoop(a = (nil), b = (nil), c = 0), line 518 in "selfserv.c"  thread_wrapper(arg = 0x816d2c0), line 486 in "selfserv.c"  _pt_root(arg = 0x8178688), line 220 in "ptthread.c"  _thr_setup(0xfec42400), at 0xfebff9be  _lwp_start(), at 0xfebffca0 (dbx) This happens because PR_Read returned -1 without setting an NSPR error . I realize the above build environment isn't supported, but our software still should be more robust. Two things need to be done : 1) selfserv should not crash . An NSPR error should be set if the wrong curve is used or keygen fails 2) we should attempt to detect the build error at build time. If NSS_ECC_MORE_THAN_SUITE_B is set but the source doesn't support that setting, there should be an #ifdef NSS_ECC_MORE_THAN_SUITE_B #error Sorry, but you lose ! #endif I have wasted an incredible amount of time on various machines with build mismatches due to out-of-date checkout scripts and ECC build macros. Similarly, if the source requires NSS_ECC_MORE_THAN_SUITE_B, it should have #ifndef NSS_ECC_MORE_THAN_SUITE_B #error Sorry, but you lose too ! #endif
The crash problem is actually bug 342795, so I'm adding it as a dependency . I will attach patches for the build problem.
Depends on: 342795
Comment on attachment 239742 [details] [diff] [review] Patch for tip and NSS_3_11_BRANCH This is fine. It should save developers a lot of time.
Attachment #239742 - Flags: superreview?(nelson) → review+
Comment on attachment 239741 [details] [diff] [review] Patch for NSS_3_11_1_BRANCH My first reaction to this patch is that I would prefer that this version of this file be ifdeffed so that it can build either basic or extended ECC. But I won't withhold review for that.
Attachment #239741 - Flags: superreview?(nelson) → review+
Attachment #239741 - Flags: review?(alexei.volkov.bugs) → review+
Attachment #239742 - Flags: review?(alexei.volkov.bugs) → review+
Nelson, re: comment 5, I would also much prefer that we have a single file, but I think we both know there are some objections to that outside of Sun . Thanks for the reviews. I checked in attachment 239741 [details] [diff] [review] to NSS_3_11_1_BRANCH : Checking in ecl-curve.h; /cvsroot/mozilla/security/nss/lib/freebl/ecl/ecl-curve.h,v <-- ecl-curve.h new revision: 18.104.22.168.2.1; previous revision: 22.214.171.124 done And attachment 239742 [details] [diff] [review] to NSS_3_11_BRANCH : Checking in ecl-curve.h; /cvsroot/mozilla/security/nss/lib/freebl/ecl/ecl-curve.h,v <-- ecl-curve.h new revision: 126.96.36.199; previous revision: 188.8.131.52 done And to the tip : Checking in ecl-curve.h; /cvsroot/mozilla/security/nss/lib/freebl/ecl/ecl-curve.h,v <-- ecl-curve.h new revision: 1.6; previous revision: 1.5 done I'm keeping this bug open until its dependencies about the crash are resolved.
Not sure why Julien set the target milestone to 3.11.4 instead of 3.12. This patch will fix the crash on Solaris by having SECU_Strerror(0) return the string "No error." instead of NULL. It doesn't do what Julien wanted though: 1) selfserv should not crash . An NSPR error should be set if the wrong curve is used or keygen fails I am open to other error strings for the error code 0. Note that strerror(0) is - "Success" on Linux - "Error 0" on HP-UX and Solaris - "Unknown error: 0" on Mac OS X - "No error" on Windows
Wan-Teh, A change was made for 3.11.4 as you can see in comment 6 to attempt to avoid building the wrong level of ECC. The part about the NSPR error being set needs to be fixed in dependent bugs, bug 342795 and 353899 . I would almost prefer not to checkin the fix for SECU_Strerror as it will hide this type of error in the future and make them even more difficult to diagnose.
Nelson, I suggest that we mark this bug fixed. I have created a patch for the dependent bug 342795. I will let you decide whether SECU_Strerror should return an error message for error code 0 (attachment 245623 [details] [diff] [review]). Either way is fine by me, but let's not let this bug drag on for too long.
Assignee: bugzilla → wtchang
Attachment #245623 - Flags: superreview?(alexei.volkov) → superreview?(alexei.volkov.bugs)
The ecl-curve.h patch in this bug was checked in in NSS 3.11.4. The dependent bug (bug 342795) was fixed in NSS 3.11.5. Whether you want to change SECU_Strerror(0) or not is fine by me. Feel free to cancel my review request for that patch.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Target Milestone: 3.11.4 → 3.11.5
Comment on attachment 245623 [details] [diff] [review] Make SECU_Strerror(0) return "No error." instead of NULL I agree with Julien. I think we are better off breaking then hiding errors in the software.
Attachment #245623 - Flags: superreview?(alexei.volkov.bugs)
Regarding SECU_Strerror(0), this function is documented to return NULL for unknown error codes. Some programs don't check and crash. Changing error 0 to return a non-NULL string fixes one error code out of billions. Do we want to change the function to never return a NULL string, but instead return an "unknown error" string for all unknown errors? Should this function assert that the error code is known?
Comment on attachment 245623 [details] [diff] [review] Make SECU_Strerror(0) return "No error." instead of NULL We should open a new bug for this issue.
It's fine by me to consider 0 an unknown error code rather than the error code for "no error".
You need to log in before you can comment on or make changes to this bug.