Closed Bug 119834 Opened 23 years ago Closed 23 years ago

CopyHostent causes a bus error on IRIX gcc/debug build

Categories

(NSPR :: NSPR, defect, P1)

4.1.3
SGI
IRIX
defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: nickb, Assigned: wtc)

References

Details

Attachments

(3 files)

Using gcc-2.95.2 and mozilla 0.9.7 (branch) with debugging enabled I get a bus error on startup. Here is the stack trace: (gdb) bt #0 MakeIPv4MappedAddr (v4=0x107ae47c "òÄ\002Oòx\236¨\211\020«mð~µ°À\204ЯÌð\212Ñ\002DÈêD\206\213%N\2047«Ø\004/L\2240y¢j[\013\021\026\ntyBCã\221Y\026", v6=0x107ae46c "") at /projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/nsprpub/pr/src/misc/prnetdb.c:254 #1 0x4139a10 in CopyHostent (from=0x101afe20, buf=0x101afe3c, bufsize=0x101afe40, conversion=_PRIPAddrIPv4Mapped, to=0x107ae400) at /projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/nsprpub/pr/src/misc/prnetdb.c:362 #2 0x413ae24 in PR_GetIPNodeByName (name=0x1077efe8 "dingo.adacel.com.au", af=64032, flags=0, buf=0x107ae47c "òÄ\002Oòx\236¨\211\020«mð~µ°À\204ЯÌð\212Ñ\002DÈêD\206\213%N\2047«Ø\004/L\2240y¢j[\013\021\026\ntyBCã\221Y\026", bufsize=920, hp=0x107ae400) at /projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/nsprpub/pr/src/misc/prnetdb.c:680 #3 0x45ef3e4 in nsDNSService::Run (this=0x100aeea8) at /projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/netwerk/dns/src/nsDnsService.cpp:833 #4 0x5fe4ba8c in nsThread::Main (arg=0x0) at /projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/xpcom/threads/nsThread.cpp:120 #5 0x414e474 in _pt_root (arg=0x0) at /projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/nsprpub/pr/src/pthreads/ptthread.c:214 #6 0xc21b8ac in _SGIPT_pt_start () at pt.c:793 After adding some debug statements, I came to the conclusion that the problem lies with the macro _PR_IN6_IS_ADDR_V4MAPPED(a), where a is a PRIPv6Addr *. Accessing the elements of union before passing to the Macro works fine (either as a char * or as the struct), as can be seen from this debug output: v6->pr_s6_addr32[0] = 0 v6->pr_s6_addr32[1] = 0 v6->pr_s6_addr32[2] = 65535 v6->pr_s6_addr32[3] = -1062716409 v6->pr_s6_addr16[0] = 0 v6->pr_s6_addr16[1] = 0 v6->pr_s6_addr16[2] = 0 v6->pr_s6_addr16[3] = 0 v6->pr_s6_addr16[4] = 0 v6->pr_s6_addr16[5] = 65535 v6->pr_s6_addr16[6] = 49320 v6->pr_s6_addr16[7] = 15367 v6->pr_s6_addr[0] = 0 v6->pr_s6_addr[1] = 0 v6->pr_s6_addr[2] = 0 v6->pr_s6_addr[3] = 0 v6->pr_s6_addr[4] = 0 v6->pr_s6_addr[5] = 0 v6->pr_s6_addr[6] = 0 v6->pr_s6_addr[7] = 0 v6->pr_s6_addr[8] = 0 v6->pr_s6_addr[9] = 0 v6->pr_s6_addr[10] = 255 v6->pr_s6_addr[11] = 255 v6->pr_s6_addr[12] = 192 v6->pr_s6_addr[13] = 168 v6->pr_s6_addr[14] = 60 v6->pr_s6_addr[15] = 7 v6[0] = 0 v6[1] = 0 v6[2] = 0 v6[3] = 0 v6[4] = 0 v6[5] = 0 v6[6] = 0 v6[7] = 0 v6[8] = 0 v6[9] = 0 v6[10] = 255 v6[11] = 255 v6[12] = 192 v6[13] = 168 v6[14] = 60 v6[15] = 7 PR_ASSERT(_PR_IN6_IS_ADDR_V4MAPPED(((PRIPv6Addr *) v6))); nsWidget::~nsWidget() of toplevel: 16 widgets still exist. moz_run_program[36]: 164808649 Bus error(coredump)
nsprpub/pr/src/misc/prnetdb.c - This is only really needed when using gcc on mips, however, I cannot find a define under nsprpub to set this, however it doesn't hurt for both MipsPro and gcc.
I should also mention that the stack trace is a little misleading, I suspect due to optimisations, in that the line number IS correct, however the actual function where it dies is MakeIPv4MappedAddr(), not CopyHostent(), altough MakeIPv4MappedAddr() IS called from CopyHostent().
Can you print the value of v6 in the debugger? I bet that it is not a multiple of 4. I believe it is a bug for the MakeIPv4MappedAddr function to cast a char* to a PRIPv6Addr*. Something along the line of your fix should be used on all platforms.
gdb) print *v6 $5 = 0 '\000' (gdb) print *((PRIPv6Addr *) v6) $6 = {_S6_un = {_S6_u8 = "\000\000\000\000\000\000\000\000\000\000ÿÿÀ¨<\a", _S6_u16 = {0, 0, 0, 0, 0, 65535, 49320, 15367}, _S6_u32 = {0, 0, 65535, 3232250887}, _S6_u64 = {0, 281473913994247}}} The wierd thing is that I created a simple test case, which fairly closly mimics the behaviour (I can post if you like), but could not re-create the crash.
Summary: CopyHostent causes a bus error on IRIX → CopyHostent causes a bus error on IRIX gcc/debug build
Sorry, I wasn't clear. I meant v6, not *v6. I want the address. In any case, I know what's wrong. The easiest fix is to remove the PR_ASSERT statements from MakeIPv4MappedAddr() and MakeIPv4CompatAddr() in mozilla/nsprpub/pr/src/misc/prnetdb.c. These two PR_ASSERT statements incorrectly cast char* to PRIPv6Addr*. (PRIPv6Addr must be 4-byte or 8-byte aligned.) These two PR_ASSERT statements are there to verify the correctness of the functions and are not that useful. (These two functions are simple.) Of course, something like your fix would also work. We can also fix this by making the casts legal. This would require allocating the char* with an appropriate alignment. I am not sure that this is necessary.
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Summary: CopyHostent causes a bus error on IRIX gcc/debug build → CopyHostent causes a bus error on IRIX
Oops, sorry. (gdb) print v6 $7 = 0x107ae46c ""
Summary: CopyHostent causes a bus error on IRIX → CopyHostent causes a bus error on IRIX gcc/debug build
Hmm... 0x107ae46c is a multiple of 4. This does not support my theory that this is an alignment problem. One last try -- could you print (PRIPv6Addr *) v6? I believe that on IRIX, PRIPv6Addr is 8-byte aligned. So it's possible that the cast of a 4-byte aligned pointer to an 8-byte aligned pointer is still problematic.
(gdb) print (PRIPv6Addr *) v6 $8 = (struct PRIPv6Addr *) 0x107ae46c Which is NOT divisible by 8! :) However, when I added debug statements, (the output is shown above), I could access all the elements via the struct (eg I had PRIPv6Addr *ptr = (PRIPv6Addr *)v6; and printed all the elements) I guess it might be better to leave the asserts... a bit of extra checking doesn't (well, shouldn't! ;-) )hurt, so maybe if we remove the #ifdef IRIX section and change them all to memcmp()'s? or would you rather keep it IRIX specific?
I would agree that it looks like an alignment problem tho, as when I changed the macro to only check the first and the eigth bytes it worked, adding any others caused problems.
Could you show me how you changed the macro? Any solution we come up with should not have #ifdef IRIX. This problem may affect other platforms. We can leave the assertions but rewrite them to not do any type casts, but don't you think that the MakeIPv4MappedAddr() and MakeIPv4CompatAddr() functions are so simple that their correctness is self-evident? :-)
I tried quite a few variations... but here are a few. The original (which dumps): #define _PR_IN6_IS_ADDR_V4MAPPED(a) \ (((a)->pr_s6_addr32[0] == 0) && \ ((a)->pr_s6_addr32[1] == 0) && \ ((a)->pr_s6_addr[8] == 0) && \ ((a)->pr_s6_addr[9] == 0) && \ ((a)->pr_s6_addr[10] == 0xff) && \ ((a)->pr_s6_addr[11] == 0xff)) This one worked: #define _PR_IN6_IS_ADDR_V4MAPPED(a) \ (((a)->pr_s6_addr32[0] == 0) && \ ((a)->pr_s6_addr[8] == 0) This one didn't: #define _PR_IN6_IS_ADDR_V4MAPPED(a) (((a)->pr_s6_addr32[1] == 0))) Do you need some more? I agree that MakeIPv4MappedAddr() and MakeIPv4CompatAddr() are simple in the extreme... and could do without the assertions, however these macros are also used in some other functions: PR_GetHostByAddr() PR_IsNetAddrType() and maybe others. So, wont they have the same problem? or is it only likely to occur when cast from char *?
This one doesn't work either: #define _PR_IN6_IS_ADDR_V4MAPPED(a) \ (((a)->pr_s6_addr32[0] == 0) && \ ((a)->pr_s6_addr[9] == 0)
Is this the right direction? Can you please review this patch. Thanks! :)
I decided to just delete the two assertions that contained the illegal type casts. I checked in the fix into the tip of NSPR.
Priority: -- → P1
Target Milestone: --- → 4.2
Okay, that will work to! :) Shall we close this then?
I am leaving this open as a reminder that I also need to check it into the mozilla client branch of NSPR. We have just missed the mozilla0.9.8 milestone. The fix will be in the mozilla0.9.9 milestone.
Fix checked into NSPRPUB_PRE_4_2_CLIENT_BRANCH.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
See Also: → 1425543
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: