Closed
Bug 119834
Opened 23 years ago
Closed 23 years ago
CopyHostent causes a bus error on IRIX gcc/debug build
Categories
(NSPR :: NSPR, defect, P1)
Tracking
(Not tracked)
RESOLVED
FIXED
4.2
People
(Reporter: nickb, Assigned: wtc)
References
Details
Attachments
(3 files)
1011 bytes,
patch
|
Details | Diff | Splinter Review | |
1.14 KB,
patch
|
Details | Diff | Splinter Review | |
3.19 KB,
patch
|
Details | Diff | Splinter Review |
Using gcc-2.95.2 and mozilla 0.9.7 (branch) with debugging enabled I get a bus
error on startup.
Here is the stack trace:
(gdb) bt
#0 MakeIPv4MappedAddr (v4=0x107ae47c
"òÄ\002Oòx\236¨\211\020«mð~µ°À\204ЯÌð\212Ñ\002DÈêD\206\213%N\2047«Ø\004/L\2240y¢j[\013\021\026\ntyBCã\221Y\026",
v6=0x107ae46c "") at
/projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/nsprpub/pr/src/misc/prnetdb.c:254
#1 0x4139a10 in CopyHostent (from=0x101afe20, buf=0x101afe3c,
bufsize=0x101afe40, conversion=_PRIPAddrIPv4Mapped, to=0x107ae400) at
/projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/nsprpub/pr/src/misc/prnetdb.c:362
#2 0x413ae24 in PR_GetIPNodeByName (name=0x1077efe8 "dingo.adacel.com.au",
af=64032, flags=0, buf=0x107ae47c
"òÄ\002Oòx\236¨\211\020«mð~µ°À\204ЯÌð\212Ñ\002DÈêD\206\213%N\2047«Ø\004/L\2240y¢j[\013\021\026\ntyBCã\221Y\026",
bufsize=920, hp=0x107ae400) at
/projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/nsprpub/pr/src/misc/prnetdb.c:680
#3 0x45ef3e4 in nsDNSService::Run (this=0x100aeea8) at
/projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/netwerk/dns/src/nsDnsService.cpp:833
#4 0x5fe4ba8c in nsThread::Main (arg=0x0) at
/projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/xpcom/threads/nsThread.cpp:120
#5 0x414e474 in _pt_root (arg=0x0) at
/projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/nsprpub/pr/src/pthreads/ptthread.c:214
#6 0xc21b8ac in _SGIPT_pt_start () at pt.c:793
After adding some debug statements, I came to the conclusion that the problem
lies with the macro _PR_IN6_IS_ADDR_V4MAPPED(a), where a is a PRIPv6Addr *.
Accessing the elements of union before passing to the Macro works fine (either
as a char * or as the struct), as can be seen from this debug output:
v6->pr_s6_addr32[0] = 0
v6->pr_s6_addr32[1] = 0
v6->pr_s6_addr32[2] = 65535
v6->pr_s6_addr32[3] = -1062716409
v6->pr_s6_addr16[0] = 0
v6->pr_s6_addr16[1] = 0
v6->pr_s6_addr16[2] = 0
v6->pr_s6_addr16[3] = 0
v6->pr_s6_addr16[4] = 0
v6->pr_s6_addr16[5] = 65535
v6->pr_s6_addr16[6] = 49320
v6->pr_s6_addr16[7] = 15367
v6->pr_s6_addr[0] = 0
v6->pr_s6_addr[1] = 0
v6->pr_s6_addr[2] = 0
v6->pr_s6_addr[3] = 0
v6->pr_s6_addr[4] = 0
v6->pr_s6_addr[5] = 0
v6->pr_s6_addr[6] = 0
v6->pr_s6_addr[7] = 0
v6->pr_s6_addr[8] = 0
v6->pr_s6_addr[9] = 0
v6->pr_s6_addr[10] = 255
v6->pr_s6_addr[11] = 255
v6->pr_s6_addr[12] = 192
v6->pr_s6_addr[13] = 168
v6->pr_s6_addr[14] = 60
v6->pr_s6_addr[15] = 7
v6[0] = 0
v6[1] = 0
v6[2] = 0
v6[3] = 0
v6[4] = 0
v6[5] = 0
v6[6] = 0
v6[7] = 0
v6[8] = 0
v6[9] = 0
v6[10] = 255
v6[11] = 255
v6[12] = 192
v6[13] = 168
v6[14] = 60
v6[15] = 7
PR_ASSERT(_PR_IN6_IS_ADDR_V4MAPPED(((PRIPv6Addr *) v6)));
nsWidget::~nsWidget() of toplevel: 16 widgets still exist.
moz_run_program[36]: 164808649 Bus error(coredump)
Reporter | ||
Comment 1•23 years ago
|
||
nsprpub/pr/src/misc/prnetdb.c - This is only really needed when using
gcc on mips, however, I cannot find a define under nsprpub to set this,
however it doesn't hurt for both MipsPro and gcc.
Reporter | ||
Comment 2•23 years ago
|
||
I should also mention that the stack trace is a little misleading, I suspect due
to optimisations, in that the line number IS correct, however the actual function
where it dies is MakeIPv4MappedAddr(), not CopyHostent(), altough
MakeIPv4MappedAddr() IS called from CopyHostent().
Assignee | ||
Comment 3•23 years ago
|
||
Can you print the value of v6 in the debugger?
I bet that it is not a multiple of 4.
I believe it is a bug for the MakeIPv4MappedAddr
function to cast a char* to a PRIPv6Addr*. Something
along the line of your fix should be used on all
platforms.
Reporter | ||
Comment 4•23 years ago
|
||
gdb) print *v6
$5 = 0 '\000'
(gdb) print *((PRIPv6Addr *) v6)
$6 = {_S6_un = {_S6_u8 = "\000\000\000\000\000\000\000\000\000\000ÿÿÀ¨<\a",
_S6_u16 = {0, 0, 0, 0, 0, 65535, 49320, 15367},
_S6_u32 = {0, 0, 65535, 3232250887}, _S6_u64 = {0, 281473913994247}}}
The wierd thing is that I created a simple test case, which fairly closly mimics
the behaviour (I can post if you like), but could not re-create the crash.
Summary: CopyHostent causes a bus error on IRIX → CopyHostent causes a bus error on IRIX gcc/debug build
Assignee | ||
Comment 5•23 years ago
|
||
Sorry, I wasn't clear. I meant v6, not *v6. I want
the address.
In any case, I know what's wrong.
The easiest fix is to remove the PR_ASSERT statements
from MakeIPv4MappedAddr() and MakeIPv4CompatAddr()
in mozilla/nsprpub/pr/src/misc/prnetdb.c. These two
PR_ASSERT statements incorrectly cast char* to
PRIPv6Addr*. (PRIPv6Addr must be 4-byte or 8-byte
aligned.) These two PR_ASSERT statements are there
to verify the correctness of the functions and are
not that useful. (These two functions are simple.)
Of course, something like your fix would also work.
We can also fix this by making the casts legal. This
would require allocating the char* with an appropriate
alignment. I am not sure that this is necessary.
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Summary: CopyHostent causes a bus error on IRIX gcc/debug build → CopyHostent causes a bus error on IRIX
Reporter | ||
Comment 6•23 years ago
|
||
Oops, sorry.
(gdb) print v6
$7 = 0x107ae46c ""
Summary: CopyHostent causes a bus error on IRIX → CopyHostent causes a bus error on IRIX gcc/debug build
Assignee | ||
Comment 7•23 years ago
|
||
Hmm... 0x107ae46c is a multiple of 4. This does not support
my theory that this is an alignment problem.
One last try -- could you print (PRIPv6Addr *) v6? I believe
that on IRIX, PRIPv6Addr is 8-byte aligned. So it's possible
that the cast of a 4-byte aligned pointer to an 8-byte aligned
pointer is still problematic.
Reporter | ||
Comment 8•23 years ago
|
||
(gdb) print (PRIPv6Addr *) v6
$8 = (struct PRIPv6Addr *) 0x107ae46c
Which is NOT divisible by 8! :)
However, when I added debug statements, (the output is shown above), I could
access all the elements via the struct (eg I had
PRIPv6Addr *ptr = (PRIPv6Addr *)v6;
and printed all the elements)
I guess it might be better to leave the asserts... a bit of extra checking
doesn't (well, shouldn't! ;-) )hurt, so maybe if we remove the #ifdef IRIX
section and change them all to memcmp()'s? or would you rather keep it IRIX
specific?
Reporter | ||
Comment 9•23 years ago
|
||
I would agree that it looks like an alignment problem tho, as when I changed
the macro to only check the first and the eigth bytes it worked, adding any
others caused problems.
Assignee | ||
Comment 10•23 years ago
|
||
Could you show me how you changed the macro?
Any solution we come up with should not have
#ifdef IRIX. This problem may affect other
platforms.
We can leave the assertions but rewrite them
to not do any type casts, but don't you think
that the MakeIPv4MappedAddr() and MakeIPv4CompatAddr()
functions are so simple that their correctness
is self-evident? :-)
Reporter | ||
Comment 11•23 years ago
|
||
I tried quite a few variations... but here are a few.
The original (which dumps):
#define _PR_IN6_IS_ADDR_V4MAPPED(a) \
(((a)->pr_s6_addr32[0] == 0) && \
((a)->pr_s6_addr32[1] == 0) && \
((a)->pr_s6_addr[8] == 0) && \
((a)->pr_s6_addr[9] == 0) && \
((a)->pr_s6_addr[10] == 0xff) && \
((a)->pr_s6_addr[11] == 0xff))
This one worked:
#define _PR_IN6_IS_ADDR_V4MAPPED(a) \
(((a)->pr_s6_addr32[0] == 0) && \
((a)->pr_s6_addr[8] == 0)
This one didn't:
#define _PR_IN6_IS_ADDR_V4MAPPED(a)
(((a)->pr_s6_addr32[1] == 0)))
Do you need some more?
I agree that MakeIPv4MappedAddr() and MakeIPv4CompatAddr() are simple in the
extreme... and could do without the assertions, however these macros are also
used in some other functions:
PR_GetHostByAddr()
PR_IsNetAddrType()
and maybe others. So, wont they have the same problem? or is it only likely to
occur when cast from char *?
Reporter | ||
Comment 12•23 years ago
|
||
This one doesn't work either:
#define _PR_IN6_IS_ADDR_V4MAPPED(a) \
(((a)->pr_s6_addr32[0] == 0) && \
((a)->pr_s6_addr[9] == 0)
Reporter | ||
Comment 13•23 years ago
|
||
This makes attachment 64775 [details] [diff] [review] obsolete.
Reporter | ||
Comment 14•23 years ago
|
||
obsolete's attachment 64775 [details] [diff] [review] and attachment 64778 [details] [diff] [review]
Reporter | ||
Comment 15•23 years ago
|
||
Is this the right direction? Can you please review this patch.
Thanks! :)
Assignee | ||
Comment 16•23 years ago
|
||
I decided to just delete the two assertions that contained
the illegal type casts. I checked in the fix into the tip
of NSPR.
Priority: -- → P1
Target Milestone: --- → 4.2
Reporter | ||
Comment 17•23 years ago
|
||
Okay, that will work to! :)
Shall we close this then?
Assignee | ||
Comment 18•23 years ago
|
||
I am leaving this open as a reminder that I also need
to check it into the mozilla client branch of NSPR.
We have just missed the mozilla0.9.8 milestone. The
fix will be in the mozilla0.9.9 milestone.
Assignee | ||
Comment 19•23 years ago
|
||
Fix checked into NSPRPUB_PRE_4_2_CLIENT_BRANCH.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•