bugzilla.mozilla.org has resumed normal operation. Attachments prior to 2014 will be unavailable for a few days. This is tracked in Bug 1475801.
Please report any other irregularities here.

CopyHostent causes a bus error on IRIX gcc/debug build

RESOLVED FIXED in 4.2

Status

NSPR
NSPR
P1
normal
RESOLVED FIXED
17 years ago
7 months ago

People

(Reporter: Nick Blievers, Assigned: Wan-Teh Chang)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(3 attachments)

(Reporter)

Description

17 years ago
Using gcc-2.95.2 and mozilla 0.9.7 (branch) with debugging enabled I get a bus
error on startup.
Here is the stack trace:
(gdb) bt
#0  MakeIPv4MappedAddr (v4=0x107ae47c
"òÄ\002Oòx\236¨\211\020«mð~µ°À\204ЯÌð\212Ñ\002DÈêD\206\213%N\2047«Ø\004/L\2240y¢j[\013\021\026\ntyBCã\221Y\026",
v6=0x107ae46c "") at
/projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/nsprpub/pr/src/misc/prnetdb.c:254
#1  0x4139a10 in CopyHostent (from=0x101afe20, buf=0x101afe3c,
bufsize=0x101afe40, conversion=_PRIPAddrIPv4Mapped, to=0x107ae400) at
/projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/nsprpub/pr/src/misc/prnetdb.c:362
#2  0x413ae24 in PR_GetIPNodeByName (name=0x1077efe8 "dingo.adacel.com.au",
af=64032, flags=0, buf=0x107ae47c
"òÄ\002Oòx\236¨\211\020«mð~µ°À\204ЯÌð\212Ñ\002DÈêD\206\213%N\2047«Ø\004/L\2240y¢j[\013\021\026\ntyBCã\221Y\026",
bufsize=920, hp=0x107ae400) at
/projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/nsprpub/pr/src/misc/prnetdb.c:680
#3  0x45ef3e4 in nsDNSService::Run (this=0x100aeea8) at
/projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/netwerk/dns/src/nsDnsService.cpp:833
#4  0x5fe4ba8c in nsThread::Main (arg=0x0) at
/projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/xpcom/threads/nsThread.cpp:120
#5  0x414e474 in _pt_root (arg=0x0) at
/projects/sise/mozilla/devel/workpits/moz/0.9.7_branch_gcc/mozilla/nsprpub/pr/src/pthreads/ptthread.c:214
#6  0xc21b8ac in _SGIPT_pt_start () at pt.c:793

After adding some debug statements, I came to the conclusion that the problem
lies with the macro _PR_IN6_IS_ADDR_V4MAPPED(a), where a is a  PRIPv6Addr *.
Accessing the elements of union before passing to the Macro works fine (either
as a char * or as the struct), as can be seen from this debug output:
v6->pr_s6_addr32[0] = 0
v6->pr_s6_addr32[1] = 0
v6->pr_s6_addr32[2] = 65535
v6->pr_s6_addr32[3] = -1062716409
v6->pr_s6_addr16[0] = 0
v6->pr_s6_addr16[1] = 0
v6->pr_s6_addr16[2] = 0
v6->pr_s6_addr16[3] = 0
v6->pr_s6_addr16[4] = 0
v6->pr_s6_addr16[5] = 65535
v6->pr_s6_addr16[6] = 49320
v6->pr_s6_addr16[7] = 15367
v6->pr_s6_addr[0] = 0
v6->pr_s6_addr[1] = 0
v6->pr_s6_addr[2] = 0
v6->pr_s6_addr[3] = 0
v6->pr_s6_addr[4] = 0
v6->pr_s6_addr[5] = 0
v6->pr_s6_addr[6] = 0
v6->pr_s6_addr[7] = 0
v6->pr_s6_addr[8] = 0
v6->pr_s6_addr[9] = 0
v6->pr_s6_addr[10] = 255
v6->pr_s6_addr[11] = 255
v6->pr_s6_addr[12] = 192
v6->pr_s6_addr[13] = 168
v6->pr_s6_addr[14] = 60
v6->pr_s6_addr[15] = 7
v6[0] = 0
v6[1] = 0
v6[2] = 0
v6[3] = 0
v6[4] = 0
v6[5] = 0
v6[6] = 0
v6[7] = 0
v6[8] = 0
v6[9] = 0
v6[10] = 255
v6[11] = 255
v6[12] = 192
v6[13] = 168
v6[14] = 60
v6[15] = 7
PR_ASSERT(_PR_IN6_IS_ADDR_V4MAPPED(((PRIPv6Addr *) v6)));
nsWidget::~nsWidget() of toplevel: 16 widgets still exist.
moz_run_program[36]: 164808649 Bus error(coredump)
(Reporter)

Comment 1

17 years ago
Created attachment 64775 [details] [diff] [review]
changes individual comparisions to a memcmp

nsprpub/pr/src/misc/prnetdb.c - This is only really needed when using
gcc on mips, however, I cannot find a define under nsprpub to set this,
however it doesn't hurt for both MipsPro and gcc.
(Reporter)

Comment 2

17 years ago
I should also mention that the stack trace is a little misleading, I suspect due
to optimisations, in that the line number IS correct, however the actual function
where it dies is MakeIPv4MappedAddr(), not CopyHostent(), altough 
MakeIPv4MappedAddr() IS called from CopyHostent().
(Assignee)

Comment 3

17 years ago
Can you print the value of v6 in the debugger?
I bet that it is not a multiple of 4.

I believe it is a bug for the MakeIPv4MappedAddr
function to cast a char* to a PRIPv6Addr*.  Something
along the line of your fix should be used on all
platforms.
(Reporter)

Comment 4

17 years ago
gdb) print *v6
$5 = 0 '\000'
(gdb) print *((PRIPv6Addr *) v6)
$6 = {_S6_un = {_S6_u8 = "\000\000\000\000\000\000\000\000\000\000ÿÿÀ¨<\a",
_S6_u16 = {0, 0, 0, 0, 0, 65535, 49320, 15367}, 
    _S6_u32 = {0, 0, 65535, 3232250887}, _S6_u64 = {0, 281473913994247}}}


The wierd thing is that I created a simple test case, which fairly closly mimics
the behaviour (I can post if you like), but could not re-create the crash.
Summary: CopyHostent causes a bus error on IRIX → CopyHostent causes a bus error on IRIX gcc/debug build
(Assignee)

Comment 5

17 years ago
Sorry, I wasn't clear.  I meant v6, not *v6.  I want
the address.

In any case, I know what's wrong.

The easiest fix is to remove the PR_ASSERT statements
from MakeIPv4MappedAddr() and MakeIPv4CompatAddr()
in mozilla/nsprpub/pr/src/misc/prnetdb.c.  These two
PR_ASSERT statements incorrectly cast char* to
PRIPv6Addr*.  (PRIPv6Addr must be 4-byte or 8-byte
aligned.)  These two PR_ASSERT statements are there
to verify the correctness of the functions and are
not that useful.  (These two functions are simple.)

Of course, something like your fix would also work.

We can also fix this by making the casts legal.  This
would require allocating the char* with an appropriate
alignment.  I am not sure that this is necessary.
Status: UNCONFIRMED → ASSIGNED
Ever confirmed: true
Summary: CopyHostent causes a bus error on IRIX gcc/debug build → CopyHostent causes a bus error on IRIX
(Reporter)

Comment 6

17 years ago
Oops, sorry.
(gdb) print v6
$7 = 0x107ae46c ""
Summary: CopyHostent causes a bus error on IRIX → CopyHostent causes a bus error on IRIX gcc/debug build
(Assignee)

Comment 7

17 years ago
Hmm... 0x107ae46c is a multiple of 4.  This does not support
my theory that this is an alignment problem.

One last try -- could you print (PRIPv6Addr *) v6?  I believe
that on IRIX, PRIPv6Addr is 8-byte aligned.  So it's possible
that the cast of a 4-byte aligned pointer to an 8-byte aligned
pointer is still problematic.
(Reporter)

Comment 8

17 years ago
(gdb) print (PRIPv6Addr *) v6
$8 = (struct PRIPv6Addr *) 0x107ae46c

Which is NOT divisible by 8! :)

However, when I added debug statements, (the output is shown above), I could 
access all the elements via the struct (eg I had 
PRIPv6Addr *ptr = (PRIPv6Addr *)v6;
and printed all the elements)


I guess it might be better to leave the asserts... a bit of extra checking 
doesn't (well, shouldn't! ;-) )hurt, so maybe if we remove the #ifdef IRIX 
section and change them all to memcmp()'s? or would you rather keep it IRIX
specific?
(Reporter)

Comment 9

17 years ago
I would agree that it looks like an alignment problem tho, as when I changed
the macro to only check the first and the eigth bytes it worked, adding any
others caused problems.
(Assignee)

Comment 10

17 years ago
Could you show me how you changed the macro?

Any solution we come up with should not have
#ifdef IRIX.  This problem may affect other
platforms.

We can leave the assertions but rewrite them
to not do any type casts, but don't you think
that the MakeIPv4MappedAddr() and MakeIPv4CompatAddr()
functions are so simple that their correctness
is self-evident? :-)
(Reporter)

Comment 11

17 years ago
I tried quite a few variations... but here are a few.
The original (which dumps):
#define _PR_IN6_IS_ADDR_V4MAPPED(a)                     \
                (((a)->pr_s6_addr32[0] == 0)    &&      \
		((a)->pr_s6_addr32[1] == 0)     &&      \
		((a)->pr_s6_addr[8] == 0)       &&      \
		((a)->pr_s6_addr[9] == 0)       &&      \
		((a)->pr_s6_addr[10] == 0xff)   &&      \
		((a)->pr_s6_addr[11] == 0xff))

This one worked:
#define _PR_IN6_IS_ADDR_V4MAPPED(a)                     \
                (((a)->pr_s6_addr32[0] == 0)    &&      \
		((a)->pr_s6_addr[8] == 0)

This one didn't:
#define _PR_IN6_IS_ADDR_V4MAPPED(a)
	(((a)->pr_s6_addr32[1] == 0)))

Do you need some more?

I agree that MakeIPv4MappedAddr() and MakeIPv4CompatAddr() are simple in the 
extreme... and could do without the assertions, however these macros are also 
used in some other functions:
PR_GetHostByAddr()
PR_IsNetAddrType()
and maybe others. So, wont they have the same problem? or is it only likely to
occur when cast from char *? 
(Reporter)

Comment 12

17 years ago
This one doesn't work either:
#define _PR_IN6_IS_ADDR_V4MAPPED(a)                     \
                (((a)->pr_s6_addr32[0] == 0)    && \
		((a)->pr_s6_addr[9] == 0)

(Reporter)

Comment 13

17 years ago
Created attachment 64778 [details] [diff] [review]
Changes both macro's to use memcmp, and removes #ifdef IRIX

This makes attachment 64775 [details] [diff] [review] obsolete.
(Reporter)

Comment 14

17 years ago
Created attachment 64863 [details] [diff] [review]
removes casts aswell as changing macros to use memcmp

obsolete's attachment 64775 [details] [diff] [review] and attachment 64778 [details] [diff] [review]
(Reporter)

Comment 15

17 years ago
Is this the right direction? Can you please review this patch.

Thanks! :)
(Assignee)

Comment 16

17 years ago
I decided to just delete the two assertions that contained
the illegal type casts.  I checked in the fix into the tip
of NSPR.
Priority: -- → P1
Target Milestone: --- → 4.2
(Reporter)

Comment 17

17 years ago
Okay, that will work to! :)

Shall we close this then?
(Assignee)

Comment 18

17 years ago
I am leaving this open as a reminder that I also need
to check it into the mozilla client branch of NSPR.

We have just missed the mozilla0.9.8 milestone.  The
fix will be in the mozilla0.9.9 milestone.
(Assignee)

Comment 19

17 years ago
Fix checked into NSPRPUB_PRE_4_2_CLIENT_BRANCH.
Status: ASSIGNED → RESOLVED
Last Resolved: 17 years ago
Resolution: --- → FIXED

Updated

7 months ago
See Also: → bug 1425543
You need to log in before you can comment on or make changes to this bug.