Closed Bug 710438 Opened 13 years ago Closed 12 years ago

Invalid read of size 8 with testcase at JSFlatString* JSRope::flattenInternal in Valgrind

Tracking

()

Status:

RESOLVED WORKSFORME

People

(Reporter: gkw, Unassigned)

Details

(Keywords: testcase, valgrind, Whiteboard: [valgrind bug])

Attachments

(5 files)

testcase 13 years ago Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now) 264 bytes, text/plain		Details
stack 13 years ago Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now) 3.62 KB, text/plain		Details
valgrind log of modified opt shell 13 years ago Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now) 3.51 KB, text/plain		Details
valgrind log of 32-bit opt shell, with --trace-redir=yes 13 years ago Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now) 205.52 KB, text/plain		Details
valgrind log of 64-bit opt shell, with --trace-redir=yes 13 years ago Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now) 214.24 KB, text/plain		Details

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Description

•

13 years ago

Attached file testcase — Details

The attached testcase shows a Valgrind error in js opt shell on m-c changeset 63bff373cb94 without any CLI arguments with an invalid read of size 8.

The following command was used:

valgrind --smc-check=all-non-file --dsymutil=yes ./js testcase.js

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 1

•

13 years ago

Attached file stack — Details

Valgrind stack

Luke Wagner [:luke]

Comment 2

•

13 years ago

Hmm, I can't repro on 10.6 or linux64 for the given cset.

But the valgrind spew is interesting!  It clearly points to the PodCopy for the right child of the rope |"(fun" + code|.  Now, 'code' is a compiler-produced atom, which agrees with where-was-this-allocated stack.  The interesting thing is that the atom length is 196 so, since valgrind says "Address 0x102274148 is 392 bytes inside a block of size 394 alloc'd" that means we are doing an 8 byte load starting at the terminating null char.  That means either (1) the length of the atom is wrong (by exactly 8 bytes...) or (2) memcpy() is running past the end.  (2) may very well be the case if memcpy() thinks it knows something about how malloc() works (which valgrind has replaced).

Even if I can't repro locally, there is a simple way to test if Gary would be willing to try something out: could you add a printf to JSRope::flattenInternal (js/src/vm/String.cpp:191)

         size_t len = right.length();
+        printf("right.d.u1.chars: %p, len: %d\n", (void *)right.d.u1.chars, (int)len);
         PodCopy(pos, right.d.u1.chars, len);

and print the new valgrind spew (that should include this printf)?

Nicholas Nethercote [inactive]

Comment 3

•

13 years ago

I can't repro on linux64 either.

> (2) memcpy() is running past the end.  (2) may very well be the
> case if memcpy() thinks it knows something about how malloc() works (which
> valgrind has replaced).

Many implementations of memcpy() read past the end of buffers in "safe" ways, so Valgrind replaces memcpy() and various variants with its own version as well :)  
And judging from valgrind/coregrind/m_redir.c if Valgrind can't replace memcpy() it'll abort.  So that suggests (1) is the cause.

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 4

•

13 years ago

> Even if I can't repro locally, there is a simple way to test if Gary would
> be willing to try something out: could you add a printf to
> JSRope::flattenInternal (js/src/vm/String.cpp:191)
> 
>          size_t len = right.length();
> +        printf("right.d.u1.chars: %p, len: %d\n", (void *)right.d.u1.chars,
> (int)len);
>          PodCopy(pos, right.d.u1.chars, len);
> 
> and print the new valgrind spew (that should include this printf)?

I patched m-c changeset 472b4a4ebea7 with this patch and got:

right.d.u1.chars: 0x1f99880, len: 196
==19697== Invalid read of size 8

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 5

•

13 years ago

Attached file valgrind log of modified opt shell — Details

Luke Wagner [:luke]

Comment 6

•

13 years ago

Unless I'm making a silly arithmetic error, comment 5 seems to indicate that memset is doing an 8 byte load past the end of the declared range (0x1f99880 + 2*196 = 0x1f99a08).  Nick, perhaps you have some insights?

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 7

•

13 years ago

Attached file valgrind log of 32-bit opt shell, with --trace-redir=yes — Details

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 8

•

13 years ago

Attached file valgrind log of 64-bit opt shell, with --trace-redir=yes — Details

Luke Wagner [:luke]

Comment 9

•

13 years ago

Julian, Nick said you would be the one to ask about this valgrind issue:

What I *think* we have here is a case where a call to glibc memcpy is not getting redirected to valgrind's memcpy.  The reason I think this is that (as comment 5 and 6 explain) the bad load is happening outside the range we observe being passed to memcpy.  The log with --trace-redir=yes is in comment 7 and comment 8.  A naive grepping does not show any patching of the memcpy callsite in question (0x1847a3, nor any address containing 0x1847), although I don't really know if that is what the log contains... it seems to, though.

Perhaps you could advise us further?

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 10

•

13 years ago

(In reply to Luke Wagner [:luke] from comment #9)
> Perhaps you could advise us further?

Julian, re-ping?

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 11

•

13 years ago

> Julian, re-ping?

fwiw, I can't reproduce on Valgrind SVN tip on a separate machine running Lion. The testcase in comment 0 was run on Snow Leopard.

Julian Seward [:jseward]

Comment 12

•

13 years ago

(In reply to Luke Wagner [:luke] from comment #9)
> What I *think* we have here is a case where a call to glibc memcpy is not
> getting redirected to valgrind's memcpy.

Luke, I think this is a plausible explanation, and looking at the logs in
comments 7 and 8 seems to support that.  One question is why we have not
been flooded with complaints about this before.

Gary, can you tell me about the CPU in the machine you see the problem on?
Is it quite old?  And what OS version is it running?

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 13

•

13 years ago

> Gary, can you tell me about the CPU in the machine you see the problem on?
> Is it quite old?  And what OS version is it running?

The CPU is a Intel Core 2 Duo P7550, is a mid-2009 Mac Mini running Mac OS X 10.6.8.

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 14

•

13 years ago

> The CPU is a Intel Core 2 Duo P7550, is a mid-2009 Mac Mini running Mac OS X
> 10.6.8.

I can still reproduce this with m-c changeset b0e65467c4c8 and Valgrind SVN r12325.

Julian Seward [:jseward]

Comment 15

•

13 years ago

See bug 715750 comment 2.

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 16

•

13 years ago

Duping to bug 715750, there is a potential patch for Valgrind in that bug.

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → DUPLICATE

Julian Seward [:jseward]

Comment 17

•

12 years ago

On further investigation, this is different from bug 715750.  Reopening.

Status: RESOLVED → REOPENED

Resolution: DUPLICATE → ---

Julian Seward [:jseward]

Comment 18

•

12 years ago

I _believe_ this is now fixed in r12343.  I have smaller test case
which shows the same problem, and it runs cleanly on r12423.  Gary,
any chance you can re-test with the js shell?

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 19

•

12 years ago

(In reply to Julian Seward from comment #18)
> I _believe_ this is now fixed in r12343.  I have smaller test case
> which shows the same problem, and it runs cleanly on r12423.  Gary,
> any chance you can re-test with the js shell?

WFM now. Resolved FIXED by a known Valgrind patch. Thanks Julian!

Status: REOPENED → RESOLVED

Closed: 13 years ago → 12 years ago

Resolution: --- → FIXED

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Comment 20

•

12 years ago

WFM is a better resolution.

Resolution: FIXED → WORKSFORME

Whiteboard: js-triage-needed → valgrind bug

Gary Kwong [:gkw] [:nth10sd] (NOT official MoCo now)

Reporter

Updated

•

12 years ago

Whiteboard: valgrind bug → [valgrind bug]

You need to log in before you can comment on or make changes to this bug.