Last Comment Bug 809321 - Source and destination overlap in mempcy in nsMimeRebuffer.cpp
: Source and destination overlap in mempcy in nsMimeRebuffer.cpp
Status: RESOLVED FIXED
:
Product: MailNews Core
Classification: Components
Component: MIME (show other bugs)
: unspecified
: x86 All
: -- normal (vote)
: Thunderbird 19.0
Assigned To: ISHIKAWA, Chiaki
:
Mentors:
Depends on:
Blocks: 803816
  Show dependency treegraph
 
Reported: 2012-11-06 20:00 PST by ISHIKAWA, Chiaki
Modified: 2012-11-11 06:45 PST (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---


Attachments
Patch to replace memcpy usage with memmove (1.00 KB, patch)
2012-11-08 05:36 PST, ISHIKAWA, Chiaki
neil: review+
Details | Diff | Review
Rev 02 of proposed patch (622 bytes, patch)
2012-11-08 18:23 PST, ISHIKAWA, Chiaki
no flags Details | Diff | Review
Rev 03 of proposed patch (608 bytes, patch)
2012-11-09 07:00 PST, ISHIKAWA, Chiaki
neil: review+
Details | Diff | Review

Description ISHIKAWA, Chiaki 2012-11-06 20:00:12 PST
Found by valgrind run. See bug 803816 about mozmill run of TB.

Source version:
comm-central thunderbird.
$ hg identify
1016cef82fd8+ tip
ishikawa@debian-vm:~/TB-NEW/TB-3HG/new-src$ cd mozilla
ishikawa@debian-vm:~/TB-NEW/TB-3HG/new-src/mozilla$ hg identify
a517f7ea5bef+ tip

==3602== Source and destination overlap in memcpy(0x1c740088, 0x1c748088, 983289)
==3602==    at 0x4028E49: memcpy (mc_replace_strmem.c:878)
==3602==    by 0x5D855DA: MimeRebuffer::ReduceBuffer(unsigned int) (nsMimeRebuffer.cpp:68)
==3602==    by 0x5D8420F: nsMimeBaseEmitter::Write(nsACString_internal const&, unsigned int*) (nsMimeBaseEmitter.cpp:456)
==3602==    by 0x5D85E40: nsMimeHtmlDisplayEmitter::WriteBody(nsACString_internal const&, unsigned int*) (nsMimeHtmlEmitter.cpp:515)
==3602==    by 0x5D6D876: mime_output_fn(char const*, int, void*) (mimemoz2.cpp:936)
==3602==    by 0x5D5DFD4: MimeOptions_write(MimeDisplayOptions*, nsCString&, char const*, int, bool) (mimei.cpp:1746)
==3602==    by 0x5D5E559: MimeObject_write(MimeObject*, char const*, int, bool) (mimei.cpp:1781)
==3602==

Observation:

The line in question nsMimeRebuffer.cpp:68

>  memcpy(mBuf, mBuf+numBytes, (mSize - numBytes));

We should use memmove() instead as linux man states:
>       The  memcpy()  function  copies  n bytes from memory area src to memory
>       area dest.  The memory areas must not overlap.  Use memmove(3)  if  the
>       memory areas do overlap.

So the above should read:

   memmove(mBuf, mBuf + numBytes, (mSize - numBytes));

We can't rely on memcpy behave in the manner the original author of
this code assumed.

For example, see the problems discussed in  "Glibc change exposing bugs"
http://lwn.net/Articles/414467/

On some CPUs, now memcpy() sometimes does backward (er, downward if
you will) copying.  So this bug may be dormant for many users of
certain types of CPUs, but may cause problems on users of some CPU
types.

(Searching through bugzilla with memcpy(), I hit upon the following
entry:
Bug 692046 - ByteArray's move_or_copy calls memcpy on overlapping
buffers 
It may be an issue also.)


We should probably replace memcpy with memmove in one sweep :-)
(Seriously, I think the libraries are now so optimized using special
instructions that such changes do make sense from the security point
of view.)
Comment 1 :aceman 2012-11-07 03:36:30 PST
I find it interesting that ReduceBuffer never changes the real size of the buffer (PR_Realloc) but decreeses only mSize. mSize is then used in IncreaseBuffer as a base to find the new target size of the buffer via PR_Realloc. But mBuf may already be large enough. Does reallocate see what is the real size of mBuf and does nothing if size+mSize is smaller? There may be some C semantic thing going on behind the scenes that I do not see.
Comment 2 ISHIKAWA, Chiaki 2012-11-07 05:11:43 PST
(In reply to :aceman from comment #1)
> Does reallocate see what
> is the real size of mBuf and does nothing if size+mSize is smaller? There
> may be some C semantic thing going on behind the scenes that I do not see.
Interesting observation.

If PR_Realloc() is eventually passed to the bare realloc(), realloc() does remember the original alloced size. 

Whether it does nothing when the newly realloced size is smaller than the originally allocated size is, I think, is not clear from what follows: quoted from opengroup's posix spec.POSIX realloc 
http://pubs.opengroup.org/onlinepubs/009695399/functions/realloc.html

--- begin quote ---
The realloc() function shall change the size of the memory object pointed to by ptr to the size specified by size. The contents of the object shall remain unchanged up to the lesser of the new and old sizes. If the new size of the memory object would require movement of the object, the space for the previous instantiation of the object is freed. If the new size is larger, the contents of the newly allocated portion of the object are unspecified. If size is 0 and ptr is not a null pointer, the object pointed to is freed. If the space cannot be allocated, the object shall remain unchanged.
--- end quote ----

According to the spec,
 - There is nothing to stop an implementation to move the content up to the
   new smaller (size+mSize) by allocating the required size and copying the required data and freeing the original (larger allocation)
     (The detailed case of shrinking realloc is not 
   explicitly mentioned to allow much freedom on the side of 
   implementations, I think.)

 - The size up to the new smaller size, in the raised case "(size + mSize)" 
   would remain unchanged. etc. (even in the case of buffer being copied to a new area. In this case, the rest of the preallocated area can become undefined from the viewpoint of valgrind, especially if we need to assume that realloc() can move the area around to avoid fragmentation, etc.)

As for replacing memcpy with memmove(), I have no idea
how many of the following lines are risky and need to be changed.

ishikawa@debian-vm:~/TB-NEW/TB-3HG/new-src$ find . \( -name "*.c[p]*" -o -name "*.h" \) -print | xargs egrep memcpy  | wc
   1598    8531  161023

That is probably the point Linus raised in 
https://bugzilla.redhat.com/show_bug.cgi?id=638477#c46

The linux kernel categorically uses memmove for memcpy to avoid 
suprises (and I think some changing conditions that invalidate formerly valid assumptions about the buffer size, positions, etc., which the author of the original buggy sound application mentioned in the thread.)

I am re-compiling TB by replacing memcpy with memmove i the above line, and
will run make mozmill under valgrind and report the result. Nothing should break.
Comment 3 ISHIKAWA, Chiaki 2012-11-08 05:36:12 PST
Created attachment 679639 [details] [diff] [review]
Patch to replace memcpy usage with memmove

I have replaced the two memcpy() usages in the file, and
run mozbill under valgrind. The overlap warning does not show up any more.

Any idea about who should I ask for the review?
Comment 4 :aceman 2012-11-08 05:40:12 PST
I'd try jcranmer.
Comment 5 Joshua Cranmer [:jcranmer] 2012-11-08 09:52:51 PST
nsMimeRebuffer is basically used as a sliding window buffer for output into the pipe (to avoid the pipe's buffer filling up...). memcpy in IncreaseBuffer is therefore safe, since addBuf won't ever alias mBuf. It's only the ReduceBuffer that's unsafe.

[Ugh... isn't there a way to force pipes to have unlimited buffering capacity and eliminate this altogether?]
Comment 6 neil@parkwaycc.co.uk 2012-11-08 14:05:35 PST
Comment on attachment 679639 [details] [diff] [review]
Patch to replace memcpy usage with memmove

>@@ -35,17 +35,17 @@ MimeRebuffer::IncreaseBuffer(const char 
>-  memcpy(mBuf+mSize, addBuf, size);
>+  memmove(mBuf+mSize, addBuf, size); /* memcpy may not cut it */
It looks as if best style is to use memcpy when you believe you are copying dissimilar buffers and memmove when you know that the memory regions may overlap. So for this case I would prefer if you keep the memcpy. r=me with that fixed.

>@@ -60,17 +60,17 @@ MimeRebuffer::ReduceBuffer(uint32_t numB
>-  memcpy(mBuf, mBuf+numBytes, (mSize - numBytes));
>+  memmove(mBuf, mBuf+numBytes, (mSize - numBytes)); /* memcpy does not cut it here */
[By not using memmove "unnecessarily" this then shows that this is a necessary use of memmove, which, plus the version control history, should suffice to make the comment redundant, but I won't ask you to remove it if you don't want to.]
Comment 7 ISHIKAWA, Chiaki 2012-11-08 18:23:05 PST
Created attachment 679938 [details] [diff] [review]
Rev 02 of proposed patch

Tried to incoporate the feedback.
The first use of memcpy() is restored.

The memmove() line is changed to have a slightly changed comment.
I would rather keep a comment here to aler the reader(s) and future maintainers.
[Does everyone always check the revision history to see how a particular line came about and was written the way it is? I doubt it. Better put a critical info as comment nearby IMHO.] 

In that sense, I am tempted to put "Non overlapping, thus memcpy is OK"
near the first use of memcpy: but that may be too much to the regular developer/maintainer of TB and FF.
To think, we can't say for sure for the other uses of memcpy(), I am more inclined to think Linus's wholesale approach of using memmove under the hood makes sense, but I digress.

(In any case, as valgrind uncovers such unsafe uses, we should fix them. Problem is I now realize TB lacks enough test cases. It is very sparse IMHO. We may have a lot of mundane bugfixes in the future.)

TIA
Comment 8 Joshua Cranmer [:jcranmer] 2012-11-08 19:00:16 PST
(In reply to ISHIKAWA, chiaki from comment #7)
> The memmove() line is changed to have a slightly changed comment.
> I would rather keep a comment here to aler the reader(s) and future
> maintainers.
> [Does everyone always check the revision history to see how a particular
> line came about and was written the way it is? I doubt it. Better put a
> critical info as comment nearby IMHO.]

The distinction between memmove and memcpy is big enough, as Neil mentions, that people would likely clue in to the former being used as being significant, and most people are decent enough to not change it back unless they were prepared to come up with good reasons.

> In that sense, I am tempted to put "Non overlapping, thus memcpy is OK"
> near the first use of memcpy: but that may be too much to the regular
> developer/maintainer of TB and FF.

memcpy is more often safe than not. I'd rather default to a memcpy mindset and use memmove only when necessary. Besides, we have more important lurking vulnerabilities than using memcpy instead of memmove.
Comment 9 ISHIKAWA, Chiaki 2012-11-08 19:40:25 PST
What about a simple '/* overlapping */' comment then?

Yes, I agree that we have more things to worry about.

That is why I agree with Linus to change memcpy to memmove in one fell sweep, but again that is a philosophical point about which I suppose we agree to disagree.
Comment 10 ISHIKAWA, Chiaki 2012-11-09 07:00:12 PST
Created attachment 680070 [details] [diff] [review]
Rev 03 of proposed patch

The patch with the shorter comment.
Comment 11 Ryan VanderMeulen [:RyanVM] 2012-11-09 12:53:38 PST
https://hg.mozilla.org/comm-central/rev/d2dc1975301f

Thanks for the patch, Ishikawa-san! One request - please make sure your hg is configured to generate all the necessary metadata needed for checkin, including a commit message. It makes life easier for those pushing on your behalf and makes sure that your name shows up in the history consistently. Thanks!
https://developer.mozilla.org/en-US/docs/Creating_a_patch_that_can_be_checked_in
Comment 12 ISHIKAWA, Chiaki 2012-11-11 06:45:27 PST
Thank you for checking in the patch.
Come to think of it, I may have created the patch on newly installed 64 bits linux and did not configured hg in the new installation. Thank you again.

Note You need to log in before you can comment on or make changes to this bug.