Closed Bug 984358 Opened 9 years ago Closed 8 years ago

crash in nsPNGEncoder::ConvertHostARGBRow(unsigned char const*, unsigned char*, unsigned int, bool)

Categories

(Core :: Graphics, defect)

30 Branch
All
Windows NT
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla31
Tracking Status
firefox29 --- unaffected
firefox30 + fixed
firefox31 + fixed
firefox-esr24 --- unaffected
b2g-v1.4 --- fixed
b2g-v2.0 --- fixed

People

(Reporter: tracy, Assigned: jwatt)

References

Details

(5 keywords, Whiteboard: [fixed by bug 983745][qa-])

Crash Data

Attachments

(1 file, 2 obsolete files)

This bug was filed from the Socorro interface and is 
report bp-2df67b5e-ff59-43b1-8399-cc4c52140316.
=============================================================


This has climbed into Nightly top 5 crashes by volume. It seems to have regressed in this range: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=fc9947c00b51&tochange=41d962d23e81

Though just 3 crashes reported build from 20140316 (which is back to pre-regression volume), this may already have been corrected. Have to wait to see what volume is like today.  http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=82c90c17fc95&tochange=e3b76b155ca4
Keywords: topcrash-win
Hmm, in both ranges I do not see anything about PNG code (by a simple "find in page" for "PNG"). :(
nsPNGEncoder.cpp contains this:

if (alpha == 0) {
    pixelOut[0] = pixelOut[1] = pixelOut[2] = pixelOut[3] = 0

It's possible that pixelOut3[3] is outside of the allocated memory,
if (!aUseTransparency).

The fix is simple.  I'm testing a patch now.
Assignee: nobody → glennrp+bmo
Typo in my comment #2: pixelOut3[3] should be pixelOut[3]
Flags: needinfo?(ryanvm)
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #1)
> Hmm, in both ranges I do not see anything about PNG code (by a simple "find
> in page" for "PNG"). :(

I don't see anything either that seems to be related to the image encoder.
The bug that the v00 patch fixes has been there for quite a while; it's already
present in hg revision 1, March 2007.

Maybe it just happened that the crash rate increased with some external
event like pwn2own.
(In reply to Glenn Randers-Pehrson from comment #5)
> Maybe it just happened that the crash rate increased with some external
> event like pwn2own.

I wouldn't assume something like that to hit specifically Nightly. That said, it's entirely possible some other code change just made us exercise the code path that leads to this crash. Thanks for jumping on it so fast!
Status: NEW → ASSIGNED
See Also: → 614144
https://tbpl.mozilla.org/?tree=Try&rev=431af5b74136

Possible to write a test for this (not sure if the one from bug 614144) still reproduces)?
Flags: needinfo?(ryanvm)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #7)
> https://tbpl.mozilla.org/?tree=Try&rev=431af5b74136
> 
> Possible to write a test for this (not sure if the one from bug 614144)
> still reproduces)?

I don't know how to reproduce the bug.  There is a locked bug that I
can't see that may have a crash test.  I don't want to mention that bug
number here, but I think it could be unlocked since apparently a fix
was applied several months ago (a bandaid that doesn't work well, I guess,
because the crashes still happen, although you can't trigger them by
the same method now).  I'll send you the bug number in a PM.
I don't think the pixelOut[3] patch is going to fix the crash. The addresses in crash reports are all of the form xxxxx003, and since protection is per-page, if pixelOut[3] is bad memory, then so are pixelOut[2] etc.

The actual crash occurs at this line during the "dereference" of uint32_t& pixelIn:
    uint8_t alpha = (pixelIn & 0xff000000) >> 24;

The compiler sees the bit trick and actually grabs the upper byte directly, resulting in the ...003 offsets. But really all four bytes of pixelIn have exceeded a page boundary.
Group: core-security
Keywords: sec-high
@lizzard, I don't think this bug has anything to do with bug #982065 which is about a problem with color management.  They both happen to involve PNG but that's all.
Comment on attachment 8392329 [details] [diff] [review]
v00 Fix potential memory overwrite in nsPNGEncoder.cpp

Tryserver results all green except for one Windows XP reftest that failed, probably for an unrelated reason.  r?
Attachment #8392329 - Flags: review?(jmuizelaar)
Comment on attachment 8392329 [details] [diff] [review]
v00 Fix potential memory overwrite in nsPNGEncoder.cpp

Review of attachment 8392329 [details] [diff] [review]:
-----------------------------------------------------------------

Please see comment 9, I don't think this patch is going to work.
(In reply to David Major [:dmajor] (Away March 22 to April 2) from comment #12)
> Please see comment 9, I don't think this patch is going to work.

Yes I agree.  The destination array is always width*4 so whether transparency is
on or off, setting pixelOut[3] = 0 is safe, even though it'll be immediately
overwritten by pixelOut[0] for the next pixel.  So we're left with the possibility
of the source array somehow being smaller than width*stride, but I can't find
where that might be happening.
Attachment #8392329 - Flags: review?(jmuizelaar)
This crash signature is showing up in the top 10 crashes for Firefox 31, with 38 out of 1714 crashes.
I looked at a few of the crash reports and they all show the PNG encoder being called by the ICO encoder.  So it may be that the buffer allocated by nsICOEncoder() at line 120 is too small.  I can't figure out by reading the code what is the size obtained by mContainedEncoder->GetImageBufferUsed(&PNGImageBufferSize);
This patch raises an ASSERT if the PNG buffer allocated in the ICO encoder is too small, or just fixes it in non-debug runs.  I haven't yet found a way to exercise this bug, so the patch is based only on code analysis.
Attachment #8392329 - Attachment is obsolete: true
(In reply to Glenn Randers-Pehrson from comment #15)
> I can't figure out by reading the code what is the size obtained by
> mContainedEncoder->GetImageBufferUsed(&PNGImageBufferSize);

nsPNGEncoder's mImageBufferUsed member (which is what GetImageBufferUsed returns) is adjusted in nsPNGEncoder::WriteCallback [1], so will depend on how much of the buffer is currently used.

[1] http://mxr.mozilla.org/mozilla-central/source/image/encoders/png/nsPNGEncoder.cpp#694
Comment on attachment 8394774 [details] [diff] [review]
v01 Fix potential memory overwrite in nsICOEncoder.cpp

Marking the v01 patch obsolete.  It's incorrect because it doesn't account for the data buffer being allowed to grow after InitFromData has occurred.
Attachment #8394774 - Attachment is obsolete: true
As far as I can tell, the buffer that may be too small is established in AsycEncodeAndWriteIcon::run()
At least that seems to be the case in the crash report linked above.
Assignee: glennrp+bmo → nobody
I can't do much more with this since the problem seems to be upstream from the PNG encoder and I don't do windows.
Returned assignment to "nobody".
Status: ASSIGNED → NEW
The overrun stems from AsyncFaviconDataReady::OnComplete on High-DPI machines. Previously this was "accidentally safe" but the problem was brought to light by: http://hg.mozilla.org/mozilla-central/rev/c25dfac7ab7b#l1.105

Starting from line 1.105 of the diff, imagine that imageSurface is a 16x16 favicon, and say the GetSystemMetrics calls return 24x24 for a 150% DPI machine. We create a buffer for a 16x16 image and tell the new AsyncEncodeAndWriteIcon to write 24x24 worth of data to it.

Previously, due to the awkward mix-and-match of imageSurface->Stride() and size.width, we would have bailed out at this sanity check: http://hg.mozilla.org/mozilla-central/file/7bacc9e903b0/image/encoders/ico/nsICOEncoder.cpp#l58
Flags: needinfo?(jwatt)
Actually that should say "read 24x24 worth of data" instead of "write 24x24 worth of data".
Attached patch patchSplinter Review
Thanks for pointing out the bug, dmajor. Seems there's been quite a few commits-on-commits of brokenness there, but I should have spotted that.
Flags: needinfo?(jwatt)
Attachment #8402367 - Flags: review?(matt.woodrow)
Assignee: nobody → jwatt
Keywords: regression
Attachment #8402367 - Flags: review?(matt.woodrow) → review+
Comment on attachment 8402367 [details] [diff] [review]
patch

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
Unknown.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
I don't think so.

Which older supported branches are affected by this flaw?
This only affects Nightly and Aurora.

If not all supported branches, which bug introduced the flaw?
Bug 981430.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
Yes.

How likely is this patch to cause regressions; how much testing does it need?
Very unlikely. This is triggered when the code that writes a favicon to disk hits an error condition that would previously have caused us to abort writing to disk as explained in comment 21. We will now scale the favicon to the correct size and write it to disk as normal.
Attachment #8402367 - Flags: sec-approval?
Comment on attachment 8402367 [details] [diff] [review]
patch

sec-approval+ for trunk. Please make and nominate an Aurora patch as well so we can avoid shipping the original problem.
Attachment #8402367 - Flags: sec-approval? → sec-approval+
Comment on attachment 8402367 [details] [diff] [review]
patch

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 981430
User impact if declined: security issue
Testing completed (on m-c, etc.): not landed yet, since I'd prefer to land trunk and aurora together
Risk to taking this patch (and alternatives if risky): low risk
String or IDL/UUID changes made by this patch: none
Attachment #8402367 - Flags: approval-mozilla-aurora?
Al, do you mean get separate sec-approval for aurora? The trunk patch applies to aurora without modification, so no need for a separate patch.
Flags: needinfo?(abillings)
Comment on attachment 8402367 [details] [diff] [review]
patch

Sec-approval just applies to trunk. We then do normal approval for branch patches. I've approved the Aurora patch but we should see it land and things go green on trunk before checkin.
Attachment #8402367 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Flags: needinfo?(abillings)
Bug 983745 has fixed the issue (on Trunk) in a different way (by not bothering to scale the icon, as opposed to scaling it correctly as the patch here does). We should probably just land that patch on Aurora. I've also filed bug 993302 to discuss whether the scaling should be restored or not.
I've requested aurora approval for the patch over in that bug. Can someone make sure it gets it?
Flags: needinfo?(abillings)
Done.
Flags: needinfo?(abillings)
The patch for bug 983745 has now landed on aurora.
Fixed by bug 983745.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Whiteboard: [fixed by bug 983745]
Target Milestone: --- → mozilla31
Marking [qa-] for verification, unless we get a test case.
Whiteboard: [fixed by bug 983745] → [fixed by bug 983745][qa-]
Group: core-security
You need to log in before you can comment on or make changes to this bug.