Closed Bug 845125 Opened 11 years ago Closed 11 years ago

Mac: Crash when printing csptesting.herokuapp.com to PDF w/ heap-use-after-free

Categories

(Core :: Graphics, defect)

x86
macOS
defect
Not set
critical

Tracking

()

RESOLVED FIXED
mozilla23
Tracking Status
firefox20 --- unaffected
firefox21 + fixed
firefox22 + fixed
firefox23 + fixed
firefox-esr17 --- unaffected
b2g18 --- unaffected

People

(Reporter: mwobensmith, Assigned: milan)

References

Details

(6 keywords, Whiteboard: [asan][adv-main21-])

Attachments

(3 files, 8 obsolete files)

1.80 KB, patch
Details | Diff | Splinter Review
27.42 KB, patch
jrmuizel
: review+
abillings
: sec-approval+
Details | Diff | Splinter Review
953 bytes, patch
jrmuizel
: review+
Details | Diff | Splinter Review
1. Go to this URL:

http://csptesting.herokuapp.com/

2. Without pressing that button, select Print > PDF > Save as PDF and save.

Result:
Crash, see ASan log below.

Does not appear to crash in daily dbg build, only ASan.

Marking s-s only because it's a crash.



==4374== ERROR: AddressSanitizer: heap-use-after-free on address 0x0001531da040 at pc 0x10000a520 bp 0x7fff5fbfbf50 sp 0x7fff5fbfbf28
WRITE of size 1 at 0x0001531da040 thread T0
    #0 0x10000a51f in 0x20000a51f
    #1 0x7fff8402b99e in CGAccessSessionGetBytes (in CoreGraphics) + 134
    #2 0x17908cdef in unpackImageRow (in libPDFRIP.A.dylib) + 71
    #3 0x17908cd28 in PDFImageEmitData (in libPDFRIP.A.dylib) + 1513
    #4 0x17908daec in PDFImageEmitDefinition (in libPDFRIP.A.dylib) + 428
    #5 0x7fff85ce6e51 in __CFSetApplyFunction_block_invoke_0 (in CoreFoundation) + 17
    #6 0x7fff85bfbd7f in CFBasicHashApply (in CoreFoundation) + 127
    #7 0x7fff85bfbced in CFSetApplyFunction (in CoreFoundation) + 157
    #8 0x17908e7e4 in PDFImageSetEmitDefinitions (in libPDFRIP.A.dylib) + 58
    #9 0x179087ce9 in emit_page_resources(PDFDocument*) (in libPDFRIP.A.dylib) + 103
    #10 0x179087c60 in PDFDocumentEndPage (in libPDFRIP.A.dylib) + 72
    #11 0x1790867f6 in pdf_EndPage (in libPDFRIP.A.dylib) + 16
    #12 0x7fff8b4e3b8a in pdfSpoolingEndPage(void*, void*) (in PrintCore) + 544
    #13 0x7fff8b4dae9e in PJCEndPage(OpaquePMPrintSession*) (in PrintCore) + 42
    #14 0x7fff8b4b8ce0 in PMSessionEndPageNoDialog (in PrintCore) + 79
    #15 0x1062b9e6a in nsDeviceContextSpecX::EndPage() (in XUL) + 42
    #16 0x104270816 in nsDeviceContext::EndPage() (in XUL) + 182
    #17 0x1045b4175 in nsSimplePageSequenceFrame::DoPageEnd() (in XUL) + 181
    #18 0x10571c381 in nsPrintEngine::PrintPage(nsPrintObject*, bool&) (in XUL) + 3201
    #19 0x1057233ca in nsPagePrintTimer::Run() (in XUL) + 186
    #20 0x106e109db in nsThread::ProcessNextEvent(bool, bool*) (in XUL) + 2139
    #21 0x106d51a3e in NS_ProcessPendingEvents_P(nsIThread*, unsigned int) (in XUL) + 254
    #22 0x1062dc1c3 in nsBaseAppShell::NativeEventCallback() (in XUL) + 451
    #23 0x10625901a in nsAppShell::ProcessGeckoEvents(void*) (in XUL) + 490
    #24 0x7fff85bfc100 in __CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__ (in CoreFoundation) + 16
    #25 0x7fff85bfbaec in __CFRunLoopDoSources0 (in CoreFoundation) + 444
    #26 0x7fff85c1edc4 in __CFRunLoopRun (in CoreFoundation) + 788
    #27 0x7fff85c1e6b1 in CFRunLoopRunSpecific (in CoreFoundation) + 289
    #28 0x7fff8295a0a3 in RunCurrentEventLoopInMode (in HIToolbox) + 208
    #29 0x7fff82959d83 in ReceiveNextEventCommon (in HIToolbox) + 165
    #30 0x7fff82959cd2 in BlockUntilNextEventMatchingListInMode (in HIToolbox) + 61
    #31 0x7fff88131612 in _DPSNextEvent (in AppKit) + 684
    #32 0x7fff88130ed1 in -[NSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] (in AppKit) + 127
    #33 0x106257865 in -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:] (in XUL) + 245
    #34 0x7fff88128282 in -[NSApplication run] (in AppKit) + 516
    #35 0x106259bf9 in nsAppShell::Run() (in XUL) + 185
    #36 0x105e0c587 in nsAppStartup::Run() (in XUL) + 311
    #37 0x103be309f in XREMain::XRE_mainRun() (in XUL) + 4287
    #38 0x103be4057 in XREMain::XRE_main(int, char**, nsXREAppData const*) (in XUL) + 599
    #39 0x103be4512 in XRE_main (in XUL) + 146
    #40 0x1000028a8 in 0x2000028a8
    #41 0x100001abd in 0x200001abd
    #42 0x100001043 in 0x200001043
    #43 0x0 in 0x0000000100000000 (in firefox-bin)
0x0001531da040 is located 0 bytes inside of 11007-byte region [0x0001531da040,0x0001531dcb3f)
freed by thread T0 here:
    #0 0x10000ef88 in 0x20000ef88
    #1 0x10000e602 in 0x20000e602
    #2 0x106f1a54e in gfxAlphaBoxBlur::~gfxAlphaBoxBlur() (in XUL) + 78
    #3 0x104301c27 in nsCSSRendering::PaintBoxShadowOuter(nsPresContext*, nsRenderingContext&, nsIFrame*, nsRect const&, nsRect const&) (in XUL) + 3911
    #4 0x10435a467 in nsDisplayBoxShadowOuter::Paint(nsDisplayListBuilder*, nsRenderingContext*) (in XUL) + 535
    #5 0x10429904f in mozilla::FrameLayerBuilder::DrawThebesLayer(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*) (in XUL) + 3263
    #6 0x106fd17e3 in mozilla::layers::BasicThebesLayer::PaintThebes(gfxContext*, mozilla::layers::Layer*, void (*)(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*), void*, mozilla::layers::ReadbackProcessor*) (in XUL) + 3635
    #7 0x106fb59f6 in mozilla::layers::BasicLayerManager::PaintSelfOrChildren(mozilla::layers::PaintContext&, gfxContext*) (in XUL) + 1142
    #8 0x106fb520e in mozilla::layers::BasicLayerManager::PaintLayer(gfxContext*, mozilla::layers::Layer*, void (*)(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*), void*, mozilla::layers::ReadbackProcessor*) (in XUL) + 3806
    #9 0x106fb5881 in mozilla::layers::BasicLayerManager::PaintSelfOrChildren(mozilla::layers::PaintContext&, gfxContext*) (in XUL) + 769
    #10 0x106fb520e in mozilla::layers::BasicLayerManager::PaintLayer(gfxContext*, mozilla::layers::Layer*, void (*)(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*), void*, mozilla::layers::ReadbackProcessor*) (in XUL) + 3806
    #11 0x106fb5881 in mozilla::layers::BasicLayerManager::PaintSelfOrChildren(mozilla::layers::PaintContext&, gfxContext*) (in XUL) + 769
    #12 0x106fb520e in mozilla::layers::BasicLayerManager::PaintLayer(gfxContext*, mozilla::layers::Layer*, void (*)(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*), void*, mozilla::layers::ReadbackProcessor*) (in XUL) + 3806
    #13 0x106fb33e6 in mozilla::layers::BasicLayerManager::EndTransactionInternal(void (*)(mozilla::layers::ThebesLayer*, gfxContext*, nsIntRegion const&, nsIntRegion const&, void*), void*, mozilla::layers::LayerManager::EndTransactionFlags) (in XUL) + 2454
    #14 0x104350d66 in nsDisplayList::PaintForFrame(nsDisplayListBuilder*, nsRenderingContext*, nsIFrame*, unsigned int) const (in XUL) + 2582
previously allocated by thread T0 here:
    #0 0x10000ed7c in 0x20000ed7c
    #1 0x7fff8bc26152 in malloc_zone_malloc (in libsystem_c.dylib) + 70
    #2 0x7fff8bc26ba6 in malloc (in libsystem_c.dylib) + 40
    #3 0x7fff8a04a346 in operator new(unsigned long) (in libstdc++..dylib) + 33
    #4 0x7fff8a04a3da in operator new[](unsigned long) (in libstdc++..dylib) + 8
    #5 0x1077597c7 in mozilla::gfx::AlphaBoxBlur::AlphaBoxBlur(mozilla::gfx::Rect const&, mozilla::gfx::IntSize const&, mozilla::gfx::IntSize const&, mozilla::gfx::Rect const*, mozilla::gfx::Rect const*) (in XUL) + 3223
    #6 0x106f1a9d7 in gfxAlphaBoxBlur::Init(gfxRect const&, nsIntSize const&, nsIntSize const&, gfxRect const*, gfxRect const*) (in XUL) + 1111
Shadow byte and word:
  0x10002a63b408: fd
  0x10002a63b408: fd fd fd fd fd fd fd fd
long double restrictunsigned __int128::* shadow bytes:
  0x10002a63b3e8: fa fa fa fa fa fa fa fa
  0x10002a63b3f0: fa fa fa fa fa fa fa fa
  0x10002a63b3f8: fa fa fa fa fa fa fa fa
  0x10002a63b400: fa fa fa fa fa fa fa fa
=>0x10002a63b408: fd fd fd fd fd fd fd fd
  0x10002a63b410: fd fd fd fd fd fd fd fd
  0x10002a63b418: fd fd fd fd fd fd fd fd
  0x10002a63b420: fd fd fd fd fd fd fd fd
  0x10002a63b428: fd fd fd fd fd fd fd fd
Stats: 710M malloced (502M for red zones) by 1042005 calls
Stats: 117M realloced by 40876 calls
Stats: 659M freed by 805793 calls
Stats: 612M really freed by 715607 calls
Stats: 363M (92982 full pages) mmaped in 633 calls
  mmaps   by size class: 7:274365; 8:106444; 9:20460; 10:9709; 11:9435; 12:3072; 13:2496; 14:1024; 15:896; 16:704; 17:456; 18:38; 19:38; 20:19; 21:4; 22:4; 24:1;
  mallocs by size class: 7:634097; 8:234135; 9:57662; 10:47143; 11:36566; 12:11018; 13:8667; 14:4174; 15:3747; 16:2622; 17:1913; 18:138; 19:72; 20:31; 21:8; 22:6; 24:6;
  frees   by size class: 7:483126; 8:171383; 9:44391; 10:42882; 11:34588; 12:9544; 13:7997; 14:3776; 15:3574; 16:2423; 17:1880; 18:119; 19:65; 20:28; 21:7; 22:6; 24:6;
  rfrees  by size class: 7:421965; 8:151304; 9:40346; 10:40605; 11:33520; 12:8914; 13:7737; 14:3592; 15:3408; 16:2157; 17:1834; 18:115; 19:65; 20:28; 21:6; 22:6; 24:5;
Stats: malloc large: 8644 small slow: 14721
==4374== ABORTING
I believe heap use-after-free is generally sec-critical.
Keywords: sec-critical
Summary: Mac: Crash on Print > Save as PDF → Mac: Crash when printing csptesting.herokuapp.com to PDF w/ heap-use-after-free
Severity: normal → critical
Keywords: crash, reproducible
Whiteboard: [asan]
Attached patch wipSplinter Review
The problem seems to be that there's something else with a reference to
this surface, so even though we own the image data, deleting it makes
the remaining cairo surface reference have a dangling data pointer.

The value of "mImageSurface->CairoSurface()->ref_count" inside the
if-statement is 2.

I'm NOT suggesting this patch is a fix or even a wallpaper - it just
demonstrates the nature of the problem.  (It does fix the crash though.)

Note also that we have some patches on top of cairo in this area:
http://mxr.mozilla.org/mozilla-central/source/gfx/cairo/quartz-create-for-data.patch
http://mxr.mozilla.org/mozilla-central/source/gfx/cairo/quartz-get-image.patch
Component: Printing: Output → Graphics
Fwiw, the crash does not occur in an ASan build on Linux.
Keywords: pp
This is crashing down in Mac OS code (Matt is using 10.8), looks like an Apple bug.

Does this happen on 10.7 also?

Since Apple's not likely to fix this any time soon we should see if we can wallpaper it.

Does this happen when you print to paper also, or only when you print to a PDF? (PDF routines are on the stack)
Keywords: sec-vector
Whiteboard: [asan] → [asan] Apple bug?
Flags: needinfo?(mwobensmith)
I'm on OSX 10.7.5.  I would guess it's a bug in Cairo or in how we use Cairo.
I don't think it's an Apple bug.  Clearly we shouldn't deallocate the surface
data when there's still other references to the surface.
To Milan for reassignment.
Assignee: nobody → milan
Keywords: sec-vector
Whiteboard: [asan] Apple bug? → [asan]
Remember that this only crashes ASan builds. I made my own for 17+20. This is Mac OS 10.8.2.

Using builds from 2013-03-11:

17.0.4esr: no crash
20.0: no crash
21.0a2: crash
22.0a: crash
Flags: needinfo?(mwobensmith)
Jeff any hunches WRT Cairo? (see comment 5).
Mat's patch does stop the crash, and the resulting PDF looks correct.
[though see comment 2 -- Mats doesn't necessarily intend it as an actual fix]
Right, it's an ownership issue:
* gfxAlphaBoxBlur new's and owns AlphaBoxBlur (in mBlur)
* mBlur in turn new's and owns the buffer
* gfxAlphaBoxBlur also new's and owns gfxImageSurface, created with mBlur's owned buffer
* on creation, gfxImageSurface creates cairo surface and lets it use the buffer in question
* deleting gfxAlphaBoxBlur deletes gfxImageSurface and AlphaBoxBlur
* gfxImageSurface does not delete the cairo surface unless the reference count is down to zero
* AlphaBoxBlur deletes the buffer it allocated and owns
* thus the trouble - cairo surface outlives gfxImageSurface and AlphaBoxBlur, which in turn deletes the buffer while the cairo surface may still want access to it in the case where there was another reference holder to the cairo surface
Should we consider Mats patch for 21 while the deeper (and pretty awesome) investigation ensues?
This is a bit of a larger change, but perhaps appropriate.
gfxAlphaBoxBlur is the one that takes the data owned by AlphaBoxBlur and lets gfxImageSurface use it, but doesn't make sure the data is still around as long as it is needed.  The approach in this patch is to take the ownership of the data away from AlphaBoxBlur right away, and give it to gfxImageSurface instead, and do it on creation, rather than on deletion.
* Add AlphaBoxBlur::GetDataWithOwnership() method which returns the data and relinquishes the ownership of it.  We still point to it, but we don't own it anymore.  Mats' approach is absolutely valid, with an explicit "own / don't own", but this one felt a bit more explicit and perhaps more difficult to abuse.  Would like to know what you think.
* Add an extra parameter to the gfxImageSurface constructor (with the data) which explicitly specifies who owns the data passed in the constructor.  The default is false (the caller still owns it), but we now have a way of passing the data ownership at construction time.
* Have gfxAlphaBoxBlur put it all together.

I have not reviewed the rest of the users of gfxImageSurface to see if we're getting into this kind of trouble in other places.
Attachment #727849 - Flags: review?(matspal)
Attachment #727849 - Flags: review?(jmuizelaar)
Comment on attachment 727849 [details] [diff] [review]
Change the data ownership between gfxImageSurface and AlphaBoxBlur when used by gfxAlphaBoxBlur

I think this is the right approach.

> gfx/2d/Blur.cpp
>+AlphaBoxBlur::GetDataWithOwnership()
>+{
>+  if (mFreeData) {
>+      mFreeData = false;

Indentation in this file is 2 spaces.

Please add mode-lines to the files that miss them while you're here.
https://developer.mozilla.org/en-US/docs/Developer_Guide/Coding_Style#Mode_Line


>+	/**
>+   * As GetData() but only return the pointer if this class owns the data.
>+   * At the same time, reset the ownership - the caller is now responsible

indentation


>gfx/thebes/gfxBlur.cpp
> gfxAlphaBoxBlur::~gfxAlphaBoxBlur()
> {
>   // Drop references to mContext and mImageSurface before we delete mBlur,
>   // because the image surface points to data in mBlur.
>   mContext = nullptr;
>+  delete mBlur;
>   mImageSurface = nullptr;
>-  delete mBlur;
> }

Is there a reason for this change?  I suspect that if there are cases
where mBlur owns the data we want to keep the current order (but see
below). (and note the comment is wrong with your change)


>     mBlur = new AlphaBoxBlur(rect, spreadRadius, blurRadius, dirtyRect, skipRect);
> 
>-    unsigned char* data = mBlur->GetData();
>+    // We will give the ownership of the mBlur data to mImageSurface
>+    unsigned char* data = mBlur->GetDataWithOwnership();

It's OK, but I wonder why the AlphaBoxBlur constructor should do the
allocation since we immediately take ownership.  Are there any situations
where AlphaBoxBlur actually owns the data after this change?  If not,
then we might as well remove mFreeData and associated code if add a new
AlphaBoxBlur method that allocates the data and use that here.


>gfx/thebes/gfxImageSurface.h
>     /**
>      * Construct an image surface around an existing buffer of image data.
>      * @param aData A buffer containing the image data
>      * @param aSize The size of the buffer
>      * @param aStride The stride of the buffer
>      * @param format Format of the data
>+     * @param aOwnsData If true, assume aData ownership passes to this class.
>      *
>      * @see gfxImageFormat
>      */
>     gfxImageSurface(unsigned char *aData, const gfxIntSize& aSize,
>-                    long aStride, gfxImageFormat aFormat);
>+                    long aStride, gfxImageFormat aFormat,
>+                    bool aOwnsData=false);

I don't like bool params because it makes the call sites hard to read
and error prone.  Please change it to "uint32_t aFlags" and add a enum
or const value for the flag.  (spaces around the '=' please)


>     void InitWithData(unsigned char *aData, const gfxIntSize& aSize,
>-                      long aStride, gfxImageFormat aFormat);
>+                      long aStride, gfxImageFormat aFormat, bool aOwnsData);

same
Attachment #727849 - Flags: review?(matspal) → review-
(In reply to Mats Palmgren [:mats] from comment #14)
> Comment on attachment 727849 [details] [diff] [review]
> ...
> 
> >gfx/thebes/gfxBlur.cpp
> > gfxAlphaBoxBlur::~gfxAlphaBoxBlur()
> > {
> >   // Drop references to mContext and mImageSurface before we delete mBlur,
> >   // because the image surface points to data in mBlur.
> >   mContext = nullptr;
> >+  delete mBlur;
> >   mImageSurface = nullptr;
> >-  delete mBlur;
> > }
> 
> Is there a reason for this change?  I suspect that if there are cases
> where mBlur owns the data we want to keep the current order (but see
> below). (and note the comment is wrong with your change)

Will change the comment for sure.  In the gfxAlphaBoxBlur scenario, mBlur never owns the data, and it doesn't really matter which one goes away first, but it felt more comfortable to not have mBlur around for an extra line of code pointing to that (potentially) deleted data, even though it wouldn't do anything with it.

> 
> >     mBlur = new AlphaBoxBlur(rect, spreadRadius, blurRadius, dirtyRect, skipRect);
> > 
> >-    unsigned char* data = mBlur->GetData();
> >+    // We will give the ownership of the mBlur data to mImageSurface
> >+    unsigned char* data = mBlur->GetDataWithOwnership();
> 
> It's OK, but I wonder why the AlphaBoxBlur constructor should do the
> allocation since we immediately take ownership.  Are there any situations
> where AlphaBoxBlur actually owns the data after this change?  If not,
> then we might as well remove mFreeData and associated code if add a new
> AlphaBoxBlur method that allocates the data and use that here.

You're right.  The cleanest thing here would be for us to allocate the data and pass it to AlphaBoxBlur in a another constructor.  That way it's clear where the ownership of the data lies and we don't even need the "give me the data ownership" method.  There is a logic in figuring out the size of the data that's currently buried in the existing constructor, but that may be nicely wrapped in a static method.

The other comments are clear, thanks.
(In reply to Mats Palmgren [:mats] from comment #14)
> Comment on attachment 727849 [details] [diff] [review]
> 
> >gfx/thebes/gfxImageSurface.h
> ...
> >     gfxImageSurface(unsigned char *aData, const gfxIntSize& aSize,
> >-                    long aStride, gfxImageFormat aFormat);
> >+                    long aStride, gfxImageFormat aFormat,
> >+                    bool aOwnsData=false);
> 
> I don't like bool params because it makes the call sites hard to read
> and error prone.  Please change it to "uint32_t aFlags" and add a enum
> or const value for the flag.  (spaces around the '=' please)

This class internally holds a bool for "own data" - would you change that to an enum as well?  Or just as the method parameter?
No, just the method parameter.
AlphaBoxBlur uses uint8_t array, gfxImageSurface uses unsigned char array.  I will not attempt to resolve that inconsistency.
I think there are still inconsistencies on how the mData is handled in gfxImageSurface, but that's probably a separate bug. For example, MakeInvalid(), and tracking the memory usage when the data is owned.
Attachment #727849 - Attachment is obsolete: true
Attachment #727849 - Flags: review?(jmuizelaar)
Attachment #729230 - Flags: review?(matspal)
AlphaBoxBlur does not hold the data, gfxAlphaBoxBlur allocates it, and passes it to gfxImageSurface to manage. 
There is still a bit a annoyance of of a shared assumption for the extra 3 bytes required by blur that nobody tracks.  Perhaps gfxImageSurface should track the "capacity" in addition to the "size".
Attachment #729235 - Flags: review?(matspal)
Comment on attachment 729230 [details] [diff] [review]
gfxImageSurface constructor lets us specify if it should own the data buffer specified.

Looks good to me, r=mats.  I'm not a gfx peer though so you need an additional r+ from someone who is.
Attachment #729230 - Flags: review?(matspal)
Attachment #729230 - Flags: review?(jmuizelaar)
Attachment #729230 - Flags: review+
Comment on attachment 729235 [details] [diff] [review]
Part 2/2: AlphaBoxBlur does not hold to the data at all, but expects the callers to manage it and specify it on each Blur() call.

>gfx/2d/Blur.cpp
>+   mMinDataSize(-1)

You forgot to declare mMinDataSize in the header?


>+    mMinDataSize(aRect.width*aRect.height)

I think you need to use CheckedInt<int32_t> for the result of that
multiplication and if it's !isValid() assign -1.


>+int32_t
>+AlphaBoxBlur::GetMinDataSize() const
>+{
>+  return mMinDataSize;
>+}

I think you should put this method in the header, unless there's
a reason not to.


> void
>+AlphaBoxBlur::Blur(uint8_t* aData)
> [...]
>+  return true;
> }

Stray statement? (this method is void)


>gfx/2d/Blur.h
>+   * Return the minimum buffer size that should be given to Blur() method.  If
>+   * negative, the class is not properly setup for blurring.  Note that this
>+   * includes the extra three bytes on top of the stride*width, where something
>+   * like gfxImageSurface::GetDataSize() would report without it, even if it 
>+   * happens to have the extra bytes.
>    */
>+  int32_t GetMinDataSize() const;

Perhaps naming it GetSurfaceAllocationSize or something to make it
distinct from "DataSize" would be clearer?


>+  /**
>+   * Perform the blur in-place on the surface backed by specified 8-bit
>+   * alpha surface data. The size must be at least that returned by
>+   * GetMinDataSize() or bad things will happen.
>+   */
>+  void Blur(uint8_t* aData);

Also mention that aData is expected to be zero-filled by the caller?


>gfx/thebes/gfxBlur.cpp
>     mBlur = new AlphaBoxBlur(rect, spreadRadius, blurRadius, dirtyRect, skipRect);
>+    int32_t blurDataSize = mBlur->GetMinDataSize();
> 
>-    unsigned char* data = mBlur->GetData();
>+    // Allocate and null out the buffer for the data.
>+    uint8_t* data = nullptr;
>+    if (blurDataSize > 0) {
>+        data = new uint8_t[blurDataSize];
>+    }
>     if (!data)
>-      return nullptr;
>+        return nullptr;
>+    memset(data, 0, blurDataSize);

Yeah, that works, but would be simpler as:

    int32_t blurDataSize = mBlur->GetMinDataSize();
    if (blurDataSize < 0)
        return nullptr;
    data = new uint8_t[blurDataSize];
    memset(data, 0, blurDataSize);

since the 'new' allocation is infallible.


>+    // Need to pass the data buffer to mBlur
>+    mBlur->Blur(mImageSurface->Data());

No need for that comment IMO, the code is obvious.


r=mats with those nits fixed, but please also get r+ from a gfx peer.
Attachment #729235 - Flags: review?(matspal)
Attachment #729235 - Flags: review?(jmuizelaar)
Attachment #729235 - Flags: review+
(In reply to Mats Palmgren [:mats] from comment #22)
> Comment on attachment 729235 [details] [diff] [review]
> ...
> 
> >+int32_t
> >+AlphaBoxBlur::GetMinDataSize() const
> >+{
> >+  return mMinDataSize;
> >+}
> 
> I think you should put this method in the header, unless there's
> a reason not to.

I put in in C++ as that seems to be the prevalent approach in this class for all the other accessors.


> ...
> >+  /**
> >+   * Perform the blur in-place on the surface backed by specified 8-bit
> >+   * alpha surface data. The size must be at least that returned by
> >+   * GetMinDataSize() or bad things will happen.
> >+   */
> >+  void Blur(uint8_t* aData);
> 
> Also mention that aData is expected to be zero-filled by the caller?

I know this was being done in the constructor, but was Blur only "called once"?  At the first glance I thought we were zeroing it out mostly for the extra bits that the Cairo surface wouldn't fill, but I only glanced at it once :-)

I'll fix the rest of the stuff and resubmit to a gfx peer.
Attachment #729235 - Flags: review?(jmuizelaar)
Comment on attachment 729621 [details] [diff] [review]
Part 2/2: AlphaBoxBlur does not hold to the data at all, but expects the callers to manage it and specify it on each Blur() call.

Already had r=mpalmgren
Attachment #729621 - Flags: review?(matspal) → review+
Fixed an issue in Windows build, otherwise r=matspal.
Attachment #729230 - Attachment is obsolete: true
Attachment #729230 - Flags: review?(jmuizelaar)
Attachment #729885 - Flags: review?(jmuizelaar)
This would obsolete the previous patch, but I won't remove it so that we can compare the two.  Jeff, is this a preferable way?  This way we allocate the memory inside of the gfxImageSurface, it has the ownership, no passing of the allocated memory ownership between classes.  If so, I'll prepare the second part that matches.
Attachment #734011 - Flags: review?(jmuizelaar)
Flags: needinfo?(jmuizelaar)
Comment on attachment 734011 [details] [diff] [review]
Add another constructor instead, allowing for the extra memory allocation

It's really feedback, not review at this point.
Attachment #734011 - Flags: review?(jmuizelaar) → feedback?(jmuizelaar)
Go with the approach where gfxImageSurface allocates the data it owns, as hinted by Jeff. AlphaBoxBlur operates on the data, doesn't hold on to it.
Attachment #729621 - Attachment is obsolete: true
Attachment #729885 - Attachment is obsolete: true
Attachment #734011 - Attachment is obsolete: true
Attachment #729621 - Flags: review?(jmuizelaar)
Attachment #729885 - Flags: review?(jmuizelaar)
Attachment #734011 - Flags: feedback?(jmuizelaar)
Attachment #738565 - Flags: review?(jmuizelaar)
I see this is marked as a regression but does not block anything.Do we know the feature or bug# that regressed this ? Adding :mwobensmith as QA to help with that.

:Milan : Given we have gone to build with beta 4 for Fx21, what are the next best steps here  ? If you are already aware of the regressing bug , please let us know if a backout is possible as Fx 20 is unaffected ?
QA Contact: mwobensmith
Unfortunately, because it's an ASan-only crash - for me - finding a regression window is extremely hard, as I don't have the range of ASan builds that I'd have with opt/debug builds.

How important is an exact regression window? Or can the questions you need answered be satisfied by doing something else?
(In reply to Matt Wobensmith from comment #33)
> I don't have the range of ASan
> builds that I'd have with opt/debug builds.

I think we publish nightlies built with ASAN available somewhere (decoder makes them, IIRC). I don't recall where they live, though.
Sorry, never mind -- currently our nightly mac ASAN builds are here:
   http://people.mozilla.org/~choller/firefox/asan/
and unfortunately we only have a few weeks' worth of them. (Nothing from around when/before this bug was filed.)

So: disregard comment 34.
I don't think the exact regression window is essential.
We just need to get this reviewed and landed.
Jeff'll get the review done today.
(In reply to Milan Sreckovic [:milan] from comment #37)
> Jeff'll get the review done today.

Awesome! Would be great to get this landed on m-c (and then uplifted Monday of next week) to prevent a sec-critical regression in FF21.
Comment on attachment 739507 [details] [diff] [review]
Add gfxImageSurface constructor that allocates more space than needed, change AlphaBoxBlur to not hold on to the data and let the callers manage it.

Review of attachment 739507 [details] [diff] [review]:
-----------------------------------------------------------------

It's not beautiful, but it's better to be secure. I do like that Blur doesn't have a member for the data anymore.
Attachment #739507 - Flags: review?(jmuizelaar) → review+
Flags: needinfo?(jmuizelaar)
Comment on attachment 739507 [details] [diff] [review]
Add gfxImageSurface constructor that allocates more space than needed, change AlphaBoxBlur to not hold on to the data and let the callers manage it.

[Security approval request comment]
How easily could an exploit be constructed based on the patch?
I don't think easily.  If anything, the patch may point to buffer overflow issue, where it's really use after free.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
No.

Which older supported branches are affected by this flaw?

If not all supported branches, which bug introduced the flaw?

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
Should be very similar if not exactly the same.  They haven't been prepared pending decision to use.

How likely is this patch to cause regressions; how much testing does it need?
Should be safe.
Attachment #739507 - Flags: sec-approval?
I added checkin-needed, but note that it should wait for sec-approval first.
Just in case, remove checkin-needed until sec-approval is granted.  If granted, whoever does that, could you set checkin-needed to expedite the process?
Keywords: checkin-needed
Comment on attachment 739507 [details] [diff] [review]
Add gfxImageSurface constructor that allocates more space than needed, change AlphaBoxBlur to not hold on to the data and let the callers manage it.

Sec-approval+ (which applies to m-c). Please nominate the patch for Aurora and Beta after/while you checkin so we can not ship this security problem publicly.
Adding back checkin-needed, & needinfo on milan for aurora/beta approval nominations.
Flags: needinfo?(milan)
Keywords: checkin-needed
Attachment #739507 - Flags: sec-approval? → sec-approval+
Comment on attachment 739507 [details] [diff] [review]
Add gfxImageSurface constructor that allocates more space than needed, change AlphaBoxBlur to not hold on to the data and let the callers manage it.

Should this be uplifted to aurora and beta?  The patches apply with some trivial modifications.
Attachment #739507 - Flags: approval-mozilla-beta?
Attachment #739507 - Flags: approval-mozilla-aurora?
Flags: needinfo?(milan)
(In reply to Milan Sreckovic [:milan] from comment #46)
> Comment on attachment 739507 [details] [diff] [review]
> Add gfxImageSurface constructor that allocates more space than needed,
> change AlphaBoxBlur to not hold on to the data and let the callers manage it.
> 
> Should this be uplifted to aurora and beta?  The patches apply with some
> trivial modifications.

I am approving this on branches given comment# 38 &#43 and keeping in mind the patch is safe.If there is a concern on landing this in our second last beta we should discuss that and see how we can mitigate it.
Attachment #739507 - Flags: approval-mozilla-beta?
Attachment #739507 - Flags: approval-mozilla-beta+
Attachment #739507 - Flags: approval-mozilla-aurora?
Attachment #739507 - Flags: approval-mozilla-aurora+
https://hg.mozilla.org/mozilla-central/rev/0c37b537e562
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Just out of interest, does this have any influence on bug 829954?
Check-in needed on the aurora and beta uplift.  The patches for the trunk should apply cleanly, but let me know if you get into trouble.
Keywords: checkin-needed
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #49)
> Just out of interest, does this have any influence on bug 829954?

This bug is "early release of allocated memory" where bug 829954 is "can't allocate memory", so probably not.
(In reply to Milan Sreckovic [:milan] from comment #50)
> Check-in needed on the aurora and beta uplift.  The patches for the trunk
> should apply cleanly, but let me know if you get into trouble.

Hold off on uplift, there is an overflow protection that wasn't working properly here.
Attached patch Correct way to use checked int (obsolete) — Splinter Review
The original code didn't have the overflow protection, the new code looked like it did, but it really didn't, because of how the code was written.  This fixes that.
Attachment #743790 - Flags: review?(jmuizelaar)
(In reply to Milan Sreckovic [:milan] from comment #51)
> (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #49)
> > Just out of interest, does this have any influence on bug 829954?
> 
> This bug is "early release of allocated memory" where bug 829954 is "can't
> allocate memory", so probably not.

OK, just wondered because this patch contains "allocate" and "AlphaBoxBlur" in the description and that other bug talks about using fallible instead of infallible allocation in that area. If it's not related, we're all fine I guess. :)
(In reply to Milan Sreckovic [:milan] from comment #53)
> Created attachment 743790 [details] [diff] [review]
> Correct way to use checked int
> 
> The original code didn't have the overflow protection, the new code looked
> like it did, but it really didn't, because of how the code was written. 
> This fixes that.

I've landed this as:
https://hg.mozilla.org/integration/mozilla-inbound/rev/fa4d8053055d
Blocks: 867348
I saw attachment 743790 [details] [diff] [review]'s push in IRC.  This error looks like a super-easy case for static analysis to detect and flag; I filed bug 867348 on adding such an analysis, hidden for now as this bug's hidden.
Here's the patch that Jeff landed.
Attachment #743790 - Attachment is obsolete: true
Attachment #743790 - Flags: review?(jmuizelaar)
Attachment #743816 - Flags: review?(jmuizelaar)
Attachment #743816 - Flags: review?(jmuizelaar) → review+
Please check this into aurora and beta, making sure both patches are taken (the second one goes on top of the first one).
Keywords: checkin-needed
Note also that there's a typo in the patch, so it might be best to just transplant from mozilla-central.
Oh, ignore me: comment 57 handles this.
Given :RyanVm is afk, any volunteers who can help land this asap on branches(mozilla-beta as priority) as we need to go-to-build ?  Please.. O:)
Whiteboard: [asan] → [asan][adv-main21-]
Group: core-security
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: