Last Comment Bug 792305 - (CVE-2012-5829) Heap-buffer-overflow in nsWindow::OnExposeEvent
(CVE-2012-5829)
: Heap-buffer-overflow in nsWindow::OnExposeEvent
Status: RESOLVED FIXED
[asan][adv-main18+][adv-esr17+]
: crash, csectype-bounds, regression, sec-critical, testcase
Product: Core
Classification: Components
Component: Widget: Gtk (show other bugs)
: Trunk
: x86_64 Linux
: -- critical (vote)
: mozilla18
Assigned To: Karl Tomlinson (:karlt)
:
:
Mentors:
Depends on:
Blocks: dlbi
  Show dependency treegraph
 
Reported: 2012-09-18 21:20 PDT by Abhishek Arya
Modified: 2016-12-01 13:38 PST (History)
12 users (show)
dveditz: sec‑bounty+
ryanvm: in‑testsuite?
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
unaffected
unaffected
unaffected
+
fixed
+
fixed
17+
fixed


Attachments
Testcase to show child window is out of bounds (Karl's point) (858 bytes, text/html)
2012-09-20 00:24 PDT, Abhishek Arya
no flags Details
Crasher testcase (9.09 KB, text/html)
2012-09-20 09:37 PDT, Abhishek Arya
no flags Details
delay shape mask update from resize to expose (6.38 KB, patch)
2012-09-20 21:46 PDT, Karl Tomlinson (:karlt)
roc: review+
Details | Diff | Splinter Review
don't update mask bits outside the mask (1.40 KB, patch)
2012-09-20 21:47 PDT, Karl Tomlinson (:karlt)
roc: review+
akeybl: approval‑mozilla‑esr10+
abillings: sec‑approval+
Details | Diff | Splinter Review
remove more unnecessary shape code (2.40 KB, patch)
2012-09-23 15:22 PDT, Karl Tomlinson (:karlt)
roc: review+
Details | Diff | Splinter Review
Screenshot of black enclosing rectangle issue. (6.35 KB, image/png)
2012-09-23 22:05 PDT, Abhishek Arya
no flags Details

Description Abhishek Arya 2012-09-18 21:20:58 PDT
Reproduces on trunk. Testcase reproduces extremely flakily if i first run firefox with a clean profile (and testcase path on cmd line), then close the firefox completely, and then rerun it again with testcase. Trying to get a minimized testcase. The second-time need might be coming from this line in the testcase

InstallTrigger.install({
    "Unsigned XPI": "http://example.com/browser/toolkit/mozapps/extensions/test/xpinstall/unsigned.xpi"
  });

=================================================================
==12538== ERROR: AddressSanitizer heap-buffer-overflow on address 0x7f0886f066d8 at pc 0x7f08b8658875 bp 0x7fff30d69ab0 sp 0x7fff30d69aa8
READ of size 1 at 0x7f0886f066d8 thread T0
    #0 0x7f08b8658874 in UpdateMaskBits(char*, int, int, nsIntRect const&, unsigned char*, int) widget/gtk2/nsWindow.cpp:4261
    #1 0x7f08b8657a51 in nsWindow::OnExposeEvent(_GdkEventExpose*) widget/gtk2/nsWindow.cpp:2222
    #2 0x7f08b866816e in expose_event_cb(_GtkWidget*, _GdkEventExpose*) widget/gtk2/nsWindow.cpp:5042
0x7f0886f066d8 is located 0 bytes to the right of 5720-byte region [0x7f0886f05080,0x7f0886f066d8)
allocated by thread T0 here:
    #0 0x4343e0 in __interceptor_malloc 
    #1 0x7f08bc70c3a8 in moz_xmalloc memory/mozalloc/mozalloc.cpp:57
    #2 0x7f08b8657a51 in nsWindow::OnExposeEvent(_GdkEventExpose*) widget/gtk2/nsWindow.cpp:2222
    #3 0x7f08b866816e in expose_event_cb(_GtkWidget*, _GdkEventExpose*) widget/gtk2/nsWindow.cpp:5042
Shadow byte and word:
  0x1fe110de0cdb: fb
  0x1fe110de0cd8: 00 00 00 fb fb fb fb fb
More shadow bytes:
  0x1fe110de0cb8: 00 00 00 00 00 00 00 00
  0x1fe110de0cc0: 00 00 00 00 00 00 00 00
  0x1fe110de0cc8: 00 00 00 00 00 00 00 00
  0x1fe110de0cd0: 00 00 00 00 00 00 00 00
=>0x1fe110de0cd8: 00 00 00 fb fb fb fb fb
  0x1fe110de0ce0: fa fa fa fa fa fa fa fa
  0x1fe110de0ce8: fa fa fa fa fa fa fa fa
  0x1fe110de0cf0: fa fa fa fa fa fa fa fa
  0x1fe110de0cf8: fa fa fa fa fa fa fa fa
Stats: 293M malloced (320M for red zones) by 521376 calls
Stats: 41M realloced by 23856 calls
Stats: 248M freed by 255478 calls
Stats: 115M really freed by 96425 calls
Stats: 532M (136284 full pages) mmaped in 133 calls
  mmaps   by size class: 8:376809; 9:49146; 10:20475; 11:14329; 12:4096; 13:2560; 14:1280; 15:256; 16:576; 17:1248; 18:288; 19:40; 20:20;
  mallocs by size class: 8:419076; 9:53870; 10:19196; 11:16732; 12:4893; 13:2792; 14:1719; 15:362; 16:881; 17:1501; 18:294; 19:41; 20:19;
  frees   by size class: 8:178304; 9:39375; 10:14131; 11:13204; 12:3650; 13:2549; 14:1466; 15:306; 16:738; 17:1481; 18:219; 19:39; 20:16;
  rfrees  by size class: 8:64681; 9:13811; 10:3824; 11:9587; 12:1194; 13:844; 14:914; 15:151; 16:412; 17:990; 18:11; 19:5; 20:1;
Stats: malloc large: 1855 small slow: 2608
==12538== ABORTING
Comment 1 Daniel Veditz [:dveditz] 2012-09-19 10:31:23 PDT
If we're mis-calculating a tranlucency area I don't know why it would be flaky, but it does looks like if ASan caught it reading that byte it will turn around and write an updated value to it in the next statement. If it's just an off-by-one might be pretty hard to exploit. Would a different testcase result in a different offset?

It's in a loop at that point though, so maybe ASan stopped it at the first byte but it would have kept going. We'll have to debug and see.
Comment 2 Karl Tomlinson (:karlt) 2012-09-19 15:35:40 PDT
We could get into trouble here if mBounds is changed during PaintWindow, but I thought that couldn't happen now with viewmanager.refresh-driver-painting.enabled.

There could also be problems if the child window mBounds were greater than the toplevel window mBounds.  I don't know how that would happen.
Comment 3 Abhishek Arya 2012-09-19 19:20:01 PDT
Due to InstallTrigger.install on unsigned xpi, I do see a small child window overhanging from the navigation bar that says "The add-on could not be downloaded because of a connection failure with xyz.com" (non-existent url). I think that is the root cause [its width is shortened when the download fails]. Reduction is going pretty poorly atm since it depends on the timing of that window.
Comment 4 Mats Palmgren (:mats) 2012-09-19 20:09:29 PDT
Can you attach the unminimized test?  With some luck it's reproducible for
someone.
Comment 5 Abhishek Arya 2012-09-20 00:23:50 PDT
(In reply to Mats Palmgren [:mats] from comment #4)
> Can you attach the unminimized test?  With some luck it's reproducible for
> someone.

Sorry, the unminimized test is a fuzzer testcase, so won't be able to attach it here. I am trying to get a somewhat minimized testcase, and will attach as soon as i get one.

Regarding Karl point - "There could also be problems if the child window mBounds were greater than the toplevel window mBounds.  I don't know how that would happen.", this part is easily reproducible by a reduced testcase (enclosing, run from cmd-line). you can just resize the top level window manually (shorten width) and the child window about the error will remain outside bounds.
Comment 6 Abhishek Arya 2012-09-20 00:24:28 PDT
Created attachment 662841 [details]
Testcase to show child window is out of bounds (Karl's point)
Comment 7 Abhishek Arya 2012-09-20 09:37:56 PDT
Created attachment 663051 [details]
Crasher testcase

Minimized testcase (almost :) that crashes under ASAN.
Comment 8 Karl Tomlinson (:karlt) 2012-09-20 20:16:20 PDT
It looks like widget bounds can change during PaintWindow if ForcedRepaint().
I assume we could make widget handle that, or perhaps we should also consider doing forced repaints from WillPaintWindow instead of PaintWindow.

Matt is considering changing painting from the refresh driver to WillPaintWindow for performance reasons, but that would also address this.
Comment 9 Karl Tomlinson (:karlt) 2012-09-20 21:46:47 PDT
Created attachment 663277 [details] [diff] [review]
delay shape mask update from resize to expose

Some initial clean-up.

There is no need to ResizeTransparencyBitmap in NativeResize because we'll get
to OnSizeAllocate from the size-allocate signal anyway.

There is not much point calling ResizeTransparencyBitmap in OnSizeAllocate
because the only difference it might make to appearance would be to make the
unpainted part of the widget visible.  Better to wait until that part is ready
to paint.  In that theme, this changes initialization of the mask on resize
to transparent.
Comment 10 Karl Tomlinson (:karlt) 2012-09-20 21:47:57 PDT
Created attachment 663278 [details] [diff] [review]
don't update mask bits outside the mask
Comment 11 Karl Tomlinson (:karlt) 2012-09-23 15:22:24 PDT
Created attachment 663854 [details] [diff] [review]
remove more unnecessary shape code

I forgot to remove the ApplyTransparencyBitmap as well as the ResizeTransparencyBitmap from OnSizeAllocate in attachment 663277 [details] [diff] [review].
(Applying the old mask again won't have any effect.)
And there's some unreached code from null-checking new().
Comment 12 Karl Tomlinson (:karlt) 2012-09-23 21:06:09 PDT
Folded attachment 663854 [details] [diff] [review] into attachment 663277 [details] [diff] [review].
https://hg.mozilla.org/integration/mozilla-inbound/rev/f92968ac30fa
https://hg.mozilla.org/integration/mozilla-inbound/rev/5a9ee59a8c3b

I couldn't actually reproduce here, so please check that this resolves the issue.
Comment 13 Karl Tomlinson (:karlt) 2012-09-23 21:08:02 PDT
https://tbpl.mozilla.org/?tree=Try&rev=c84ffc9700ae
Comment 14 Abhishek Arya 2012-09-23 22:04:33 PDT
(In reply to Karl Tomlinson (:karlt) from comment #12)
> Folded attachment 663854 [details] [diff] [review] into attachment 663277 [details] [diff] [review]
> [details] [diff] [review].
> https://hg.mozilla.org/integration/mozilla-inbound/rev/f92968ac30fa
> https://hg.mozilla.org/integration/mozilla-inbound/rev/5a9ee59a8c3b
> 
> I couldn't actually reproduce here, so please check that this resolves the
> issue.

Good news: security crash is fixed. Bad news: there is an ugly black enclosing rectangle visible on the edges. Let me add the screen shot. Just to reiterate on reproducer steps. Start with clean new profile and testcase on cmd-line, close firefox completely and then start firefox with that same profile and again testcase on cmd-line.
Comment 15 Abhishek Arya 2012-09-23 22:05:18 PDT
Created attachment 663916 [details]
Screenshot of black enclosing rectangle issue.
Comment 16 Karl Tomlinson (:karlt) 2012-09-23 22:17:49 PDT
Thanks.  I don't see the black part here.

Can you tell me which window manager is running, please?
Does disabling or enabling compositing change the behavior?
Does the black rectangle disappear immediately on closing the window.
Comment 17 Abhishek Arya 2012-09-23 22:44:31 PDT
(In reply to Karl Tomlinson (:karlt) from comment #16)
> Thanks.  I don't see the black part here.

Did you see the screenshot in c#15. There is a small black rectangle to the right of 'X' sign.

> 
> Can you tell me which window manager is running, please?

Metacity

> Does disabling or enabling compositing change the behavior?

No. same behavior, with and without compositing enabled.

> Does the black rectangle disappear immediately on closing the window.

Yes if I close this child window (the window that says "this add-on downloaded from example.com...."), then the blck rectangle disappears.
Comment 19 Karl Tomlinson (:karlt) 2012-09-24 21:05:27 PDT
(In reply to Ryan VanderMeulen from comment #18)
> Should this have a test?

A test could be useful, but creating a test that will reproduce may be tricky.
Comment 20 Karl Tomlinson (:karlt) 2012-09-24 21:10:53 PDT
(In reply to Abhishek Arya from comment #17)
> Did you see the screenshot in c#15. There is a small black rectangle to the
> right of 'X' sign.

Yes.  I can see, thanks.

> Metacity

I can't reproduce the same behavior here with metacity, but I do see other glitches, similar to bug 635897, but more significant more often.

Rather than add more Shape code to work around this, bug 408284 is almost there, so I'll see about getting that completed.

> > Does disabling or enabling compositing change the behavior?
> 
> No. same behavior, with and without compositing enabled.

That's harder to explain.  I see no glitches with compositing disabled.  (The window manager is not involved with these override-redirect windows when compositing is disabled.)
If there is an X server interaction issue, then that needs a different solution.
Comment 21 Daniel Veditz [:dveditz] 2012-09-28 11:43:32 PDT
Karlt: is this old code such that we need this fix on other branches (esr10, beta)?
Comment 22 Karl Tomlinson (:karlt) 2012-09-30 20:35:46 PDT
The nsWindow.cpp code was designed to expect widget resizes during WillPaintWindow but not during PaintWindow.

Bug 539356 changed things to so that ProcessPendingUpdates() can sometimes call ProcessPendingUpdates() during nsViewManager::Refresh() during PaintWindow.
That allows resizes to happen during PaintWindow() and is what I expect enabled this bug to be triggered.

That change landed for the 17 branch, but is disabled on that branch through viewmanager.refresh-driver-painting.enabled.

Before that change, going back as far as 15 at least, Refresh() does not call ProcessPendingUpdates().  The nsAutoScriptBlocker in Refresh() should prevent any nested event loops that may allow resizes.

So we should be fine on 15 to 17.

On esr10, things are different again:

In nsViewManager::DispatchEvent(), NS_PAINT (the equivalent of PaintWindow) will ProcessPendingUpdates() if mHasPendingUpdates is set.  I think that is fine because ProcessPendingUpdates() will do nothing because the resizes have already happened during NS_WILL_PAINT.

However, NS_PAINT can cause CallWillPaintOnObservers to be called.  I'm not familiar with the didResize logic there, but that can cause nsPresShell::FlushPendingNotifications() to be called, during which anything can happen.  I doubt that that function iterates until there are no more events pending, and so calling CallWillPaintOnObservers on NS_WILL_PAINT does not ensure that nothing happens on NS_PAINT AFAIK.

We can backport attachment 663278 [details] [diff] [review] to esr10.
There is already an assertion there that the patch should have no effect on results, unless this bug is possible there.
Comment 23 Karl Tomlinson (:karlt) 2012-10-04 16:24:07 PDT
bajaj: why do you say esr10 is unaffected?
Comment 24 Alex Keybl [:akeybl] 2012-10-11 16:24:32 PDT
(In reply to Karl Tomlinson (:karlt) from comment #22)
> We can backport attachment 663278 [details] [diff] [review] to esr10.
> There is already an assertion there that the patch should have no effect on
> results, unless this bug is possible there.

Let's move forward with this for the next ESR. Please backport a patch as soon as possible.
Comment 25 Karl Tomlinson (:karlt) 2012-10-11 18:46:17 PDT
Picking sec-critical because unrelated memory is overwritten outside the buffer.
Perhaps, though, there are mitigating circumstances, including timing requirements.
Comment 26 Karl Tomlinson (:karlt) 2012-10-11 18:57:43 PDT
Comment on attachment 663278 [details] [diff] [review]
don't update mask bits outside the mask

[Security approval request comment]
How easily can the security issue be deduced from the patch?
The presence of potential overflow bug can be deduced from the patch, but the conditions to generate the overflow are much more complicated.

Do comments in the patch, the check-in comment, or tests included in the patch paint a bulls-eye on the security problem?
Same answer as above.  The do not indicate how content might affect the size of a chrome popup.  There are also timing requirements, and on esr10 these include having to influence the chrome popup during an event dispatched because of reflow.

Which older supported branches are affected by this flaw?
esr10

If not all supported branches, which bug introduced the flaw?
Bug 539356 for 18.  Bug may have existed long before esr10.

Do you have backports for the affected branches? If not, how different, hard to create, and risky will they be?
Same key patch applies as intended with fuzz.  The cleanup patches, which include risk, are not required.

How likely is this patch to cause regressions; how much testing does it need?
Very unlikely; see quote in comment 24.

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
Fixes potential heap buffer overflow security risk.
User impact if declined: risk
Fix Landed on Version: 18
Risk to taking this patch (and alternatives if risky): very low risk; see quote in comment 24.
String or UUID changes made by this patch:  none
Comment 27 Karl Tomlinson (:karlt) 2012-10-11 19:01:38 PDT
(In reply to Karl Tomlinson (:karlt) from comment #26)
> having to influence the chrome popup during an event dispatched
> because of reflow.

and that needs to be during a paint event on the chrome popup.  I'm not sure content will get to run then, so this may not be exploitable on esr10.
But the patch is safe.
Comment 28 Karl Tomlinson (:karlt) 2012-10-18 18:55:31 PDT
https://hg.mozilla.org/releases/mozilla-esr10/rev/53363548ad9b
Comment 29 Al Billings [:abillings] 2012-11-07 16:27:31 PST
I'm confused. How do we have a bug that affects ESR10 but does not affect 17, which came after ESR10's fork?
Comment 30 Karl Tomlinson (:karlt) 2012-11-07 16:41:11 PST
See comment 22.  The painting code has changed several times.  The bug on ESR10 is a theoretical bug.  The testcase here may not reproduce on ESR10, but there is the potential for other triggers.
Comment 32 Daniel Veditz [:dveditz] 2012-11-19 15:14:19 PST
This bug qualifies for the security bug bounty.

Note You need to log in before you can comment on or make changes to this bug.