Last Comment Bug 659349 - WebGL allows access to uninitialised graphics memory
: WebGL allows access to uninitialised graphics memory
Status: VERIFIED FIXED
[sg:high][qa!]
: verified-aurora, verified-beta
Product: Core
Classification: Components
Component: Canvas: WebGL (show other bugs)
: unspecified
: x86 Mac OS X
: -- normal (vote)
: mozilla7
Assigned To: Nobody; OK to take it and work on it
:
: Milan Sreckovic [:milan]
Mentors:
https://cvs.khronos.org/svn/repos/reg...
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-05-24 09:36 PDT by Paul Stone
Modified: 2015-05-07 15:18 PDT (History)
15 users (show)
rforbes: sec‑bounty+
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
fixed
fixed
fixed
unaffected
unaffected


Attachments
svg filter (310 bytes, image/svg+xml)
2011-05-24 09:36 PDT, Paul Stone
no flags Details
iframe texture (817 bytes, text/plain)
2011-05-24 09:37 PDT, Paul Stone
no flags Details
iframe texture (817 bytes, text/plain)
2011-05-24 09:38 PDT, Paul Stone
no flags Details
iframe texture (817 bytes, text/plain)
2011-05-24 09:39 PDT, Paul Stone
no flags Details
webgl reader (3.16 KB, text/plain)
2011-05-24 09:41 PDT, Paul Stone
no flags Details
frameset (156 bytes, text/html)
2011-05-24 09:44 PDT, Paul Stone
no flags Details
rewrite GLContext::ClearSafely (4.06 KB, patch)
2011-05-25 10:10 PDT, Benoit Jacob [:bjacob] (mostly away)
jmuizelaar: review+
asa: approval‑mozilla‑aurora+
bugzilla: approval‑mozilla‑beta+
Details | Diff | Splinter Review
Microsoft TechNet - WebGL considered harmful - at 9:21 AM Pacific time today (139.17 KB, application/x-bzip)
2011-06-16 13:17 PDT, Benoit Jacob [:bjacob] (mostly away)
no flags Details

Description Paul Stone 2011-05-24 09:36:04 PDT
User-Agent:       Mozilla/5.0 (Windows NT 5.1; rv:2.0.1) Gecko/20100101 Firefox/4.0.1
Build Identifier: 

The linked conformance test checks that a WebGL canvas isn't given unintialised memory when it is resized. We found that it fails on some configurations on Windows XP and MacOS X. By using a larger canvas is it possible to view graphics data from other web pages, and on MacOS various parts of the desktop as well. 

I'm attaching a PoC that loads an iframe onto a texture (by applying a CSS transform/transition to it), then searches for it in graphics memory. An SVG color matrix filter is allied to it to set the red channel to a specific value which we then search for. Although the retrieved image is somewhat scrambled, the PoC could be further developed by encoding position information for each pixel into the color channels and using that information to reconstruct the whole image.

We've tested this with success on Intel and ATI cards on WinXP and Nvidia and Intel cards on some recent Macs.

Reproducible: Always

Steps to Reproduce:
1. Get WebGL context
2. Resize canvas
3. Copy WebGL canvas onto 2d canvas
4. read pixels
Comment 1 Paul Stone 2011-05-24 09:36:49 PDT
Created attachment 534801 [details]
svg filter
Comment 2 Paul Stone 2011-05-24 09:37:59 PDT
Created attachment 534802 [details]
iframe texture
Comment 3 Paul Stone 2011-05-24 09:38:47 PDT
Created attachment 534804 [details]
iframe texture
Comment 4 Paul Stone 2011-05-24 09:39:56 PDT
Created attachment 534805 [details]
iframe texture
Comment 5 Paul Stone 2011-05-24 09:41:33 PDT
Created attachment 534806 [details]
webgl reader
Comment 6 Paul Stone 2011-05-24 09:44:17 PDT
Created attachment 534807 [details]
frameset

This is the one you should run. It seems to work best on MacOS X. For some reason the HTML is uploading as text/plain, I guess this is a new Bugzilla security feature?
Comment 7 Benoit Jacob [:bjacob] (mostly away) 2011-05-24 12:49:50 PDT
The testcase works damn well here. Just save the 'frameset' attachment and run it locally.  Mac OS 10.6.4, Firefox Nightly from today, NVIDIA GeForce 320M OpenGL Engine -- 2.1 NVIDIA-1.6.18.
Comment 8 Benoit Jacob [:bjacob] (mostly away) 2011-05-24 14:12:48 PDT
Removing the colorMask() call makes it not work anymore.

I think I know what's happening. GLContext::ClearSafely is being naive:

void
GLContext::ClearSafely()
{
    GLfloat clearColor[4];
    GLfloat clearDepth;
    GLint clearStencil;

    fGetFloatv(LOCAL_GL_COLOR_CLEAR_VALUE, clearColor);
    fGetFloatv(LOCAL_GL_DEPTH_CLEAR_VALUE, &clearDepth);
    fGetIntegerv(LOCAL_GL_STENCIL_CLEAR_VALUE, &clearStencil);

    fClearColor(0.0f, 0.0f, 0.0f, 0.0f);
    fClearStencil(0);
    fClearDepth(1.0f);

    fClear(LOCAL_GL_COLOR_BUFFER_BIT | LOCAL_GL_DEPTH_BUFFER_BIT | LOCAL_GL_STENCIL_BUFFER_BIT);

    fClearColor(clearColor[0], clearColor[1], clearColor[2], clearColor[3]);
    fClearStencil(clearStencil);
    fClearDepth(clearDepth);
}

As you can see, it's making many assumptions, in particular it assumes no masks, so it can be trumped by the testcase's glColorMask() call.

We already check for this kind of issues at 2 different places in our WebGL implementation, see WebGLContext::ForceClearFramebufferWithDefaultValues(). We call that when creating new FBOs and with the default option preserveDrawingBuffer=false we also call that after the backbuffer has been presented to the compositor. However we don't currently call that after the canvas has been resized!

Patch coming.
Comment 9 Benoit Jacob [:bjacob] (mostly away) 2011-05-24 14:39:37 PDT
The big discussion we're having now with Jeff and Joe is: should we do:
  * option 1: fix this by just letting WebGLContext::SetDimensions call orceClearFramebufferWithDefaultValues()
  * option 2: actually fix GLContext::ClearSafely()

Option 1 has the benefit that it's a very simple patch and doesn't add complexity to GLContext.

Option 2 has the benefit that it doesn't waste time clearing the buffer _twice_, and makes GLContext::ClearSafely() actually safe.

The question is, should the method in GLContext trust GL state? In general they can, but in the case of a GL context backing a WebGL context, state is scriptable and thus can't be trusted. Since it's a webgl-specific problem, maybe it's not a good idea to complexify GLContext in general just for it.
Comment 10 Benoit Jacob [:bjacob] (mostly away) 2011-05-24 14:54:47 PDT
Looking at other GLContext methods, the code we have there is already very careful about not trusting existing GL state and is mostly just calling glGetIntegerv to query state rather than keeping a copy of state, which answers the design question raised in comment 9. I'll just go for option 2 and just call glGetIntegerv.
Comment 11 Benoit Jacob [:bjacob] (mostly away) 2011-05-25 10:10:51 PDT
Created attachment 535112 [details] [diff] [review]
rewrite GLContext::ClearSafely

There's a comment in the patch with some explanations.
Comment 12 Benoit Jacob [:bjacob] (mostly away) 2011-05-25 10:12:10 PDT
Try push:
http://tbpl.mozilla.org/?tree=Try&rev=9274be381579

When the Mac builds are done I'll check if it fixes the testcase.
Comment 13 Benoit Jacob [:bjacob] (mostly away) 2011-05-25 12:19:15 PDT
(In reply to comment #12)
> Try push:
> http://tbpl.mozilla.org/?tree=Try&rev=9274be381579
> 
> When the Mac builds are done I'll check if it fixes the testcase.

It's fixed indeed.
Comment 14 Jeff Muizelaar [:jrmuizel] 2011-05-26 08:01:21 PDT
Comment on attachment 535112 [details] [diff] [review]
rewrite GLContext::ClearSafely

It would be nice if you could document why particular state needs to be set. (e.g. GL_DITHER)
Comment 15 Benoit Jacob [:bjacob] (mostly away) 2011-05-26 11:17:49 PDT
Jeff: still giving a bit more time to the WebKit guys, but at this point I think that disabling GL_DITHER is useless. I think the biggest reason was as an optimization, but in the case of a Clear() that shouldn't make a difference, right? Since it's the same small dither pattern repeated all over the framebuffer.
Comment 16 Benoit Jacob [:bjacob] (mostly away) 2011-05-26 11:19:07 PDT
If I remove the GL_DITHER, that will be in a follow-up though, I don't want to introduce any risk here.
Comment 17 Benoit Jacob [:bjacob] (mostly away) 2011-05-26 11:23:41 PDT
(In reply to comment #12)
> Try push:
> http://tbpl.mozilla.org/?tree=Try&rev=9274be381579

Hey, this makes us pass canvas-test.html on Mac! And looking at the log, it also passes on WinXP where we had added it to the ignore-list. So this test failure was actually very meaningful.
Comment 18 Benoit Jacob [:bjacob] (mostly away) 2011-05-26 11:30:06 PDT
Pushed to mozilla-central:
http://hg.mozilla.org/mozilla-central/rev/a45a190af4f7

Please give me aurora and beta approval for this.
Comment 19 Asa Dotzler [:asa] 2011-05-26 13:34:08 PDT
Dan, can you help the release drivers with the call on whether this meets the criteria you all have been using for making tracking+ decisions for Firefox 5 and/or Firefox 6?
Comment 20 Asa Dotzler [:asa] 2011-05-26 13:35:13 PDT
Benoit, Jeff, can you two help the release team by explaining the risks associated with this patch and what's been done to help assure there's no negative fallout?
Comment 21 Benoit Jacob [:bjacob] (mostly away) 2011-05-27 07:23:39 PDT
Sure. This bug allows to read back arbitrary uninitialized video memory at least on Mac and on Windows XP, when WebGL is enabled.

When I ran the testcase on a Mac, I saw the contents of other windows and of the desktop being painted on the canvas.

It can't read currently-in-use video memory, but it can read memory that had been used by any other application and had been released since (even if the application is still running).

It's known to happen also on Windows XP. We used to have a WebGL test failing because of that on Windows XP.

So it's really a quite severe flaw.

Moreover the code that this patch adds is not entirely new since it's a simplified version of the code that we already have in WebGL to clear framebuffer objects.

I should also note that although the bug currently is only known to affect WebGL, it's not a bug in the WebGL implementation, see the patch, it's touching only GLContext::ClearSafely().
Comment 22 Asa Dotzler [:asa] 2011-06-01 12:13:20 PDT
Comment on attachment 535112 [details] [diff] [review]
rewrite GLContext::ClearSafely

Please land in Aurora ASAP and we'll evaluate after further testing for 5 Beta.
Comment 23 Benoit Jacob [:bjacob] (mostly away) 2011-06-01 12:27:31 PDT
Landed on Aurora:
http://hg.mozilla.org/releases/mozilla-aurora/rev/0b02d41a5436

For Beta, I pushed it to Try:
http://tbpl.mozilla.org/?tree=Try&rev=623856993a20
Comment 24 Benoit Jacob [:bjacob] (mostly away) 2011-06-01 13:04:17 PDT
Hey!!! Is tryserver busted?

My tryserver builds fail with this:

===
=== If you get failures below, please file a bug describing the error
=== and your environment (compiler and linker versions), and use
=== --disable-elf-hack until this is fixed.
===
/builds/slave/try-lnx-dbg/build/obj-firefox/build/unix/elfhack/elfhack -b test.so
/builds/slave/try-lnx-dbg/build/obj-firefox/build/unix/elfhack/elfhack: /tools/gcc-4.3.3/installed/lib/libstdc++.so.6: version `GLIBCXX_3.4.14' not found (required by /builds/slave/try-lnx-dbg/build/obj-firefox/build/unix/elfhack/elfhack)
NEXT ERROR make[7]: *** [test.so] Error 1

I'm not the only one to get these errors, see e.g.
http://tbpl.mozilla.org/?tree=Try&rev=bed0a8106c45
http://tbpl.mozilla.org/?tree=Try&rev=df89a4ecc7ff

CC'ing glandium about the elf-hack message.
Comment 25 Mike Hommey [:glandium] 2011-06-01 15:30:43 PDT
(In reply to comment #24)
> Hey!!! Is tryserver busted?
> 
> My tryserver builds fail with this:
> 
> ===
> === If you get failures below, please file a bug describing the error
> === and your environment (compiler and linker versions), and use
> === --disable-elf-hack until this is fixed.
> ===
> /builds/slave/try-lnx-dbg/build/obj-firefox/build/unix/elfhack/elfhack -b
> test.so
> /builds/slave/try-lnx-dbg/build/obj-firefox/build/unix/elfhack/elfhack:
> /tools/gcc-4.3.3/installed/lib/libstdc++.so.6: version `GLIBCXX_3.4.14' not
> found (required by
> /builds/slave/try-lnx-dbg/build/obj-firefox/build/unix/elfhack/elfhack)
> NEXT ERROR make[7]: *** [test.so] Error 1

https://wiki.mozilla.org/index.php?title=ReleaseEngineering/TryServer#Using_older_GCC

(You should have gotten a warning when pushing, but maybe the hook didn't work properly?)
 
> I'm not the only one to get these errors, see e.g.
> http://tbpl.mozilla.org/?tree=Try&rev=bed0a8106c45
> http://tbpl.mozilla.org/?tree=Try&rev=df89a4ecc7ff

Not the same errors at all.
Comment 26 Benoit Jacob [:bjacob] (mostly away) 2011-06-02 11:52:19 PDT
(In reply to comment #25)
> https://wiki.mozilla.org/index.php?title=ReleaseEngineering/
> TryServer#Using_older_GCC
> 
> (You should have gotten a warning when pushing, but maybe the hook didn't
> work properly?)

Thanks for the tip, and you're right, this gets printed when I push, I just hadn't seen it. New try push:
http://tbpl.mozilla.org/?tree=Try&rev=248aead2bb55

Also I've made a local build and it works fine.
Comment 27 Johnathan Nightingale [:johnath] 2011-06-02 14:23:32 PDT
Comment on attachment 535112 [details] [diff] [review]
rewrite GLContext::ClearSafely

Can this please land today on beta?
Comment 28 Benoit Jacob [:bjacob] (mostly away) 2011-06-02 15:14:30 PDT
Landed on beta:
http://hg.mozilla.org/releases/mozilla-beta/rev/bca54750eec4
Comment 29 Benoit Jacob [:bjacob] (mostly away) 2011-06-16 11:47:34 PDT
CC'ing Ken as Khronos is releasing a statement about that in response to today's blog post from Context.
Comment 30 Benoit Jacob [:bjacob] (mostly away) 2011-06-16 11:55:01 PDT
@ Paul Stone:
You knew that this was fixed in Firefox 5, which is going to be released in 5 days from now. Why did you blog about it today?
http://www.contextis.com/resources/blog/webgl2/
Comment 31 Paul Stone 2011-06-16 12:17:54 PDT
Timing wasn't down to me, I'm afraid. I just did the PoC.
Comment 32 Benoit Jacob [:bjacob] (mostly away) 2011-06-16 12:51:05 PDT
Speaking about timing, just a few hours after Context's blog post, Microsoft released this:

http://blogs.technet.com/b/srd/archive/2011/06/16/webgl-considered-harmful.aspx
Comment 33 Benoit Jacob [:bjacob] (mostly away) 2011-06-16 13:17:09 PDT
Created attachment 539878 [details]
Microsoft TechNet - WebGL considered harmful - at 9:21 AM Pacific time today

Microsoft's post says 9:21 AM. Looking at other posts, clearly they use Pacific time.

When exactly was Context's post published?
Comment 34 Daniel Veditz [:dveditz] 2011-06-16 14:14:48 PDT
The MS blog is no surprise, they expressed those same three concerns to us when Vlad first started talking about WebGL a few years ago.
Comment 35 Benoit Jacob [:bjacob] (mostly away) 2011-06-16 15:16:34 PDT
http://www.google.com/#sclient=psy&hl=en&source=hp&q=http:%2F%2Fwww.contextis.com%2Fresources%2Fblog%2Fwebgl2&aq=f&aqi=&aql=&oq=&pbx=1&bav=on.2,or.r_gc.r_pw.&fp=9dd7129c70b71d22&biw=1362&bih=625&as_qdr=y15

currently says '8 hours ago' and just a minute ago it said '7 hours ago'. That means it went online exactly 8 hours ago, which means 7:10 AM Pacific time (it's currently 15:10 Pacific time).

http://www.contextis.co.uk/resources/blog/blog.xml confirms it was published June 16 but doesn't give a time of the day.

So, assuming that (my interpretation of) Google's measurement is correct, there is only 2 hours and 10 minutes between the publication of Context's post, and the publication of the Microsoft TechNet article quoting it.

Please, someone give me an explanation that makes me drop my tinfoil hat.
Comment 36 Benoit Jacob [:bjacob] (mostly away) 2011-06-16 15:17:45 PDT
(In reply to comment #34)
> The MS blog is no surprise, they expressed those same three concerns to us
> when Vlad first started talking about WebGL a few years ago.

But they quote this new article by context, that was published just 2 hours before.
Comment 37 Kenneth Russell 2011-06-16 15:18:41 PDT
Fnord!
Comment 38 Benoit Jacob [:bjacob] (mostly away) 2011-06-16 16:47:28 PDT
http://www.google.com/#sclient=psy&hl=en&source=hp&q=http:%2F%2Fwww.contextis.com%2Fresources%2Fblog%2Fwebgl2&aq=f&aqi=&aql=&oq=&pbx=1&fp=1&biw=1362&bih=625&as_qdr=y15&bav=on.2,or.r_gc.r_pw.&cad=b

now says '23 hours ago' which is not plausible as that would mean it was published at midnight, UK time...

I suppose the timestamp got cleared to 00:00:00, as in http://www.contextis.co.uk/resources/blog/blog.xml where it's Thur, 16 Jun 2011 00:00:00 +0000

What, if anything, can/should we say about this affair? Just ignore it? Or just mention, in a way as neutral as possible, that at most a few hours elapsed between Context's blog and Microsoft's article quoting it?
Comment 39 Robert O'Callahan (:roc) (email my personal email if necessary) 2011-06-16 21:28:35 PDT
We shouldn't say anything about it. We can't prove anything and it wouldn't matter if we could.
Comment 40 Benoit Jacob [:bjacob] (mostly away) 2011-06-16 21:42:58 PDT
Yeah, OK, I had come to the same conclusion.
Comment 42 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-09-22 13:52:52 PDT
qa?: what is the minimized testcase for QA to verify this fix? I understand comment 7 but am unclear as to how to use it.
Comment 43 Benoit Jacob [:bjacob] (mostly away) 2011-09-22 14:25:48 PDT
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #42)
> qa?: what is the minimized testcase for QA to verify this fix? I understand
> comment 7 but am unclear as to how to use it.

Just run the testcase: if the bug still existed, you would see images from other tabs / other apps / your desktop appearing in the canvas.
Comment 44 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2011-09-22 17:45:02 PDT
qa+ for verification on Firefox 7 using the test on comment 43
Comment 45 Vlad [QA] 2011-09-23 06:00:34 PDT
Setting resolution to Verified Fixed on:
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:9.0a1) Gecko/20110922 Firefox/9.0a1
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:8.0a2) Gecko/20110921 Firefox/8.0a2
Mozilla/5.0 (Macintosh; Intel Mac OS X 10.6; rv:7.0) Gecko/20100101 Firefox/7.0

I run the test case and no tabs/apps/desktop appeared on the canvas.
Comment 46 Raymond Forbes[:rforbes] 2013-07-19 18:40:23 PDT
rforbes-bugspam-for-setting-that-bounty-flag-20130719

Note You need to log in before you can comment on or make changes to this bug.