Closed Bug 890432 Opened 6 years ago Closed 6 years ago

WebGL conformance-test-suite crash with long-expressions-should-not-crash [@glpCGNode]

Categories

(Core :: Canvas: WebGL, defect, critical)

All
macOS
defect
Not set
critical

Tracking

()

VERIFIED FIXED
mozilla27
Tracking Status
firefox25 - wontfix
firefox26 - verified
firefox27 --- verified
firefox-esr17 --- wontfix
firefox-esr24 --- wontfix
b2g18 --- wontfix

People

(Reporter: posidron, Assigned: jgilbert)

References

(Blocks 1 open bug)

Details

(Keywords: crash, csectype-dos, testcase, Whiteboard: [adv-main26+])

Crash Data

Attachments

(2 files, 1 obsolete file)

Testcase: https://www.khronos.org/registry/webgl/sdk/tests/conformance/glsl/bugs/long-expressions-should-not-crash.html

Stack:

#0  0x00000001256ee609 in glpCGNode ()
#1  0x00000001256f03ad in glpCGPPStreamOpNode ()
#2  0x00000001256eff8f in glpCGNode ()
#3  0x00000001256f03ad in glpCGPPStreamOpNode ()
#4  0x00000001256eff8f in glpCGNode ()
[...]
#4403 0x00000001256f03ad in glpCGPPStreamOpNode ()
#4404 0x00000001256eff8f in glpCGNode ()
#4405 0x00000001256f03ad in glpCGPPStreamOpNode ()
#4406 0x00000001256eff8f in glpCGNode ()
#4407 0x00000001256f03ad in glpCGPPStreamOpNode ()
[...]

Tested with http://hg.mozilla.org/integration/mozilla-inbound/rev/a9d6d86a8090
Firefox 22 is also affected.

  Chipset Model: NVIDIA GeForce GT 650M
  Type: GPU
  Bus: PCIe
  PCIe Lane Width: x8
  VRAM (Total):	1024 MB
  Vendor: NVIDIA (0x10de)
  Device ID: 0x0fd5
  Revision ID: 0x00a2
  ROM Revision: 3688
  gMux Version: 3.2.19 [3.2.8]


ProductName:	Mac OS X
ProductVersion:	10.8.4
BuildVersion:	12E55
Is there anything under all those repeating pairs on the stack?
Flags: needinfo?(cdiehl)
I haven't looked deeper into it and I do not have any symbols for the graphic drivers.
Flags: needinfo?(cdiehl)
Is there any more information along with the call tree you can get out of gdb? registers, a bit of dissassembly, the reason for the crash?
Flags: needinfo?(cdiehl)
Attached file callstack.txt
At frame #5467 we finally get some real stack to see. I have attached the stack for those frames.
Flags: needinfo?(cdiehl)
Jeff what do you make of the stack in comment 4?
Flags: needinfo?(jgilbert)
(In reply to David Bolter [:davidb] from comment #5)
> Jeff what do you make of the stack in comment 4?

It looks like a driver bug we're hitting. I imagine the newer versions of ANGLE work around such issues.
Flags: needinfo?(jgilbert)
Jeff: do we have a bug open to update ANGLE that this could depend on? Or do we morph this bug into the "update ANGLE (this time)" bug?

What is our regular process for tracking when we need to update ANGLE?
Flags: needinfo?(jgilbert)
The most recent ANGLE update is bug 883478.
Flags: needinfo?(jgilbert)
The stack looks like it went into a recursive loop until it ran out of resources. Those generally aren't exploitable unless we see other scary signs.
Depends on: 883478
Keywords: csec-dos
Oops, I was testing Win7.
Looks like we hit this even with that ANGLE update.
Attached patch limit-mac-nv-complex (obsolete) — Splinter Review
This should do it. I'll spin up a Try build and test it locally.
Assignee: nobody → jgilbert
Attachment #806935 - Flags: review?(bjacob)
Oops, fixed whitespace. The EOF change here is adding a newline at EOF, where there wasn't one before.
Attachment #806935 - Attachment is obsolete: true
Attachment #806935 - Flags: review?(bjacob)
Attachment #806939 - Flags: review?(bjacob)
Comment on attachment 806939 [details] [diff] [review]
patch: Limit expressions complexity to 1000 on Mac+NV

Review of attachment 806939 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/canvas/src/WebGLContextGL.cpp
@@ +3021,5 @@
> +        if (gl->WorkAroundDriverBugs()) {
> +#ifdef XP_MACOSX
> +            if (gl->Vendor() == gl::GLContext::VendorNVIDIA) {
> +                // Work around bug 890432
> +                resources.MaxExpressionComplexity = 1000;

Please make this a std::min instead.

@@ +3107,5 @@
>          int compileOptions = SH_ATTRIBUTES_UNIFORMS |
>                               SH_ENFORCE_PACKING_RESTRICTIONS;
>  
> +        if (resources.MaxExpressionComplexity > 0) {
> +            compileOptions |= SH_LIMIT_EXPRESSION_COMPLEXITY;

Is this needed outside of the above case?

If not, please reorganize this code so that it's only done at the same place.

@@ +3119,5 @@
>  
>          if (useShaderSourceTranslation) {
>              compileOptions |= SH_OBJECT_CODE
>                              | SH_MAP_LONG_VARIABLE_NAMES;
> +

Whitespace
Attachment #806939 - Flags: review?(bjacob) → review+
(In reply to Benoit Jacob [:bjacob] from comment #15)
> Comment on attachment 806939 [details] [diff] [review]
> patch: Limit expressions complexity to 1000 on Mac+NV
> 
> Review of attachment 806939 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: content/canvas/src/WebGLContextGL.cpp
> @@ +3021,5 @@
> > +        if (gl->WorkAroundDriverBugs()) {
> > +#ifdef XP_MACOSX
> > +            if (gl->Vendor() == gl::GLContext::VendorNVIDIA) {
> > +                // Work around bug 890432
> > +                resources.MaxExpressionComplexity = 1000;
> 
> Please make this a std::min instead.
std::min doesn't make sense here. The default for this is 0, but it's also disabled by default.
> 
> @@ +3107,5 @@
> >          int compileOptions = SH_ATTRIBUTES_UNIFORMS |
> >                               SH_ENFORCE_PACKING_RESTRICTIONS;
> >  
> > +        if (resources.MaxExpressionComplexity > 0) {
> > +            compileOptions |= SH_LIMIT_EXPRESSION_COMPLEXITY;
> 
> Is this needed outside of the above case?
> 
> If not, please reorganize this code so that it's only done at the same place.
I think it's much better here. We always want to activate SH_LIMIT_EXPR_COMPL whenever we have a value set for MaxExprCompl. We could set it in the same place, but this would require me to hoist this code about 80 lines further up, leaving us with a mysterious 'compileOptions' var that isn't used for a while yet.

I think it's better to do it this way, such that we activate it if we use it. When other cases come up where we might want to limit complexity, we don't have to do anything here: We just need to set MaxExprCompl to something non-zero. (non-default)
> 
> @@ +3119,5 @@
> >  
> >          if (useShaderSourceTranslation) {
> >              compileOptions |= SH_OBJECT_CODE
> >                              | SH_MAP_LONG_VARIABLE_NAMES;
> > +
> 
> Whitespace
I think I'm going to put my editor back in kill-all-whitespace mode, and deal with the side-effects.
https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=1430526cf6e5

I've also verified that this is fixed on the machine I could repro this on.
Hardware: x86_64 → All
Comment on attachment 806939 [details] [diff] [review]
patch: Limit expressions complexity to 1000 on Mac+NV

[Approval Request Comment]
Bug caused by (feature/regressing bug #): n/a
User impact if declined: Leaves users of Macs with NVidia GPUs vulnerable to DOS.
Testing completed (on m-c, etc.): On inbound.
Risk to taking this patch (and alternatives if risky): very low
String or IDL/UUID changes made by this patch: none

This is probably not worth uplifting to beta. (It might not even be possible, if our older version of ANGLE doesn't support this)
Attachment #806939 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/1430526cf6e5
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27
Doesn't need to track as long as this is a DOS. Will leave to approval triage on whether or not to uplift.
Comment on attachment 806939 [details] [diff] [review]
patch: Limit expressions complexity to 1000 on Mac+NV

Low risk, can take on Aurora.
Attachment #806939 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
https://hg.mozilla.org/releases/mozilla-aurora/rev/2f3a72f707c9

(In reply to Jeff Gilbert [:jgilbert] from comment #17)

BTW, TBPL links will go stale after 30 days, so please post links to hg.mozilla.org when pushing to inbound :)
(In reply to Ryan VanderMeulen [:RyanVM UTC-4] from comment #22)
> https://hg.mozilla.org/releases/mozilla-aurora/rev/2f3a72f707c9
> 
> (In reply to Jeff Gilbert [:jgilbert] from comment #17)
> 
> BTW, TBPL links will go stale after 30 days, so please post links to
> hg.mozilla.org when pushing to inbound :)

Alright, and thanks!
Keywords: verifyme
Whiteboard: [good first verify]
oops, I shouldn't have marked this secure bug with [good first verify]. New testers won't have privs to see it.  

The test case doesn't crash latest Nightly nor Aurora builds
Status: RESOLVED → VERIFIED
Keywords: verifyme
Whiteboard: [good first verify]
Whiteboard: [adv-main26+]
Group: core-security
You need to log in before you can comment on or make changes to this bug.