Closed Bug 1601703 Opened 4 years ago Closed 4 years ago

Crash in [@ gl::Program::getInfoLogLength]

Categories

(Core :: Graphics: CanvasWebGL, defect, P2)

72 Branch
All
Windows
defect

Tracking

()

RESOLVED FIXED
mozilla73
Tracking Status
firefox-esr68 --- unaffected
firefox71 --- unaffected
firefox72 + fixed
firefox73 --- fixed

People

(Reporter: philipp, Assigned: jgilbert)

References

(Regression)

Details

(Keywords: crash, regression)

Crash Data

Attachments

(2 files)

This bug is for crash report bp-9093fd88-3e3f-4d0c-8057-318490191205.

Top 10 frames of crashing thread:

0 libglesv2.dll gl::Program::getInfoLogLength gfx/angle/checkout/src/libANGLE/Program.cpp:1824
1 libglesv2.dll gl::QueryProgramiv gfx/angle/checkout/src/libANGLE/queryutils.cpp:1153
2 libglesv2.dll gl::Context::getProgramiv gfx/angle/checkout/src/libANGLE/Context.cpp:5814
3 xul.dll mozilla::gl::GLContext::fGetProgramiv gfx/gl/GLContext.h:1266
4 xul.dll void mozilla::WebGLProgram::LinkAndUpdate dom/canvas/WebGLProgram.cpp:1515
5 xul.dll mozilla::WebGLProgram::LinkProgram dom/canvas/WebGLProgram.cpp:1238
6 xul.dll mozilla::WebGLContext::LinkProgram dom/canvas/WebGLContextGL.cpp:1122
7 xul.dll static bool mozilla::dom::WebGLRenderingContext_Binding::linkProgram dom/bindings/WebGLRenderingContextBinding.cpp:19220
8 xul.dll mozilla::dom::binding_detail::GenericMethod<mozilla::dom::binding_detail::NormalThisPolicy, mozilla::dom::binding_detail::ThrowExceptions> dom/bindings/BindingUtils.cpp:3153
9 xul.dll js::InternalCallOrConstruct js/src/vm/Interpreter.cpp:548

this crash signature is starting to show up since firefox 72. so far it's fairly low volume and most reports show fairly high memory percentage use and the signature is skewed towards 32bit builds.

Sounds like OOM handling in the update we just took.

I'll take a look.

Assignee: nobody → jgilbert
Component: Graphics → Canvas: WebGL
Priority: -- → P2

Marking 71 as unaffected as all the crashes are in 72.

See Also: → 1603138

Jeff any luck here? We're getting close to the end of beta72, what with the holidays...

Flags: needinfo?(jgilbert)

This actually looks like bad behavior in ANGLE in this case during context loss.

Flags: needinfo?(jgilbert)

My read is:

We violate this assert:
ASSERT(program != nullptr || pname == GL_COMPLETION_STATUS_KHR);
https://searchfox.org/mozilla-central/rev/c79fc6e32a5b561205a71fcfef9112d34b4037ae/gfx/angle/checkout/src/libANGLE/queryutils.cpp#1129

Because the context is lost:

void Context::getProgramiv(GLuint program, GLenum pname, GLint *params)
{
    Program *programObject = nullptr;
    if (!mContextLost)
    {
        // Don't resolve link if checking the link completion status.
        programObject = (pname == GL_COMPLETION_STATUS_KHR ? getProgramNoResolveLink(program)
                                                           : getProgramResolveLink(program));
        ASSERT(programObject);
    }
    QueryProgramiv(this, programObject, pname, params);
}

https://searchfox.org/mozilla-central/rev/c79fc6e32a5b561205a71fcfef9112d34b4037ae/gfx/angle/checkout/src/libANGLE/Context.cpp#5807

Normally, validation would prevent this call:

        if (context->skipValidation() || ValidateGetProgramiv(context, program, pname, params))
        {
            context->getProgramiv(program, pname, params);
        }

https://searchfox.org/mozilla-central/rev/c79fc6e32a5b561205a71fcfef9112d34b4037ae/gfx/angle/checkout/src/libGLESv2/entry_points_gles_2_0_autogen.cpp#1184

But since we ask for NO_VALIDATION, we skipValidation(), skipping the is-lost check, and causing our null-deref.
https://searchfox.org/mozilla-central/rev/c79fc6e32a5b561205a71fcfef9112d34b4037ae/gfx/gl/GLContextProviderEGL.cpp#566

Regressed by: angle-72
Has Regression Range: --- → yes
Pushed by jgilbert@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/264003682fbf
Cherry-pick lost no-error context fix for GetProgramiv. r=lsalzman
Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla73

Do you think this is OK for uplift? The RC build will be on Monday. I could check back Sunday night or early Monday morning.

Flags: needinfo?(jgilbert)

Approval Request Comment
[Feature/Bug causing the regression]: angle-72 update
[User impact if declined]: Occasional crashes
[Is this code covered by automated tests?]: Upstream yes, but not in our CI.
[Has the fix been verified in Nightly?]: no
[Needs manual test from QE? If yes, steps to reproduce]: no
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: low risk
[Why is the change risky/not risky?]: This code looks safe, and we've tested the fix upstream.
[String changes made/needed]: none

Flags: needinfo?(jgilbert)
Attachment #9117922 - Flags: review+
Attachment #9117922 - Flags: approval-mozilla-beta?
Comment on attachment 9117922 [details] [diff] [review]
0001-Bug-1601703-beta72-Cherry-pick-lost-no-error-context.patch

May be risky but the crash volume in beta is high enough that I think it's worth a try for the RC.
Attachment #9117922 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: