Closed Bug 1560736 Opened 5 years ago Closed 5 years ago

Perma [tier 2] toolkit/components/url-classifier/tests/mochitest/test_socialtracking.html | test_socialtracking_annotate.html | application crashed [@ mozilla::gl::GLContext::BeforeGLCall(char const*) const]

Categories

(Core :: Graphics: Layers, defect, P1)

Unspecified
Android
defect

Tracking

()

RESOLVED FIXED
mozilla69
Tracking Status
firefox-esr60 --- unaffected
firefox-esr68 --- fixed
firefox68 --- wontfix
firefox69 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: jgilbert)

References

(Regression)

Details

(Keywords: crash, intermittent-failure, regression, Whiteboard: [stockwell disabled])

Crash Data

Attachments

(4 files)

Filed by: aciure [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=252972862&repo=autoland
Full log: https://queue.taskcluster.net/v1/task/fBJvSZ5-RHKCC9vcUiwXaA/runs/0/artifacts/public/logs/live_backing.log


[task 2019-06-22T21:00:17.197Z] 21:00:17 INFO - 3785 INFO TEST-START | toolkit/components/url-classifier/tests/mochitest/test_socialtracking.html
[task 2019-06-22T21:00:27.417Z] 21:00:27 INFO - GECKO | nents/url-classifier/tests/mochitest/test_socialtracking.html","js_source":"TestRunner.js"}
[task 2019-06-22T21:00:37.635Z] 21:00:37 INFO - GECKO | ⰲ겿
[task 2019-06-22T21:00:47.857Z] 21:00:47 INFO - GECKO | 겿
[task 2019-06-22T21:00:58.077Z] 21:00:58 INFO - GECKO |
[task 2019-06-22T21:01:18.625Z] 21:01:18 INFO - wait for org.mozilla.geckoview.test complete; top activity=com.android.launcher3
[task 2019-06-22T21:01:18.732Z] 21:01:18 INFO - remoteautomation.py | Application ran for: 0:01:35.396641
[task 2019-06-22T21:01:19.188Z] 21:01:19 INFO - mozcrash Copy/paste: /builds/worker/workspace/build/linux64-minidump_stackwalk /tmp/tmpARrurN/24897735-6174-d6e6-2834-4f6d68ea2538.dmp /builds/worker/workspace/build/symbols
[task 2019-06-22T21:01:25.186Z] 21:01:25 INFO - mozcrash Saved minidump as /builds/worker/workspace/build/blobber_upload_dir/24897735-6174-d6e6-2834-4f6d68ea2538.dmp
[task 2019-06-22T21:01:25.186Z] 21:01:25 INFO - mozcrash Saved app info as /builds/worker/workspace/build/blobber_upload_dir/24897735-6174-d6e6-2834-4f6d68ea2538.extra
[task 2019-06-22T21:01:25.194Z] 21:01:25 WARNING - PROCESS-CRASH | toolkit/components/url-classifier/tests/mochitest/test_socialtracking.html | application crashed [@ mozilla::gl::GLContext::BeforeGLCall(char const*) const]
[task 2019-06-22T21:01:25.194Z] 21:01:25 INFO - Crash dump filename: /tmp/tmpARrurN/24897735-6174-d6e6-2834-4f6d68ea2538.dmp
[task 2019-06-22T21:01:25.194Z] 21:01:25 INFO - Operating system: Android
[task 2019-06-22T21:01:25.194Z] 21:01:25 INFO - 0.0.0 Linux 3.10.0+ #1 PREEMPT Thu Jan 5 00:46:30 UTC 2017 x86_64
[task 2019-06-22T21:01:25.194Z] 21:01:25 INFO - CPU: amd64
[task 2019-06-22T21:01:25.194Z] 21:01:25 INFO - family 6 model 2 stepping 3
[task 2019-06-22T21:01:25.194Z] 21:01:25 INFO - 1 CPU
[task 2019-06-22T21:01:25.194Z] 21:01:25 INFO - GPU: UNKNOWN
[task 2019-06-22T21:01:25.194Z] 21:01:25 INFO - Crash reason: SIGSEGV /SEGV_MAPERR
[task 2019-06-22T21:01:25.194Z] 21:01:25 INFO - Crash address: 0x0
[task 2019-06-22T21:01:25.194Z] 21:01:25 INFO - Process uptime: not available
[task 2019-06-22T21:01:25.194Z] 21:01:25 INFO - Thread 35 (crashed)
[task 2019-06-22T21:01:25.194Z] 21:01:25 INFO - 0 libxul.so!mozilla::gl::GLContext::BeforeGLCall(char const*) const [GLContext.h:11dc1c09bb580dfec90d93bfdf2fae386f670d0e : 663 + 0x29]
[task 2019-06-22T21:01:25.195Z] 21:01:25 INFO - rax = 0x0000786c14fff743 rdx = 0x0000000000000004
[task 2019-06-22T21:01:25.195Z] 21:01:25 INFO - rcx = 0x0000786c183f2a88 rbx = 0x0000786c06465000
[task 2019-06-22T21:01:25.195Z] 21:01:25 INFO - rsi = 0x0000786c06afe440 rdi = 0x000000000000001b
[task 2019-06-22T21:01:25.195Z] 21:01:25 INFO - rbp = 0x0000786c06afeb10 rsp = 0x0000786c06afeaf0
[task 2019-06-22T21:01:25.195Z] 21:01:25 INFO - r8 = 0x0000000000000000 r9 = 0x0000786c34bb5090
[task 2019-06-22T21:01:25.195Z] 21:01:25 INFO - r10 = 0x0000000000000022 r11 = 0x0000000000000246
[task 2019-06-22T21:01:25.195Z] 21:01:25 INFO - r12 = 0x0000000000000303 r13 = 0x0000000000000001
[task 2019-06-22T21:01:25.195Z] 21:01:25 INFO - r14 = 0x0000786c150170db r15 = 0x0000000000000001
[task 2019-06-22T21:01:25.195Z] 21:01:25 INFO - rip = 0x0000786c10b4f798
[task 2019-06-22T21:01:25.195Z] 21:01:25 INFO - Found by: given as instruction pointer in context
[task 2019-06-22T21:01:25.195Z] 21:01:25 INFO - 1 libxul.so!mozilla::gl::GLContext::fBlendFuncSeparate(unsigned int, unsigned int, unsigned int, unsigned int) [GLContext.h:11dc1c09bb580dfec90d93bfdf2fae386f670d0e : 845 + 0x24]
[task 2019-06-22T21:01:25.195Z] 21:01:25 INFO - rbx = 0x0000786c06465000 rbp = 0x0000786c06afeb70
[task 2019-06-22T21:01:25.196Z] 21:01:25 INFO - rsp = 0x0000786c06afeb20 r12 = 0x0000000000000303
[task 2019-06-22T21:01:25.196Z] 21:01:25 INFO - r13 = 0x0000000000000001 r14 = 0x0000786c06afeb28
[task 2019-06-22T21:01:25.196Z] 21:01:25 INFO - r15 = 0x0000000000000001 rip = 0x0000786c10ba02cc
[task 2019-06-22T21:01:25.199Z] 21:01:25 INFO - Found by: call frame info
[task 2019-06-22T21:01:25.199Z] 21:01:25 INFO - 2 libxul.so!mozilla::layers::CompositorOGL::BeginFrame(mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const*, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::IntRegionTyped<mozilla::gfx::UnknownUnits> const&, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits>, mozilla::gfx::IntRectTyped<mozilla::gfx::UnknownUnits>) [CompositorOGL.cpp:11dc1c09bb580dfec90d93bfdf2fae386f670d0e : 792 + 0x20]
[task 2019-06-22T21:01:25.199Z] 21:01:25 INFO - rbx = 0x0000786c06b45820 rbp = 0x0000786c06afebf0
[task 2019-06-22T21:01:25.199Z] 21:01:25 INFO - rsp = 0x0000786c06afeb80 r12 = 0x0000000000000320
[task 2019-06-22T21:01:25.199Z] 21:01:25 INFO - r13 = 0x0000786c06b45760 r14 = 0x0000000000000470
[task 2019-06-22T21:01:25.199Z] 21:01:25 INFO - r15 = 0x0000786c06afec80 rip = 0x0000786c10ba1a99
[task 2019-06-22T21:01:25.199Z] 21:01:25 INFO - Found by: call frame info

Flags: needinfo?(amarchesini)
Regressed by: 1560040
Summary: Intermittent toolkit/components/url-classifier/tests/mochitest/test_socialtracking.html | application crashed [@ mozilla::gl::GLContext::BeforeGLCall(char const*) const] → Perma [tier 2] toolkit/components/url-classifier/tests/mochitest/test_socialtracking.html | test_socialtracking_annotate.html | application crashed [@ mozilla::gl::GLContext::BeforeGLCall(char const*) const]

It seems so an unrelated crash. Dimi, do you have any clue why this is happening?

Flags: needinfo?(amarchesini) → needinfo?(dlee)

We have a similar one in test_cryptomining_annotate.html (Bug 1554025)
But I think this is not related to the testcase itself, my guess is there is something wrong while "shutting down" the testcases.

so before test_socialtracking_annotate.html is added, test test_cryptomining_annotate.html is the last url-classifier mochitest, this crash[1] shows
the testcase is complete and then crash happens. And here is another similar one[2] for test_socialtracking_annotate.html

I'll check with graphic team to see if they can help.

[1] https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=252816367&repo=mozilla-inbound&lineNumber=7233
[2] https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=253000766&repo=autoland&lineNumber=7442

Hi Jeff,
I am not sure who should I ask for help on this, please let me know if I should needinfo other people.

The Safe Browsing testcase seems to introduce unrelated crash, I am not sure if this is related Bug 1553046.
Would you mind providing suggestions on how we should check this issue? thanks!

Flags: needinfo?(dlee) → needinfo?(jgilbert)

Crasher:
https://searchfox.org/mozilla-central/rev/06bd14ced96f25ff1dbd5352cb985fc0fa12a64e/gfx/gl/GLContext.h#663

MOZ_GL_ASSERT(this, IsCurrentImpl());

Caller:
https://searchfox.org/mozilla-central/rev/06bd14ced96f25ff1dbd5352cb985fc0fa12a64e/gfx/layers/opengl/CompositorOGL.cpp#792

It looks like it should still be current:

  // If the widget size changed, we have to force a MakeCurrent
  // to make sure that GL sees the updated widget size.
  if (mWidgetSize.width != width || mWidgetSize.height != height) {
    MakeCurrent(ForceMakeCurrent);

    mWidgetSize.width = width;
    mWidgetSize.height = height;
  } else {
    MakeCurrent();
  }

  mPixelsPerFrame = width * height;
  mPixelsFilled = 0;

#ifdef MOZ_WIDGET_ANDROID
  java::GeckoSurfaceTexture::DestroyUnused((int64_t)mGLContext.get());
#endif

  // Default blend function implements "OVER"
  mGLContext->fBlendFuncSeparate(LOCAL_GL_ONE, LOCAL_GL_ONE_MINUS_SRC_ALPHA,
                                 LOCAL_GL_ONE, LOCAL_GL_ONE_MINUS_SRC_ALPHA);

snorp:
This is Android, so we have a call to GeckoSurfaceTexture::DestroyUnused, so maybe that's stealing/changing MakeCurrent?

jrmuizel:
Anything related changed in CompositorOGL recently?

Moving to Core/Graphics: Layers.

Component: Safe Browsing → Graphics: Layers
Flags: needinfo?(snorp)
Flags: needinfo?(jmuizelaar)
Flags: needinfo?(jgilbert)
OS: Unspecified → Android
Priority: -- → P1
Product: Toolkit → Core

It's possible that the DestroyUnused() call could be mucking with things, but I don't see anything obvious. https://searchfox.org/mozilla-central/source/mobile/android/geckoview/src/main/java/org/mozilla/gecko/gfx/GeckoSurfaceTexture.java#204

Flags: needinfo?(snorp)

I can think of no recent changes in CompositorOGL. Maybe jnicol knows something?

Flags: needinfo?(jmuizelaar) → needinfo?(jnicol)

BeforeGLCall crashes have been seen for other tests on this platform, like bug 1558285, bug 1553046, bug 1559680.

Well we're blessed with this (for now) perma-orange, so we should take the opportunity to dig into it.

It does seem to be happening on Shutdown. Maybe race-y shutdown handling on Android with CompositorOGL?

Jeff, this has now been disabled. Are you by any chance working on a fix?

Flags: needinfo?(jgilbert)
Keywords: leave-open
Whiteboard: [stockwell needswork:owner] → [stockwell disabled]
Pushed by apavel@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/b44fb6ba1a4f
disabled test_socialtracking_annotate.html on android debug r=jmaher

:apavel - The line added in comment 20 looks good, but I think you forgot to delete the incorrect line!

Flags: needinfo?(apavel)

AAA, i just edited that line, i didn't delete it. Will fix it now. thank you.

Flags: needinfo?(apavel)

I can try some things.
FWIW, the orange is on test-android-em-7.0-x86_64/debug-mochitest-e10s-4.

Assignee: nobody → jgilbert
Flags: needinfo?(jgilbert)
Pushed by csabou@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/02010cfcae6e
Disable test_socialtracking.html for frequent failures. a=testonly

I'm not aware of any recent changes that might have caused this, I'm afraid.

Flags: needinfo?(jnicol)

Seems like GeckoSurfaceTexture::DestroyUnused can change the current
context.

Let's check:
https://treeherder.mozilla.org/#/jobs?repo=try&author=jgilbert%40mozilla.com&selectedJob=254481935

07-02 23:27:39.470 F/MOZ_Assert(14423): Assertion failure: mGLContext->IsCurrent() (After DestroyUnused), at /builds/worker/workspace/build/src/gfx/layers/opengl/CompositorOGL.cpp:793

Welp!

Let's try re-enabling those tests:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d8e4f5d51bc3cca4b94da4008f3de68f4e6fa171

Looks good to me! Nothing perma-orange, at least?

Pushed by jgilbert@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7cfd4060d3f5
MakeCurrent after GeckoSurfaceTexture::DestroyUnused. r=snorp
https://hg.mozilla.org/integration/autoland/rev/120801b7d0e1
Re-enable tests.
See Also: → 1560330
Status: NEW → RESOLVED
Closed: 5 years ago
Keywords: leave-open
Resolution: --- → FIXED
Regressions: 1560761
Regressions: 1563443

(In reply to Intermittent Failures Robot from comment #42)

80 failures in 3501 pushes (0.023 failures/push) were associated with this bug in the last 7 days.

This is the #10 most frequent failure this week.

** This failure happened more than 75 times this week! Resolving this bug is a very high priority. **

** Try to resolve this bug as soon as possible. If unresolved for 1 week, the affected test(s) may be disabled. **

Repository breakdown:

  • autoland: 44
  • mozilla-beta: 3
  • mozilla-inbound: 13
  • mozilla-central: 14
  • try: 4
  • mozilla-esr68: 2

Platform breakdown:

  • android-em-7-0-x86_64: 77
  • android-em-7-0-x86_64-beta: 2
  • macosx1014-64: 1

For more details, see:
https://treeherder.mozilla.org/intermittent-failures.html#/bugdetails?bug=1560736&startday=2019-07-01&endday=2019-07-07&tree=all

What a nice down-and-to-the-right graph! :D It's good to see it fixed!

(In reply to Pulsebot from comment #39)

Pushed by jgilbert@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/7cfd4060d3f5
MakeCurrent after GeckoSurfaceTexture::DestroyUnused. r=snorp

Is this something we could safely uplift to ESR68 also to fix up the failures there?

Flags: needinfo?(jgilbert)
Target Milestone: --- → mozilla69

Comment on attachment 9075530 [details]
Bug 1560736 - MakeCurrent after GeckoSurfaceTexture::DestroyUnused.

ESR Uplift Approval Request

  • If this is not a sec:{high,crit} bug, please state case for ESR consideration: Fixes an intermittent orange.
  • User impact if declined: No reports of user impact with or without.
  • Fix Landed on Version: 69
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): Low risk: Calling MakeCurrent again is at worst a slight slowdown, but this is cold code, so that's no worry.
  • String or UUID changes made by this patch: none
Flags: needinfo?(jgilbert)
Attachment #9075530 - Flags: approval-mozilla-esr68?
Attachment #9075531 - Flags: approval-mozilla-esr68?

Comment on attachment 9075531 [details]
Bug 1560736 - Re-enable tests.

This isn't needed on esr68.

Attachment #9075531 - Flags: approval-mozilla-esr68?

Comment on attachment 9075530 [details]
Bug 1560736 - MakeCurrent after GeckoSurfaceTexture::DestroyUnused.

Fixes up some intermittent oranges. Approved for 68.1esr.

Attachment #9075530 - Flags: approval-mozilla-esr68? → approval-mozilla-esr68+
Has Regression Range: --- → yes
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: