Closed Bug 1606549 Opened 4 years ago Closed 23 days ago

Intermittent Android crashtest <test-name> | application crashed [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext]

Categories

(Core :: Graphics: WebRender, defect, P3)

defect

Tracking

()

RESOLVED WORKSFORME

People

(Reporter: intermittent-bug-filer, Assigned: jnicol)

References

(Blocks 1 open bug)

Details

(Keywords: crash, intermittent-failure, Whiteboard: [retriggered][stockwell unknown])

Crash Data

Attachments

(1 file)

Filed by: aciure [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer.html#?job_id=283140202&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/SzdXI02KRNaQaG_Mqrgpew/runs/0/artifacts/public/logs/live_backing.log
Reftest URL: https://hg.mozilla.org/mozilla-central/raw-file/tip/layout/tools/reftest/reftest-analyzer.xhtml#logurl=https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/SzdXI02KRNaQaG_Mqrgpew/runs/0/artifacts/public/logs/live_backing.log&only_show_unexpected=1


[task 2020-01-01T13:26:57.941Z] 13:26:57 INFO - REFTEST TEST-START | http://10.0.2.2:8854/tests/dom/media/test/crashtests/1270303.html
[task 2020-01-01T13:26:57.941Z] 13:26:57 INFO - REFTEST TEST-LOAD | http://10.0.2.2:8854/tests/dom/media/test/crashtests/1270303.html | 623 / 3748 (16%)
[task 2020-01-01T13:27:18.673Z] 13:27:18 INFO - wait for org.mozilla.geckoview.test complete; top activity=com.android.launcher3
[task 2020-01-01T13:27:18.775Z] 13:27:18 INFO - remoteautomation.py | Application ran for: 0:02:10.018884
[task 2020-01-01T13:27:19.535Z] 13:27:19 INFO - REFTEST INFO | Copy/paste: /builds/worker/workspace/build/linux64-minidump_stackwalk /tmp/tmpEjVO4b/40c1ab12-8c29-9265-6955-1a70797e5105.dmp /builds/worker/workspace/build/symbols
[task 2020-01-01T13:27:24.178Z] 13:27:24 INFO - REFTEST INFO | Saved minidump as /builds/worker/workspace/build/blobber_upload_dir/40c1ab12-8c29-9265-6955-1a70797e5105.dmp
[task 2020-01-01T13:27:24.179Z] 13:27:24 INFO - REFTEST INFO | Saved app info as /builds/worker/workspace/build/blobber_upload_dir/40c1ab12-8c29-9265-6955-1a70797e5105.extra
[task 2020-01-01T13:27:24.185Z] 13:27:24 WARNING - REFTEST PROCESS-CRASH | http://10.0.2.2:8854/tests/dom/media/test/crashtests/1270303.html | application crashed [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext()]
[task 2020-01-01T13:27:24.185Z] 13:27:24 INFO - Crash dump filename: /tmp/tmpEjVO4b/40c1ab12-8c29-9265-6955-1a70797e5105.dmp
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - Operating system: Android
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - 0.0.0 Linux 3.10.0+ #260 SMP PREEMPT Fri May 19 12:48:14 PDT 2017 x86_64
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - CPU: amd64
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - family 6 model 6 stepping 3
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - 4 CPUs
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - GPU: UNKNOWN
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - Crash reason: SIGSEGV /SEGV_MAPERR
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - Crash address: 0x0
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - Process uptime: not available
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - Thread 40 (crashed)
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - 0 libxul.so!mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext() [RenderAndroidSurfaceTextureHostOGL.cpp:a748a5149bda383173368cd3a8df84c8423f9f7b : 131 + 0x29]
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - rax = 0x00007c64ebd733f4 rdx = 0x0000000000000001
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - rcx = 0x00007c64efa3ca80 rbx = 0x0000000000000000
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - rsi = 0x00007c64dbbb5ea0 rdi = 0x00007c64dbbb5bf0
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - rbp = 0x00007c64dbbb5f30 rsp = 0x00007c64dbbb5ee0
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - r8 = 0x0000000000000b2c r9 = 0x00007c64dbbb6450
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - r10 = 0x00007c64e75d5a3b r11 = 0x0000000000000000
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - r12 = 0x00007c64d5d8fc00 r13 = 0x00007c64dbbb6080
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - r14 = 0x00007c64d5d8fc38 r15 = 0x00007c64dbbb5eec
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - rip = 0x00007c64e75d5a4d
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - Found by: given as instruction pointer in context
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - 1 libxul.so!mozilla::wr::RenderAndroidSurfaceTextureHostOGL::PrepareForUse() [RenderAndroidSurfaceTextureHostOGL.cpp:a748a5149bda383173368cd3a8df84c8423f9f7b : 150 + 0x8]
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - rbp = 0x00007c64dbbb5f70 rsp = 0x00007c64dbbb5f40
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - rip = 0x00007c64e75d5c12
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - Found by: previous frame's frame pointer
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - 2 libxul.so!mozilla::wr::RenderThread::HandlePrepareForUse() [RenderThread.cpp:a748a5149bda383173368cd3a8df84c8423f9f7b : 728 + 0xe]
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - rbp = 0x00007c64dbbb5fc0 rsp = 0x00007c64dbbb5f80
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - rip = 0x00007c64e75d9273
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - Found by: previous frame's frame pointer
[task 2020-01-01T13:27:24.187Z] 13:27:24 INFO - 3 libxul.so!mozilla::detail::RunnableMethodImpl<mozilla::wr::RenderThread*, void (mozilla::wr::RenderThread::*)(), true, (mozilla::RunnableKind)0>::Run() [nsThreadUtils.h:a748a5149bda383173368cd3a8df84c8423f9f7b : 1217 + 0x17]
[task 2020-01-01T13:27:24.188Z] 13:27:24 INFO - rbp = 0x00007c64dbbb5fd0 rsp = 0x00007c64dbbb5fd0
[task 2020-01-01T13:27:24.188Z] 13:27:24 INFO - rip = 0x00007c64e75e8019
[task 2020-01-01T13:27:24.188Z] 13:27:24 INFO - Found by: previous frame's frame pointer
[task 2020-01-01T13:27:24.188Z] 13:27:24 INFO - 4 libxul.so!MessageLoop::RunTask(already_AddRefed<nsIRunnable>) [message_loop.cc:a748a5149bda383173368cd3a8df84c8423f9f7b : 442 + 0x11]
[task 2020-01-01T13:27:24.188Z] 13:27:24 INFO - rbp = 0x00007c64dbbb6030 rsp = 0x00007c64dbbb5fe0
[task 2020-01-01T13:27:24.188Z] 13:27:24 INFO - rip = 0x00007c64e6cb2cde
[task 2020-01-01T13:27:24.188Z] 13:27:24 INFO - Found by: previous frame's frame pointer

The priority flag is not set for this bug.
:bryce, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(bvandyk)

Looks like a crash in WebRender code. Possibly an assertion failure given the stack and that all reports are in debug builds. Not clear to me if the webm in the test does something to trigger this.

Moving to Web Render.

Component: Audio/Video: Playback → Graphics: WebRender
Flags: needinfo?(bvandyk)

The priority flag is not set for this bug.
:jbonisteel, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jbonisteel)
Blocks: wr-android
Flags: needinfo?(jbonisteel) → needinfo?(ktaeleman)
Priority: -- → P3

Hey Sotaro, any idea what's going on here?

Flags: needinfo?(ktaeleman) → needinfo?(sotaro.ikeda.g)

It seems that GeckoSurfaceTexture::AttachToGLContext() call in RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext() seemed to be failed. It seems that SurfaceTexture state was not valid. I am going to take a look.

https://searchfox.org/mozilla-central/rev/2e355fa82aaa87e8424a9927c8136be184eeb6c7/gfx/webrender_bindings/RenderAndroidSurfaceTextureHostOGL.cpp#123

Assignee: nobody → sotaro.ikeda.g
Flags: needinfo?(sotaro.ikeda.g)
Crash Signature: [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext()] → [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext()] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::Lock(unsigned char, mozilla::gl::GLContext*, mozilla::wr::ImageRendering)]

Worth mentioning that all these crashes have this assertion in the log:
Assertion failure: 0, at /builds/worker/workspace/build/src/gfx/webrender_bindings/RenderAndroidSurfaceTextureHostOGL.cpp:131

[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.484 E/ResourceManagerService( 1357): Rejected removeResource call with invalid pid.
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.485 I/GeckoMediaManager( 3063): Media service has been unbound. Stopping.
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.487 I/Gecko   ( 2802):
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.487 I/Gecko   ( 2802): {"action":"log","time":1582286312486,"thread":null,"pid":null,"source":"reftest","level":"DEBUG","message":"[CONTENT] MakeProgress: Completed"}
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.488 I/Gecko   ( 2802):
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.488 I/Gecko   ( 2802): {"action":"log","time":1582286312488,"thread":null,"pid":null,"source":"reftest","level":"DEBUG","message":"[CONTENT] RecordResult fired"}
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.489 I/Gecko   ( 2802):
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.489 I/Gecko   ( 2802): {"action":"log","time":1582286312489,"thread":null,"pid":null,"source":"reftest","level":"DEBUG","message":"RecordResult fired"}
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.490 I/Gecko   ( 2802):
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.490 I/Gecko   ( 2802): {"action":"test_status","time":1582286312489,"thread":null,"pid":null,"source":"reftest","test":"dom/media/test/crashtests/1267263.html","subtest":"(LOAD ONLY)","status":"PASS"}
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.490 E/GLConsumer( 2802): [SurfaceTexture-0-2802-6] attachToContext: abandoned GLConsumer
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.490 W/System.err( 2802): java.lang.RuntimeException: Error during attachToGLContext (see logcat for details)
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.490 I/Gecko   ( 2802):
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.490 I/Gecko   ( 2802): {"action":"test_end","time":1582286312490,"thread":null,"pid":null,"source":"reftest","test":"dom/media/test/crashtests/1267263.html","status":"OK"}
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.490 W/System.err( 2802): 	at android.graphics.SurfaceTexture.attachToGLContext(SurfaceTexture.java:286)
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.490 W/System.err( 2802): 	at org.mozilla.gecko.gfx.GeckoSurfaceTexture.attachToGLContext(GeckoSurfaceTexture.java:90)
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.490 F/MOZ_Assert( 2802): Assertion failure: 0, at /builds/worker/workspace/build/src/gfx/webrender_bindings/RenderAndroidSurfaceTextureHostOGL.cpp:131
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.490 I/Gecko   ( 2802):
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.490 I/Gecko   ( 2802): {"action":"log","time":1582286312490,"thread":null,"pid":null,"source":"reftest","level":"DEBUG","message":"Loading a blank page"}
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.495 W/google-breakpad( 2802): ExceptionHandler::GenerateDump cloned child
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.495 W/google-breakpad( 2802): 3315
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.495 W/google-breakpad( 2802):
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.495 W/google-breakpad( 2802): ExceptionHandler::SendContinueSignalToChild sent continue signal to child
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:32.495 W/google-breakpad( 3315): ExceptionHandler::WaitForContinueSignal waiting for continue signal...
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:33.198 I/Choreographer( 2802): Skipped 41 frames!  The application may be doing too much work on its main thread.
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:33.206 I/Gecko   ( 2802):
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:33.206 I/Gecko   ( 2802): {"action":"assertion_count","time":1582286313206,"thread":null,"pid":null,"source":"reftest","test":"dom/media/test/crashtests/1267263.html","min_expected":0,"max_expected":0,"count":0}
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:33.207 I/Gecko   ( 2802):
[task 2020-02-21T11:59:07.074Z] 11:59:07     INFO -  02-21 11:58:33.207 I/Gecko   ( 2802): {"action":"test_start","time":1582286313207,"thread":null,"pid":null,"source":"reftest","test":"dom/media/test/crashtests/1270303.html"}

Recent failure: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=289998025&repo=autoland&lineNumber=3669

Summary: Intermittent dom/media/test/crashtests/1270303.html | application crashed [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext()] → Intermittent Android crashtest <test-name> | application crashed [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext()]
Depends on: 1621836

This bug might be affected by Bug 1626142.

I reproduced this with some logging: https://treeherder.mozilla.org/#/jobs?repo=try&revision=db140128b4a36f634a4aade80ffd80ec6c87451e&selectedTaskRun=bGm0D5mtTqKtr6O0f092uA-0

Here's the relevant output:

Creating AndroidSurfaceTextureData handle = 82
AndroidSurfaceTextureData::Serialize() handle=82
RemoteVideoDecoder::~RemoteVideoDecoder() handle=82
SurfaceAllocator.disposeSurface() handle=82
SurfaceAllocatorService.releaseSurface() handle=82
GeckoSurfaceTexture.decrementUse() handle=82 upstream=0 count=0
GST handle=82 count reached zero. releasing immediately
GeckoSurfaceTexture.release() handle=82
CreateTextureHostOGL() handle=82
SurfaceTextureHost::SurfaceTextureHost() 0x78d1584129d0 handle=82
GeckoSurfaceTexture.incrementUse() handle=82 upstream=0 count=1
RenderAndroidSurfaceTextureHostOGL::RenderAndroidSurfaceTextureHostOGL() 0x78d157c648c0 handle=82
GeckoSurfaceTexture.incrementUse() handle=82 upstream=0 count=2
SurfaceTextureHost::CreateRenderTexture() 0x78d157c648c0 id=30064771271 handle=82
RenderThread::PrepareForUse() 0x78d157c648c0
GeckoSurface.release() handle=82 82
GeckoSurfaceTexture.decrementUse() handle=82 upstream=82 count=0
GST handle=82 count reached zero. releasing immediately
GeckoSurfaceTexture.release() handle=82
RenderThread::HandlePrepareForUse()
RenderAndroidSurfaceTextureHostOGL::PrepareForUse() 0x78d157c648c0 handle=82
RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext() 0x78d157c648c0 handle=82
GeckoSurfaceTexture.attachToGLContext() handle=82
[SurfaceTexture-0-4874-185] attachToContext: abandoned GLConsumer
java.lang.RuntimeException: Error during attachToGLContext (see logcat for details)
	at android.graphics.SurfaceTexture.attachToGLContext(SurfaceTexture.java:286)
	at org.mozilla.gecko.gfx.GeckoSurfaceTexture.attachToGLContext(GeckoSurfaceTexture.java:92)
EnsureAttached 0x78d157c648c0 attach failed
MOZ_Assert: Assertion failure: 0, at /builds/worker/checkouts/gecko/gfx/webrender_bindings/RenderAndroidSurfaceTextureHostOGL.cpp:153

What I think is happening is:

  • We send a video frame from the decoder (content process) to compositor by using AndroidSurfaceTextureData::Serialize(). We don't appear to add any reference to the Surface here, which I think is the problem.
  • Then the RemoteVideoDecoder gets destroyed. This decrements the final reference to the Surface.
  • The Surface gets released, which decrements the reference count of the corresponding SurfaceTexture to zero.
  • Meanwhile in the Compositor thread, the video frame gets deserialized and a SurfaceTextureHost is constructed. It calls GeckoSurfaceTexture.lookup() and finds the SurfaceTexture. This increments the reference count back up to 1, but the SurfaceTexture is in the process of being released in GeckoSurfaceTexture.decrementUse().
  • Back in SurfaceTexture.decrementUse(), the SurfaceTexture is finally released, and removed from sSurfaceTextures, but it is too late.
  • Then later in RenderAndroidSurfaceTextureHost::PrepareForUse(), we call EnsureAttachedToGLContext(), which calls attachToGLContext(),
    which fails because the SurfaceTexture has been released (or "abandoned" as it calls it).

Sometimes, the SurfaceTexture is in fact released and removed from sSurfaceTextures before the SurfaceTextureHost is deserialized. In this case, GeckoSurfaceTexture.lookup() fails, so the SurfaceTextureHost is constructed with nullptr as mSurfTex. This doesn't cause any crashes but will presumably means frames are dropped when rendering.

(In reply to Jamie Nicol [:jnicol] from comment #27)

What I think is happening is:

  • We send a video frame from the decoder (content process) to compositor by using AndroidSurfaceTextureData::Serialize(). We don't appear to add any reference to the Surface here, which I think is the problem.
  • Then the RemoteVideoDecoder gets destroyed. This decrements the final reference to the Surface.
  • The Surface gets released, which decrements the reference count of the corresponding SurfaceTexture to zero.

Thanks for the investigation. It seems that this problem could be addressed by using TextureFlags::WAIT_HOST_USAGE_END.

(In reply to Sotaro Ikeda [:sotaro] from comment #29)

Thanks for the investigation. It seems that this problem could be addressed by using TextureFlags::WAIT_HOST_USAGE_END

Created Bug 1636334 for it.

Depends on: 1636334

It seem that calling HandlePrepareForUse() before UpdateAndRender() is not enough. Emulator is very very slow. Then there might be a case that new RenderThread::PrepareForUse() and new WebRender transaction by WebRenderBridgeParent::MaybeGenerateFrame() achieve during calling UpdateAndRender().

Depends on: 1636352

(In reply to Sotaro Ikeda [:sotaro] from comment #31)

It seem that calling HandlePrepareForUse() before UpdateAndRender() is not enough. Emulator is very very slow. Then there might be a case that new RenderThread::PrepareForUse() and new WebRender transaction by WebRenderBridgeParent::MaybeGenerateFrame() achieve during calling UpdateAndRender().

Bug 1636352 was created for it.

Since Bug 1636352 fix, the failures did not happen.

Status: NEW → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---

Failure happened at mSurfTex->AttachToGLContext(). Form it, SurfaceTexture might be deallocated when AttachToGLContext() is called.

I understand what the problem is and can reproduce lo locally so am happy to take this bug, sotaro.

Assignee: sotaro.ikeda.g → jnicol

Thanks.

Crash Signature: [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext()] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::Lock(unsigned char, mozilla::gl::GLContext*, mozilla::wr::ImageRendering)] → [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext()] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::Lock(unsigned char, mozilla::gl::GLContext*, mozilla::wr::ImageRendering)] [@ mozilla::wr::RenderAndroidSurfaceTextureHos…
Crash Signature: , mozilla::wr::ImageRendering)] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::PrepareForUse()] → , mozilla::wr::ImageRendering)] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::PrepareForUse()]
Whiteboard: [stockwell needswork:owner] → [stockwell needswork:owner][retriggers]

It's very confusing that this bug contains multiple crash signatures that are almost certainly different issues. I think I know how to fix the EnsureAttachedToGLContext crash. I will land a potential fix asap. But the Lock crash is likely unrelated. Is the recent spike predominantly in either one of the signatures? Is there a way to separate them?

Currently when a GeckoSurfaceTexture's refcount reaches zero, it calls
super.release() prior to removing itself from sSurfaceTextures. If a call to
GeckoSurfaceTexture.lookup() takes place in between these events, then it will
succesfully find the surface texture and return it. However, that surface
texture will be in a released state, which will cause problems.

To fix this, change lookup() to check the refcount of any surface texture that
it finds (within a synchronized block, of course), and return null if the
refcount is zero. I think this is preferable to locking sSurfaceTextures for
every decrementUse() call.

Change mRefCount from an AtomicInt to a plain int, as it is only accessed within
synchronized blocks or methods. The use of AtomicInt implied a false sense of
security, when in fact there is more work that must be done atomically than just
decrementing the count.

Additionally, remove some code which attempts to remove the surface texture from
sSurfaceTextures just after calling release(), as we have just done this in our
overridden release() method.

Whiteboard: [stockwell needswork:owner][retriggers] → [stockwell needswork:owner][retriggered]

The patch in comment 58 should hopefully fix the RenderAndroidSurfaceTextureHost::EnsureAttachedToGLContext() crash.

As far as I can tell, however, the spike is due to the crash in RenderAndroidSurfaceTextureHost::Lock(). (By quickly looking through the failures linked to by the bot the past week.) This bug is confusing enough already, I'd really like it if we could separate that out in to a different bug. Andreea, is it possible to do that? I obviously know how to file a new bug, but I have no idea how to make it so that treeherder matches the intermittent failures to the new bug instead.

Sotaro, the crash in Lock() occurs because mPrepareStatus == STATUS_MIGHT_BE_USED_BY_WR. This is because NofityForUse() has not been called before Lock(). Do you know why that might be happening?

Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(apavel)

(In reply to Jamie Nicol [:jnicol] from comment #60)

The patch in comment 58 should hopefully fix the RenderAndroidSurfaceTextureHost::EnsureAttachedToGLContext() crash.

As far as I can tell, however, the spike is due to the crash in RenderAndroidSurfaceTextureHost::Lock(). (By quickly looking through the failures linked to by the bot the past week.) This bug is confusing enough already, I'd really like it if we could separate that out in to a different bug. Andreea, is it possible to do that? I obviously know how to file a new bug, but I have no idea how to make it so that treeherder matches the intermittent failures to the new bug instead.

Sotaro, the crash in Lock() occurs because mPrepareStatus == STATUS_MIGHT_BE_USED_BY_WR. This is because NofityForUse() has not been called before Lock(). Do you know why that might be happening?

I apologize for miscalssifying this in the first place. I filed bug 1658005.
TH usually shows the suggestions if they match exactly the failure line or part of the failure line. However, we'll keep in mind t make the distinction between the two crashes.

Thank you.

Flags: needinfo?(apavel)

Thanks Andreea, much appreciated!

(In reply to Jamie Nicol [:jnicol] from comment #62)

Thanks Andreea, much appreciated!

No problem, any time.

Cancelling Sotaro's needinfo to transfer it to bug 1658005.

Flags: needinfo?(sotaro.ikeda.g)
Crash Signature: , mozilla::wr::ImageRendering)] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::PrepareForUse()] → , mozilla::wr::ImageRendering)] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::PrepareForUse()] [@ mozilla::wr::RenderAndroidSurfaceTextureHost::EnsureAttachedToGLContext()]
Severity: normal → S3
Crash Signature: [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext()] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::Lock(unsigned char, mozilla::gl::GLContext*, mozilla::wr::ImageRendering)] [@ mozilla::wr::RenderAndroidSurfaceTextureHo… → [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::Lock] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::PrepareForUse] [@ mozilla::wr::RenderAndroidSurfaceTextureHost::…
Summary: Intermittent Android crashtest <test-name> | application crashed [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext()] → Intermittent Android crashtest <test-name> | application crashed [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext]
Status: REOPENED → RESOLVED
Crash Signature: [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::Lock] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::PrepareForUse] [@ → [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::EnsureAttachedToGLContext] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::Lock] [@ mozilla::wr::RenderAndroidSurfaceTextureHostOGL::PrepareForUse] [@
Closed: 4 years ago23 days ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: