Closed Bug 982106 Opened 6 years ago Closed 5 years ago

Intermittent | java-exception | java.lang.NullPointerException at org.mozilla.gecko.gfx.GeckoLayerClient.syncViewportInfo(GeckoLayerClient.java:691)

Categories

(Firefox for Android :: General, defect)

ARM
Android
defect
Not set

Tracking

()

RESOLVED FIXED
Firefox 38
Tracking Status
firefox36 --- wontfix
firefox37 --- wontfix
firefox38 --- fixed
firefox-esr31 --- wontfix

People

(Reporter: cbook, Assigned: gbrown)

References

()

Details

(Keywords: intermittent-failure)

Attachments

(1 file)

Android 4.0 Panda mozilla-inbound opt test mochitest-1 on 2014-03-11 06:28:24 PDT for push ec47c16ac24d

slave: panda-0542

https://tbpl.mozilla.org/php/getParsedLog.php?id=35939528&tree=Mozilla-Inbound



PROCESS-CRASH | java-exception | java.lang.NullPointerException at org.mozilla.gecko.gfx.GeckoLayerClient.syncViewportInfo(GeckoLayerClient.java:691)
See Also: → 975566
(In reply to TBPL Robot from comment #23)

13:44:45     INFO -  REFTEST TEST-START | http://10.0.2.2:8860/tests/layout/base/crashtests/645572-1.html
13:44:45     INFO -  REFTEST TEST-LOAD | http://10.0.2.2:8860/tests/layout/base/crashtests/645572-1.html | 1326 / 2776 (47%)
13:44:45     INFO -  REFTEST TEST-PASS | http://10.0.2.2:8860/tests/layout/base/crashtests/645572-1.html | (LOAD ONLY)
13:44:45     INFO -  REFTEST INFO | Loading a blank page
13:44:45     INFO -  REFTEST TEST-END | http://10.0.2.2:8860/tests/layout/base/crashtests/645572-1.html
13:44:45     INFO -  REFTEST TEST-START | http://10.0.2.2:8860/tests/layout/base/crashtests/640272.html
13:44:45     INFO -  REFTEST TEST-LOAD | http://10.0.2.2:8860/tests/layout/base/crashtests/640272.html | 1327 / 2776 (47%)
13:44:45     INFO -  REFTEST INFO | drawWindow flags = DRAWWINDOW_DRAW_CARET | DRAWWINDOW_DRAW_VIEW | DRAWWINDOW_USE_WIDGET_LAYERS; window size = 0,0; test browser size = 800,1000
13:44:45     INFO -  
13:44:45     INFO -  INFO | automation.py | Application ran for: 0:03:42.738058
13:44:45     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmpz1KX3opidlog
13:44:45     INFO -  Contents of /data/anr/traces.txt:
13:44:45     INFO -  
13:44:45     INFO -  
13:44:45  WARNING -  PROCESS-CRASH | java-exception | java.lang.NullPointerException at org.mozilla.gecko.gfx.GeckoLayerClient.syncViewportInfo(GeckoLayerClient.java:683)
13:44:45     INFO -  WARNING | leakcheck | refcount logging is off, so leaks can't be detected!
13:44:45     INFO -  
13:44:45     INFO -  REFTEST INFO | runreftest.py | Running tests: end.
13:44:45     INFO -  
13:44:45     INFO -  02-11 21:44:07.786 W/GeckoConsole( 1960): [JavaScript Error: "Invalid markup: Incorrect number of children for <mfrac/> tag." {file: "chrome://reftest/content/reftest-content.js" line: 345}]
13:44:45     INFO -  02-11 21:44:07.786 W/GeckoConsole( 1960): [JavaScript Error: "Invalid markup: Incorrect number of children for <mfrac/> tag." {file: "chrome://reftest/content/reftest-content.js" line: 345}]
13:44:45     INFO -  02-11 21:44:09.926 I/SUTAgentAndroid( 1737): 10.0.2.2 : activity
13:44:45     INFO -  02-11 21:44:10.367 W/GeckoConsole( 1960): [JavaScript Warning: "The character encoding of a framed document was not declared. The document may appear different if viewed without the document framing it." {file: "data:text/html,%3Cq%20id%3D%27element2%27%3E%3Cq%20id%3D%27element3%27%3E%3Cq%20id%3D%27element4%27%3E%3Cdd%20style%20id%3D%27element6%27%3E" line: 0}]
13:44:45     INFO -  02-11 21:44:10.477 E/GeckoCrashHandler( 1960): >>> REPORTING UNCAUGHT EXCEPTION FROM THREAD 98 ("Thread-98")
13:44:45     INFO -  02-11 21:44:10.477 E/GeckoCrashHandler( 1960): java.lang.NullPointerException
13:44:45     INFO -  02-11 21:44:10.477 E/GeckoCrashHandler( 1960): 	at org.mozilla.gecko.gfx.GeckoLayerClient.syncViewportInfo(GeckoLayerClient.java:683)
13:44:45     INFO -  02-11 21:44:10.477 E/GeckoCrashHandler( 1960): 	at dalvik.system.NativeStart.run(Native Method)
13:44:45     INFO -  02-11 21:44:10.477 E/GeckoCrashHandler( 1960): Main thread (1) stack:
13:44:45     INFO -  02-11 21:44:10.477 E/GeckoCrashHandler( 1960):     android.os.MessageQueue.nativePollOnce(Native Method)
13:44:45     INFO -  02-11 21:44:10.477 E/GeckoCrashHandler( 1960):     android.os.MessageQueue.next(MessageQueue.java:125)
13:44:45     INFO -  02-11 21:44:10.487 E/GeckoCrashHandler( 1960):     android.os.Looper.loop(Looper.java:124)
13:44:45     INFO -  02-11 21:44:10.487 E/GeckoCrashHandler( 1960):     android.app.ActivityThread.main(ActivityThread.java:5039)
13:44:45     INFO -  02-11 21:44:10.487 E/GeckoCrashHandler( 1960):     java.lang.reflect.Method.invokeNative(Native Method)
13:44:45     INFO -  02-11 21:44:10.487 E/GeckoCrashHandler( 1960):     java.lang.reflect.Method.invoke(Method.java:511)
13:44:45     INFO -  02-11 21:44:10.487 E/GeckoCrashHandler( 1960):     com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:793)
13:44:45     INFO -  02-11 21:44:10.487 E/GeckoCrashHandler( 1960):     com.android.internal.os.ZygoteInit.main(ZygoteInit.java:560)
13:44:45     INFO -  02-11 21:44:10.487 E/GeckoCrashHandler( 1960):     dalvik.system.NativeStart.main(Native Method)
13:44:45     INFO -  02-11 21:44:10.546 F/libc    ( 1960): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1), thread 2003 (Compositor)
This is fairly easy to reproduce using Android x86 crashtests (comment 23, 25). The cedar signature is

java.lang.NullPointerException at org.mozilla.gecko.gfx.GeckoLayerClient.syncViewportInfo(GeckoLayerClient.java:683)

which is http://hg.mozilla.org/projects/cedar/annotate/31b1b9fdef6e/mobile/android/base/gfx/GeckoLayerClient.java#l683:

        mRootLayer.setPositionAndResolution(
            Math.round(x + mCurrentViewTransform.offsetX),
            Math.round(y + mCurrentViewTransform.offsetY),
            Math.round(x + width + mCurrentViewTransform.offsetX),
            Math.round(y + height + mCurrentViewTransform.offsetY),
            resolution);
mCurrentViewTransform is safe.

mRootLayer is initialized in notifyGeckoReady(); I don't see any guarantee that notifyGeckoReady completes before syncViewportInfo is called.
If I guard against mRootLayer being null, x86 crashtest intermittently crashes with:

https://treeherder.mozilla.org/#/jobs?repo=cedar&revision=a8f3584e7727
http://ftp.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/cedar-android-x86/1423874266/cedar_ubuntu64_hw_test-androidx86-set-3-bm105-tests1-linux-build8.txt.gz

20:16:15     INFO -  02-14 04:15:44.597 E/GeckoCrashHandler( 1928): java.lang.NullPointerException
20:16:15     INFO -  02-14 04:15:44.597 E/GeckoCrashHandler( 1928): 	at org.mozilla.gecko.gfx.GeckoLayerClient.createFrame(GeckoLayerClient.java:732)
20:16:15     INFO -  02-14 04:15:44.597 E/GeckoCrashHandler( 1928): 	at dalvik.system.NativeStart.run(Native Method)
20:16:15     INFO -  02-14 04:15:44.597 E/GeckoCrashHandler( 1928): 	at dalvik.system.NativeStart.run(Native Method)

If I also guard against mLayerRenderer being null, then x86 crashtests appear to run reliably:

https://treeherder.mozilla.org/#/jobs?repo=cedar&revision=81a1a3b6862d
Blocks: 1132210
I don't claim any deep understanding of what is happening here; this is just a brute force avoid-the-NPE patch. Is there a better way to resolve this?
Assignee: nobody → gbrown
Attachment #8566216 - Flags: review?(bugmail.mozilla)
Comment on attachment 8566216 [details] [diff] [review]
guard against NPEs in GeckoLayerClient

Review of attachment 8566216 [details] [diff] [review]:
-----------------------------------------------------------------

I'm ok with landing this to fix the crashing, but I think it papers over what might be a more serious root cause. We should leave bug 975566 open and repurpose to it investigate the root cause. Something is causing the compositor to start running and compositing before Java gets word that Gecko is even up and running.
Attachment #8566216 - Flags: review?(bugmail.mozilla) → review+
https://hg.mozilla.org/mozilla-central/rev/d880fdc6fb00
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 38
You need to log in before you can comment on or make changes to this bug.