Common Firebase ANR in all mobile project main activities - input dispatcher time out
Categories
(GeckoView :: IME, defect, P2)
Tracking
(firefox-esr91 wontfix, firefox99 wontfix, firefox100 wontfix, firefox101 wontfix, firefox102 wontfix)
People
(Reporter: aaronmt, Unassigned)
References
Details
(Whiteboard: [geckoview:m101] [geckoview:m102] [geckoview:m103] [geckoview:m104] )
Details in https://github.com/mozilla-mobile/fenix/issues/24047
We have a branch open that is preparing to upgrade the AndroidX Test and Support libraries and device on Firebase to target API 30 across mobile projects. It is on this branch where we are frequently (daily at this point through trial runs) running into these ANRs (input dispatcher timeout). We would like to land this work to upgrade our tests to run on API30, but we're hesitant with the introduced instability. These ANRs occur across all mobile projects.
Input dispatcher State at time of last ANR:
ANR:
Time: 2022-03-15 09:48:37
Reason: e35e897 org.mozilla.fenix.debug/org.mozilla.fenix.HomeActivity (server) is not responding. Waited 5003ms for FocusEvent(hasFocus=true)
Window: ActivityRecord{f0b7ada u0 org.mozilla.fenix.debug/org.mozilla.fenix.HomeActivity t9} - Window{e35e897 u0 org.mozilla.fenix.debug/org.mozilla.fenix.HomeActivity}
DispatchEnabled: true
DispatchFrozen: false
InputFilterEnabled: false
FocusedDisplayId: 0
FocusedApplications:
displayId=0, name='ActivityRecord{f0b7ada u0 org.mozilla.fenix.debug/org.mozilla.fenix.HomeActivity t9}', dispatchingTimeout=5000ms
FocusedWindows:
displayId=0, name='Window{e35e897 u0 org.mozilla.fenix.debug/org.mozilla.fenix.HomeActivity}'
TouchStates: <no displays touched>
Display: 0
Windows:
0: name='Window{3c8501f u0 NavigationBar0}', displayId=0, portalToDisplayId=-1, paused=false, hasFocus=false, hasWallpaper=false, visible=true, canReceiveKeys=false, flags=0x21840068, type=0x000007e3, frame=[0,1788][1080,1920], globalScale=1.000000, windowScale=(1.000000,1.000000), touchableRegion=[0,1788][1080,1920], inputFeatures=0x00000000, ownerPid=1609, ownerUid=10133, dispatchingTimeout=5000ms
1: name='Window{c2039a5 u0 StatusBar}', displayId=0, portalToDisplayId=-1, paused=false, hasFocus=false, hasWallpaper=false, visible=true, canReceiveKeys=false, flags=0x81800008, type=0x000007d0, frame=[0,0][1080,66], globalScale=1.000000, windowScale=(1.000000,1.000000), touchableRegion=[0,0][1080,66], inputFeatures=0x00000000, ownerPid=1609, ownerUid=10133, dispatchingTimeout=5000ms
2: name='Window{e35e897 u0 org.mozilla.fenix.debug/org.mozilla.fenix.HomeActivity}', displayId=0, portalToDisplayId=-1, paused=false, hasFocus=true, hasWallpaper=false, visible=true, canReceiveKeys=true, flags=0x81010120, type=0x00000001, frame=[0,0][1080,1920], globalScale=1.000000, windowScale=(1.000000,1.000000), touchableRegion=[0,0][1080,1920], inputFeatures=0x00000000, ownerPid=12403, ownerUid=10153, dispatchingTimeout=5000ms
3: name='SurfaceView - org.mozilla.fenix.debug/org.mozilla.fenix.HomeActivity#0', displayId=0, portalToDisplayId=-1, paused=false, hasFocus=false, hasWallpaper=false, visible=true, canReceiveKeys=false, flags=0x00000020, type=0x00000000, frame=[0,66][1080,1788], globalScale=1.000000, windowScale=(1.000000,1.000000), touchableRegion=<empty>, inputFeatures=0x00000002, ownerPid=12403, ownerUid=10153, dispatchingTimeout=0ms
4: name='Window{5a9eb6b u0 com.android.systemui.ImageWallpaper}', displayId=0, portalToDisplayId=-1, paused=false, hasFocus=false, hasWallpaper=false, visible=false, canReceiveKeys=false, flags=0x00014318, type=0x000007dd, frame=[-54,-96][2509,2016], globalScale=1.000000, windowScale=(0.484979,0.484848), touchableRegion=[-54,-96][4750,3864], inputFeatures=0x00000000, ownerPid=1609, ownerUid=10133, dispatchingTimeout=5000ms
Global monitors in display 0:
0: 'PointerEventDispatcher0 (server)',
RecentQueue: length=10
KeyEvent(deviceId=-1, source=0x00000101, displayId=-1, action=UP, flags=0x00000000, keyCode=4, scanCode=0, metaState=0x00000000, repeatCount=0), policyFlags=0x6b000000, age=11463ms
FocusEvent(hasFocus=false), age=11347ms
FocusEvent(hasFocus=true), age=11347ms
MotionEvent(deviceId=-1, source=0x00001002, displayId=0, action=DOWN, actionButton=0x00000000, flags=0x00000000, metaState=0x00000000, buttonState=0x00000000, classification=NONE, edgeFlags=0x00000000, xPrecision=1.0, yPrecision=1.0, xCursorPosition=nan, yCursorPosition=nan, pointers=[0: (915.0, 1711.0)]), policyFlags=0x6b000000, age=10462ms
MotionEvent(deviceId=-1, source=0x00001002, displayId=0, action=UP, actionButton=0x00000000, flags=0x00000000, metaState=0x00000000, buttonState=0x00000000, classification=NONE, edgeFlags=0x00000000, xPrecision=1.0, yPrecision=1.0, xCursorPosition=nan, yCursorPosition=nan, pointers=[0: (915.0, 1711.0)]), policyFlags=0x6b000000, age=10462ms
FocusEvent(hasFocus=false), age=10168ms
FocusEvent(hasFocus=true), age=9000ms
MotionEvent(deviceId=-1, source=0x00001002, displayId=0, action=DOWN, actionButton=0x00000000, flags=0x00000000, metaState=0x00000000, buttonState=0x00000000, classification=NONE, edgeFlags=0x00000000, xPrecision=1.0, yPrecision=1.0, xCursorPosition=nan, yCursorPosition=nan, pointers=[0: (485.0, 991.0)]), policyFlags=0x6b000000, age=6064ms
MotionEvent(deviceId=-1, source=0x00001002, displayId=0, action=UP, actionButton=0x00000000, flags=0x00000000, metaState=0x00000000, buttonState=0x00000000, classification=NONE, edgeFlags=0x00000000, xPrecision=1.0, yPrecision=1.0, xCursorPosition=nan, yCursorPosition=nan, pointers=[0: (485.0, 991.0)]), policyFlags=0x6b000000, age=5847ms
FocusEvent(hasFocus=true), age=5033ms
PendingEvent: <none>
InboundQueue: <empty>
ReplacedKeys: <empty>
Connections:
346: channelName='e35e897 org.mozilla.fenix.debug/org.mozilla.fenix.HomeActivity (server)', windowName='e35e897 org.mozilla.fenix.debug/org.mozilla.fenix.HomeActivity (server)', status=NORMAL, monitor=false, responsive=false
OutboundQueue: <empty>
WaitQueue: length=1
FocusEvent(hasFocus=true), targetFlags=0x00000100, resolvedAction=0, age=5033ms, wait=5003ms
317: channelName='dd0c025 com.google.android.apps.mtaas.backdrop/com.google.android.apps.mtaas.backdrop.BackdropActivity (server)', windowName='dd0c025 com.google.android.apps.mtaas.backdrop/com.google.android.apps.mtaas.backdrop.BackdropActivity (server)', status=NORMAL, monitor=false, responsive=true
OutboundQueue: <empty>
WaitQueue: <empty>
262: channelName='484a672 AssistPreviewPanel (server)', windowName='484a672 AssistPreviewPanel (server)', status=NORMAL, monitor=false, responsive=true
OutboundQueue: <empty>
WaitQueue: <empty>
331: channelName='fbd63e0 com.android.launcher3/com.android.searchlauncher.SearchLauncher (server)', windowName='fbd63e0 com.android.launcher3/com.android.searchlauncher.SearchLauncher (server)', status=NORMAL, monitor=false, responsive=true
OutboundQueue: <empty>
WaitQueue: <empty>
302: channelName='recents_animation_input_consumer (server)', windowName='recents_animation_input_consumer (server)', status=NORMAL, monitor=false, responsive=true
OutboundQueue: <empty>
WaitQueue: <empty>
335: channelName='3e98489 InputMethod (server)', windowName='3e98489 InputMethod (server)', status=NORMAL, monitor=false, responsive=true
OutboundQueue: <empty>
WaitQueue: <empty>
311: channelName='f2aeb25 pip-dismiss-overlay (server)', windowName='f2aeb25 pip-dismiss-overlay (server)', status=NORMAL, monitor=false, responsive=true
OutboundQueue: <empty>
WaitQueue: <empty>
267: channelName='c2039a5 StatusBar (server)', windowName='c2039a5 StatusBar (server)', status=NORMAL, monitor=false, responsive=true
OutboundQueue: <empty>
WaitQueue: <empty>
310: channelName='5a9eb6b com.android.systemui.ImageWallpaper (server)', windowName='5a9eb6b com.android.systemui.ImageWallpaper (server)', status=NORMAL, monitor=false, responsive=true
OutboundQueue: <empty>
WaitQueue: <empty>
343: channelName='c56a870 NotificationShade (server)', windowName='c56a870 NotificationShade (server)', status=NORMAL, monitor=false, responsive=true
OutboundQueue: <empty>
WaitQueue: <empty>
237: channelName='24b5f2c com.android.launcher3/com.android.searchlauncher.SearchLauncher (server)', windowName='24b5f2c com.android.launcher3/com.android.searchlauncher.SearchLauncher (server)', status=NORMAL, monitor=false, responsive=true
OutboundQueue: <empty>
WaitQueue: <empty>
281: channelName='3c8501f NavigationBar0 (server)', windowName='3c8501f NavigationBar0 (server)', status=NORMAL, monitor=false, responsive=true
OutboundQueue: <empty>
WaitQueue: <empty>
140: channelName='PointerEventDispatcher0 (server)', windowName='PointerEventDispatcher0 (server)', status=NORMAL, monitor=true, responsive=true
OutboundQueue: <empty>
WaitQueue: <empty>
AppSwitch: not pending
Configuration:
KeyRepeatDelay: 50ms
KeyRepeatTimeout: 400ms
https://github.com/mozilla-mobile/fenix/issues/24047#issuecomment-1069667056
From Mike:
The ANR seems to be caused by the main thread waiting for a art::ConditionVariable::WaitHoldingLocks(art::Thread*)+148) during on a vsync:
"main" prio=5 tid=1 Native
#00 pc 0000000000086b4c /apex/com.android.runtime/lib64/bionic/libc.so (syscall+28)
#00 pc 00000000001af92c /apex/com.android.art/lib64/libart.so (art::ConditionVariable::WaitHoldingLocks(art::Thread*)+148)
#00 pc 00000000003a0408 /apex/com.android.art/lib64/libart.so (art::JNI<false>::CallObjectMethodV(_JNIEnv*, _jobject*, _jmethodID*, std::__va_list)+504)
#00 pc 0000000000004040 /apex/com.android.art/lib64/libnativehelper.so (_JNIEnv::CallObjectMethod(_jobject*, _jmethodID*, ...)+124)
#00 pc 00000000000ed10c /system/lib64/libandroid_runtime.so (android::NativeDisplayEventReceiver::dispatchVsync(long, unsigned long, unsigned int)+48)
#00 pc 000000000008fae8 /system/lib64/libgui.so (android::DisplayEventDispatcher::handleEvent(int, int, void*)+168)
#00 pc 0000000000019da8 /system/lib64/libutils.so (android::Looper::pollInner(int)+916)
#00 pc 00000000000199ac /system/lib64/libutils.so (android::Looper::pollOnce(int, int*, int*, void**)+112)
#00 pc 0000000000118288 /system/lib64/libandroid_runtime.so (android::android_os_MessageQueue_nativePollOnce(_JNIEnv*, _jobject*, long, int)+44)
at android.os.MessageQueue.nativePollOnce (Native method)
at android.os.MessageQueue.next (MessageQueue.java:335)
at android.os.Looper.loop (Looper.java:193)
at android.app.ActivityThread.main (ActivityThread.java:8059)
at java.lang.reflect.Method.invoke (Native method)
at com.android.internal.os.RuntimeInit$MethodAndArgsCaller.run (RuntimeInit.java:656)
at com.android.internal.os.ZygoteInit.main (ZygoteInit.java:967)
Comment 1•3 years ago
|
||
Adding this bug to GV's m101 sprint because Aaron said this crash is common in Google Play's crash stats.
Tentatively assigning to Agi because he said he would investigate the crash this week.
Updated•3 years ago
|
Reporter | ||
Comment 2•3 years ago
|
||
I believe this is the ANR that matches on the Play Store
Updated•3 years ago
|
Comment 3•3 years ago
|
||
Agi says this is a GV bug but the Graphics team is also looking at it. He recommends we check with them.
Reporter | ||
Comment 4•3 years ago
|
||
noted in android eng-call happy to run any signed build with any sort of additional GV logging to LOGCAT on Firebase Test Lab against the tests
Comment 5•3 years ago
|
||
Reporter | ||
Comment 6•3 years ago
•
|
||
Christian you had mentioned builds with logging that we can run on Firebase Test Lab as an option to help facilitate this bug, are those available?
Comment 7•3 years ago
|
||
(In reply to Aaron Train [:aaronmt] from comment #6)
Christian you had mentioned builds with logging that we can run on Firebase Test Lab as an option to help facilitate this bug, are those available?
Aaron, I'm looking into it this week, so I'll need a bit more time to get you a build.
Comment 8•3 years ago
|
||
I've built a debug GV locally, packaged it within a Fenix build that was signed with my debug keys, and placed a few variants here: https://drive.google.com/drive/folders/10ZzQ7dK1gewv8hAdllaJFTs2hEKE3-gN?usp=sharing
Could you let me know if you have trouble installing them in case the debug key signing process doesn't work on Firebase?
Comment 9•3 years ago
|
||
So here are some updates on what Christian and I discovered last week:
- We can see these ANRs happening across all our browsers (fenix/focus/reference browser) which means that there is no application-specific code that could be the cause.
- When we looked at a failing firebase video, you can see the page loading, some seconds go by and app closes (the video is shortened so it looks quicker than it is).
- There is a
bugreport.txt
that has some trace files that happened during the ANR, but those do not show anything happening on the main thread that looks suspicious; no references to our code running at the point of the ANR. - The dump of the graphics stats for our app shows:
Total frames rendered: 25 Janky frames: 24 (96.00%)
So the app is laggy. - Could this be the app or is the virtual emulator too slow for our debug app?
- When you look at the logcat, we can see GeckoView logs show the test page loading, completing, and idling; less reason to suspect GV at this point.
- The logs are also very generic so the Google Play reports that indicate the same crash might be the cause of something else.
- There is a suspicious line in the Firebase report:
Test instrumentation process crashed. Check org.mozilla.focus.activity.ErrorPagesTest#badURLCheckTest.txt for details
- We can't get these logs off Firebase for this particular text file but there is a request for the Firebase team to do this eventually.
- Some google searches indicate that people have had these instrumentation process crashes happen because of mismatched dependencies we found an alpha version of a test dep that we should update to a stable as well.
Some next steps to try:
- There is some reason to suspect that if we can upgrade to a matching version of test frameworks needed, these ANRs may go away.
- Try a faster emulator or maybe just a different one entirely.
- Profile the app while closing tabs to see if we're causing some kind of memory/CPU spike which is slowing the emulator.
- Try and deduce the location in code where this is happening and print the stack traces out on a loop to see what the main thread is doing.
- If all else fails, we can provide a debug GV within a debug app to aaronmt and have him reproduce the bug on Firebase again to see if that bugreport gives us more information.
Clearing NI for now until I investigate what would be the right dependencies to match up before retrying.
Comment 10•3 years ago
|
||
Jonathan will continue investigation in 103. If this is a bug in the test or test framework, we don't need to uplift to Beta 102.
Comment 11•3 years ago
|
||
Jon says this is a Fenix issue, not a GV issue. He'll reopen the Fenix bug and close this one.
Comment 12•3 years ago
|
||
Moving some input bugs to the new GeckoView::IME component.
Comment 13•3 years ago
|
||
105
Comment 14•3 years ago
|
||
Not actionable atm. Aaron is reaching out to Firebase.
Reporter | ||
Comment 15•2 years ago
•
|
||
Ruling out GMSCore (FLOSS Play Services) updates on FTL as a suspect. FTL is currently rolling out GMSCore updates across devices (you can tell from the first frame in videos as the Play Services version), and I have an instance of the same (or similar signature) ANR
08-13 20:11:44.907: I/WindowManager(483): Input event dispatching timed out sending to org.mozilla.fenix.debug/org.mozilla.fenix.HomeActivity. Reason: 240a4be org.mozilla.fenix.debug/org.mozilla.fenix.HomeActivity (server) is not responding. Waited 5003ms for MotionEvent(deviceId=-1, source=0x00001002, displayId=0, action=MOVE, actionButton=0x00000000, flags=0x00000000, metaState=0x00000000, buttonState=0x00000000, classification=NONE, edgeFlags=0x00000000, xPrecision=1.0, yPrecision=1.0, xCursorPosition=nan, yCursorPosition=nan, pointers=[0: (915.0, 1711.0)]), policyFlags=0x6b000000
@ matrix summary details
With the next two lines as
08-13 20:11:45.022: I/ActivityManager(483): Force stopping org.mozilla.fenix.debug appid=10152 user=0: finished inst 08-13 20:11:45.022: I/ActivityManager(483): Killing 15886:org.mozilla.fenix.debug/u0a152 (adj 0): stop org.mozilla.fenix.debug due to finished inst
Would also like to rule out any warnings of Non-SDK API Usage Violations that FTL identifies (see matrix link violations here) (although no mention of Landroid/view/MotionEvent
)
We have FocusEvent
and MotionEvent
instances at this point
Of note are the slow UI thread and (~100%) slow draw commands reported in the performance tab for this matrix
Comment 16•2 years ago
|
||
This issue is logged on Google's bug tracker here: https://issuetracker.google.com/issues/230950647
Comment 17•2 years ago
|
||
Un-assigning since there is nothing actionable for us to do here and an upstream bug has been filed.
Comment 18•2 years ago
|
||
This comment looks interesting https://issuetracker.google.com/issues/230950647#comment137
My understanding of Google's position on this issue, based on https://issuetracker.google.com/issues/230950647#comment51 and https://issuetracker.google.com/issues/230950647#comment53 is that there isn't an Android bug that's causing these types of ANRs, it's just that they've made a change which means that most ANRs are now being reporting with a callstack that tells you no useful information.
If so, it's not really the case that Google can go and fix these issues for you*, but that they've just created a situation where it's nigh on impossible to diagnose most ANRs. I think the best we can do as developers is look at the few actionable ANRs and hope that by fixing those it might also address some of the ones that come through with this useless callstack.
Description
•