Closed Bug 1371175 Opened 7 years ago Closed 7 years ago

Intermittent Android reftest No tests run or test summary not found after... nothing visibly wrong

Categories

(Firefox for Android Graveyard :: Testing, defect, P3)

defect

Tracking

(Not tracked)

RESOLVED WORKSFORME

People

(Reporter: intermittent-bug-filer, Assigned: gbrown)

Details

(Keywords: intermittent-failure, Whiteboard: [stockwell unknown])

Interesting.

The main test log shows tests running...until they stop:

[task 2017-06-08T04:55:42.621442Z] 04:55:42     INFO -  REFTEST TEST-LOAD | http://10.0.2.2:8854/tests/layout/reftests/font-inflation/threshold-select-listbox-contents-at-1.html | 249 / 309 (80%)
[task 2017-06-08T04:55:42.621508Z] 04:55:42     INFO -  REFTEST INFO | RESTORE PREFERENCE pref(font.size.inflation.lineThreshold,400)
[task 2017-06-08T04:55:42.621572Z] 04:55:42     INFO -  REFTEST INFO | RESTORE PREFERENCE pref(font.size.inflation.forceEnabled,false)
[task 2017-06-08T04:55:42.621628Z] 04:55:42     INFO -  REFTEST INFO | RESTORE PREFERENCE pref(font.size.inflation.emPerLine,0)
[task 2017-06-08T04:55:42.621709Z] 04:55:42     INFO -  REFTEST TEST-LOAD | http://10.0.2.2:8854/tests/layout/reftests/font-inflation/threshold-select-listbox-contents-at-1-ref.html | 249 / 309 (80%)
[task 2017-06-08T04:56:05.897992Z] 04:56:05     INFO -  INFO | automation.py | Application ran for: 0:23:29.856570

So, after threshold-select-listbox-contents-at-1-ref.html, we know there were more tests to run ("80%") but then the harness noticed that fennec was no longer running. There was no crash report, and no sign of an uncaught java exception in the logcat. There was no test summary because the browser died before the tests completed, hence the error "No tests run or test summary not found".

Why did the browser die?

Logcat has evidence of low memory:

06-07 21:54:47.822 D/GeckoMemoryMonitor(  768): onLowMemory() notification received
06-07 21:54:47.832 D/GeckoMemoryMonitor(  768): increasing memory pressure to 4
...
06-07 21:55:02.762 D/GeckoMemoryMonitor(  768): onTrimMemory() notification received with level 15
06-07 21:55:02.762 D/GeckoMemoryMonitor(  768): increasing memory pressure to 4
...
06-07 21:55:16.912 I/ActivityManager(  279): Process com.android.inputmethod.latin (pid 379) has died.
06-07 21:55:16.932 W/ActivityManager(  279): Scheduling restart of crashed service com.android.inputmethod.latin/.LatinIME in 5000ms
...
06-07 21:55:23.302 I/ActivityManager(  279): Start proc android.process.acore for content provider com.android.providers.userdictionary/.UserDictionaryProvider: pid=1270 uid=10010 gids={50010, 3003, 1015, 1028}
...
06-07 21:55:26.192 I/ActivityManager(  279): Process android.process.acore (pid 1270) has died.
06-07 21:55:26.302 I/ActivityManager(  279): Process com.android.inputmethod.latin (pid 1253) has died.
06-07 21:55:26.302 W/ActivityManager(  279): Scheduling restart of crashed service com.android.inputmethod.latin/.LatinIME in 20000ms
...
06-07 21:55:36.811 I/ActivityManager(  279): Process org.mozilla.fennec_aurora (pid 768) has died.
06-07 21:55:36.811 W/ActivityManager(  279): Scheduling restart of crashed service org.mozilla.fennec_aurora/org.mozilla.gecko.media.MediaControlService in 19492ms
Priority: -- → P3
:snorp -- Anything to add? Want to follow-up at all?
Flags: needinfo?(snorp)
It does look like OOM to me. We don't have time to look deeper right now, but I want to try to get valgrind going, as we have some reports of high memory usage in the wild and I'm not sure if it's a leak or not.
Flags: needinfo?(snorp)
:gbrown, can you look into this a bit more this week as the failure rate is getting higher.
Flags: needinfo?(gbrown)
Whiteboard: [stockwell needswork]
With additional history, some trends emerge:
 - almost always on Android Debug
 - almost always in reftests
 - OOM occurs during or shortly after certain tests, like bidi/numeral/persian-1.html and svg/filters/css-svg-filter-chains/clip-input-css-filter.html, but it is unclear what these tests have in common

Interestingly, there have been no failures today or yesterday. Something like bug 1358898 may have fixed this.


Keeping ni for monitoring, but I'm hopeful this is resolved.
0 failures for 3 days in a row.
Assignee: nobody → gbrown
Status: NEW → RESOLVED
Closed: 7 years ago
Flags: needinfo?(gbrown)
Resolution: --- → WORKSFORME
Whiteboard: [stockwell needswork] → [stockwell unknown]
This is still happening on Beta...
Flags: needinfo?(gbrown)
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.