Android robocop crashes no longer pass the correct test name to mozcrash

RESOLVED FIXED in Firefox 33

Status

Testing
General
--
major
RESOLVED FIXED
4 years ago
4 years ago

People

(Reporter: emorley, Assigned: akachkach)

Tracking

(Blocks: 1 bug)

Trunk
mozilla34
ARM
Android
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox33 fixed, firefox34 fixed)

Details

Attachments

(1 attachment)

(Reporter)

Description

4 years ago
eg:
https://tbpl.mozilla.org/php/getParsedLog.php?id=44525801&tree=B2g-Inbound
09:24:21  WARNING -  PROCESS-CRASH | remoteautomation.py | application crashed [@ js::jit::ICSetProp_Native::Compiler::getStub(js::jit::ICStubSpace*)]

Should have been:
09:24:21  WARNING -  PROCESS-CRASH | testImportFromAndroid | application crashed [@ js::jit::ICSetProp_Native::Compiler::getStub(js::jit::ICStubSpace*)]

And:
https://tbpl.mozilla.org/php/getParsedLog.php?id=44524261&tree=Mozilla-Inbound
08:57:23  WARNING -  PROCESS-CRASH | remoteautomation.py | application crashed [@ CustomElf::GetSymbol(char const*, unsigned long) const]

Should have been:
08:57:23  WARNING -  PROCESS-CRASH | testDistribution | application crashed [@ CustomElf::GetSymbol(char const*, unsigned long) const]

The string printed there is test_name passed to mozcrash:
http://mxr.mozilla.org/mozilla-central/source/testing/mozbase/mozcrash/mozcrash/mozcrash.py#25

Via lastTestSeen:
http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py#1447
1447       crashed = mozcrash.check_for_crashes(minidump_path,
1448                                            symbolsPath,
1449                                            test_name=self.lastTestSeen)

I'm guessing this might be fallout from bug 886570?
(Reporter)

Comment 1

4 years ago
I'm presuming this change doesn't play nicely with robocop:

-    def record_last_test(self, line):
-      """record last test on harness"""
-      if "TEST-START" in line and "|" in line:
-        self.harness.lastTestSeen = line.split("|")[1].strip()
-      return line
+    def record_last_test(self, message):
+      """record last test on harness"""
+      if message['action'] == 'test_start':
+        self.harness.lastTestSeen = message['test']
+      return message

Does robocop not send a test_start event?

(Note mozcrash has a fallback, so it's not that:
message['test'] = remoteautomation.py
...but instead:
message['action'] != 'test_start'

and thus we're hitting the fallback:
http://mxr.mozilla.org/mozilla-central/source/testing/mozbase/mozcrash/mozcrash/mozcrash.py#61
(Assignee)

Comment 2

4 years ago
Oh well. It's way worse than that actually: Robocop isn't doing structured logs for now, as I thought passing unstructured logs as log messages would be enough while we figure this out. Here 

I started working on a patch to convert the Java code to structured logging (bug 1034236 and bug 1034267), so I can either try to finish the Java refactor (it should be almost done... the problem is that I can't get fennec/robocop to run locally, so it's hard to test) or add conditions on unstructured logs (if message['action'] == 'log' and 'TEST-START' in message['message']). Not sure which is preferable.
(Assignee)

Updated

4 years ago
Depends on: 1034267
(Reporter)

Comment 3

4 years ago
A temporary workaround until bug 1034236 / bug 1034267 are finished might be best, if that's ok? :-)
(Assignee)

Comment 4

4 years ago
Created attachment 8462735 [details] [diff] [review]
0001-Bug-1043485-Save-last-Android-robocop-test-name-when.patch

I believe this would fix the problem (and a couple other handlers) but we won't have 100% compatibility until we land the structured robocop logs. Hopefully that won't take too long (see bug 1034267), just have to make sure the output looks the same as before.
Attachment #8462735 - Flags: review?(emorley)
(Reporter)

Updated

4 years ago
Attachment #8462735 - Flags: review?(emorley) → review+
(Reporter)

Comment 5

4 years ago
Thank you :-)
(Assignee)

Updated

4 years ago
Assignee: nobody → akachkach
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/00b25727072e
Status: NEW → RESOLVED
Last Resolved: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
(Reporter)

Comment 8

4 years ago
Unfortunately I'm still seeing a missing name for "Shutdown", eg:
https://tbpl.mozilla.org/php/getParsedLog.php?id=44811243&tree=Mozilla-Inbound

{
09:48:16     INFO -  1372 INFO TEST-PASS | testNativeCrypto | Expected byte array length matches key length - 20 should equal 20
09:48:16     INFO -  1373 INFO TEST-PASS | testNativeCrypto | Expected byte array matches key byte array -
09:48:16     INFO -  1374 INFO TEST-PASS | testNativeCrypto | MessageDigest hash is the same as NativeCrypto SHA-1 hash -
09:48:16     INFO -  1375 INFO TEST-PASS | testNativeCrypto | MessageDigest hash is the same as NativeCrypto SHA-1 hash -
09:48:16     INFO -  1376 INFO TEST-PASS | testNativeCrypto | MessageDigest hash is the same as NativeCrypto SHA-1 hash -
09:48:16     INFO -  1377 INFO TEST-END | testNativeCrypto | finished in 40471ms
09:48:16     INFO -  1378 INFO TEST-START | Shutdown
09:48:16     INFO -  1379 INFO Passed: 46
09:48:16     INFO -  1380 INFO Failed: 0
09:48:16     INFO -  1381 INFO Todo: 0
09:48:16     INFO -  1382 INFO SimpleTest FINISHED
09:48:16     INFO -  INFO | automation.py | Application ran for: 0:01:16.247024
09:48:16     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmp3hGV4_pidlog
09:48:16     INFO -  Contents of /data/anr/traces.txt:
09:48:16     INFO -  
09:48:16     INFO -  
09:48:16     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android/1406647240/fennec-34.0a1.en-US.android-arm.crashreporter-symbols.zip
09:48:16  WARNING -  PROCESS-CRASH | remoteautomation.py | application crashed [@ libui.so + 0x1befe]
}

The above crash is intermittent failure bug 986738 - previous logs for that failure mode looked like:

https://tbpl.mozilla.org/php/getParsedLog.php?id=44089292&tree=Mozilla-Inbound

{
01:46:01     INFO -  15 INFO TEST-PASS | testAddonManager | Given message occurred for registered event: {"parentId":0,"delayLoad":false,"title":"http:\/\/mochi.test:8888\/tests\/robocop\/robocop_blank_01.html","selected":true,"isPrivate":false,"stub":false,"external":false,"desktopMode":false,"tabIndex":-1,"tabID":2,"type":"Tab:Added","uri":"http:\/\/mochi.test:8888\/tests\/robocop\/robocop_blank_01.html"} - Tab:Added should equal Tab:Added
01:46:01     INFO -  16 INFO TEST-PASS | testAddonManager | Given message occurred for registered event: {"errorType":"","bgColor":"transparent","type":"DOMContentLoaded","tabID":2} - DOMContentLoaded should equal DOMContentLoaded
01:46:01     INFO -  EventExpecter: no longer listening for Tab:Added
01:46:01     INFO -  EventExpecter: no longer listening for DOMContentLoaded
01:46:01     INFO -  17 INFO TEST-PASS | testAddonManager | The correct number of tabs are opened - 2 should equal 2
01:46:01     INFO -  18 INFO TEST-PASS | testAddonManager | Page title is correct - Browser Blank Page 01 should equal Browser Blank Page 01
01:46:01     INFO -  waitForText timeout on ^Add-ons$
01:46:01     INFO -  19 INFO TEST-PASS | testAddonManager | The correct number of tabs are opened - 2 should equal 2
01:46:01     INFO -  20 INFO TEST-END | testAddonManager | finished in 152371ms
01:46:01     INFO -  21 INFO TEST-START | Shutdown
01:46:01     INFO -  22 INFO Passed: 18
01:46:01     INFO -  23 INFO Failed: 0
01:46:01     INFO -  24 INFO Todo: 0
01:46:01     INFO -  25 INFO SimpleTest FINISHED
01:46:01     INFO -  
01:46:01     INFO -  INFO | automation.py | Application ran for: 0:03:10.208340
01:46:01     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmpqFoFyfpidlog
01:46:01     INFO -  Contents of /data/anr/traces.txt:
01:46:01     INFO -  
01:46:01     INFO -  
01:46:01     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/mozilla-inbound-android/1405666711/fennec-33.0a1.en-US.android-arm.crashreporter-symbols.zip
01:46:01  WARNING -  PROCESS-CRASH | Shutdown | application crashed [@ libui.so + 0x1befe]
}

In fact looking at the comments in that bug, you can see the test name change around 2014-07-18, which is when bug 886570 landed.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Reporter)

Comment 9

4 years ago
In fact it seems like this isn't working for !Shutdown too, eg:

(this push was after the fix here merged around)

https://tbpl.mozilla.org/php/getParsedLog.php?id=44769830&tree=Fx-Team

21:24:47     INFO -  11 INFO TEST-START | testLoad
21:24:47     INFO -  INFO | automation.py | Application ran for: 0:00:15.764747
21:24:47     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmptLyQVepidlog
21:24:47     INFO -  Contents of /data/anr/traces.txt:
21:24:49     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/mobile/tinderbox-builds/fx-team-android/1406604281/fennec-34.0a1.en-US.android-arm.crashreporter-symbols.zip
21:24:53  WARNING -  PROCESS-CRASH | remoteautomation.py | application crashed [@ CustomElf::GetSymbol(char const*, unsigned long) const]
(Assignee)

Comment 10

4 years ago
Can we wait for bug 1034267 to land? Using actual structured logs in Robocop might solve this.

Actually, I'm starting to wonder if this isn't caused by the changes to remoteautomation.py

http://dxr.mozilla.org/mozilla-central/source/build/mobile/remoteautomation.py#251
(Assignee)

Comment 11

4 years ago
After further inspecting remoteautomation.py, it seems like this bug is caused because we still pass a structured logger to RemoteAutomation, which in turn processes the logs like structured logs (looking for a "test_start" action where it will only get logs).

I think bug 1034267 will fix this. What's an easy way to create a PROCESS-CRASH in robocop tests?
(Assignee)

Comment 12

4 years ago
Pushed the bug 1034267 patch with a MOZ_CRASH to see if it's fixed: https://hg.mozilla.org/try/rev/3a3d80f713df
(Assignee)

Comment 13

4 years ago
With the patch applied, here are the logs we get on a crash:

16:30:12     INFO -  10 INFO SimpleTest START
16:30:12     INFO -  11 INFO TEST-START | testFlingCorrectness
16:30:12     INFO -  12 INFO TEST-PASS | testFlingCorrectness | Given message occurred for registered event: {"type":"Gecko:Ready"} - Gecko:Ready should equal Gecko:Ready
16:30:12     INFO -  13 INFO EventExpecter: no longer listening for Gecko:Ready
16:30:12     INFO -  INFO | automation.py | Application ran for: 0:00:45.686828
16:30:12     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmpUX5pqapidlog
16:30:12     INFO -  Contents of /data/anr/traces.txt:
16:30:12     INFO -  
16:30:12     INFO -  
16:30:12     INFO -  mozcrash INFO | Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/try-builds/ahmed.kachkach@gmail.com-3a3d80f713df/try-android/fennec-34.0a1.en-US.android-arm.crashreporter-symbols.zip
16:30:12  WARNING -  PROCESS-CRASH | testFlingCorrectness | application crashed [@ nsWindow::OnGlobalAndroidEvent(mozilla::AndroidGeckoEvent*)]
16:30:12     INFO -  Crash dump filename: /tmp/tmpB1OHgg/0b330c82-e0b9-20be-4b047e8a-10985a44.dmp

So I guess we should just wait for bug 1034267 to land.
Flags: needinfo?(emorley)
(Reporter)

Comment 14

4 years ago
Sounds good to me, thank you :-)
Flags: needinfo?(emorley)
(Reporter)

Updated

4 years ago
Status: REOPENED → RESOLVED
Last Resolved: 4 years ago4 years ago
Resolution: --- → FIXED
https://hg.mozilla.org/releases/mozilla-aurora/rev/40bd25be8263
status-firefox33: --- → fixed
status-firefox34: --- → fixed
You need to log in before you can comment on or make changes to this bug.