Closed
Bug 1059990
Opened 11 years ago
Closed 8 years ago
Improve hang reporting for Android tests
Categories
(Firefox for Android Graveyard :: Testing, defect)
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: gbrown, Assigned: gbrown)
Details
Attachments
(1 file)
|
3.94 KB,
patch
|
Details | Diff | Splinter Review |
In theory, when the browser hangs during an Android test, the test harness will detect that there has been no output for N seconds, report an error, and shut down the browser first with kill -3 (to produce an ANR report), then with kill -6 (to
produce a crash report) and finally with kill -9 (to ensure the process is killed). See http://hg.mozilla.org/mozilla-central/annotate/47c9418fbc28/build/mobile/remoteautomation.py#l366.
In bug 1054292, there are now hundreds of Android 4.0 test hang reports, but none of them seem very effective. The kill -3/-6/-9 procedure appears to be happening, but:
- ANR reports are usually (always?) not generated;
- .dmp files are usually created but are often corrupt, so no crash report is generated;
- when a crash report is generated, the stack for the known problem (bug 1059797) is not reported.
Can we improve the existing mechanism?
Can we leverage Fennec's telemetry hang reporting?
| Assignee | ||
Comment 1•11 years ago
|
||
The existing system seems to work better on Android 2.3. Consider:
https://tbpl.mozilla.org/php/getParsedLog.php?id=46969350&tree=Mozilla-Inbound
which produced both an ANR report and a crash report.
| Assignee | ||
Comment 2•11 years ago
|
||
Maybe this is not as bad as I thought.
Consider https://tbpl.mozilla.org/?tree=Try&rev=bc4330803c2d, a try push that produced a different hang. These failures seem to produce ANR reports and crash reports more often than in bug 1054292.
Also, investigation of bug 1059797 now points to the compositor thread as the real problem; you can see the problematic stack in many of the crash reports in bug 1054292.
Comment 3•11 years ago
|
||
One of the drawbacks of Gecko hang monitoring is that it needs the Gecko thread running in order to output data through telemetry. Maybe we can add an asynchronous mechanism to it so it can output data on its own. The ANR reporter works on its own. I don't know if its output will be useful, but you can find it at /data/data/PACKAGE/files/mozilla/PROFILE/saved_telemetry_pings/.
Also, I think it'll be great if we can somehow attach GDB to a hanging process, dump all the thread stacks, and quit. For native hangs, GDB traces will be a lot better than what we get through ANR logs.
| Assignee | ||
Comment 4•11 years ago
|
||
(In reply to Jim Chen [:jchen :nchen] from comment #3)
> /data/data/PACKAGE/files/mozilla/PROFILE/saved_telemetry_pings/.
Oops -- seems like it is actually saved-telemetry-pings! (- vs _)
Comment 5•11 years ago
|
||
(In reply to Geoff Brown [:gbrown] (PTO Sept 15 - Oct 7) from comment #4)
> (In reply to Jim Chen [:jchen :nchen] from comment #3)
> > /data/data/PACKAGE/files/mozilla/PROFILE/saved_telemetry_pings/.
>
> Oops -- seems like it is actually saved-telemetry-pings! (- vs _)
Ah you're right! Sorry! See [1] for the format of the JSON files. I'm still not sure if we do generate ANR reports during test hangs. I hope we do!
[1] http://mxr.mozilla.org/mozilla-central/source/mobile/android/base/ANRReporter.java#259
| Assignee | ||
Comment 6•11 years ago
|
||
Here's my work in progress patch. You can see it running at https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=4b74c73bc4bc.
I have reproduced the "CreateShader" hang a couple of times but never found an anr report. It needs to be tested more.
mochitest-1 and some robocop runs often create a file in saved-telemetry-pings but these have "reason":"saved-session". For example, http://mozilla-releng-blobs.s3.amazonaws.com/blobs/try/sha512/26b78c624ee5cc4c28472ec5579a41a7442d759d3a0403a07386f1849dfc8b57792a085efc31b9b9cbfafbb617d6d3c3c263d41a2578e65544b188f33793d445
I suppose we should pull any file found in saved-telemetry-pings, then open the file, check the reason, and discard anything that is not "android-anr-report".
I won't get back to this for several weeks -- feel free to take this bug.
| Assignee | ||
Updated•8 years ago
|
Assignee: nobody → gbrown
Status: NEW → RESOLVED
Closed: 8 years ago
Component: General → Testing
Product: Testing → Firefox for Android
Resolution: --- → WONTFIX
Updated•5 years ago
|
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•