Closed Bug 910320 Opened 12 years ago Closed 11 years ago

Check for java exceptions on Talos runs too

Categories

(Testing :: Talos, defect)

ARM
Android
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 4 obsolete files)

Bug 823452 added support for checking for java exceptions in the logcat (and if found, means we don't bother looking for the minidump). ie: http://hg.mozilla.org/mozilla-central/annotate/44e615a66f3f/build/mobile/remoteautomation.py#l92 { 89 def checkForJavaException(self, logcat): 90 found_exception = False 91 for i, line in enumerate(logcat): 92 if "REPORTING UNCAUGHT EXCEPTION" in line or "FATAL EXCEPTION" in line: 93 # Strip away the date, time, logcat tag and pid from the next two lines and 94 # concatenate the remainder to form a concise summary of the exception. 95 # 96 # For example: 97 # 98 # 01-30 20:15:41.937 E/GeckoAppShell( 1703): >>> REPORTING UNCAUGHT EXCEPTION FROM THREAD 9 ("GeckoBackgroundThread") 99 # 01-30 20:15:41.937 E/GeckoAppShell( 1703): java.lang.NullPointerException 100 # 01-30 20:15:41.937 E/GeckoAppShell( 1703): at org.mozilla.gecko.GeckoApp$21.run(GeckoApp.java:1833) 101 # 01-30 20:15:41.937 E/GeckoAppShell( 1703): at android.os.Handler.handleCallback(Handler.java:587) 102 # 01-30 20:15:41.937 E/GeckoAppShell( 1703): at android.os.Handler.dispatchMessage(Handler.java:92) 103 # 01-30 20:15:41.937 E/GeckoAppShell( 1703): at android.os.Looper.loop(Looper.java:123) 104 # 01-30 20:15:41.937 E/GeckoAppShell( 1703): at org.mozilla.gecko.util.GeckoBackgroundThread.run(GeckoBackgroundThread.java:31) 105 # 106 # -> java.lang.NullPointerException at org.mozilla.gecko.GeckoApp$21.run(GeckoApp.java:1833) 107 found_exception = True 108 logre = re.compile(r".*\): \t?(.*)") 109 m = logre.search(logcat[i+1]) 110 if m and m.group(1): 111 top_frame = m.group(1) 112 m = logre.search(logcat[i+2]) 113 if m and m.group(1): 114 top_frame = top_frame + m.group(1) 115 print "PROCESS-CRASH | java-exception | %s" % top_frame 116 break 117 return found_exception } However, this isn't implemented for Talos, meaning that we get things like: https://tbpl.mozilla.org/php/getParsedLog.php?id=26815583&full=1&branch=fig { PROCESS-CRASH | tcheck2 | application crashed [@ mozalloc_abort(char const*)] Crash dump filename: /tmp/tmpPtM48s/469f2341-9b77-5a0d-31e4eb41-270f459d.dmp Operating system: Android 0.0.0 Linux 2.6.32.9-00002-gd8084dc-dirty #1 SMP PREEMPT Wed Feb 2 11:32:06 PST 2011 armv7l nvidia/harmony/harmony/harmony:2.2/FRF91/20110202.102810:eng/test-keys CPU: arm 2 CPUs Crash reason: SIGSEGV Crash address: 0x0 Thread 0 (crashed) 0 libmozalloc.so!mozalloc_abort(char const*) [mozalloc_abort.cpp:7e83ba2dfb51 : 30 + 0x4] r4 = 0x0000aa50 r5 = 0x00000000 r6 = 0x483f35b0 r7 = 0x430cbf30 r8 = 0xbe9697b0 r9 = 0x430cbf28 r10 = 0x430cbf14 fp = 0x00000000 sp = 0xbe969728 lr = 0x4498a0bb pc = 0x4498a0be Found by: given as instruction pointer in context 1 libxul.so!Java_org_mozilla_gecko_GeckoAppShell_reportJavaCrash [AndroidJNI.cpp:7e83ba2dfb51 : 102 + 0x3] r4 = 0x0000aa50 r5 = 0x00000000 r6 = 0x483f35b0 r7 = 0x430cbf30 r8 = 0xbe9697b0 r9 = 0x430cbf28 r10 = 0x430cbf14 fp = 0x00000000 sp = 0xbe969730 pc = 0x55700ef7 Found by: call frame info 2 libmozglue.so!Java_org_mozilla_gecko_GeckoAppShell_reportJavaCrash [jni-stubs.inc:7e83ba2dfb51 : 108 + 0x1] r4 = 0x0000aa50 r5 = 0x00000000 r6 = 0x483f35b0 r7 = 0x430cbf30 r8 = 0xbe9697b0 r9 = 0x430cbf28 r10 = 0x430cbf14 fp = 0x00000000 sp = 0xbe9697a8 pc = 0x80c1b4a3 Found by: call frame info } However in the logcat, as expected: { 08-21 05:25:10.594 E/GeckoAppShell( 2110): >>> REPORTING UNCAUGHT EXCEPTION FROM THREAD 1 ("main") 08-21 05:25:10.594 E/GeckoAppShell( 2110): java.lang.IllegalStateException: Fragment already added: PinBookmarkDialog{48425208 #3 pin_bookmark} 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.support.v4.app.FragmentManagerImpl.addFragment(FragmentManager.java:1159) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.support.v4.app.BackStackRecord.run(BackStackRecord.java:616) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.support.v4.app.FragmentManagerImpl.execPendingActions(FragmentManager.java:1444) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.support.v4.app.FragmentManagerImpl$1.run(FragmentManager.java:429) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.os.Handler.handleCallback(Handler.java:587) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.os.Handler.dispatchMessage(Handler.java:92) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.os.Looper.loop(Looper.java:123) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at android.app.ActivityThread.main(ActivityThread.java:4627) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at java.lang.reflect.Method.invokeNative(Native Method) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at java.lang.reflect.Method.invoke(Method.java:521) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at com.android.internal.os.ZygoteInit$MethodAndArgsCaller.run(ZygoteInit.java:868) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at com.android.internal.os.ZygoteInit.main(ZygoteInit.java:626) 08-21 05:25:10.594 E/GeckoAppShell( 2110): at dalvik.system.NativeStart.main(Native Method) }
Attached patch WIP (obsolete) — Splinter Review
Assignee: nobody → emorley
Status: NEW → ASSIGNED
Comment on attachment 796782 [details] [diff] [review] WIP Joel, this is untested yet (because talos and so I can't face doing so today), but do you at least think this is the approach that you'd prefer?
Attachment #796782 - Flags: feedback?(jmaher)
Blocks: 907720
Don't we have this data in the .extra file next to the .dmp? Couldn't we handle this in mozcrash instead of duplicating this logic between automation.py and Talos? :-(
Comment on attachment 796782 [details] [diff] [review] WIP Review of attachment 796782 [details] [diff] [review]: ----------------------------------------------------------------- this approach is fine. It would be great if mozcrash had a mozcrash.checkForJavaCrash() method :)
ted, this data is found in logcat information, not in .dmp or .extra files.
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #3) > Don't we have this data in the .extra file next to the .dmp? Couldn't we > handle this in mozcrash instead of duplicating this logic between > automation.py and Talos? :-( That's the approach I trialled in bug 809065, but the logcat method of bug 823452 got implemented first, and at the time we had no reason to favour one method over the other (though now that seems less of the case).
(In reply to Joel Maher (:jmaher) from comment #5) > ted, this data is found in logcat information, not in .dmp or .extra files. We put the Java stack in the .extra file so Socorro can process it, so I assume it's available there in automation as well.
there are many cases where we have java only crashes which are at the system level and have nothing to do with our program. If it is a java crash in our program we get .dmp/.extra files, if this is a libc crash or libdvm crash at the system level, we only see it in logcat.
Okay, I don't know about all of those. Ed's comment 0 clearly shows that we have a .dmp (since we got a stack from it), so we'd presumably have the Java stack in the matching .extra. Maybe we could simply move this method into mozbase somewhere instead of duplicating it?
Attachment #796782 - Flags: feedback?(jmaher)
Blocks: 920490
Depends on: 923500
Let's just pop the common functionality in mozcrash and use that both from remoteautomation.py and talos. Have broken out to bug 923500, since we'll need to do the "land in mozbase, deploy elsewhere" dance, before we can proceed here.
Comment on attachment 796782 [details] [diff] [review] WIP Have broken out the "clean up remoteautomation.py" part to bug 923513, due to this bug requiring talos push to prod etc.
Attachment #796782 - Attachment is obsolete: true
Use the new mozcrash.check_for_java_exception() added by bug 910320 (this will obviously have to land after the updated mozcrash is rolled out).
Attachment #813655 - Flags: review?(jmaher)
ttest.py:15: 'glob' imported but unused ttest.py:19: 're' imported but unused ttest.py:21: 'shutil' imported but unused ttest.py:129: local variable 'cleanup_result' is assigned to but never used ttest.py:199: local variable 'e' is assigned to but never used
Attachment #813656 - Flags: review?(jmaher)
Comment on attachment 813655 [details] [diff] [review] Part 1: Use mozcrash.check_for_java_exception() Review of attachment 813655 [details] [diff] [review]: ----------------------------------------------------------------- close, but this should be rearranged a bit. ::: talos/ttest.py @@ +136,5 @@ > + with open('logcat.log') as f: > + logcat = f.read() > + found = mozcrash.check_for_java_exception(logcat) > + > + if not found: why not use an else here? @@ +168,5 @@ > > + found = mozcrash.check_for_crashes(minidumpdir, > + browser_config['symbols_path'], > + stackwalk_binary=stackwalkbin, > + test_name=test_name) I would like to put all the remote checking in the same block. So, check for remote first (logcat/java, then minidump), then windows, osx, linux.
Attachment #813655 - Flags: review?(jmaher) → review-
Comment on attachment 813656 [details] [diff] [review] Part 2: Fix some existing Pyflakes warnings Review of attachment 813656 [details] [diff] [review]: ----------------------------------------------------------------- thanks!
Attachment #813656 - Flags: review?(jmaher) → review+
(In reply to Joel Maher (:jmaher) from comment #14) > Comment on attachment 813655 [details] [diff] [review] > Part 1: Use mozcrash.check_for_java_exception() > > ::: talos/ttest.py > @@ +136,5 @@ > > + with open('logcat.log') as f: > > + logcat = f.read() > > + found = mozcrash.check_for_java_exception(logcat) > > + > > + if not found: > > why not use an else here? Because that breaks the "logcat exists but there wasn't a java exception" case. (ie: if the logcat exists, we'd give up after finding no java exception, rather than continuing to look for the minidump as usual). > @@ +168,5 @@ > > > > + found = mozcrash.check_for_crashes(minidumpdir, > > + browser_config['symbols_path'], > > + stackwalk_binary=stackwalkbin, > > + test_name=test_name) > > I would like to put all the remote checking in the same block. So, check > for remote first (logcat/java, then minidump), then windows, osx, linux. Agreed, have rearranged and will re-upload.
Attachment #814389 - Flags: review?(jmaher)
Attachment #813655 - Attachment is obsolete: true
Just noticed the logcat was being passed as a string, rather than the required list of lines. Added a .split('\r') to match: https://github.com/mozilla/mozbase/blob/master/mozdevice/mozdevice/devicemanager.py#L137 ...which is what is used by automation.py.in prior to calling self.checkForJavaException(logcat): http://hg.mozilla.org/mozilla-central/file/51b36c5fd45f/build/mobile/remoteautomation.py#l146
Attachment #814392 - Flags: review?(jmaher)
Attachment #814389 - Attachment is obsolete: true
Attachment #814389 - Flags: review?(jmaher)
Comment on attachment 814392 [details] [diff] [review] Part 1: Use mozcrash.check_for_java_exception() v3 Review of attachment 814392 [details] [diff] [review]: ----------------------------------------------------------------- thanks!
Attachment #814392 - Flags: review?(jmaher) → review+
+ bump the version of mozcrash required to the newly released 0.9, so check_for_java_exception() is available.
Attachment #814392 - Attachment is obsolete: true
Update create_talos_zip.py to use the new mozcrash version too.
Comment on attachment 817038 [details] [diff] [review] Part 3: create_talos_zip.py followup Review of attachment 817038 [details] [diff] [review]: ----------------------------------------------------------------- this looks good
Panda talos runs will need bug 927055 too
Depends on: 927055
Depends on: 928370
will this be closed when talos is deployed?
(In reply to Joel Maher (:jmaher) from comment #26) > will this be closed when talos is deployed? Yup :-)
Depends on: 931032
Status: ASSIGNED → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: