Closed Bug 850670 Opened 12 years ago Closed 12 years ago

Add TBPL bug suggestion support for leaks

Tracking

(Not tracked)

Status:

RESOLVED FIXED

People

(Reporter: emorley, Assigned: emorley)

References

(Blocks 1 open bug)

Details

(Keywords: sheriffing-P1)

Attachments

(3 files)

Part 1: Use more intuitive variable names + add comments 12 years ago Ed Morley [:emorley] 2.36 KB, patch	philor : review+	Details \| Diff \| Splinter Review
Part 2: Make regexes more readable 12 years ago Ed Morley [:emorley] 1.69 KB, patch	philor : review+	Details \| Diff \| Splinter Review
Part 3: Generate bug suggestions for leaks 12 years ago Ed Morley [:emorley] 2.27 KB, patch	philor : review+	Details \| Diff \| Splinter Review

Ed Morley [:emorley]

Assignee

Description

•

12 years ago

For leaks, the TBPL annotated summary currently displays: { 07:32:44 WARNING - TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 168 bytes during test execution 07:32:44 WARNING - TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of Mutex with size 12 bytes 07:32:44 WARNING - TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of ReentrantMonitor with size 16 bytes 07:32:44 WARNING - TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 2 instances of nsRunnable with size 12 bytes each (24 bytes total) 07:32:44 WARNING - TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of nsTArray_base with size 4 bytes 07:32:44 WARNING - TEST-UNEXPECTED-FAIL | automationutils.processLeakLog() | leaked 1 instance of nsThread with size 112 bytes } TBPL's AnnotatedSummaryGenerator.phpm intentionally excludes 'automationutils.processLeakLog()' from bug suggestion BzAPI calls, since it will return too many results. We should: (a) Output just one summary line for the entire leak (since each line on it's own is often insufficient to identify the real intermittent bug). (b) Cope varying numbers/size of each object being leaked (c) Either create a fake "test name" based on the unique combination of first N objects leaked, or else specifically cater for leaks in AnnotatedSummaryGenerator.php before trying to find the test name.

Ed Morley [:emorley]

Assignee

Comment 1

•

12 years ago

Proposal: Make mozilla-central's automationutils.py: 1) Use "leakcheck" in place of automationutils.processLeakLog() 2) Only output the one TEST-UNEXPECTED-FAIL line per leak (which summarises the first N objects, like we do already): > TEST-UNEXPECTED-FAIL | leakcheck | leaked 204 bytes during test execution (Mutex, ReentrantMonitor, nsRunnable, nsTArray_base, nsThread, ...) 3) Output all the other lines as TEST-INFO (like we already do after the first 5 leaked objects). We may even be able to get rid of these, if the leak stats table is deemed easy enough to grok without the summary. Make TBPL's AnnotatedSummaryGenerator.php: 1) Special case testname == "leakcheck" here: https://hg.mozilla.org/webtools/tbpl/file/109d1726d24f/php/inc/AnnotatedSummaryGenerator.php#l78 2) Make the BzAPI search of form: summary="(Mutex, ReentrantMonitor, nsRunnable, nsTArray_base, nsThread, ...)" (Then update the various bugs to use the new summary format). This would give us working bug suggestions that won't break if the number of objects/size of those objects varies platform to platform.

Ed Morley [:emorley]

Assignee

Updated

•

12 years ago

Depends on: 850681

Phil Ringnalda (:philor)

Comment 2

•

12 years ago

The problem isn't just that "automationutils.processLeakLog()" returns too many things, it's also that there are more classes of leak than just "identified by the first five classes of object leaked." There are also * Things which before mozharness logging broke it were characterized by "analyze the leak" saying that a particular test leaked a domwindow. The list of objects will be extremely generic, because it leaked the world, and if we have tbpl suggesting a bug with "DR_State" in the list of objects, then everyone who leaks the world on Try will star their new leak as it. * Things which are characterized by the first or last thing in the list of leaked URLs looking significant to a human. If your DR_State-including leak ends its list with http://127.0.0.1:8888/safebrowsing-dummy/newkey, that's your leak, if your DR_State-including leak ends its list with some URL from one of the last tests to run, or a chrome image, good luck with finding what actually does characterize your leak. * An ugly dumping ground of plugin- and tab-process leaks that don't print their list of leaked URLs, so they have generic leaked-the-world leaks like bug 841134 which might be safebrowsing-dummy/newkey, or might not.

Phil Ringnalda (:philor)

Comment 3

•

12 years ago

Having never actually looked at the full list of objects in anything but trivially small lists, I wonder whether we'd get better identification from "Leaked n bytes where the hash of the list of all objects leaked is 79054025255fb1a26e4bc422aef54eb4."

Ed Morley [:emorley]

Assignee

Comment 4

•

12 years ago

I agree this won't cover all of the classes of leaks, but I think we should try and still aim for a 95% solution (by frequency of manual starring) as a v1 :-) The implementation in comment 1 will avoid the "everyone starring the leaks the world variants as the wrong bug", since it won't do partial matching on one object (eg DR_State), only all of the N first objects shown. If we find there are false positives, we can iterate the algorithm further and/or just rename the summary of those "dumping ground" bugs, to ensure they aren't suggested to people who would mis-use them. Regarding the leaked URLs output, afaik it's not actually in the leak logs (it comes from http://mxr.mozilla.org/mozilla-central/source/netwerk/base/src/nsStandardURL.cpp#301) so would have to be handled separately anyway. (In reply to Phil Ringnalda (:philor) from comment #2) > * Things which before mozharness logging broke it were characterized by > "analyze the leak" saying that a particular test leaked a domwindow. This is on my list to fix - I've just been focusing on this bug so far, due to the number of occurrences of bug 833769 and bug 694254 which I'm sure you're finding as much of a pain to star as I :-) (In reply to Phil Ringnalda (:philor) from comment #3) > Having never actually looked at the full list of objects in anything but > trivially small lists, I wonder whether we'd get better identification from > "Leaked n bytes where the hash of the list of all objects leaked is > 79054025255fb1a26e4bc422aef54eb4." I like the principal of this - the only problems I see is that it doesn't give enough wiggle room for leaks that are essential the same, but vary across platforms, and also it means the resultant bug summaries are even more indecipherable to humans than normal.

Ed Morley [:emorley]

Assignee

Comment 5

•

12 years ago

Attached patch Part 1: Use more intuitive variable names + add comments — Details — Splinter Review

s/tokens/lineParts/ s/testPath/testNameOrFilePath/ s/parts/filePathParts/ + adds a few more comments.

Attachment #725786 - Flags: review?(ryanvm)

Ed Morley [:emorley]

Assignee

Comment 6

•

12 years ago

Attached patch Part 2: Make regexes more readable — Details — Splinter Review

Changes the regex delimiter from '/' to '#' so that we don't have to escape the forward slashes (http://php.net/manual/en/regexp.reference.delimiters.php), thereby making the regexes more readable.

Attachment #725787 - Flags: review?(ryanvm)

Ed Morley [:emorley]

Assignee

Comment 7

•

12 years ago

Attached patch Part 3: Generate bug suggestions for leaks — Details — Splinter Review

Currently, TBPL: * Checks to see if a failure line is a leak. * If so, sets hasLeak to true (which causes us to get the "Leak Analysis" link at the end of the annotated summary) - and then gives up, returning out of processLine() before searching for bugs. With this patch, we'll instead: * Check for test name 'leakcheck'. * If found, set hasLeak to true, then try to extract the list of leaked objects inside the parentheses & use that as the search term. If no such list is found (eg the "negative leaks found!" leakcheck message), we'll fall back to searching for the whole line, since we know searching for just 'leakcheck' isn't going to end well due to number of results returned. * If the test name wasn't 'leakcheck' we continue as before (ie: split on path slashes and take whatever is at the end as the test name). In another bug I want to overall the way we do fall-back (since we duplicate |$searchTerm = $trimmedLine;| in a few places already, plus could really do with adding it in yet more edge cases) - but I don't want to change too much as once, given the lack of tests for TBPL.

Attachment #725789 - Flags: review?(ryanvm)

Ed Morley [:emorley]

Assignee

Comment 8

•

12 years ago

Initial testing appears to work well - see bug 833769 comment 659 (used a log from my try run of bug 850681) - though we'll want to pay close attention to this on tbpl-dev.

Ed Morley [:emorley]

Assignee

Comment 9

•

12 years ago

s/overall/overhaul/

Phil Ringnalda (:philor)

Comment 10

•

12 years ago

Comment on attachment 725786 [details] [diff] [review] Part 1: Use more intuitive variable names + add comments Yup.

Attachment #725786 - Flags: review?(ryanvm) → review+

Phil Ringnalda (:philor)

Comment 11

•

12 years ago

Comment on attachment 725787 [details] [diff] [review] Part 2: Make regexes more readable Some people just don't have a proper appreciation for a pile of twigs :(

Attachment #725787 - Flags: review?(ryanvm) → review+

Phil Ringnalda (:philor)

Comment 12

•

12 years ago

Comment on attachment 725789 [details] [diff] [review] Part 3: Generate bug suggestions for leaks I'm not really excited by the effect on releases/* if we don't backport, but it's not like things could actually be any worse.

Attachment #725789 - Flags: review?(ryanvm) → review+

Ed Morley [:emorley]

Assignee

Comment 13

•

12 years ago

Ta :-) https://hg.mozilla.org/webtools/tbpl/rev/d71ad26ccd23 https://hg.mozilla.org/webtools/tbpl/rev/35d5822d1f23 https://hg.mozilla.org/webtools/tbpl/rev/a205650d5980 I'll backport soon (plus the testing I've just done locally seems to show that there aren't any major issues for release branches, other than the loss of the leak analysis button, which is in need of fixing anyway).

Ed Morley [:emorley]

Assignee

Updated

•

12 years ago

Depends on: 852307

Ed Morley [:emorley]

Assignee

Comment 14

•

12 years ago

In production :-)

Status: ASSIGNED → RESOLVED

Closed: 12 years ago

Resolution: --- → FIXED

Nobody; OK to take it and work on it

Updated

•

11 years ago

Product: Webtools → Tree Management

Nobody; OK to take it and work on it

Updated

•

10 years ago

Product: Tree Management → Tree Management Graveyard

You need to log in before you can comment on or make changes to this bug.