Autophone - improve error messages to help Treeherder suggestions narrow the possibilities

RESOLVED WONTFIX

Status

Testing
Autophone
RESOLVED WONTFIX
6 months ago
24 days ago

People

(Reporter: bc, Assigned: bc)

Tracking

Trunk
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Assignee)

Description

6 months ago
https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=b4b9eaba235ec71a73bd996f436bcdecfc79bf0c&filter-searchStr=autophone&selectedJob=129212564 is an example where Autophone's error message results in Treeherder making many unrelated suggestions with an unrelated crash being the top suggestion.

The suggested match is:
1388294 - Intermittent autophone-s1s2 | application crashed [@ libc.so + 0x28a74] [ 0.44 ] 

but the actual bug is hidden under More as:
 [WONTFIX] 1121547 - Autophone - Intermittent PROCESS-CRASH | autophone-s1s2 | application crashed [@ libc.so + 0x21ba8] | PROCESS-CRASH | autophone-s1s2 | application crashed [@ libsc-a3xx.so] [ 0.35 ] 

The WONTFIX may result in this particular suggestion being relegated to the More section but we also have other problems with Autophone in that the number of suggestions routinely exceeds the limit of 20.

KWierso mentioned that Treeherder is matching on the "filename" autophone-s1s2 and is suggesting any bug with a matching "filename" and that Treeherder only searches for the full error message string if it doesn't find a match on the "filename".

Is there a way to structure these error messages so that Treeherder will use suggestions that match the string:

"autophone-s1s2 | application crashed [@ ....]" ?
Flags: needinfo?(cdawson)

Comment 1

6 months ago
From what I've seen, the 20-bug limit only applies to the old Failure Summary panel. The newer Autoclassify mode does not appear to have the 20-bug limit. So going forward, that shouldn't be a serious problem once stragglers stop using the old way. It would be good to combine any opened bugs that actually do cover the same failure, though, to avoid duplicating work.

The larger problem is that the autoclassify panel sorts the results by the confidence in the match between the failure message and the bug summary. All of the stuff at the beginning of the failure message (datestamp, timestamp, device, etc) don't actually become part of the bug summary (the bug filer tool starts with the "PROCESS-CRASH" part, and even that gets stripped out as unnecessary to save a few characters), so you end up with just "autophone-s1s2 | application crashed [@ libc.so + 0xFOO]", which only matches a small bit of the failure message, causing the sorted results to be less accurate.

Could autophone be changed so that the timestamps and other stuff are on a prior line so that they're part of the log but not surfaced in Treeherder on the error line? Doing that would improve the accuracy of the matching between the failure and the bug.

And if the "remote-nytimes", "remote-blank", etc parts of the failures are useful, changing the order so they come after the "PROCESS-CRASH" part would mean they get included in the bug summary.



So a preferred (I think) format for when autophone has a failure would be something like this:
2017-09-07 13:39:47,114 pixel-04 INFO S1S2TestJob autoland 20170907193116 opt api-25 android-api-16
PROCESS-CRASH | autophone-s1s2 | remote-nytimes | application crashed [@ libc.so + 0x49da0]

In this case, only the second line would be exposed as a failure to Treeherder, and the bug summary filed from this would be:
autophone-s1s2 | remote-nytimes | application crashed [@ libc.so + 0x49da0]



With it formatted like this, crashes in different tests that have the same root cause could be combined so the bug summary ends up being something like:
autophone-s1s2 | remote-nytimes,remote-blank | application crashed [@ libc.so + 0x49da0]

Which would still get shown as a suggestion in Treeherder, but only take up one bug instead of two, while still being more accurately matched than the current format.
Wes sounds like he's on to something.  I must admit, I don't know the inner workings of the choices the autoclassifier makes very well.  That's been James Graham's domain more than mine.  n-i'ing James to see if he has any other suggestions that may help.
Flags: needinfo?(cdawson) → needinfo?(james)
Yes, what :KWierso says is right. The expectation is that there should be a good match between what's printed in the error summary and the bug suggestion. It seems like autophone isn't meeting that expectation, so if it can change to do so that's the best solution here.
Flags: needinfo?(james)
(Assignee)

Comment 4

24 days ago
Autophone is going away. Resolving these to wontfix.
Status: ASSIGNED → RESOLVED
Last Resolved: 24 days ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.