A log_eval_func should be able to see "WARNING: Unable to ping tegra after 5 attempts" and retrigger for us, so we don't have to see bug 781419 10 or 15 times a day.
Probably stuck behind a yak-shaving dependency chain. The reasonable place to stick the regex is in the broadly-named tegra_errors, but that includes a rather broad "Automation error: Error" -> FAILURE that might not go well with verify.py. Looking for things that might trip over shows that bug 790613 is touching messages to stick Automation Error in front of them, including this one, but it's stuck behind bug 781341.
Assignee: nobody → philringnalda
Status: NEW → ASSIGNED
Depends on: 790613
Priority: -- → P3
Created attachment 661543 [details] [diff] [review] quick and dirty This would be prettier with the full message, and with the full knowledge of what messages some future sut_tools will produce to know whether it's safe to just add on to tegra_errors instead, but, I noticed that this is actually number 5 on http://brasstacks.mozilla.com/orangefactor/?display=OrangeFactor so I'd rather retry now than retry more prettily at some random unknowable future time.
Comment on attachment 661543 [details] [diff] [review] quick and dirty NOTE: This will only empower the unittest jobs, talos jobs would need this elsewhere. -- if we do the whole "Automation Error: Unable..." we can stick this in the generic tegra_errors I think. I'll let ben address the choices in the real patch, with my statements here as a rough guide. Also I'm inclined to deploy Bug 790613 now if it would help this (takes moments to deploy) Its not really blocked behind 781341 at all, since 790613 already landed and is relatively easy to deploy (just manual for now)
Attachment #661543 - Flags: feedback+
Yeah, I glossed over talos because it makes me stabby. It does addCleanupSteps(), ... run talos ... addCleanupSteps(), which means it runs verify.py twice, which makes us fail much more often in talos than we do in other suites, and makes me reluctant to set RETRY after a successful run. Not that I haven't done that before, mind you, but still, I'm reluctant to do it again.
Created attachment 661834 [details] [diff] [review] less dirty This time reusing tegra_errors, by turning the existing one from what looks like it's some generic thing catching automation errors into what it really is, a specific message (and the only existing message that starts its message about what auomation error it is with the word error) that we have to turn to FAILURE because it happens during the test step, where it otherwise winds up orange instead of red.
Attachment #661834 - Flags: review?(bhearsum) → review+
Comment on attachment 661834 [details] [diff] [review] less dirty http://hg.mozilla.org/build/buildbotcustom/rev/d62f03c272db
Attachment #661834 - Flags: checked-in+
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: General Automation → General
Product: Release Engineering → Release Engineering
You need to log in before you can comment on or make changes to this bug.