Closed Bug 1076990 Opened 10 years ago Closed 10 years ago

update talos.json on tip to capture mainthreadio and other talos cleanup

Categories

(Testing :: Talos, defect)

defect
Not set
normal

Tracking

(firefox34 fixed, firefox35 fixed)

RESOLVED FIXED
mozilla35
Tracking Status
firefox34 --- fixed
firefox35 --- fixed

People

(Reporter: jmaher, Assigned: jmaher)

References

Details

Attachments

(2 files, 3 obsolete files)

we have a lot of talos cleanup recently including adding a feature for mainthread IO detection.
Attached patch update_taloszip.patch (1.0) (obsolete) — Splinter Review
Assignee: nobody → jmaher
Status: NEW → ASSIGNED
Attachment #8499001 - Flags: review?(wlachance)
Comment on attachment 8499001 [details] [diff] [review]
update_taloszip.patch (1.0)

Review of attachment 8499001 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #8499001 - Flags: review?(wlachance) → review+
No longer depends on: 1068989
thanks Ed!  I should have landed this faster.

Aaron, here are the failures we see:
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\yahoo.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.08601400000000001, 'RunCount': 2}
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\twitter.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.079486, 'RunCount': 2}
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\ebay.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.08025399999999999, 'RunCount': 2}
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\amazondotcom.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.09023800000000001, 'RunCount': 2}
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\bing.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.078718, 'RunCount': 2}
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\google.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.079486, 'RunCount': 2}
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\wikipedia.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.079486, 'RunCount': 2}

should we just add all of these to the whitelist?
Flags: needinfo?(aklotz)
Hmmm, those look like that failure you ping me about yesterday during the perf testing meeting.

I'm inclined to answer, "yes," but give me a bit of time to confirm this.
Flags: needinfo?(aklotz)
Yeah, looks like it's a fallback path in the search service if async init hasn't completed in time, hence the intermittent failures. We'll need to whitelist everything in searchplugins.
Flags: needinfo?(jmaher)
Attachment #8499741 - Flags: review?(aklotz)
Flags: needinfo?(jmaher)
Attachment #8499741 - Flags: review?(aklotz) → review+
landed the whitelist changes in talos, will update next week:
https://hg.mozilla.org/build/talos/rev/87f6d98a6a4a
updated and now we can run the same rev on android + desktop!

try run with lots of green:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=edec39112a25
Attachment #8499001 - Attachment is obsolete: true
Attachment #8500425 - Flags: review?(dminor)
Attachment #8500425 - Flags: review?(dminor) → review+
I had to back this out for semi-frequent Android crashes. Ben, looks like these might be PBackground related?
https://hg.mozilla.org/integration/mozilla-inbound/rev/8f91f44a4e3e

https://treeherder.mozilla.org/ui/logviewer.html#?job_id=2797530&repo=mozilla-inbound
Flags: needinfo?(bent.mozilla)
Did all the crashes look like this? My first read is memory corruption.
Flags: needinfo?(bent.mozilla)
All 3 crashes I've seen had the same signature, yes.
I will ignore android for now.  A lot of changes are taking place in talos, especially with e10s, so there will be work being done to ensure compatibility with android.
Attachment #8500425 - Attachment is obsolete: true
Attachment #8501023 - Flags: review?(dminor)
Attachment #8501023 - Flags: review?(dminor) → review+
I'm really sorry but I've had to revert this again for intermittent failures only a couple of pushes after this landed:
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=2821353&repo=mozilla-inbound

And more importantly, the failure output isn't in the standard TBPL-parsable format:
  Actual: 
TEST-UNEXPECTED-FAIL : mainthreadio: File '{appdata}\local\temp\tmpujwgcp' was accessed and we were not expecting it: {'Count': 6, 'Duration': 0.093696, 'RunCount': 6}
  Expected:
TEST-UNEXPECTED-FAIL | mainthreadio | File '{appdata}\local\temp\tmpujwgcp' was accessed and we were not expecting it: {'Count': 6, 'Duration': 0.093696, 'RunCount': 6}

(Since with the current format, TBPL/treeherder use the full-line fallback, which is unlikely to match against filed bugs, since the duration will vary by a couple thousands of a second each time etc)

And also, whilst the commit message said desktop only, there was a crash of the same form as in comment 13:
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=2820555&repo=mozilla-inbound


remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/9a8039fb5055
aklotz: and we are back again to the drawing board:
File '{appdata}\local\temp\tmpujwgcp' was accessed and we were not expecting it: {'Count': 6, 'Duration': 0.093696, 'RunCount': 6}

do you know what that file is?  it seems like we created a temp file in our profile directory (or it is the profile directory).  I can figure out a way to ignore this, but it only shows up sometimes.  I wanted to get your thoughts on this before hacking on it.
Flags: needinfo?(aklotz)
Ugh! Let me take a few minutes to do some digging....
Flags: needinfo?(aklotz)
I don't know what that file is. The obvious searches aren't turning up anything. :-(
it appears to be the root directory of the profile/ folder, I thought I had accounted for that, let me figure out a solution.
this includes fixes for mainthreadio and for osx.  I am not updating android as we will tackle that next week.
Attachment #8501023 - Attachment is obsolete: true
Attachment #8503340 - Flags: review?(avihpit)
Attachment #8503340 - Flags: review?(avihpit) → review+
Depends on: 1081354
https://hg.mozilla.org/mozilla-central/rev/1fb3f954952b
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
Ryan, we need to uplift this to aurora so our osx results come in solid.  I have a try run:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=ac3a95ef987a

we hit the random issue on xperf as well; I have a fix in the works for that.  Do you have a problem with me landing this on Aurora this weekend prior to the uplift?
Flags: needinfo?(ryanvm)
Nope, go for it.
Flags: needinfo?(ryanvm)
hmm, I thought the hg checkin script did it automatically, apologies for not doing that.
You need to log in before you can comment on or make changes to this bug.