Closed Bug 1076990 Opened 10 years ago Closed 10 years ago

update talos.json on tip to capture mainthreadio and other talos cleanup

Categories

(Testing :: Talos, defect)

defect
Not set
normal

Tracking

(firefox34 fixed, firefox35 fixed)

RESOLVED FIXED
mozilla35
Tracking Status
firefox34 --- fixed
firefox35 --- fixed

People

(Reporter: jmaher, Assigned: jmaher)

References

Details

Attachments

(2 files, 3 obsolete files)

we have a lot of talos cleanup recently including adding a feature for mainthread IO detection.
Attached patch update_taloszip.patch (1.0) (obsolete) — Splinter Review
Assignee: nobody → jmaher
Status: NEW → ASSIGNED
Attachment #8499001 - Flags: review?(wlachance)
Comment on attachment 8499001 [details] [diff] [review]
update_taloszip.patch (1.0)

Review of attachment 8499001 [details] [diff] [review]:
-----------------------------------------------------------------

lgtm
Attachment #8499001 - Flags: review?(wlachance) → review+
No longer depends on: 1068989
thanks Ed!  I should have landed this faster.

Aaron, here are the failures we see:
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\yahoo.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.08601400000000001, 'RunCount': 2}
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\twitter.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.079486, 'RunCount': 2}
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\ebay.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.08025399999999999, 'RunCount': 2}
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\amazondotcom.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.09023800000000001, 'RunCount': 2}
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\bing.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.078718, 'RunCount': 2}
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\google.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.079486, 'RunCount': 2}
13:05:59     INFO -  TEST-UNEXPECTED-FAIL : mainthreadio: File '{xre}\browser\searchplugins\wikipedia.xml' was accessed and we were not expecting it: {'Count': 2, 'Duration': 0.079486, 'RunCount': 2}

should we just add all of these to the whitelist?
Flags: needinfo?(aklotz)
Hmmm, those look like that failure you ping me about yesterday during the perf testing meeting.

I'm inclined to answer, "yes," but give me a bit of time to confirm this.
Flags: needinfo?(aklotz)
Yeah, looks like it's a fallback path in the search service if async init hasn't completed in time, hence the intermittent failures. We'll need to whitelist everything in searchplugins.
Flags: needinfo?(jmaher)
Attachment #8499741 - Flags: review?(aklotz)
Flags: needinfo?(jmaher)
Attachment #8499741 - Flags: review?(aklotz) → review+
landed the whitelist changes in talos, will update next week:
https://hg.mozilla.org/build/talos/rev/87f6d98a6a4a
updated and now we can run the same rev on android + desktop!

try run with lots of green:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=edec39112a25
Attachment #8499001 - Attachment is obsolete: true
Attachment #8500425 - Flags: review?(dminor)
Attachment #8500425 - Flags: review?(dminor) → review+
I had to back this out for semi-frequent Android crashes. Ben, looks like these might be PBackground related?
https://hg.mozilla.org/integration/mozilla-inbound/rev/8f91f44a4e3e

https://treeherder.mozilla.org/ui/logviewer.html#?job_id=2797530&repo=mozilla-inbound
Flags: needinfo?(bent.mozilla)
Did all the crashes look like this? My first read is memory corruption.
Flags: needinfo?(bent.mozilla)
All 3 crashes I've seen had the same signature, yes.
I will ignore android for now.  A lot of changes are taking place in talos, especially with e10s, so there will be work being done to ensure compatibility with android.
Attachment #8500425 - Attachment is obsolete: true
Attachment #8501023 - Flags: review?(dminor)
Attachment #8501023 - Flags: review?(dminor) → review+
I'm really sorry but I've had to revert this again for intermittent failures only a couple of pushes after this landed:
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=2821353&repo=mozilla-inbound

And more importantly, the failure output isn't in the standard TBPL-parsable format:
  Actual: 
TEST-UNEXPECTED-FAIL : mainthreadio: File '{appdata}\local\temp\tmpujwgcp' was accessed and we were not expecting it: {'Count': 6, 'Duration': 0.093696, 'RunCount': 6}
  Expected:
TEST-UNEXPECTED-FAIL | mainthreadio | File '{appdata}\local\temp\tmpujwgcp' was accessed and we were not expecting it: {'Count': 6, 'Duration': 0.093696, 'RunCount': 6}

(Since with the current format, TBPL/treeherder use the full-line fallback, which is unlikely to match against filed bugs, since the duration will vary by a couple thousands of a second each time etc)

And also, whilst the commit message said desktop only, there was a crash of the same form as in comment 13:
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=2820555&repo=mozilla-inbound


remote:   https://hg.mozilla.org/integration/mozilla-inbound/rev/9a8039fb5055
aklotz: and we are back again to the drawing board:
File '{appdata}\local\temp\tmpujwgcp' was accessed and we were not expecting it: {'Count': 6, 'Duration': 0.093696, 'RunCount': 6}

do you know what that file is?  it seems like we created a temp file in our profile directory (or it is the profile directory).  I can figure out a way to ignore this, but it only shows up sometimes.  I wanted to get your thoughts on this before hacking on it.
Flags: needinfo?(aklotz)
Ugh! Let me take a few minutes to do some digging....
Flags: needinfo?(aklotz)
I don't know what that file is. The obvious searches aren't turning up anything. :-(
it appears to be the root directory of the profile/ folder, I thought I had accounted for that, let me figure out a solution.
this includes fixes for mainthreadio and for osx.  I am not updating android as we will tackle that next week.
Attachment #8501023 - Attachment is obsolete: true
Attachment #8503340 - Flags: review?(avihpit)
Attachment #8503340 - Flags: review?(avihpit) → review+
Depends on: 1081354
https://hg.mozilla.org/mozilla-central/rev/1fb3f954952b
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla35
Ryan, we need to uplift this to aurora so our osx results come in solid.  I have a try run:
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=ac3a95ef987a

we hit the random issue on xperf as well; I have a fix in the works for that.  Do you have a problem with me landing this on Aurora this weekend prior to the uplift?
Flags: needinfo?(ryanvm)
Nope, go for it.
Flags: needinfo?(ryanvm)
hmm, I thought the hg checkin script did it automatically, apologies for not doing that.
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: