Make talos force a cycle collection between each pageload

RESOLVED FIXED

Status

Release Engineering
General
RESOLVED FIXED
10 years ago
4 years ago

People

(Reporter: sicking, Assigned: sicking)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(1 attachment, 4 obsolete attachments)

Created attachment 288758 [details] [diff] [review]
Patch to fix

Here's a patch that forces a cycle collection between each pageload. This should give us more consistent numbers, and make sure that cycle-collection happens more 'normally' despite the fact that we're using uncommon usage behavior (i.e. hammering the browser really hard).

It outputs just the total time spent in cyclecollection, i.e. not per load or per site. I figured being as detailed as we are for the actual pageloads would be overengineering a bit, at least for now.

Not sure if there needs to be modifications done to the scripts that read the talos output? The new data is reported as an additional "_x_x_mozilla_cycle_collect" field.

The other question is, are we running scripts such that it's possible to request UniversalXPConnect? For now that is required in order to be able to call the cyclecollect API.
Attachment #288758 - Flags: superreview?
Attachment #288758 - Flags: review?
Comment on attachment 288758 [details] [diff] [review]
Patch to fix

Not sure who should review, alice, you touched this file last it looks like?
Attachment #288758 - Flags: superreview?(anodelman)
Attachment #288758 - Flags: superreview?
Attachment #288758 - Flags: review?(anodelman)
Attachment #288758 - Flags: review?
Oops, there's a missing ';' at the end of the UniversalXPConnect request. I've fixed that locally.
Unfortunately, you've made the changes on the framecycler (ie, tp2), all of the new machines are now using the pageloader (tp3).  This is found in the tree at mozilla/layout/tools/pageloader.

It is probably good to get both tp2 & tp3 to use the new cycle collection scheme, but I think that tp3 is the higher priority.

Also, I'd like to see the cycle collection still be optional.  Since it won't be available in old builds we won't be able to run this option when doing baselining runs going back months (and years).

We have been running the tests with universalxpconnect privileges to deal with problems quitting the application on macs - so that shouldn't be a problem.
How would you opt in/out of the cycle collection? Should I make it an argument to framecycler.html?
Created attachment 289104 [details] [diff] [review]
Patch v2

Here is the second attempt. It adds the new pagecycler and adds configuration parameters to both testsuites.

Questions:
Should I force CC by default or not? For now I made the default to force CC, I don't care much either way. It's easy to change in the patch.

Do we want to not report the force-cc time at all if we're not forcing CC? Not sure what is easiest to do on the receiving end of the data.
Attachment #288758 - Attachment is obsolete: true
Attachment #289104 - Flags: superreview?(anodelman)
Attachment #289104 - Flags: review?(anodelman)
Attachment #288758 - Flags: superreview?(anodelman)
Attachment #288758 - Flags: review?(anodelman)
Oh, one thing that we may or may not want to add. In Tp3, when using "js" output format, I couldn't figure out a good way to include the cycle collection time data. Not sure if it's important, but ideas would be welcome.
Comment on attachment 289104 [details] [diff] [review]
Patch v2

For the pageloader, in report.js looks to me like half of the cc report is in 'text' and the other half is in 'tinderbox' format - this will break how talos reads the data from the pageloader.  I'd want to cc report to be a separate part entirely, and would only be generated if we were forcing cc for the given test.
Attachment #289104 - Flags: review?(anodelman) → review-
Created attachment 289413 [details] [diff] [review]
Patch v3

This puts the reported cycle collection time outside of __start_tp_report/__end_tp_report as well as only outputs any CC data if forceCC is turned on.
Attachment #289104 - Attachment is obsolete: true
Attachment #289413 - Flags: superreview?(anodelman)
Attachment #289413 - Flags: review?(anodelman)
Attachment #289104 - Flags: superreview?(anodelman)
Comment on attachment 289413 [details] [diff] [review]
Patch v3

My tests of the pageloader were successful - could turn on/off cycle collection and it doesn't break the talos code.
Attachment #289413 - Flags: review?(anodelman) → review+
Attachment #289413 - Flags: superreview?(anodelman) → superreview+
We are waiting on this upon realizing that this will auto-turn this on by default, even on branch test machines that will be unable to force cycle collection.  Waiting on a patch that would simply ignore attempts to cycle collect on browsers where it wasn't supported.
Depends on: 399107
Created attachment 289560 [details] [diff] [review]
Patch v4

I'm checking this in. The only difference is that this'll check if .garbageCollect is available, and call Components.utils.forceGC() if not.
Attachment #289413 - Attachment is obsolete: true
Created attachment 289561 [details] [diff] [review]
Patch v4

Opps, this is the one.
Attachment #289560 - Attachment is obsolete: true
So I have already landed the "Patch v4" patch as it is basically the same as "Patch v3", just with an additional check to see if the force-CC function is there. If it is not there it will use another API that forces GC, which is the closes equivalent there is for branch.
Since this was checked in I haven't had any results reported from any of my branch machines.  Either a patch needs to be checked in to fix this or we need to back this out.
Can I see any output anywhere? I can guess at a few things failing, but the best thing would be if there was a log to look at. Otherwise I'll back out tomorrow morning :(
I'm not seeing any output either to the error console or dumped.  What I'm seeing on the affected machines is the test stalling out on the first page - it's not advancing to the next one.
So this baby is fixed! Looks like the Tp numbers are much less noisy which was what i was hoping for. The weird part is that the 1.8 number lookse more noisy, I have no clue why that would be the case since the patch doesn't change the 1.8 behavior at all.

Not sure what we want to do about the framecycler.html parts I'll attach what i've got, but it's not very well tested, and I don't know how important we think it is.
Blocks: 455755
No longer blocks: 455755
This one was fixed a long time ago
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
Mass move of Core:Testing bugs to mozilla.org:Release Engineering:Talos. Filter on RelEngTalosMassMove to ignore.
Component: Testing → Release Engineering: Talos
Product: Core → mozilla.org
QA Contact: testing → release
Version: Trunk → other

Updated

9 years ago
Component: Release Engineering: Talos → Release Engineering
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.