Closed Bug 936521 Opened 11 years ago Closed 11 years ago

talos tresize regression on November 6th for windows 8

Categories

(Testing :: Talos, defect)

x86_64
Windows 8
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: jmaher, Unassigned)

Details

(Keywords: perf, regression, Whiteboard: [talos_regression])

I found an alert in dev.tree-management noting that tresize has regressed about 10%:
https://groups.google.com/forum/#!topic/mozilla.dev.tree-management/oh59N-ddgUw

I followed up on this and looked at some graphs:
https://datazilla.mozilla.org/?start=1383321416&stop=1383926216&product=Firefox&repository=Mozilla-Inbound-Non-PGO&os=win&os_version=6.2.9200&test=tresize&graph_search=dfeee13c85fb,8222e9ae0a21,82ff69542540&tr_id=3481137&graph=tresize&project=talos

While that above graph shows :ejpbruel (bug 927116) as the culprit, the raw numbers don't show it, it looks more to be :mattwoodrow (bug934860) that is the problem.

Tresize is defined as a test here:
http://hg.mozilla.org/build/talos/file/0987e4cbd219/talos/startup_test/tresize-test.html
matt- can you take a look at your patches on bug 934860 and see if they would cause this regression?
Flags: needinfo?(matt.woodrow)
I just double checked all the other platforms, this is a windows 8 specific issue.
I think this likely is my change, yes.

Unfortunately I don't have a windows 8 machine to test this one. Is it possible to get profiles generated from tresize runs on windows 8?

I suspect it should be fairly easy to figure out from there.
Flags: needinfo?(matt.woodrow)
Profiles as in firefox profiles or SPS profiles?

We can get info from a test run, or we can get you a loaner machine (a fairly straightforward process these days).
Matt, I've just successfully gotten comparison profiles of tresize. I'll try to get one for this.
Awesome, thanks Jeff.
The biggest difference that I can see is that we spend 3-4x as long in DrawTargetD2D::Flush when painting the ThebesLayer for the content area of the page. Time spent painting chrome is about the same.

Playing with the same page locally gives me about 3 or 4 invalidation rects for the content area, so it would appear the cost is relative to the number of rects we paint, for this example at least.

The content area is very simple, we only have an nsDisplayBackgroundColor, nsDisplayBorder (painting a solid color), and nsDisplayText.

Bas: Any idea why this might be? And in particular, why it only affects windows 8.
Flags: needinfo?(bas)
Are we flushing more often? i.e. Once per rect instead of once per the whole region. Also why do we have multiple invalidation rects for tresize? Shouldn't we be painting everything at once?
Flags: needinfo?(bas)
No, we're only flushing after we return from DrawThebesLayer, so this should be the same.

The content area ThebesLayer is mainly covered with a solid color background, DLBI only invalidates the newly exposed areas of it.
something has fixed tresize- it has been getting better and better over the last couple weeks.  Shall we mark this as fixed?
(In reply to Joel Maher (:jmaher) from comment #12)
> something has fixed tresize- it has been getting better and better over the
> last couple weeks.  Shall we mark this as fixed?

There are 2 regressions at the graph linked from the dev.tree-management http://graphs.mozilla.org/graph.html#tests=[[254,63,31]]&sel=none&displayrange=30&datatype=running:

- Nov 1st: from 14.9ms to 15.1ms + increased noise.
- Nov 6th: from 15.1ms to 16.5ms (~10%).

I think this specific bug is for the Nov 6th regression (looking at the dates on the list message changesets), which is _apparently_ fixed on Nov 17 (by just looking at the graph it looks like the Nov 6 regression was reverted on Nov 17).

Then there was another improvement on Nov 18 to 12.9ms (looks still with the increased noise) and yet another on Nov 21 to 12.3ms (hard to tell if the noise level is still high because not enough data points yet).

Assuming that we're only discussing the Nov 6 regression here, to truly close this bug, I think someone would have to look at the changesets from the original regression message: https://groups.google.com/forum/#!topic/mozilla.dev.tree-management/oh59N-ddgUw , understand which changeset caused it, and then confirm that this change got negated/reverted/fixed (WRT tresize) on Nov 17, at one of the changesets here: https://groups.google.com/forum/#!searchin/mozilla.dev.tree-management/tresize$20and$20winnt$20and$206.2|sort:date/mozilla.dev.tree-management/Jd1Ety5H6h4/J6yEym-lcioJ
Flags: needinfo?(matt.woodrow)
Yes, that patch got disabled, so it's expected that the regression would have gone away.
Flags: needinfo?(matt.woodrow)
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.