Closed Bug 1102479 Opened 5 years ago Closed 5 years ago

convert talos tresize to e10s

Categories

(Testing :: Talos, defect)

defect
Not set

Tracking

(e10s+)

RESOLVED FIXED
Tracking Status
e10s + ---

People

(Reporter: jmaher, Assigned: jmaher)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 5 obsolete files)

we need to rewrite talos tresize test to work smoother with e10s.  In fact we could make sure this test works well with/without e10s by having the window resizing take place in the chrome space.
let me push to try, this works great testing locally.
Assignee: nobody → jmaher
Status: NEW → ASSIGNED
Attachment #8526302 - Flags: review?(wlachance)
Attachment #8526302 - Flags: review?(avihpit)
Comment on attachment 8526302 [details] [diff] [review]
convert tresize to be e10s compatible (1.0)

Review of attachment 8526302 [details] [diff] [review]:
-----------------------------------------------------------------

::: talos/startup_test/tresize-test.html
@@ +10,5 @@
> +// Limited to one argument (data), which is enough for TART.
> +// doneCallback will be called once done and, if applicable, with the result as argument.
> +// Execution might finish quickly (e.g. when setting prefs) or
> +// take a while (e.g. when triggering the test run)
> +function chromeExec(commandName) {

IMO either leave the generic code from TART and use it without further modification (except for the message prefixes), or modify it exactly for your needs.

As it stands, you copied the code from TART and modified it but left several parts of it unused and others you hardcoded for your needs.

E.g. The generic code has data argument, so if you removed it as a formal argument, there's no need to send a data property with the event.

OTOH, you removed the callback argument and instead called logResults hardcoded from the framescript.

Instead, you could have left the original chromeExec code and the framescript code as is (they were designed to be generic, sans the message prefixes), and invoke it as:
function runTest() {
  chromeExec("runTest", null, logResults);
}

@@ +37,5 @@
>  
> +  window.setTimeout(function() {
> +    goQuitApplication();
> +    window.close();
> +  }, 0);

The original code quit only if closeOnFinished==true. I'm guessing that setting it to false would have made it easier to run tresize outside of talos without the browser closing after each run.

It's probably still useful to be able to run tresize outside of talos, so maybe add closeOnFinished at this file and use it the same as the original code? (and try it outside of talos to see how it looks and works).

::: talos/startup_test/tresize/addon/content/Profiler.js
@@ +4,5 @@
> +
> +// - NOTE: This file is duplicated verbatim at:
> +//         - talos/scripts/Profiler.js
> +//         - talos/pageloader/chrome/Profiler.js
> +//         - talos/page_load_test/tart/addon/content/Profiler.js

If you've made another copy of Profiler.js, you should probably add the instance at this comment and duplicate the comment to all the instances.

OTOH, do you really need another copy of it? TART needs to start/stop the profiler because not all the test duration is actual testing (it has warmup animations, wait periods, etc, which we don't want to profile), but tresize should be profiled from start to finish, which AFAIK the pageloader addon can already do - isn't it what the talos/pageloader/chrome/Profiler.js instance does?

So I think you should remove Profiler.js and the calls to start/stop profiling from the tresize addon, and instead make sure that the pageloader addon can profile tresize from start to finish.

::: talos/startup_test/tresize/addon/content/framescript.js
@@ +6,5 @@
> +
> +    addMessageListener(TRESIZE_PREFIX + "chrome-exec-reply", function done(reply) {
> +      if (reply.data.id == uniqueMessageId) {
> +        removeMessageListener(TRESIZE_PREFIX + "chrome-exec-reply", done);
> +        content.wrappedJSObject.logResults(reply.data.result);

That's the hardcoded call to logResults. You could leave the generic original code from TART and instead just pass logResults as the callback argument when calling chromeExe.

::: talos/startup_test/tresize/addon/content/tresize.js
@@ +12,5 @@
> +var count = 0;
> +var max = 300;
> +
> +var dumpDataSet = false;
> +var closeOnFinished = true;

closeOnFinished is unused here. You probably want it at tresize-test.html

@@ +19,5 @@
> +function painted() {
> +  window.gBrowser.tabContainer.removeEventListener("MozAfterPaint", painted, true);
> +  var paintTime = window.performance.now();
> +
> +  Profiler.pause("resize " + count);

No need for Profiler.pause and Profiler.resume since nothing really happens and no time passes between the calls to pause and resume. tresize should be profiled straight from start to finish.

@@ +21,5 @@
> +  var paintTime = window.performance.now();
> +
> +  Profiler.pause("resize " + count);
> +  dataSet[count].end = paintTime;
> +  setTimeout(resizeCompleted, 20);

I don't understand this 20ms timeout. I see that the original test also has it, but since I never analyzed the original tresize, I'm not sure why it's here.

I think tresize should and could be much simpler than it is now. Right now it resizes the window several times, measure each of those resizes duration using mozAfterPaint, and then averages them as its final result.

It should probably be modified to just resize as fast as possible (probably iterate using requestAnimationFrame and with ASAP mode) without any artificial timeouts, and then just divide the overall duration by the number of resize steps to arrive at its final result.

The main concern I have with the current/original approach is that mozAfterPaint is much more complex with e10s, and it's hard for me to evaluate if waiting for it as the sole indication that the resize is done is what we need.

OTOH, iterating as fast as possible with requestAnimationFrame just removes mozAfterPaint from the equation, and just measures.. well.. how fast we can iterate resizing, which is what we wanted to measure in the first place IMO.

@@ +30,5 @@
> +    dims += increment;
> +    Profiler.resume("resize " + count);
> +    var startTime = window.performance.now();
> +    dataSet[count] = {};
> +    dataSet[count].start = startTime;

dataSet[count] = {start: window.performance.now()}
?

@@ +79,5 @@
> +  try {
> +    window.moveTo(10,10);
> +    window.resizeTo(dims,dims);
> +    resizeTest();
> +  } catch(ex) { dump(ex + '\n'); }

So what happens if there's an exception? you just dump it and let talos terminate the test after some minutes of timeout?
Attachment #8526302 - Flags: review?(avihpit) → review-
Avi, thanks for the review, this is very useful.  I have worked on this and simplified the system greatly.  I have also added the ability for it to run locally.

A few questions remain:
1) running locally works great if I put tresize-test.html into the addon, otherwise I need to set 'security.turn_off_all_security_so_that_viruses_can_take_over_this_computer' in user.js.  The problem here is I cannot launch firefox from the commandline and access this, if I use:
firefox -profile /tmp/talosprofiledir --chrome chrome://tresize/content/tresize-test.html

the browser loads, but I get a chrome window, not a firefox window.

:avih -  We need to figure out a way to do this, maybe include tresize.html in the addon as a file we don't normally use, but just calls the runTest() method inside the chrome context?

2) MozAfterPaint, setTimeout(..., 20), and ASAP mode.  What are we testing here.  I can easily remove the setTimeout parameter of 20 and it works, likewise I can remove the mozAfterPaint logic.  Is that what we want to do?  The test doesn't have to stay the same, so I would rather make sure this is useful.  Especially now that the browser has changed to e10s, etc.

:jimm - what should we be measuring here?  could you provide some context to the reasons for mozafterpaint and the 20ms sleep in settimeout?  Also your perspective from e10s would be beneficial.
Flags: needinfo?(jmathies)
Flags: needinfo?(avihpit)
Comment on attachment 8526302 [details] [diff] [review]
convert tresize to be e10s compatible (1.0)

Review of attachment 8526302 [details] [diff] [review]:
-----------------------------------------------------------------

::: talos/startup_test/tresize/addon/content/tresize.js
@@ +21,5 @@
> +  var paintTime = window.performance.now();
> +
> +  Profiler.pause("resize " + count);
> +  dataSet[count].end = paintTime;
> +  setTimeout(resizeCompleted, 20);

The original delay I think was just to let the browser sit for a sec. I doubt it was critical to test operation. 

> The main concern I have with the current/original approach is that mozAfterPaint is much more complex with e10s, and it's hard for me to evaluate if waiting for it as the sole indication that the resize is done is what we need.

As long as the changes you request here still measure the time from window resize -> fully composed chrome window with bits on the screen. If you need to know about MozAfterPaint/MozRemoteAfterPaint and e10s, ask handyman in #e10s (dparks), he's our gfx guy and understands these events better than I do.
Flags: needinfo?(jmathies)
Flags: needinfo?(avihpit) → needinfo?(matt.woodrow)
(In reply to Joel Maher (:jmaher) from comment #3)
> :avih -  We need to figure out a way to do this, maybe include tresize.html
> in the addon as a file we don't normally use, but just calls the runTest()
> method inside the chrome context?

As discussed on IRC, let's make the html file minimal such that it only posts the event which starts the test, and then everything happens within the addon. Then we can duplicate this html file such that we could also launch it from CLI.

Or, as you suggested, launch the file as $talos/pageloadtest/tresize/addon/content/...

> 2) MozAfterPaint, setTimeout(..., 20), and ASAP mode.  What are we testing
> here.  I can easily remove the setTimeout parameter of 20 and it works,
> likewise I can remove the mozAfterPaint logic.  Is that what we want to do? 
> The test doesn't have to stay the same, so I would rather make sure this is
> useful.  Especially now that the browser has changed to e10s, etc.
> 
> :jimm - what should we be measuring here?  could you provide some context to
> the reasons for mozafterpaint and the 20ms sleep in settimeout?  Also your
> perspective from e10s would be beneficial.

The original tresize test without e10s measured 2 things in one:
1. How quickly the browser can be resized.
2. How quickly the content can be rendered after resize.

1 and 2 were linked. I.e. the browser frame could not be repainted at its new size before the content was reflowed and repainted.

But with e10s it could be possible to iterate the frame resizing quickly (e.g. while dragging the window corner with the mouse), without the content fully reflowed and repainted on each frame resize. E.g. when enlarging the window, it might fill the new areas in white without actually reflowing the content.

So it might not be possible to measure the same thing as the old tresize did, simply because e10s changes the way it works.

So maybe we could try to measure both.

But I'm not 100% sure how mozAfterPaint works with e10s.

Matt, any ideas here?
From IRC:

<avih> so the white is at the parent. i thought it's at the child
<avih> in that case, we could measure both the parent resize speed, and the child's MozAfterRemotePaint. the former would measure responsiveness to manual resizing, and the latter will measure how quickkly gecko can reflow it.
<avih> mattwoodrow: sounds about right? ^
<mattwoodrow> avih: yep, sounds good

Joel, so you could wait for both MozAfterPaint and MozAfterRemotePaint, measure both, and start the next iteration only after you got both events.

So instead of measuring one value per iteration, we'll measure two. When all is done, we can average all the collected values, or, if you feel motivated, submit the result as 2 "pages" - one for each average of each event type.
Flags: needinfo?(matt.woodrow)
(In reply to Avi Halachmi (:avih) from comment #5)
> The original tresize test without e10s measured 2 things in one:
> 1. How quickly the browser can be resized.
> 2. How quickly the content can be rendered after resize.

The original reason for tresize was to measure #1, I don't see much value in #2. With e10s, this has become even more important since we currently totally suck at getting content painted after a chrome window resize. :)

> Joel, so you could wait for both MozAfterPaint and MozAfterRemotePaint,
> measure both, and start the next iteration only after you got both events.
> 
> So instead of measuring one value per iteration, we'll measure two. When all
> is done, we can average all the collected values, or, if you feel motivated,
> submit the result as 2 "pages" - one for each average of each event type.

sgtm!
(In reply to Jim Mathies [:jimm] from comment #7)
> (In reply to Avi Halachmi (:avih) from comment #5)
> > The original tresize test without e10s measured 2 things in one:
> > 1. How quickly the browser can be resized.
> > 2. How quickly the content can be rendered after resize.
> 
> The original reason for tresize was to measure #1, I don't see much value in
> #2. With e10s, this has become even more important since we currently
> totally suck at getting content painted after a chrome window resize. :)

Ugh, reverse that, change it! The original reason for tresize was to measure #2, I don't see much value in #1.
(In reply to Jim Mathies [:jimm] from comment #8)
> ...
> > The original reason for tresize was to measure #1, I don't see much value in
> > #2. With e10s, this has become even more important since we currently
> > totally suck at getting content painted after a chrome window resize. :)
> 
> Ugh, reverse that, change it! The original reason for tresize was to measure
> #2, I don't see much value in #1.

Actually...

Since the current tresize has empty content, and since in e10s the white background is painted at the parent process anyway, maybe we could do just MozAfterPaint at the parent process (i.e. what your last patch does - same as the old tresize did), which will measure the reflow speed of the browser chrome and the rendering of the empty white content - just what it measures with the old tresize.

So, no need to address the MAP/rAF/etc issue - leave this part as is.

But do remove the 20ms timeout. There's no need for it.
(In reply to Jim Mathies [:jimm] from comment #8)
> Ugh, reverse that, change it! The original reason for tresize was to measure
> #2, I don't see much value in #1.

The value in #1 IMO is responsiveness. There's value when the browser frame responds instantly to mouse drag of the window border even if the content reflows asynchronously. It'd certainly feel much better than the border resizing following the content reflow speed (I'm sure you know how much not fun it is when it happens with complex content).

Of course, quick resizing together with quick content reflow is the best combo ;)
(In reply to Avi Halachmi (:avih) from comment #10)
> (In reply to Jim Mathies [:jimm] from comment #8)
> > Ugh, reverse that, change it! The original reason for tresize was to measure
> > #2, I don't see much value in #1.
> 
> The value in #1 IMO is responsiveness. There's value when the browser frame
> responds instantly to mouse drag of the window border even if the content
> reflows asynchronously. It'd certainly feel much better than the border
> resizing following the content reflow speed (I'm sure you know how much not
> fun it is when it happens with complex content).

I guess the reason I don't see this as such an issue is that we have never had problems with this, at least on windows from my experience. Although I guess you're right, a test for this to pick up regressions would be nice.

> Of course, quick resizing together with quick content reflow is the best
> combo ;)

Repainting content is really important to the e10s team since we have issues there. You can take e10s for a spin on a complex MDN page for a good example as to why -

https://developer.mozilla.org/en-US/docs/Mozilla/QA/Automated_testing

Note that scrollbars and other content chrome painted with content lag when sizing out. We would love to have this delay in a talos metric so we can work on improving it, and prevent regressions.
(In reply to Jim Mathies [:jimm] from comment #11)
> I guess the reason I don't see this as such an issue is that we have never
> had problems with this, at least on windows from my experience. Although I
> guess you're right, a test for this to pick up regressions would be nice.

While maybe not a huge issue, complex content on low-end-ish systems (be those windows or few years old Macbook Air) does result in slow resize, which is a sub optimal experience.

> Repainting content is really important to the e10s team since we have issues
> there. ...

I absolutely don't disagree. Content reflow is speed is very important. That being said, the original tresize didn't measure it, since it didn't really have any content to speak of.

We could enhance tresize to add some content and then do the resize test, possibly as I suggested in comment 6 - to measure both frame resize speed and content reflow speed. But this might be worth a new bug. This bug is about making sure that tresize works reasonably well in e10s.
(In reply to Jim Mathies [:jimm] from comment #11)
> Repainting content is really important to the e10s team since we have issues
> there. You can take e10s for a spin on a complex MDN page for a good example
> as to why -
> 
> https://developer.mozilla.org/en-US/docs/Mozilla/QA/Automated_testing
> 
> Note that scrollbars and other content chrome painted with content lag when
> sizing out. We would love to have this delay in a talos metric so we can
> work on improving it, and prevent regressions.

Yeah, doesn't look nice.

If this issue is considered with more weight than browser frame resizing responsiveness, maybe they can make resizing synchronous, i.e. wait for the content to reflow and render before rendering the parent, which would bring resizing behavior to be like non-e10s.

It's a judgment call IMO (not sure which side I'd take), though will probably require some work to follow it.
Thanks for the great chatter in this bug about what we should measure.

I have updated the patch to do:
* simplified chrome message logic- no hacked up template
* support for running standalone as an installed xpi
* tested in e10s, non e10s, and standalone
* removed the setTimeout(..., 20), made it a 0
* kept the mozAfterPaint, didn't add mozAfterRemotePaint

In general, this patch coverts what we have with some cleanup.  It was not 100% clear if we should be doing anything with MAP or MARP, so I left that alone.  As the numbers will be changing as a result of this, I would prefer that we change the test as well;  We can always adjust the test in a followup bug.
Attachment #8526302 - Attachment is obsolete: true
Attachment #8527168 - Flags: review?(wlachance)
Attachment #8527168 - Flags: review?(avihpit)
(In reply to Joel Maher (:jmaher) from comment #14)
> ...
> I have updated the patch to do:
> * simplified chrome message logic- no hacked up template
> * support for running standalone as an installed xpi
> * tested in e10s, non e10s, and standalone
> * removed the setTimeout(..., 20), made it a 0
> * kept the mozAfterPaint, didn't add mozAfterRemotePaint
> 
> In general, this patch coverts what we have with some cleanup.  It was not
> 100% clear if we should be doing anything with MAP or MARP, so I left that
> alone.

Sounds great. Thanks.

> As the numbers will be changing as a result of this, I would prefer
> that we change the test as well;  We can always adjust the test in a
> followup bug.

Did you check locally how much the numbers changed, if at all? I'd actually imagine they might not change too much.

Also, I think it might be useful to leave the test name as is, such that we can compare the numbers before/after moving to e10s on the same graph, but I don't feel very strongly about it. Your call.

I'd appreciate however if you could split the patch to two to (wth.. 222??!?) make it easier to review:
1. The test change itself.
2. The xpigen thingy.

Otherwise, the xpigen thingy adds huge noise to the patch...
Comment on attachment 8527168 [details] [diff] [review]
convert tresize to be e10s compatible (2.0)

Review of attachment 8527168 [details] [diff] [review]:
-----------------------------------------------------------------

I don't feel totally qualified to review this, so I'm just going to f+ it and leave the final decision with Avi. The approach as I understand it makes sense to me.

::: talos/startup_test/tresize/addon/content/tresize-test.html
@@ +15,5 @@
> +}
> +
> +function logResults(data) {
> +  try {
> +    for (var line in data) {

Could be expressed somewhat more elegantly as:

data.forEach(function(line) {
  dumpLog(data[line]);
});

::: talos/startup_test/tresize/addon/content/tresize.js
@@ +7,5 @@
> + */ 
> +
> +var dataSet = new Array();
> +var dims = 425;
> +var increment = 2;

Could we use something more descriptive than "dims" and "increment"? I'd suggest: windowSize and resizeIncrement

@@ +8,5 @@
> +
> +var dataSet = new Array();
> +var dims = 425;
> +var increment = 2;
> +var count = 0;

I guess I wonder a bit why you're using a count variable as opposed to just appending to the array and tracking its length, but I guess it doesn't matter that much.

@@ +35,5 @@
> +
> +function testCompleted() {
> +  try {
> +    var total = 0;
> +    diffs = [];

var diffs = []

@@ +42,5 @@
> +      total += diff;
> +      diffs.push(diff);
> +    }
> +    var average = (total/count);
> +    retVal = [];

var retVal = []
Attachment #8527168 - Flags: review?(wlachance) → feedback+
updated patch with feedback from :wlach, this is split out into just the tresize modifications.
Attachment #8527168 - Attachment is obsolete: true
Attachment #8527168 - Flags: review?(avihpit)
Attachment #8527297 - Flags: review?(avihpit)
additional patch to make tresize generate a .xpi and with some instructions to install and run it.
Attachment #8527298 - Flags: review?(avihpit)
Comment on attachment 8527298 [details] [diff] [review]
make tresize work as a standalone test (1.0)

Review of attachment 8527298 [details] [diff] [review]:
-----------------------------------------------------------------

xpigen.js and jszip.min.js are missing from this patch.

::: talos/page_load_test/tart/generate-xpi.html
@@ +7,5 @@
>  <head>
>    <meta charset="UTF-8"/>
>    <title>TART/CART addon xpi generator</title>
> +  <script src="../../scripts/jszip.min.js"></script>
> +  <script src="../../scripts/xpigen.js"></script>

This will not work when opening the file directly (as file:// ...), because it'll generate a security error when generate-xpi.html tries to load scripts from outside its containing folder.

Maybe move the two files as generate-tart-xpi.html and generate-tresize-xpi.html to the talos folder (where test.py is) which should work since 'scripts', 'tart' and 'tresize' folders are within it.

(And later we'll also need update the link at https://wiki.mozilla.org/Buildbot/Talos/Tests#TART.2FCART )

::: talos/startup_test/tresize/generate-xpi.html
@@ +7,5 @@
> +<head>
> +  <meta charset="UTF-8"/>
> +  <title>Tresize addon xpi generator</title>
> +  <script src="../../scripts/jszip.min.js"></script>
> +  <script src="../../scripts/xpigen.js"></script>

Same. Will not work when loaded as file:// ... .
Attachment #8527298 - Flags: review?(avihpit) → review-
Comment on attachment 8527297 [details] [diff] [review]
convert tresize to be e10s compatible (2.1)

Review of attachment 8527297 [details] [diff] [review]:
-----------------------------------------------------------------

::: talos/startup_test/tresize/addon/content/tresize-test.html
@@ +4,5 @@
> +<html>
> +<head>
> +<script language="Javascript" type="text/javascript" src="../../../../scripts/MozillaFileLogger.js"></script>
> +<script language="Javascript" type="text/javascript" src="../../../../scripts/Profiler.js"></script>
> +<script language="JavaScript" type="text/javascript" src="../../../../page_load_test/quit.js"></script>

When loading this file within the addon, its URI would be chrome://tresize/content/tresize-test.html . How can it reference these js utilities which are clearly not accessible from this chrome URI and are not part of the addon?

Also, you removed the profiler start/resume from tresize, but left it at the includes.

Also, so now tresize can't get profiled? In one of my earlier comments I mentioned that the pageloader addon already activates the profiler accordingly, but I failed to recall that tresize is not a pageload test, therefore it would need to activate the profiler itself if we want to allow profiling.

Which brings me to my next point:
Maybe we could just make tresize a pageload test? the pageloader already has logs and profiler etc and knows to quit the browser when done, so tresize will not need to take care of those, and all tresize will need to do with the result is call tpRecordTime, instead of formatting the output on its own. I think it could save many lines of code.

One thing it might need to do though if changing it to a pageload test, is restore the windows position and dimensions to how they were before the test started, since the pageloader addon doesn't expect that the test changes those.

Also, from looking at this patch, it's not clear to me how would a user run this test as stand alone, and what would be the requirements for it. A brief explanation would be appreciated.

::: talos/startup_test/tresize/addon/content/tresize.js
@@ +3,5 @@
> + * test opens a single window positioned at 10,10 and sized to 425,425,
> + * then resizes the window outward |max| times measuring the amount of
> + * time it takes to repaint each resize. Dumps the resulting dataset
> + * and average to stdout or logfile.
> + */ 

nit - there's space after '*/'.

@@ +26,5 @@
> +function resizeTest() {
> +  try {
> +    windowSize += resizeIncrement;
> +    dataSet[count] = {'start': window.performance.now()};
> +    window.gBrowser.tabContainer.addEventListener("MozAfterPaint", painted, true);

Adding the event listener shouldn't be measured, but it is with this code (even if it's supposedly quick). We can move this one line up.

@@ +28,5 @@
> +    windowSize += resizeIncrement;
> +    dataSet[count] = {'start': window.performance.now()};
> +    window.gBrowser.tabContainer.addEventListener("MozAfterPaint", painted, true);
> +    window.resizeTo(windowSize,windowSize);
> +    painted();

You already added the listener to invoke 'painted' for MozAfterPaint. Why do you invoke it directly? This means that for each resize event 'painted' will be invoked twice. Is that your intention? If yes, why?

::: talos/startup_test/tresize/addon/content/tresize.overlay.xul
@@ +3,5 @@
> +
> +<script type="application/x-javascript" src="tresize.js" />
> +<script type="application/x-javascript">
> +(function(){
> +})();

This does nothing. The original had all of this script within a function scope to avoid pollution of the global window. You should move the "})();" part to just above the </script> closing tag.
Attachment #8527297 - Flags: review?(avihpit) → review-
updated patch to:
* include profiler.js (tested locally from a talos run)
* fix nit's mentioned in review

this is still a startup_test; right now with the way pageloader tests are defined we need 2 data points to be returned so that we can throw away the highest value.  We are planning to switch that next month which if this was a month later it wouldn't be a problem.

Either way, I think this is a fairly straightforward approach.
Attachment #8527297 - Attachment is obsolete: true
Attachment #8528606 - Flags: review?(avihpit)
took your advice here and moved generate-xpi.html -> talos/ directory.  Now if you click here:
http://hg.mozilla.org/users/jmaher_mozilla.com/tpain/raw-file/9a41f427ba59/talos/generate-tresize-xpi.html

you will be able to generate a .xpi, install it and get a tresize number!  Likewise for generate-tart-xpi.html :)
Attachment #8527298 - Attachment is obsolete: true
Attachment #8528607 - Flags: review?(avihpit)
Comment on attachment 8528606 [details] [diff] [review]
convert tresize to be e10s compatible (2.2)

Review of attachment 8528606 [details] [diff] [review]:
-----------------------------------------------------------------

It's not far off. Missing:
- New GUID for the addon.
- Remove Profiler.js from the addon and use it like quit.js
- Small cleanups if required (exceptions, unreachable code).
- Short description of how it should be used as "stand alone".
- The numbers should be similar to the old tresize, so verify this.

r- because I want to look at the final patch before landing it.

::: talos/startup_test/tresize/addon/content/tresize-test.html
@@ +3,5 @@
> +   - You can obtain one at http://mozilla.org/MPL/2.0/.  -->
> +<html>
> +<head>
> +<script language="Javascript" type="text/javascript" src="../../../../scripts/MozillaFileLogger.js"></script>
> +<script language="JavaScript" type="text/javascript" src="../../../../page_load_test/quit.js"></script>

If the requirement for "stand alone" is to clone talos first, and the page to use in stand alone is $talos/startup_test/tresize/... rather than a page within the addon as a chrome:// URI, and it already uses MozillaFileLogger.js and quit.js from outside the addon, then for consistency (and smaller patch), you should use Profiler.js the same way. I.e. as another include here, and not as part of the addon.

Unless I'm missing something and it's needed as part of the addon for some cases?

@@ +20,5 @@
> +    });
> +
> +    window.setTimeout(function() {
> +      goQuitApplication();
> +      window.close();

Is window.close() required or even reached after you call goQuitApplication() ?

@@ +22,5 @@
> +    window.setTimeout(function() {
> +      goQuitApplication();
> +      window.close();
> +    }, 0);
> +  } catch (ex) {

Does dumpLog throws the exception? at what scenario? And if you only need to test if dumpLog is available, wouldn't |if (dumpLog){...} else alert(data);| be cleaner?

::: talos/startup_test/tresize/addon/content/tresize.js
@@ +68,5 @@
> +  try {
> +    window.moveTo(10,10);
> +    window.resizeTo(windowSize,windowSize);
> +    resizeTest();
> +  } catch(ex) { finish([ex + '\n']); }

Why is this inside a try block? I.e. could it ever throw an exception?

::: talos/startup_test/tresize/addon/install.rdf
@@ +6,5 @@
> +<em:version>1.0.1</em:version>
> +
> +<em:targetApplication>
> +    <Description>
> +        <em:id>{ec8030f7-c20a-464f-9b0e-13a3a9e97384}</em:id>

This GUID is already in use for TART. You should use a unique/new GUID.
Attachment #8528606 - Flags: review?(avihpit) → review-
Comment on attachment 8528607 [details] [diff] [review]
make tresize work as a standalone test (2.0)

Review of attachment 8528607 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks :)
Attachment #8528607 - Flags: review?(avihpit) → review+
(In reply to Joel Maher (:jmaher) from comment #22)
> ... Now if you click here:
> http://hg.mozilla.org/users/jmaher_mozilla.com/tpain/raw-file/9a41f427ba59/
> talos/generate-tresize-xpi.html

Working nicely!, and without the need to clone talos! :)

The alert, however, looks like this on my system:

__start_report22.90593593714741__end_report
,__startTimestamp1417158578464__endTimestamp

I'm guessing the result is 22.90593593714741 ? If yes, it would probably be nicer to display only the result. Maybe

data.split("__start_report")[1].split("__end_report")[0]

?
(In reply to Avi Halachmi (:avih) from comment #23)
> ...
> Does dumpLog throws the exception? at what scenario? And if you only need to
> test if dumpLog is available, wouldn't |if (dumpLog){...} else alert(data);|
> be cleaner?

if (typeof dumpLog != "undefined") {... , and the same for Profiler if we're moving it outside of the addon. Assuming it's the existence of dumpLog which throws rather than what it does inside the function.
The only concern or unknown which I have remaining is the issue of the profiler and the quit.js/mozillafilelogger.js.

so for quit.js, it is references as a relative path in the http url when we run in automation.  Running standalone (with or without cloning talos) would yield a failure and it should be- we don't want this to close the browser and log to an unknown file while it is running standalone.

After stating that, what should we do.  I was going on the 'silent failure' technique for the standalone addon version.  Is there a better way?

On the subject of the profiler.js, we need that in chrome space (or so I believe) and this is why it is loaded in the framescript.js.  Should we make that a silent failure when it comes to the standalone addon as well?
When running outside automation, I don't think the user expects the browser to quit when the test is done, and also, the user can profile it without any help from he addon, and obviously, the logger is not required.

So since all three of those are only needed during automation runs, I think it's ok that they don't work elsewhere.

Just provide some instructions on how to run it outside of automation, and make sure the test runs, completes and the result becomes visible when running at this setup.
* window.close() - this is needed to close the current tab otherwise we have two tabs (session restore) which ends up running the test twice
* try/catch stuff - cleaned up, and checking typeof dumplog works great!
* GUID - changing the guid to a /usr/bin/uuidgen, yielded addon compatibility errors - in this case, I used toolkit@mozilla.org and it works great!
* Profiler.js - from what I understand this needs to be in the actual code that does the resizing and measurements.  For this reason, we need to reference it from tresize-overlay.xul.  This presents a problem by doing src="../../../scripts/Profiler.js", we don't load .xul via http.

Quite possibly there is a way to solve this by doing some magic in a chrome.manifest file to get Profiler.js available?

Do we need Profiler.js in the chrome space as I have made an ass[umption|ertion] above?

From my perspective the only thing remaining here is sorting out the profiler.  I would love to not have it duplicated and included in the addon, as it stands I am not familiar with ways to not do that.

When I upload the final patch and land it, the docs will be updated here to outline how to run the test locally:
https://wiki.mozilla.org/Buildbot/Talos/Tests#tresize
(In reply to Joel Maher (:jmaher) from comment #29)
> * GUID - changing the guid to a /usr/bin/uuidgen, yielded addon
> compatibility errors - in this case, I used toolkit@mozilla.org and it works
> great!

Not sure where would compat issues come from, but bug1102479@mozilla.org">bug1102479@mozilla.org would probably be better than toolkit@... . Addon ID uniqueness is important.


> * Profiler.js - from what I understand this needs to be in the actual code
> that does the resizing and measurements.  For this reason, we need to
> reference it from tresize-overlay.xul.  This presents a problem by doing
> src="../../../scripts/Profiler.js", we don't load .xul via http.
> 
> Quite possibly there is a way to solve this by doing some magic in a
> chrome.manifest file to get Profiler.js available?

Not sure why it needs to be referenced from the xul file. If during automation we load the $talos/.../test-tresize.html, then it can reference Profiler.js directly like it does with quit.js and the logger file. Couldn't it?

> When I upload the final patch and land it, the docs will be updated here to
> outline how to run the test locally:
> https://wiki.mozilla.org/Buildbot/Talos/Tests#tresize

In order to better assess the patch, it'll be easier to understand the automation invocation and stand alone invocation if their descriptions appeared at this bug first.
(In reply to Avi Halachmi (:avih) from comment #23)

> ::: talos/startup_test/tresize/addon/install.rdf
> @@ +6,5 @@
> > +<em:version>1.0.1</em:version>
> > +
> > +<em:targetApplication>
> > +    <Description>
> > +        <em:id>{ec8030f7-c20a-464f-9b0e-13a3a9e97384}</em:id>
> 
> This GUID is already in use for TART. You should use a unique/new GUID.

My bad on this one. The addon id is tresize@mozilla.org, which is fine, and this GUID is the target application, AKA Firefox, and should be the same for TART or any other addon which targets Firefox.
updated patch to convert tresize:
* addonid is all good
* profiler.js is in tresize-test.html only, not in the addon!

This is how it is intended to be used:
1) run via automation - expected to run in cycles, quit between cycles, log to a file, ability for profiler
2) run locally - click a link to a hg repo to generate and install addon, browse to simple url.  Runs once and alerts the results.

Running this locally (Ubuntu 14.04 x64) vs without the patch, I see similar numbers:
with patch: 12.99 -> 14.72
without patch: 13.07 -> 14.23

So we are not adjusting our numbers (or if so very minor).  I have pushed to try server (we are on v.37, so a new push was needed):
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=4e6796e08448

I can compare this to mainline and see how we are doing.

To run this locally, please visit:
http://hg.mozilla.org/users/jmaher_mozilla.com/tpain/raw-file/tip/talos/generate-tresize-xpi.html
Attachment #8528606 - Attachment is obsolete: true
Attachment #8533198 - Flags: review?(avihpit)
   testname            m-c low  - > m-c high; try avg
Linux:
:( tresize           	   18.6	-> 22.9	      23.1
Linux64:
:( tresize           	   15.9	-> 18.3	      21.9
Win:
:) tresize           	   19.9	-> 20.7	      19.1
WinXP:
:( tresize           	   12.1	-> 12.7	      22.8
Win8:
   tresize           	   12.1	-> 15.7	      12.8
OSX64:
:( tresize           	   20.2	-> 21.7	      32.0


in summary we are really messing with the numbers.  the try avg currently is the average value reported from 3 data points.
So m-c is the "old" tresize on non-e10s, and the try avg is the new tresize on e10s? or is try avg the new tresize but on non-e10s?
Comment on attachment 8533198 [details] [diff] [review]
convert tresize to be e10s compatible (2.3)

Review of attachment 8533198 [details] [diff] [review]:
-----------------------------------------------------------------

w00t!
Attachment #8533198 - Flags: review?(avihpit) → review+
Thanks for the review. I am glad we made so much progress on this bug.

Looking at the raw numbers coming out of try server, we have regressions on WinXP, OSX 10.6, linux32, and linux64.  Win7 yields a win and Win8 is neutral.  

After doing an experiment by adding the 20ms sleep between resizes, we yield similar numbers as we had before.

The questions becomes "do we want 20ms sleep between resizes, or resize as fast as possible"?  Ideally we want to fix this test if needed to be as useful and accurate as possible.
Flags: needinfo?(jmathies)
Flags: needinfo?(avihpit)
(In reply to Joel Maher (:jmaher) from comment #36)
> After doing an experiment by adding the 20ms sleep between resizes, we yield
> similar numbers as we had before.

Seems like we don't need it then.. :)
Flags: needinfo?(jmathies)
:jimm - to clarify- the numbers stay the same if we leave the 20ms sleep in, we tried removing the 20ms to get maximum resize time and this yielded a different set of results.
(In reply to Joel Maher (:jmaher) from comment #38)
> :jimm - to clarify- the numbers stay the same if we leave the 20ms sleep in,
> we tried removing the 20ms to get maximum resize time and this yielded a
> different set of results.

Hmm, how skewed were the result? If the numbers are totally "wack" without the delay, we should experiment with putting a delay back in. Maybe try a shorter delay to see what we can get away with. 

a 20ms wait isn't a big deal imo if we get more reliable data out of it. 10ms might work too.
so the data is reliable, it is just much different from what we have now.  I am leaning towards no 20ms delay.  We should figure out what makes the most sense to measure with Firefox as it is today and go from there.  Quite possibly there are good reasons for the 20ms delay between resizes.

Maybe avih has some good ideas.
(In reply to Joel Maher (:jmaher) from comment #33)
>    testname            m-c low  - > m-c high; try avg
> Linux:
> :( tresize           	   18.6	-> 22.9	      23.1
> Linux64:
> :( tresize           	   15.9	-> 18.3	      21.9
> Win:
> :) tresize           	   19.9	-> 20.7	      19.1
> WinXP:
> :( tresize           	   12.1	-> 12.7	      22.8
> Win8:
>    tresize           	   12.1	-> 15.7	      12.8
> OSX64:
> :( tresize           	   20.2	-> 21.7	      32.0
> 

Are these the numerical differences you're concerned about? I would expect things to be slower on e10s, so this doesn't surprise me. However we had this discussion about how the 20ms delay changed the numbers too, and I don't have a reference on that to judge?
right now I am comparing:
tresize non e10s plain html file <- to -> tresize non e10s with addon

I haven't looked at the delta between non-e10s and e10s yet (at least we can run the tests now)
(In reply to Joel Maher (:jmaher) from comment #42)
> right now I am comparing:
> tresize non e10s plain html file <- to -> tresize non e10s with addon
> 
> I haven't looked at the delta between non-e10s and e10s yet (at least we can
> run the tests now)

Oh I see, so these differences are simply due to test changes.

Personally I don't see any issue with updating tests when they need it even if the numbers fluctuate a bit. as a result. True we open a small window where we might not pick up on a regression since we don't have a nice long run of data to look at with the new numbers, but its minimal risk.
Well, I think 0ms delay will give us a better estimation of resize iteration speed than when adding a 20ms idle between resize iterations. And if there's a difference, then it's expected that with 0ms the results would be worse.

However, I can think of few reasons for us to consider keeping it with 20ms idle anyway:

1. We know the old tresize with 20ms is reliable and provides good results, and the new tresize with 20ms idle seems to provide the same kinds of results.

2. Changing to 0ms will change the results and therefore make it harder to make comparisons between old tresize results and new results.

3. We don't know for a fact that 0ms is better (= more reliable, less noisy, etc), we just estimate that it is - because we haven't been using 0 so far.


So if this was a new test, I'd have said we should use 0, but since we're modifying an existing test, then the reasons above have some counter-weight.

What if we call this test tresize2, run it in parallel to the "old" tresize for a while to see if its results are as useful as the old tresize (specifically see if they reflect improvements and regressions similarly, and that tresize2's noise level is not worse than tresize's), and then deprecate the old tresize (and possible rename tresize2 back to tresize?)?
Flags: needinfo?(avihpit)
Lets do this:

1) add a delay to the new test (10-20ms)
2) get that into production and drop the old test
3) file a follow up on removing the delay and keeping good numbers.

My reasoning:

1) we know we get good results with the delay in both tests
2) avoids all the overhead running two tests in parallel, revisiting to update, etc..
3) if we have the cycles to look at the delay we can now, but if we don't we can get tresize working with e10s now, and get past this bug.
Sounds like a plan.
look at comment 45 in this bug, you can see the plan of action.

Bug 1109876 is filed to resolve this.
Attachment #8535792 - Flags: review?(wlachance)
Comment on attachment 8535792 [details] [diff] [review]
do the 20ms timeout for real this time (1.0)

Review of attachment 8535792 [details] [diff] [review]:
-----------------------------------------------------------------

Looks reasonable given the discussion.
Attachment #8535792 - Flags: review?(wlachance) → review+
this is landed and deployed
Status: ASSIGNED → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
(In reply to Avi Halachmi (:avih) from comment #19)
> Maybe move the two files as generate-tart-xpi.html and
> generate-tresize-xpi.html to the talos folder (where test.py is) which
> should work since 'scripts', 'tart' and 'tresize' folders are within it.
> 
> (And later we'll also need update the link at
> https://wiki.mozilla.org/Buildbot/Talos/Tests#TART.2FCART )

Hmm.. seems the wiki was not updated, so the xpi link was 404...
I just fixed it now - to https://hg.mozilla.org/build/talos/raw-file/tip/talos/generate-tart-xpi.html

Do we also need the link for the tresize addon at the wiki? would users be able to use it by just installing the addon and visiting a URL?
You need to log in before you can comment on or make changes to this bug.