Closed Bug 608447 Opened 14 years ago Closed 12 years ago

setup automated tests for the IE9 testdrive performance demos

Categories

(Testing :: General, defect)

defect
Not set
normal

Tracking

(blocking2.0 -)

RESOLVED FIXED
Tracking Status
blocking2.0 --- -

People

(Reporter: gal, Unassigned)

References

(Blocks 1 open bug)

Details

Microsoft is hosting a bunch of nice HTML5 performance demos on the IE9 Test Drive page:

http://ie.microsoft.com/testdrive/

We under-perform on some of these tests in comparison to IE on Windows but also Safari on Mac. We should automate these these tests and see where we stand, and then figure out how to get competitive on them.

I downloaded a local copy of the Performance/ part of the tests. All tests are self-contained and a simple wget grabs a local copy just fine.

The tests are easy to automate. Most tests don't need user input and report an FPS number. The FPS number is pretty easy to extract.

I wrote a little python web server running on port 8888:

from BaseHTTPServer import BaseHTTPRequestHandler, HTTPServer

class MyHandler(BaseHTTPRequestHandler):

    def do_GET(self):
        print(self.path);
        self.send_response(200)
        self.send_header('Content-type','text/html')
        self.end_headers()
        self.wfile.write("thanks");
        return

def main():
    try:
        server = HTTPServer(('', 8888), MyHandler)
        print 'started httpserver...'
        server.serve_forever()
    except KeyboardInterrupt:
        print '^C received, shutting down server'
        server.socket.close()

if __name__ == '__main__':
    main()

For the fishtank demo, for example, I hooked into the Draw function of the fpsmeter and where it updates the FPS meter I added:

                                var req = new XMLHttpRequest();
                                req.open("GET", "http://localhost:8888/" +
                                         this.browserName + "/" +
                                         this.browserVersion + "/" +
                                         "fishtank/" +
                                         this.fps);
                                req.send();

This identifies the browser and the test and reports the FPS around once a second. A script can grab the output of the python web server and then process it into a nice graph. Here is what the output looks like right now:

localhost - - [29/Oct/2010 17:16:37] "GET /Firefox/4/fishtank/1 HTTP/1.1" 200 -
/Firefox/4/fishtank/1
localhost - - [29/Oct/2010 17:16:39] "GET /Firefox/4/fishtank/1 HTTP/1.1" 200 -
/Firefox/4/fishtank/1
localhost - - [29/Oct/2010 17:16:41] "GET /Firefox/4/fishtank/1 HTTP/1.1" 200 -
/Firefox/4/fishtank/1
localhost - - [29/Oct/2010 17:16:43] "GET /Firefox/4/fishtank/1 HTTP/1.1" 200 -
/Firefox/4/fishtank/1
localhost - - [29/Oct/2010 17:16:45] "GET /Firefox/4/fishtank/1 HTTP/1.1" 200 -
/Firefox/4/fishtank/1
localhost - - [29/Oct/2010 17:16:47] "GET /Firefox/4/fishtank/1 HTTP/1.1" 200 -
/Safari/533.18/fishtank/0
localhost - - [29/Oct/2010 17:17:11] "GET /Safari/533.18/fishtank/0 HTTP/1.1" 200 -
/Safari/533.18/fishtank/13
localhost - - [29/Oct/2010 17:17:13] "GET /Safari/533.18/fishtank/13 HTTP/1.1" 200 -
/Safari/533.18/fishtank/16
localhost - - [29/Oct/2010 17:17:14] "GET /Safari/533.18/fishtank/16 HTTP/1.1" 200 -
/Safari/533.18/fishtank/15
localhost - - [29/Oct/2010 17:17:15] "GET /Safari/533.18/fishtank/15 HTTP/1.1" 200 -
/Safari/533.18/fishtank/16
localhost - - [29/Oct/2010 17:17:16] "GET /Safari/533.18/fishtank/16 HTTP/1.1" 200 -
/Safari/533.18/fishtank/15

I ran the test for a couple seconds in Firefox and Safari respectively.

Automation would need to start various browsers for a little while and then collect the data.
I suggest doing this very soon, and we should block 2.0 on being reasonably competitive on these tests. Safari 5 is 16x faster on this particular test than b6. I didn't check either browser's trunk on mac, but I did compare trunk against IEPP on Windows and we are not looking good on some of the tests.

This is probably higher priority for Windows than Mac, but even on mac we should be an order of magnitude slower.
blocking2.0: --- → ?
> see where we stand,

Thanks to our army of awesome, we've got some of that info already (though I fully support automating this further).  See https://bugzilla.mozilla.org/buglist.cgi?quicksearch=whiteboard%3Aietestdrive

These tests tend to focus on SVG and graphics performance.  Given the state of our SVG code from a performance viewpoint, blocking 2.0 on anything related to it is no good, imo.  It'll take serious arch work to make dynamic SVG stuff fast.

For the non-SVG stuff, it's up to the gfx folks what they think of things, but I think focusing on getting our acceleration story further along will take care of some of the things on its own, and we don't really have the resources to do more than that before 2.0.
I don't think we s(In reply to comment #1)
> I suggest doing this very soon, and we should block 2.0 on being reasonably
> competitive on these tests. Safari 5 is 16x faster on this particular test than
> b6. I didn't check either browser's trunk on mac, but I did compare trunk
> against IEPP on Windows and we are not looking good on some of the tests.

Do you really mean we should leave our users stuck on 3.6 until we're looking good on all of these tests?
In 2.0 we should be competitive with IE9 on the canvas demos, on systems which can run IE9. For example with D3D10 (default on trunk), on my laptop, we're very close on FishIE last time I tested --- 53 vs 56 fps with 1000 fish.

SVG performance needs work, work which we cannot get done for Firefox 2.0 without slipping a lot more. Work like retained path API for cairo (bug 555877), refactoring SVG to use display lists (no bug), layer acceleration for SVG (no bug), and display-list based invalidation for SVG (no bug, but depends on dispay-list based invalidation, bug 539356).

A lot of SVG and canvas demos would benefit from refactoring cairo internals to be a thinner wrapper around D2D, Quartz and other high-level backends (no bug, but there's a nascent "cairo_backend_t" branch by Chris Wilson).

Beyond Windows, we will need to push on cairo-gl so we can use it to accelerate canvas for Mac and Linux so they don't get left behind. Google is doing an the equivalent of this.

I actually don't know why Safari would be so much faster than us on FishIE. We should be close, unless cairo wrapping overhead is what's killing us. That is probably worth a bug of its own, and may be fixable for 2.0.
I would love to see this bug fixed by the way --- tracking the testdrive numbers automatically and catching regressions would be very helpful.
I think we should ship a 2.0 that is reasonably competitive with other browsers on fairly reasonable looking graphically intensive benchmarks (such as these). We don't have to win on all of them. But we should do reasonably well. On the fish tank demo we are 16x behind Safari on Mac. That's worrisome. If we do much better on other hardware and OSes, thats great. Lets get numbers on that. Maybe we won't get to parity, but at the very least lets understand why and where we are slow. Thats what this bug is about: get a systematic handle on tracking our performance on benchmarks that people will likely measure us on once FF4 and IE9 are out.
blocking2.0: ? → -
Whiteboard: [d?]
Just so that folks don't think this is languishing, I am working on this as a side project.
Assignee: nobody → bmoss
Assignee: bmoss → mcote
In case you missed my blog posts, we've got 6 demos, plus the test262 JS conformance test, running twice a day on two Windows 7 machines in Firefox release, Nightly, Safari, IE, and Chrome.  Nightly is updated, well, nightly, and the others every 2-3 weeks-ish.  Results are here:

http://brasstacks.mozilla.com/speedtests/results.html

My posts about this are http://cloquewerk.livejournal.com/22400.html and http://cloquewerk.livejournal.com/22600.html

There's a wiki page for the project at https://wiki.mozilla.org/Auto-tools/Projects/SpeedTests

Added Mac and Linux clients is the next thing on the list.
It's a bit hard to tell the difference between some of the lines on the plots due to colors being similar. Maybe we could use fewer colors by using the same color across releases for a given browser?
Sure, I think I can do that.  Also note that you can hide lines by clicking on the entries in the legend, so if you're trying to see a line that is very close to another, just hide the latter.
This is a bug in the talos component, do we want to pull these tests into talos?  If not, lets move this bug outside of testing::talos.
Yeah this is not only not talos, but v1.0 was done a while ago. I'll open a new bug for looking into better colours in the legend.
Assignee: mcote → nobody
Status: NEW → RESOLVED
Closed: 12 years ago
Component: Talos → General
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.