Closed Bug 230697 Opened 21 years ago Closed 14 years ago

New automated regression test framework

Tracking

()

Status:

RESOLVED WONTFIX

People

(Reporter: roc, Assigned: roc)

Details

Attachments

(2 files, 2 obsolete files)

testrunner.pl 21 years ago Robert O'Callahan (:roc) (email my personal email if necessary) 10.81 KB, text/plain		Details
New version 21 years ago Robert O'Callahan (:roc) (email my personal email if necessary) 27.00 KB, text/plain		Details
more core changes 21 years ago Robert O'Callahan (:roc) (email my personal email if necessary) 2.59 KB, patch	bzbarsky : review+ bzbarsky : superreview+	Details \| Diff \| Splinter Review
updated script 21 years ago Robert O'Callahan (:roc) (email my personal email if necessary) 27.06 KB, text/plain		Details

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Description

•

21 years ago

I have been developing a new regression test framework, primarily for layout,
but since it's end-to-end it also tests HTTP, Gfx/Widget, parser, content,
views, etc. It's inspired by Hixie's test engine for Opera.

It's fairly straightforward and requires only very minor changes to the Mozilla
codebase. Basically we set up a special local Web server that feeds Mozilla a
XUL app to drive the tests and a set of testcases. The XUL app loads each
testcase and then signals the server (by requesting a magic URL) that the
testcase is loaded. The server then takes a screenshot of the Mozilla window and
replies to the URL request, causing Mozilla to move to the next testcase. The
result is a set of PNGs, one per testcase. Regression testing consists of
building a baseline set of PNGs and then rerunning the tests with a modified
Mozilla, and comparing the PNGs.

This is all implemented in one big Perl script. Currently it only works on
Unix-like systems because it uses fork() and X-specific graphics commands. I
wanted to use Xvfb for headless operation and fast screenshotting, but Xvfb is
broken in RH9, so for now I'm running the tests on the current X display and
using ImageMagick's import command. Runtime is dominated by the time taken by
the screenshots so I'd like to get Xvfb working eventually.

There are some complications to the above description. Mozilla may crash or hang
on some testcases and the script needs to detect that, kill Mozilla, and resume
with the remaining tests. Some tests are unsuitable for this approach because
they're animated so screenshots will not always return the same contents. My
script supports a "classify" mode where it runs the testcases a few times and
checks that Mozilla reports a consistent image every time.

To ensure that the screen is fully updated before the screenshot is taken, I
patched Mozilla so that when the right environment variable is set, we flush all
reflows and force repaint after firing onload, and then also print a message on
STDOUT. The server watches for this message and takes the screenshot only after
the message appears.

This isn't completely done yet but it is usable now. I want to add an "image
comparison" feature to compare directories full of PNGs and generate a DHTML
report visually highlighting any differences.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 1

•

21 years ago

oh dear. I already checked in the paint-forcing patch by mistake!

http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=nsDocumentViewer.cpp&branch=&root=/cvsroot&subdir=mozilla/content/base/src&command=DIFF_FRAMESET&rev1=1.347&rev2=1.348
http://bonsai.mozilla.org/cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=nsPresShell.cpp&branch=&root=/cvsroot&subdir=mozilla/layout/html/base/src&command=DIFF_FRAMESET&rev1=3.679&rev2=3.680
(see call to EndUpdateViewBatch)

Well, er, that simplifies things :-)

Christopher Blizzard (:blizzard)

Comment 2

•

21 years ago

Sounds sexy.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 3

•

21 years ago

Attached file testrunner.pl (obsolete) — Details

checkpoint of the current state of the testing script. All commands should work
as advertised. There are still a few features I need to add:
-- an image comparison report generator
-- expose 'chunk delay' option --- tells the server to pause in the middle of
feeding HTML pages to Mozilla, to test incremental reflows
-- need to change input syntax so that # introduces a line comment, and change
classifier output to report "# OK", "# FAILURE", "# MISMATCH", so you can write
echo *.html | testrunner.pl classify | grep OK | testrunner.pl -m other/mozilla


and of course this needs to be run on larger test suites and any testrunner
bugs fixed.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 4

•

21 years ago

A couple more notes before I forget:

> setTimeout(nextFrame, 1000);
I put this in because without it, no window ever appears. I'm not sure why. I
should try reducing 1000 to 1, but as it's only used for the first frame it's
not really an issue. Other than this there are no built-in delays. The tests
will run as fast as the system can go.

This probably leaves zombie processes around. I need to put a wait() after
close(<RUNNER>), at least.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 5

•

21 years ago

My Xft build appears to produce different antialiasing pixels in different runs.
Is there a way to stop Xft from antialiasing by setting an environment variable
or something?

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 6

•

21 years ago

I guess I can launch mozilla with a custom fonts.conf pointed to by FONTCONFIG_FILE

Hixie (not reading bugmail)

Comment 7

•

21 years ago

Um, that seems bad. Anti-aliasing should be completely deterministic.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 8

•

21 years ago

I have subpixel positioning on. Maybe that's doing it.

Hixie (not reading bugmail)

Comment 9

•

21 years ago

It shouldn't, assuming your window is always in the same place (you full-screen
the window, right?).

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 10

•

21 years ago

> you full-screen the window, right?

At the moment I'm setting the window to 400x600. The window is not always at the
same place.

Anyway, I've written the code to turn off antialiasing, and I've made all the
other changes mentioned here, and now I'm just polishing up the script so it's
not as write-only.

Hixie (not reading bugmail)

Comment 11

•

21 years ago

> At the moment I'm setting the window to 400x600. The window is not always at 
> the same place.

Ah. I recommend full-screening the window. :-)


> Anyway, I've written the code to turn off antialiasing

Generally for this kind of script you want the test to be as close as possible 
to what end-users are actually going to see.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 12

•

21 years ago

Making the window any given size is easy, but the bigger it is, the slower
everything runs. 400x600 seems like a good size for most testcases.

It would be nice if we could run with antialiasing, but with Xft, we can't. I
did some more tests; turning off subpixel positioning helps, but there are still
a few cases where I get different pixel values unless I turn off antialiasing
completely.

I realized that background image loads don't block onload firing. This is a
problem in some testcases. bz, if you're reading this, would it be hard to
toggle that behavior if MOZ_FORCE_PAINT_AFTER_ONLOAD is set at runtime?

I still have at least one bug to shake out that is stopping me from running the
testcases in layout/html/tests. There's another bug that is not too serious but
I don't know how to fix yet: tests with IFRAMEs fire onload events when those
IFRAMEs load, and that spits out "PAINT FORCED" messages, and I don't know how
to distinguish those from the top-level IFRAME. Maybe I'll add some goop to my
nsPresShell patch.

Boris Zbarsky [:bzbarsky]

Comment 13

•

21 years ago

> would it be hard to toggle that behavior if MOZ_FORCE_PAINT_AFTER_ONLOAD is set
> at runtime?

See
http://lxr.mozilla.org/seamonkey/source/layout/base/src/nsImageLoader.cpp#120 --
you'd want to not pass the LOAD_BACKGROUND flag there if you want them to affect
onload (just pass nsIRequest::LOAD_NORMAL).

As for iframes, is this the problem with load events bubbling in XUL and such? 
Or are you using a capturing listener or something?

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 14

•

21 years ago

Great, I'll make a patch to the image loader. Thanks!

The problem is that my change to DocumentViewerImpl::LoadComplete prints a
message after onload has fired and we've finished painting --- for any IFRAME
that we load. So when we print the "paint forced" message, the server script
doesn't know whether the message refers to a child frame or to the real
testcase. Probably I should just have DocumentViewerImpl::LoadComplete include
the document URL in the message.

Hixie (not reading bugmail)

Comment 15

•

21 years ago

> Making the window any given size is easy, but the bigger it is, the slower
> everything runs. 400x600 seems like a good size for most testcases.

This seems weird... The Opera regression tests I did run at full-screen
1600x1200 and work fine. Is Mozilla really that much slower?

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 16

•

21 years ago

No, it's the time required for screenshotting that is the bottleneck.

Also, I have a feeling that a narrower window will induce more interesting
wrapping behaviours.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 17

•

21 years ago

Attached file New version (obsolete) — Details

New iteration of the script. It does everything I've mentioned in this bug.
I've successfully run this over all the testcases under layout/html/tests. Of
these tests
-- 138 are classified "FAILED" (Mozilla crashed, or hung, or the onload event
failed to fire on the top level document --- this seems to happen quite often
on framesets, and it also happens on tests that try to print themselves)
-- 66 are classified "MISMATCH" (We got different results depending on the
timing of the screenshot; I need to look into these more closely, but some of
them are no doubt animated images, or scripts --- a lot of print tests fell
into this category too)
-- 1391 are classified "OK" (We got identical results over 3 iterations with
varying timing of the screenshot in each iteration)

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

21 years ago

Attachment #138877 - Attachment is obsolete: true

Boris Zbarsky [:bzbarsky]

Comment 18

•

21 years ago

Hmm... Pretty much anything in the FAILED section is a bug, no?

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 19

•

21 years ago

Yes, most FAILED testcases probably are bugs. Some of them could even be bugs in
the test framework. I'll look into it.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 20

•

21 years ago

Attached patch more core changes — Details — Splinter Review

Here are the additional changes that I need in nsDocumentViewer and
nsImageLoader, as discussed above.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

21 years ago

Attachment #139602 - Flags: superreview+

Attachment #139602 - Flags: review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

21 years ago

Attachment #139602 - Flags: superreview?(bz-vacation)

Attachment #139602 - Flags: superreview+

Attachment #139602 - Flags: review?(bz-vacation)

Attachment #139602 - Flags: review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 21

•

21 years ago

Attached file updated script — Details

One more update. This makes image diff do something sensible even if the images
end up in a different format or size. Also, get rid of some alerts from the JS
controller because the alert box cripples any subsequent tests.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Updated

•

21 years ago

Attachment #139591 - Attachment is obsolete: true

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 22

•

21 years ago

The classify run that I did had some problems, namely that my wife was using the
computer at the time and that corrupted some of the tests :-). The real numbers
are FAILED 133, MISMATCH 12, OK 1450.

A demo of the imagediff report for the 12 mismatches is here:
http://ocallahan.org/mozilla/testrunner/demo1/index.html

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 23

•

21 years ago

Note the disturbing one pixel difference in the table test case. I think Xft
just isn't very good about deterministic rendering, even though all antialiasing
is off.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 24

•

21 years ago

I'm supposed to take delivery of a new home machine today. I'm planning to make
my old machine into a headless server, running (among other things) a continuous
layout regression tester based on this code.

Christopher Blizzard (:blizzard)

Comment 25

•

21 years ago

I would actually expect that Xft would do deterministic rendering.  Keith?

Boris Zbarsky [:bzbarsky]

Comment 26

•

21 years ago

Comment on attachment 139602 [details] [diff] [review]
more core changes

Looks fine.  r+sr=bzbarsky.

I assume the idea is to be able to run this in non-debug mode, right?

Attachment #139602 - Flags: superreview?(bz-vacation)

Attachment #139602 - Flags: superreview+

Attachment #139602 - Flags: review?(bz-vacation)

Attachment #139602 - Flags: review+

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 27

•

21 years ago

Yes, absolutely. We want to be able to do regression tests with opt builds.

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 28

•

20 years ago

checked in patch 139602

Bernd

Comment 29

•

14 years ago

Robert, this bug is obsolete?

Robert O'Callahan (:roc) (email my personal email if necessary)

Assignee

Comment 30

•

14 years ago

Absolutely.

Status: NEW → RESOLVED

Closed: 14 years ago

Resolution: --- → WONTFIX

You need to log in before you can comment on or make changes to this bug.