Closed Bug 62589 Opened 24 years ago Closed 23 years ago

the layout regression tests need improvement

Tracking

()

Status:

VERIFIED FIXED

Milestone:

mozilla0.9.1

People

(Reporter: buster, Assigned: waterson)

References

Details

Attachments

(10 files)

part 1 of the fix, for the webshell directory 24 years ago buster 3.68 KB, patch		Details \| Diff \| Splinter Review
part 2 of the fix, for the layout directory 24 years ago buster 2.03 KB, patch		Details \| Diff \| Splinter Review
bbox patch 24 years ago Bernd 1.14 KB, patch		Details \| Diff \| Splinter Review
couldn't we try something like this patch 24 years ago Bernd 2.70 KB, patch		Details \| Diff \| Splinter Review
patch cosmetics 24 years ago Bernd 2.69 KB, patch		Details \| Diff \| Splinter Review
work-in-progress, always use a timer, even if delay == 0 24 years ago Chris Waterson 9.15 KB, patch		Details \| Diff \| Splinter Review
make regression tests work better 24 years ago Chris Waterson 24.81 KB, patch		Details \| Diff \| Splinter Review
updated patch 23 years ago Chris Waterson 13.72 KB, patch		Details \| Diff \| Splinter Review
updated after jud's landing to change viewer to embed APIs 23 years ago Chris Waterson 38.81 KB, patch		Details \| Diff \| Splinter Review
get stuff working properly on Linux, too 23 years ago Chris Waterson 41.46 KB, patch		Details \| Diff \| Splinter Review

buster

Reporter

Description

•

24 years ago

currently,
1) some tests fail in unpredictable ways
2) it's hard to decipher the regression test output
3) it would be hard to use the regression tests as part of an automated process,
as in tinderbox

buster

Reporter

Comment 1

•

24 years ago

the problems can be addressed in this way:
1) separate out the tests into those that need delays because of asynch page
loading (image loading in particular)  Remove the delay for files that don't
need it (greatly speeding up the process.)  I've done this for block, and chris
is doing it for table.  Since this part of the regression test is not part of
the normal build, these changes do not need to be super-reviewed.  Chris and I
will review each other's changes to the data files.

2) I've changed the regression test output:
a) to mark any discrepancy as a failure,
b) to output more data about every failure (in particular, which frame was being
tested at the time of the failure),
c) to output in the format "regression test [passed|failed] <filename>" to make
searching for these strings easier
d) to output a summary after every url list that includes number of url's loaded
and number that failed.

3) the performance speed up from (1) plus the grep-friendly strings from (2)
should make it easy to hook up regression testing to a tinderbox.

Any other suggestions?

Status: NEW → ASSIGNED

Priority: P3 → P1

Marc Attinasi

Comment 2

•

24 years ago

Chris K. and I were discussing how the delay between the loading of URLs happens
at the wrong time. We need the delay to happen after the load but before the
regerssion data is dumped, whereas currently the delay happens after the
regression data is dumped, and before the next URL is loaded. Moving the delay
to before the regression data is dumped would allow for the extra time we want
to let the layout complete.

I believe that Chris has changes for this in his tree, so we should consider
taking those changes as well. As it is now, I cannot see how the delay makes any
difference at all (although I've heard that it does).

Another problem (on Win32 platforms at least) is that assertions stop the run
until a dialog is dismissed. Is there a way to make assertions fail the URL but
continue the tests?

buster

Reporter

Comment 3

•

24 years ago

Attached patch part 1 of the fix, for the webshell directory — Details — Splinter Review

buster

Reporter

Comment 4

•

24 years ago

Attached patch part 2 of the fix, for the layout directory — Details — Splinter Review

buster

Reporter

Comment 5

•

24 years ago

attached the diffs for nsWebCrawler.cpp and nsFrameUtil.cpp.
chris k or marc, please review
chris w, please sr.

to marc's point, chris and I had discussed this issue earlier today. chris can
either send me his changes or fold them in himself.

Marc Attinasi

Comment 6

•

24 years ago

r=attinasi with small comments for each patch:

1) in part 1: '1024-1' should probably be 'sizeof(dirName)-1' (can you say
'nit'?) Also, the case in nsWebCrawler::PerformRegressionTest where you now
return NS_ERROR_FAILURE (nsIFrameUtil not created) might deserve an assertion too.

2) in part 2: The node-dumps you put in for the cases where the state and bbox
don't match should probably be put in for the case where the style data does not
match too.

With or without the above, the changes look good.

Chris Waterson

Assignee

Comment 7

•

24 years ago

do what marc said, sr=waterson

Hixie (not reading bugmail)

Comment 8

•

24 years ago

Marking donttest to get this off the managerial radar. If this does in fact need
a reduced testcase, please remove the donttest keyword and explain what testcase
you would like. Thanks!

Keywords: donttest

Bernd

Comment 9

•

24 years ago

-Will that patch fix the  'bogus bbox differences' karnaze mentioned on
2000-10-10 in n.p.m.layout?
- is there a way to get rid of  path's like s:\mozilla\dist\win32_d.obj\bin\viewer
I don't have a path s:\ and  in order to run the tests locally (outside
netscape) I have to edit all the scripts.
- This is especially true for the filelist.txt
- Another improvement would be the possibility to dump the content of the verify
run into one big file, that could be attached to a bug as a proof, that the
regression tests have been successful.

Currently there is only one assert in the table regression test's, much more
annoying are the assertions at the start up and shutdown.

buster

Reporter

Comment 10

•

24 years ago

>>Will that patch fix the  'bogus bbox differences' karnaze mentioned...
most, if not all, bbox false hits will be fixed by a combination of better
testing code and some shuffling around of the test cases themselves

>>is there a way to get rid of  path's like s:/ ....
sure, we could use local paths.  I'll look into that separately

>>Another improvement would be the possibility to dump the content of the verify
>>run into one big file, that could be attached to a bug as a proof, that the
>>regression tests have been successful.

I'm not sure I want to ask people to attach big output files to bugs as a matter
of course.  If we didn't trust someone to run the regression tests before
checking in, they wouldn't have gotten approval to check in from me or chris.

Bernd

Comment 11

•

24 years ago

It would be good, to be able to run the regression tests as a single run, 
without leaving and restarting the viewer, by this one would have bloat 
statistics on a larger sample ( and get rid of the start and shutdown 
assertions)

buster

Reporter

Comment 12

•

24 years ago

moving to mozilla0.9

buster

Reporter

Comment 13

•

24 years ago

moving to mozilla0.9

Target Milestone: --- → mozilla0.9

buster

Reporter

Updated

•

24 years ago

Target Milestone: mozilla0.9 → mozilla0.8

Bernd

Comment 14

•

24 years ago

Attached patch bbox patch — Details — Splinter Review

Bernd

Comment 15

•

24 years ago

In my oppinion, it would be good if bbox differences would generate a failed
result for the corresponding regression test (see patch for a quick hack example).

karnaze (gone)

Comment 16

•

24 years ago

Moving to Mozilla0.9

Target Milestone: mozilla0.8 → mozilla0.9

Marc Attinasi

Comment 17

•

24 years ago

Taking this bug so I can get it checked in.

Chris, have you looked at the bbox patch attached by bernd? I'll try that out
too. It should probably be checked in separately. The first two patches, from
buster, have already been reviewed so they appear to be good to go. I'm
reapplying and retesting them now just to be sure...

Assignee: buster → attinasi

Status: ASSIGNED → NEW

Bernd

Comment 18

•

24 years ago

my patch is obsolete, it has been fixed as a part of 1.18 checkin from Steve
(2001-01-16).

Marc Attinasi

Comment 19

•

24 years ago

Accepting - moving out to Mozilla 1.0 as this is ongoing effort.

Status: NEW → ASSIGNED

Target Milestone: mozilla0.9 → mozilla1.0

Bernd

Comment 20

•

24 years ago

Attached patch couldn't we try something like this patch — Details — Splinter Review

Bernd

Comment 21

•

24 years ago

The idea behind the patch is to avoid to load the next url while the reflow has
not finished and we are still dumping regression data, may be this will reduce
the amount of false positives.

karnaze (gone)

Comment 22

•

24 years ago

r=karnaze on attachment #4 [details] [diff] [review].

Marc Attinasi

Comment 23

•

24 years ago

sr=attinasi on attachID=28738

Bernd

Comment 24

•

24 years ago

Attached patch patch cosmetics — Details — Splinter Review

Chris Waterson

Assignee

Comment 25

•

24 years ago

bernd, could you post unified diffs from now on (cvs diff -u)? Also, I don't
understand why we'd startup dumping layout regression data for a *new* page
before the *old* page has completed. This all happens on the main thread.

Chris Waterson

Assignee

Comment 26

•

24 years ago

Attached patch work-in-progress, *always* use a timer, even if delay == 0 — Details — Splinter Review

Chris Waterson

Assignee

Comment 27

•

24 years ago

With the above patch, I got four false positives (ten frame state mismatches)
running through the table regression tests. That said, I only made it through
563 of the tests, because it hangs sometimes. (That said, without the patch, I
only completed 460 of the tests due to crashes, and had 2,645 frame state
mismatches!)

The idea is to *always* do regression on a timer callback, even if the timeout
is zero. This lets all layout's timers clear before we try to validate.

Chris Waterson

Assignee

Comment 28

•

24 years ago

...and it looks like the hang is caused by a missing image file; e.g., in

  mozilla/layout/html/tests/table/core/captions3.html

we've got

  <img src="../../images/vh40" alt="Valid HTML 4.0!" height="31" width="88">

Let me see if I can fix that.

Marc Attinasi

Comment 29

•

24 years ago

Chris, I cannot tell what your patch is doing (yet), but the results sure sound
amazing! I'm applying your work-in-progress now to see for myself (and so I can
see the changes in context).

Marc Attinasi

Comment 30

•

24 years ago

Chris, I just ran your patch (attachID 29257) with the same engine for the
baseline and the verify and I got 0 (that is Z E R O) false positives over the
table regression tests! Every test passed, no bbox or framestate errors - This,
I like very, VERY much! You da man, man!

Chris Waterson

Assignee

Updated

•

24 years ago

Depends on: 74165

Chris Waterson

Assignee

Comment 31

•

24 years ago

Well, I'm not having quite so much luck :-(.

Running through the table tests, I see three hangs, two from places where we
document.write() into a table cell. We also hang in the block tests this way,
too. I'm looking at that.

I still see about a dozen frame state mismatches. The flags that are mismatches
are consistently either NS_FRAME_HAS_LOADED_IMAGES or
NS_FRAME_EXTERNAL_REFERENCE. I've squelched a bunch of bogus false positives by
whacking nsFrameUtils.cpp to ignore frame state differences in ScrollbarFrame.
(The slider frame inside the scrollbar consistently generates dozens of
NS_FRAME_HAS_LOADED_IMAGES mismatches.)

Bernd

Comment 32

•

24 years ago

>Also, I don't understand why we'd startup dumping layout regression data for a 
>*new* page before the *old* page has completed.

I have seen this several times, the usual indicator is that the page does not
make it to the screen, I have seen that a large page was unchanged on the screen
and in the status bar the URL's changed. In the previous alogorithm this would
happen under the following conditions:

LoadNextURL is executed and starts a timer 
with the OnStart observer we dump the regression data for the previous URL, and
may be this takes a longer time than the we specified for the timer. Under these
conditions the LoadNextURL routine will be executed before the URL made it to
the screen. 

This scenario became possible after the last change in the execution of
regression tests where the delay:= could be specified in the file_list.txt
files. My impression is that you are reverting to some extend this change, I
consider  this a good idea.

Marc Attinasi

Comment 33

•

24 years ago

The build I reported teh excellent results from has no new ImageLib (it is a bit
old, obviously). HTat explains the frame-state mismatches maybe, but not the
hangs...

Bernd

Comment 34

•

24 years ago

I tried Chris patch, Marc is right (You da man, man). It looks pretty promising,
but I had the feeling that the delay times inserted in the file_list.txt files
are ignored.

Chris Waterson

Assignee

Updated

•

24 years ago

Depends on: 53956

Chris Waterson

Assignee

Updated

•

24 years ago

Depends on: 75198

Chris Waterson

Assignee

Comment 35

•

24 years ago

Ok, things are getting warmer. With patches for above bugs, I'm now hanging 
only on table/bugs/bug11026.html (libpr0n! grr...) I'll try to get that nailed 
tomorrow, so I can get on with my block frame hackery...

Chris Waterson

Assignee

Comment 36

•

24 years ago

Attached patch make regression tests work better — Details — Splinter Review

Chris Waterson

Assignee

Comment 37

•

24 years ago

With the last patch, plus a couple of random fixes from pavlov, I can get the
regression tests to run all the way through, with one false positive in the
block tests and six false positives in the table tests.

I made it so that we ignore the NS_FRAME_EXTERNAL_REFERENCE frame state flag: I
think these differences occur if viewer is the top window and you move the mouse
or something. The only places that the bit gets set are 1) in the caret code and
2) in the ESM.

I also made it so that we ignore the scrollbars, which were another source of
frequent false positives.

karnaze (gone)

Comment 38

•

24 years ago

Waterson, this is great! Please check it in. I've been having problems with
linux patch files. r=karnaze.

Does this also mean that we can remove the delay:=n workarounds and speed things up?

Chris Waterson

Assignee

Comment 39

•

24 years ago

I definitely want to get this in, but this'll make the tests unusable until the 
dependent bugs are fixed. Let me put some pressure on getting the fixes in for 
the dependents...

Chris Waterson

Assignee

Comment 40

•

24 years ago

marc, I'm gonna steal this one. Hope you don't mind. With fixes for bug 75576, I
get zero false positives from bbox differences! Woo hoo! (Bug 75576 fixes some
cases where image requests weren't properly added to the load group.)

Assignee: attinasi → waterson

Status: ASSIGNED → NEW

Depends on: 75576

Target Milestone: mozilla1.0 → mozilla0.9.1

Chris Waterson

Assignee

Updated

•

24 years ago

Status: NEW → ASSIGNED

Marc Attinasi

Comment 41

•

24 years ago

Chris, I thought you owned it already - if not in name, in spirit. Looking
forward to this landing - if you need a review, just holler.

Chris Waterson

Assignee

Comment 42

•

23 years ago

Attached patch updated patch — Details — Splinter Review

Chris Waterson

Assignee

Comment 43

•

23 years ago

karnaze, dbaron, or attinasi, could you r= that latest patch?

- nsFrameUtil ignores differences in scrollbars and some (one) of the frame
  state bits

- reworked nsWebCrawler a bit, tried to clean up some of the timing code

- removed code that kills the event queue, as this was causing intermittent
  crashes on shutdown. (might be leaky, I didn't check...)

Bernd

Comment 44

•

23 years ago

Chris, is there a possibility to get rid of the lines:

http://lxr.mozilla.org/seamonkey/source/layout/base/src/nsFrameUtil.cpp#648
http://lxr.mozilla.org/seamonkey/source/layout/base/src/nsFrameUtil.cpp#650

I commented them out a long time ago, dumping the critical nodes is for me
enough information. When this happens on a long testcase you can easily dump
0.5MB of data, which is completely redundant because you have the same
information in the .rgd files.

Chris Waterson

Assignee

Comment 45

•

23 years ago

bernd: good idea. Why don't I add an extra parameter (``noisy regression data'') 
or something like that in case people want the old behavior.

Bernd

Comment 46

•

23 years ago

I would like to 'r' your patch and separate my wish from this, in order to get
your fix into the tree faster ( I have your fix applied already, but maybee
smooth running regression tests are a good argument for others to use them :-))

Chris Waterson

Assignee

Comment 47

•

23 years ago

ok, so dcone is going to be checking in the stuff bernd wanted in bug 78523, and 
jud is going to whack _all_ this stuff pretty hard in bug 15345. (After which, 
I'll need to update these patches.) Adding dependencies; and emailed jud asking 
him to land nsWebCrawler changes ASAP.

Depends on: 15345, 78523

Chris Waterson

Assignee

Comment 48

•

23 years ago

Attached patch updated after jud's landing to change viewer to embed APIs — Details — Splinter Review

Chris Waterson

Assignee

Comment 49

•

23 years ago

Attached patch get stuff working properly on Linux, too — Details — Splinter Review

Chris Waterson

Assignee

Comment 50

•

23 years ago

Ok, stuff is looking good on both Windows and Linux, so I'd like to solicit a
final review. Here is a summary of the changes:

- Remove ``-d'' option, hard-code delay to 200msec.
- Collect regression data 200msec after page load has completed
- Add support for printing regression tests in Linux harness (still doesn't
  appear to work; looks like it actually wants to print!)
- Fix some bustage that jud introduced; specifically, need OnStateChange()
  to ignore any state changes for any web-progress but topmost.
- Remove code that was killing the event queue on shutdown; I don't think it's
  necessary, and it was causing periodic crashes.
- Set XPCOM_DEBUG_BREAK=warn on Win32, so that assertions turn into printf()'s
  instead of dialog boxes
- Sprinkle some nsCOMPtr's around
- Make nsFrameUtil ignore differences in scroll frames, or differences in
  some of the frame-state flags.

karnaze (gone)

Comment 51

•

23 years ago

r=karnaze. bug 78523 should have fixed the printing problems. I have already 
checked in removal of delay:= in table file lists.

Marc Attinasi

Comment 52

•

23 years ago

I still have to pull and patch to test this all out, but from inspecting the
patches I say 'super'! sr=attinasi

Chris Waterson

Assignee

Updated

•

23 years ago

Keywords: patch

rpotts (gone)

Comment 53

•

23 years ago

r=rpotts.  The changes to nsWebCrawler::OnStateChange(...) look fine to me.

I noticed that you are only handling the success case for document completion.
Do you need to do anything in the case that the load fails (with
NS_BINDING_ABORTED for example)?  This could be a fairly common situation with
meta-refresh...

-- rick

Chris Waterson

Assignee

Comment 54

•

23 years ago

Changes checked in.

Status: ASSIGNED → RESOLVED

Closed: 23 years ago

Resolution: --- → FIXED

Chris Petersen

Comment 55

•

23 years ago

Marking verified per last comments

Status: RESOLVED → VERIFIED

You need to log in before you can comment on or make changes to this bug.

part 1 of the fix, for the webshell directory 24 years ago buster 3.68 KB, patch		Details \| Diff \| Splinter Review
part 2 of the fix, for the layout directory 24 years ago buster 2.03 KB, patch		Details \| Diff \| Splinter Review
bbox patch 24 years ago Bernd 1.14 KB, patch		Details \| Diff \| Splinter Review
couldn't we try something like this patch 24 years ago Bernd 2.70 KB, patch		Details \| Diff \| Splinter Review
patch cosmetics 24 years ago Bernd 2.69 KB, patch		Details \| Diff \| Splinter Review
work-in-progress, always use a timer, even if delay == 0 24 years ago Chris Waterson 9.15 KB, patch		Details \| Diff \| Splinter Review
make regression tests work better 24 years ago Chris Waterson 24.81 KB, patch		Details \| Diff \| Splinter Review
updated patch 23 years ago Chris Waterson 13.72 KB, patch		Details \| Diff \| Splinter Review
updated after jud's landing to change viewer to embed APIs 23 years ago Chris Waterson 38.81 KB, patch		Details \| Diff \| Splinter Review
get stuff working properly on Linux, too 23 years ago Chris Waterson 41.46 KB, patch		Details \| Diff \| Splinter Review