Closed
Bug 62589
Opened 24 years ago
Closed 23 years ago
the layout regression tests need improvement
Categories
(Core :: Layout, defect, P1)
Tracking
()
VERIFIED
FIXED
mozilla0.9.1
People
(Reporter: buster, Assigned: waterson)
References
Details
Attachments
(10 files)
3.68 KB,
patch
|
Details | Diff | Splinter Review | |
2.03 KB,
patch
|
Details | Diff | Splinter Review | |
1.14 KB,
patch
|
Details | Diff | Splinter Review | |
2.70 KB,
patch
|
Details | Diff | Splinter Review | |
2.69 KB,
patch
|
Details | Diff | Splinter Review | |
9.15 KB,
patch
|
Details | Diff | Splinter Review | |
24.81 KB,
patch
|
Details | Diff | Splinter Review | |
13.72 KB,
patch
|
Details | Diff | Splinter Review | |
38.81 KB,
patch
|
Details | Diff | Splinter Review | |
41.46 KB,
patch
|
Details | Diff | Splinter Review |
currently, 1) some tests fail in unpredictable ways 2) it's hard to decipher the regression test output 3) it would be hard to use the regression tests as part of an automated process, as in tinderbox
the problems can be addressed in this way: 1) separate out the tests into those that need delays because of asynch page loading (image loading in particular) Remove the delay for files that don't need it (greatly speeding up the process.) I've done this for block, and chris is doing it for table. Since this part of the regression test is not part of the normal build, these changes do not need to be super-reviewed. Chris and I will review each other's changes to the data files. 2) I've changed the regression test output: a) to mark any discrepancy as a failure, b) to output more data about every failure (in particular, which frame was being tested at the time of the failure), c) to output in the format "regression test [passed|failed] <filename>" to make searching for these strings easier d) to output a summary after every url list that includes number of url's loaded and number that failed. 3) the performance speed up from (1) plus the grep-friendly strings from (2) should make it easy to hook up regression testing to a tinderbox. Any other suggestions?
Status: NEW → ASSIGNED
Priority: P3 → P1
Comment 2•24 years ago
|
||
Chris K. and I were discussing how the delay between the loading of URLs happens at the wrong time. We need the delay to happen after the load but before the regerssion data is dumped, whereas currently the delay happens after the regression data is dumped, and before the next URL is loaded. Moving the delay to before the regression data is dumped would allow for the extra time we want to let the layout complete. I believe that Chris has changes for this in his tree, so we should consider taking those changes as well. As it is now, I cannot see how the delay makes any difference at all (although I've heard that it does). Another problem (on Win32 platforms at least) is that assertions stop the run until a dialog is dismissed. Is there a way to make assertions fail the URL but continue the tests?
attached the diffs for nsWebCrawler.cpp and nsFrameUtil.cpp. chris k or marc, please review chris w, please sr. to marc's point, chris and I had discussed this issue earlier today. chris can either send me his changes or fold them in himself.
Comment 6•24 years ago
|
||
r=attinasi with small comments for each patch: 1) in part 1: '1024-1' should probably be 'sizeof(dirName)-1' (can you say 'nit'?) Also, the case in nsWebCrawler::PerformRegressionTest where you now return NS_ERROR_FAILURE (nsIFrameUtil not created) might deserve an assertion too. 2) in part 2: The node-dumps you put in for the cases where the state and bbox don't match should probably be put in for the case where the style data does not match too. With or without the above, the changes look good.
Assignee | ||
Comment 7•24 years ago
|
||
do what marc said, sr=waterson
Comment 8•24 years ago
|
||
Marking donttest to get this off the managerial radar. If this does in fact need a reduced testcase, please remove the donttest keyword and explain what testcase you would like. Thanks!
Keywords: donttest
-Will that patch fix the 'bogus bbox differences' karnaze mentioned on 2000-10-10 in n.p.m.layout? - is there a way to get rid of path's like s:\mozilla\dist\win32_d.obj\bin\viewer I don't have a path s:\ and in order to run the tests locally (outside netscape) I have to edit all the scripts. - This is especially true for the filelist.txt - Another improvement would be the possibility to dump the content of the verify run into one big file, that could be attached to a bug as a proof, that the regression tests have been successful. Currently there is only one assert in the table regression test's, much more annoying are the assertions at the start up and shutdown.
Reporter | ||
Comment 10•24 years ago
|
||
>>Will that patch fix the 'bogus bbox differences' karnaze mentioned... most, if not all, bbox false hits will be fixed by a combination of better testing code and some shuffling around of the test cases themselves >>is there a way to get rid of path's like s:/ .... sure, we could use local paths. I'll look into that separately >>Another improvement would be the possibility to dump the content of the verify >>run into one big file, that could be attached to a bug as a proof, that the >>regression tests have been successful. I'm not sure I want to ask people to attach big output files to bugs as a matter of course. If we didn't trust someone to run the regression tests before checking in, they wouldn't have gotten approval to check in from me or chris.
Comment 11•24 years ago
|
||
It would be good, to be able to run the regression tests as a single run, without leaving and restarting the viewer, by this one would have bloat statistics on a larger sample ( and get rid of the start and shutdown assertions)
Reporter | ||
Comment 12•24 years ago
|
||
moving to mozilla0.9
Comment 14•24 years ago
|
||
Comment 15•24 years ago
|
||
In my oppinion, it would be good if bbox differences would generate a failed result for the corresponding regression test (see patch for a quick hack example).
Comment 17•24 years ago
|
||
Taking this bug so I can get it checked in. Chris, have you looked at the bbox patch attached by bernd? I'll try that out too. It should probably be checked in separately. The first two patches, from buster, have already been reviewed so they appear to be good to go. I'm reapplying and retesting them now just to be sure...
Assignee: buster → attinasi
Status: ASSIGNED → NEW
Comment 18•24 years ago
|
||
my patch is obsolete, it has been fixed as a part of 1.18 checkin from Steve (2001-01-16).
Comment 19•24 years ago
|
||
Accepting - moving out to Mozilla 1.0 as this is ongoing effort.
Status: NEW → ASSIGNED
Target Milestone: mozilla0.9 → mozilla1.0
Comment 20•24 years ago
|
||
Comment 21•24 years ago
|
||
The idea behind the patch is to avoid to load the next url while the reflow has not finished and we are still dumping regression data, may be this will reduce the amount of false positives.
Comment 22•24 years ago
|
||
r=karnaze on attachment #4 [details] [diff] [review].
Comment 23•24 years ago
|
||
sr=attinasi on attachID=28738
Comment 24•24 years ago
|
||
Assignee | ||
Comment 25•24 years ago
|
||
bernd, could you post unified diffs from now on (cvs diff -u)? Also, I don't understand why we'd startup dumping layout regression data for a *new* page before the *old* page has completed. This all happens on the main thread.
Assignee | ||
Comment 26•24 years ago
|
||
Assignee | ||
Comment 27•24 years ago
|
||
With the above patch, I got four false positives (ten frame state mismatches) running through the table regression tests. That said, I only made it through 563 of the tests, because it hangs sometimes. (That said, without the patch, I only completed 460 of the tests due to crashes, and had 2,645 frame state mismatches!) The idea is to *always* do regression on a timer callback, even if the timeout is zero. This lets all layout's timers clear before we try to validate.
Assignee | ||
Comment 28•24 years ago
|
||
...and it looks like the hang is caused by a missing image file; e.g., in mozilla/layout/html/tests/table/core/captions3.html we've got <img src="../../images/vh40" alt="Valid HTML 4.0!" height="31" width="88"> Let me see if I can fix that.
Comment 29•24 years ago
|
||
Chris, I cannot tell what your patch is doing (yet), but the results sure sound amazing! I'm applying your work-in-progress now to see for myself (and so I can see the changes in context).
Comment 30•24 years ago
|
||
Chris, I just ran your patch (attachID 29257) with the same engine for the baseline and the verify and I got 0 (that is Z E R O) false positives over the table regression tests! Every test passed, no bbox or framestate errors - This, I like very, VERY much! You da man, man!
Assignee | ||
Comment 31•24 years ago
|
||
Well, I'm not having quite so much luck :-(. Running through the table tests, I see three hangs, two from places where we document.write() into a table cell. We also hang in the block tests this way, too. I'm looking at that. I still see about a dozen frame state mismatches. The flags that are mismatches are consistently either NS_FRAME_HAS_LOADED_IMAGES or NS_FRAME_EXTERNAL_REFERENCE. I've squelched a bunch of bogus false positives by whacking nsFrameUtils.cpp to ignore frame state differences in ScrollbarFrame. (The slider frame inside the scrollbar consistently generates dozens of NS_FRAME_HAS_LOADED_IMAGES mismatches.)
Comment 32•24 years ago
|
||
>Also, I don't understand why we'd startup dumping layout regression data for a
>*new* page before the *old* page has completed.
I have seen this several times, the usual indicator is that the page does not
make it to the screen, I have seen that a large page was unchanged on the screen
and in the status bar the URL's changed. In the previous alogorithm this would
happen under the following conditions:
LoadNextURL is executed and starts a timer
with the OnStart observer we dump the regression data for the previous URL, and
may be this takes a longer time than the we specified for the timer. Under these
conditions the LoadNextURL routine will be executed before the URL made it to
the screen.
This scenario became possible after the last change in the execution of
regression tests where the delay:= could be specified in the file_list.txt
files. My impression is that you are reverting to some extend this change, I
consider this a good idea.
Comment 33•24 years ago
|
||
The build I reported teh excellent results from has no new ImageLib (it is a bit old, obviously). HTat explains the frame-state mismatches maybe, but not the hangs...
Comment 34•24 years ago
|
||
I tried Chris patch, Marc is right (You da man, man). It looks pretty promising, but I had the feeling that the delay times inserted in the file_list.txt files are ignored.
Assignee | ||
Comment 35•24 years ago
|
||
Ok, things are getting warmer. With patches for above bugs, I'm now hanging only on table/bugs/bug11026.html (libpr0n! grr...) I'll try to get that nailed tomorrow, so I can get on with my block frame hackery...
Assignee | ||
Comment 36•24 years ago
|
||
Assignee | ||
Comment 37•24 years ago
|
||
With the last patch, plus a couple of random fixes from pavlov, I can get the regression tests to run all the way through, with one false positive in the block tests and six false positives in the table tests. I made it so that we ignore the NS_FRAME_EXTERNAL_REFERENCE frame state flag: I think these differences occur if viewer is the top window and you move the mouse or something. The only places that the bit gets set are 1) in the caret code and 2) in the ESM. I also made it so that we ignore the scrollbars, which were another source of frequent false positives.
Comment 38•24 years ago
|
||
Waterson, this is great! Please check it in. I've been having problems with linux patch files. r=karnaze. Does this also mean that we can remove the delay:=n workarounds and speed things up?
Assignee | ||
Comment 39•24 years ago
|
||
I definitely want to get this in, but this'll make the tests unusable until the dependent bugs are fixed. Let me put some pressure on getting the fixes in for the dependents...
Assignee | ||
Comment 40•24 years ago
|
||
marc, I'm gonna steal this one. Hope you don't mind. With fixes for bug 75576, I get zero false positives from bbox differences! Woo hoo! (Bug 75576 fixes some cases where image requests weren't properly added to the load group.)
Assignee: attinasi → waterson
Status: ASSIGNED → NEW
Depends on: 75576
Target Milestone: mozilla1.0 → mozilla0.9.1
Assignee | ||
Updated•24 years ago
|
Status: NEW → ASSIGNED
Comment 41•24 years ago
|
||
Chris, I thought you owned it already - if not in name, in spirit. Looking forward to this landing - if you need a review, just holler.
Assignee | ||
Comment 42•23 years ago
|
||
Assignee | ||
Comment 43•23 years ago
|
||
karnaze, dbaron, or attinasi, could you r= that latest patch? - nsFrameUtil ignores differences in scrollbars and some (one) of the frame state bits - reworked nsWebCrawler a bit, tried to clean up some of the timing code - removed code that kills the event queue, as this was causing intermittent crashes on shutdown. (might be leaky, I didn't check...)
Comment 44•23 years ago
|
||
Chris, is there a possibility to get rid of the lines: http://lxr.mozilla.org/seamonkey/source/layout/base/src/nsFrameUtil.cpp#648 http://lxr.mozilla.org/seamonkey/source/layout/base/src/nsFrameUtil.cpp#650 I commented them out a long time ago, dumping the critical nodes is for me enough information. When this happens on a long testcase you can easily dump 0.5MB of data, which is completely redundant because you have the same information in the .rgd files.
Assignee | ||
Comment 45•23 years ago
|
||
bernd: good idea. Why don't I add an extra parameter (``noisy regression data'') or something like that in case people want the old behavior.
Comment 46•23 years ago
|
||
I would like to 'r' your patch and separate my wish from this, in order to get your fix into the tree faster ( I have your fix applied already, but maybee smooth running regression tests are a good argument for others to use them :-))
Assignee | ||
Comment 47•23 years ago
|
||
ok, so dcone is going to be checking in the stuff bernd wanted in bug 78523, and jud is going to whack _all_ this stuff pretty hard in bug 15345. (After which, I'll need to update these patches.) Adding dependencies; and emailed jud asking him to land nsWebCrawler changes ASAP.
Assignee | ||
Comment 48•23 years ago
|
||
Assignee | ||
Comment 49•23 years ago
|
||
Assignee | ||
Comment 50•23 years ago
|
||
Ok, stuff is looking good on both Windows and Linux, so I'd like to solicit a final review. Here is a summary of the changes: - Remove ``-d'' option, hard-code delay to 200msec. - Collect regression data 200msec after page load has completed - Add support for printing regression tests in Linux harness (still doesn't appear to work; looks like it actually wants to print!) - Fix some bustage that jud introduced; specifically, need OnStateChange() to ignore any state changes for any web-progress but topmost. - Remove code that was killing the event queue on shutdown; I don't think it's necessary, and it was causing periodic crashes. - Set XPCOM_DEBUG_BREAK=warn on Win32, so that assertions turn into printf()'s instead of dialog boxes - Sprinkle some nsCOMPtr's around - Make nsFrameUtil ignore differences in scroll frames, or differences in some of the frame-state flags.
Comment 51•23 years ago
|
||
r=karnaze. bug 78523 should have fixed the printing problems. I have already checked in removal of delay:= in table file lists.
Comment 52•23 years ago
|
||
I still have to pull and patch to test this all out, but from inspecting the patches I say 'super'! sr=attinasi
Comment 53•23 years ago
|
||
r=rpotts. The changes to nsWebCrawler::OnStateChange(...) look fine to me. I noticed that you are only handling the success case for document completion. Do you need to do anything in the case that the load fails (with NS_BINDING_ABORTED for example)? This could be a fairly common situation with meta-refresh... -- rick
Assignee | ||
Comment 54•23 years ago
|
||
Changes checked in.
Status: ASSIGNED → RESOLVED
Closed: 23 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•