Closed Bug 379233 Opened 17 years ago Closed 14 years ago

Pageload tests should flush out layout before timing load end

Categories

(Testing :: General, defect, P5)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: bzbarsky, Assigned: anodelman)

References

Details

(Whiteboard: [talos][waiting for 1.9.0 EOL])

Attachments

(2 files, 5 obsolete files)

Attached patch Something like this (obsolete) — Splinter Review
See bug 379093.  I don't really think we want the pageload tests showing a time drop when bugs like that are introduced.

Unfortunately, I don't have access to the Tp code, or even the code that's used for Tp2 on the tinderboxen.  Attached patch is against the "public" Tp2 version that's in the tree.
I think that this can be integrated with the new perf tests - but I'm unsure what the standard tinderboxes are currently running.
Alice, I just cced you for the new perf tests.  rhelmer's my target for the existing ones.  ;)
FWIW, the only difference between the Tp2 that the tinderboxes are running and the one in the tree is the list of pages in the |pages| variable (which would be moved to an external file if we land the patch in bug 359994).
Yeah, I know.  But the point is that if I change the in-tree version the tinderboxes won't pick it up automatically...
Can that check be done after the timer is stopped?  I don't want this to effect the pageload time for correct pages.  Otherwise I think that I can get the suggested fix checked in with some other cycler improvements that I've been working on.
Which "that check"?  And which timer?

What do you mean by "correct pages"?

And what is your definition of "pageload time"?
I'm referring to:

var offsetWidth = frames["content"].document.documentElement.offsetWidth;

From the code chunk you provided it compares start and end time after this call and calculates the page load time from that.  I'm wondering if we can do the offsetWidth test after collecting the loadtime - just in case the call to offsetWidth causes a slow down in the page load time collected.

My concern being that if this gets integrated into the cycler code that it could cause an increase in the numbers.  I guess that that really isn't such a big deal, as long as it is a consistent increase across the board, but I'm wary of affecting the page load times for successfully loaded pages.
Again, what do you mean by "page load time"?

On trunk, we guarantee that layout is done by the time onload fires.  In the past, we had bugs so that we didn't guarantee it.  We want to do apple-to-apple comparisons of perf on trunk and branch.  Which means that the test itself needs to guarantee that layout is done before the timer stops.  The line of code in question does that.

So the whole point of the change is to get the offsetWidth before we stop timing.  On trunk I expect it to have 0 effect on the numbers, unless something is _very_ wrong.  On branches, depending on the bugginess of them, it might have an effect, but that effect is exactly what we want.
Component: Testing → Talos
Product: Core → Testing
QA Contact: testing → talos
Version: Trunk → unspecified
Moving to correct talos category.
Component: Talos → Release Engineering: Talos
Product: Testing → mozilla.org
Version: unspecified → other
This has been sitting for a long time - moving to Future.
Component: Release Engineering: Talos → Release Engineering: Future
QA Contact: talos → release
Honestly, I'm not sure why "it's taken us a long time to fix this" is good justification for "we should take even longer".  :(  It's an attitude I hate for our browser product too...
Is that patch still current?  

If so, this would need baking time on stage to see what happens to perf numbers and if it doesn't fail (the patch there seems to suggest that there is a fail state that could happen semi-frequently, but I could be wrong).
The old patch deals with the old tp infrastructure, we only do tp tests using the pageloader extension now.
Attachment #263234 - Attachment is obsolete: true
Attachment #353750 - Flags: review?(vladimir)
Alice, thanks for moving the fix over to the new infrastructure.  That looks great!  If we ever need to do non-HTML here, it's not hard to modify things to use something like getBoundingClientRect, which will work for all languages.
With that patch it will just ignore any error and keep testing - do you want that to be a real fail state that stops the test pages from cycling?
If possible, that would be great.
Assignee: nobody → anodelman
Attachment #353750 - Attachment is obsolete: true
Attachment #353753 - Flags: review?(vladimir)
Attachment #353750 - Flags: review?(vladimir)
Component: Release Engineering: Future → Release Engineering: Talos
OS: Linux → All
Priority: -- → P3
Hardware: PC → All
Hrm, the patch looks fine, but I think that we already have some non-html tests in the pageloader, e.g. Tsvg.  Those tests will break, no?
Hmm.. actually, offsetWidth would just return undefined, and no breakage.  But yeah, doing getBoundingClientRect might be better.
What would the less-breaky check look like?  I'd be happy to update the patch with it.
This should work for all our branches (HTML-only on 1.8, works for everything on 1.9 and newer):

  var docElem = frames["content"].document.documentElement;
  var width;
  if ("getBoundingClientRect" in docElem) {
    width = docElem.getBoundingClientRect().width;
  } else if ("offsetWidth" in docElem) {
    width = docElem.offsetWidth;
  }

No need for the try/catch + fail if we're using this approach, I think.
Updated patch with bz's suggested fix.
Attachment #353753 - Attachment is obsolete: true
Attachment #353901 - Flags: review?(vladimir)
Attachment #353753 - Flags: review?(vladimir)
Comment on attachment 353901 [details] [diff] [review]
[Backed out]check offset before recording page load complete (attempt #3)

Yep, looks good!
Comment on attachment 353901 [details] [diff] [review]
[Backed out]check offset before recording page load complete (attempt #3)

Checking in pageloader.js;
/cvsroot/mozilla/layout/tools/pageloader/pageloader.js,v  <--  pageloader.js
new revision: 1.15; previous revision: 1.14
done
Attachment #353901 - Attachment description: check offset before recording page load complete (attempt #3) → [Checked in]check offset before recording page load complete (attempt #3)
Attachment #353901 - Flags: checked‑in+
Attachment #353901 - Attachment description: [Checked in]check offset before recording page load complete (attempt #3) → [Backed out]check offset before recording page load complete (attempt #3)
Attachment #353901 - Flags: checked‑in+ → checked‑in-
Comment on attachment 353901 [details] [diff] [review]
[Backed out]check offset before recording page load complete (attempt #3)

This patch is causing frozen browsers on talos boxes, backed out.
(In reply to comment #25)
> (From update of attachment 353901 [details] [diff] [review])
> This patch is causing frozen browsers on talos boxes, backed out.

urr... what next to do here?
(In reply to comment #26)
> (In reply to comment #25)
> > (From update of attachment 353901 [details] [diff] [review] [details])
> > This patch is causing frozen browsers on talos boxes, backed out.
> 
> urr... what next to do here?

bz; want to try this again or should we just close?
Assignee: anodelman → nobody
Component: Release Engineering: Talos → Release Engineering: Future
Do we have any sort of output here?  As in, was there an exception thrown?  Or was this an actual freeze (in the sense of process not responding to input)?

I do still think we need to fix this bug.

I'd also love any sort of response to the talos issue I recently raised in .platform, which is related to this bug.
It froze as in did not advance to the next page in the manifest - eventually Talos notices the lack of inactivity and kills off the browser and reports as 'frozen'.
Can we try wrapping a try/catch around this code with a dump() of the exception in the catch clause?  It sounds like an exception was thrown by the added code for some reason...
I think that what we need is somebody to be assigned this bug so that the code can be tested locally to ensure that it is working correctly.  I'd like to pick it up, but I don't think that I have time in my schedule for it right now.

What sort of priority do we want to put on this?
P3 (as now) is probably about right, if my opinion matters.
Pretty much the same as before, but we were checking the offset of the wrong window.  Specified now to check browserWindow - the window that is actually cycling through pages.  Tested on my home machine and it actually advances through the pages instead of getting frozen.

Would still require a downtime for landing to ensure that a) there's nothing that I'm missing and b) in case it alters recorded tp numbers.
Assignee: nobody → anodelman
Attachment #353901 - Attachment is obsolete: true
Attachment #362168 - Flags: review?(vladimir)
Assignee: anodelman → nobody
Component: Release Engineering: Future → Release Engineering: Talos
Priority: P3 → P2
Assignee: nobody → anodelman
Comment on attachment 362168 [details] [diff] [review]
check offset before recording page load complete (attempt #4)

Checking in pageloader.js;
/cvsroot/mozilla/layout/tools/pageloader/pageloader.js,v  <--  pageloader.js
new revision: 1.17; previous revision: 1.16
done
Attachment #362168 - Flags: checked‑in+
Comment on attachment 362168 [details] [diff] [review]
check offset before recording page load complete (attempt #4)

Backed out, froze up Firefox3.0 talos boxes.
Attachment #362168 - Flags: checked‑in+ → checked‑in-
Assignee: anodelman → nobody
Component: Release Engineering: Talos → Release Engineering
Assignee: nobody → anodelman
Assignee: anodelman → nobody
Assignee: nobody → anodelman
Moving to future till either I have time to work on this or a patch presents itself.
Assignee: anodelman → nobody
Component: Release Engineering → Release Engineering: Future
Mass move of bugs from Release Engineering:Future -> Release Engineering. See
http://coop.deadsquid.com/2010/02/kiss-the-future-goodbye/ for more details.
Component: Release Engineering: Future → Release Engineering
found in triage. 

alice is this still needed?
This is still needed to do useful cross-branch comparisons, yes.

Also if we ever decide to change our core behavior to not flush right before onload, of course.
Since this only showed problems on 1.9.0 and 1.9.0 is end-of-life-ing we could attempt a reland after that branch is discontinued... assuming that it doesn't show problems on any of our new branches.
Whiteboard: [talos][waiting for 1.9.0 EOL]
Found during triage. Adding depbug, so we can investigate after 1.9.0 stops.
Depends on: 554226
Priority: P3 → P5
Component: Release Engineering → General
Product: mozilla.org → Testing
QA Contact: release → general
Version: other → unspecified
This is about changes in a talos suite, not the automation of a suite, so moving this bug to Testing:General as requested by bmoss, ctalbert.
> Also if we ever decide to change our core behavior to not flush right before
> onload, of course.

Which we want to do.  See bug 572604.  So at this point this is blocking core product changes.
Blocks: 572604
I wasn't willing to hold said core changes for this bug given its history; they needed to happen for the beta.  So starting tonight, Tp4 on m-c is ... probably less useful than it should be.  :(
We no longer test 1.9.0 with talos (bug 574062), so this should be landable now.
To get this to roll out:

- need a patch against bsmedberg's pageloader as bundle (which is what talos currently uses to install the pageloader)
- quick staging run
- downtime
Testing patch on talos-staging-master02 on mozilla-central, mozilla-1.9.1 and mozilla-1.9.2.
I'm seeing green runs on mozilla-central, 1.9.1 and 1.9.2.  No freezing
observed, which was what caused the back out of this patch last time around. 
There were some red runs, but they were due to crashes or other non-related
failures.
I took a closer look at the logs and the patch breaks tgfx - I'll look into this and see if it can be fixed.
Previous patch didn't take into account nochrome tests which do not create a secondary browserWindow.  New code handle both chrome/nochrome case.

Still don't have an error state here, but just running the check and not freezing would be a win considering the age of this bug.
Attachment #464630 - Attachment is obsolete: true
Attachment #471322 - Flags: review?(vladimir)
Comment on attachment 471322 [details] [diff] [review]
re-based patch based upon bsmedberg's pageloader bundle + fix freezing state

http://hg.mozilla.org/build/pageloader/rev/b4da709724e1
Also updated http://build.mozilla.org/talos/xpis/pageloader.xpi
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Assignee: nobody → anodelman
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: