Sunspider 0.9.1 never paints its subframe

RESOLVED FIXED in mozilla5

Status

()

Core
Layout
P1
normal
RESOLVED FIXED
7 years ago
4 years ago

People

(Reporter: bz, Assigned: bz)

Tracking

Trunk
mozilla5
x86
Windows 7
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite ?

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(1 attachment)

BUILD: Current trunk, etc

STEPS TO REPRODUCE: Click url in url field

EXPECTED RESULTS: Test names and times whizzing by in subframe.

ACTUAL RESULTS: Blank subframe for a while, then the score.

DETAILS:

In 3.6, we paint the test names but not the results in the subframe.  The reason for that is that after the test sets the innerHTML to show the result it immediately calls a function that sets a 10ms timer that destroys that subframe.  Apparently we don't get around to painting during those 10ms; not quite sure why.  The fact that both the reflow and the paint would happen async (so need two trips through the event loop) might have something to do with this.

Then in this range: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=120667a01fd2&tochange=a43e2f7eda8f

we stop painting completely.  I played with it a bit, and looks like we just don't unsuppress painting on the subframe (except via the 250ms timer).  I would guess that bug 526394 is responsible, but I don't see why we would have unsuppressed before that bug and don't now.
Aha!  So one reason we don't unsuppress there is that the test harness part that writes out a testcase looks like this:

        testFrame.contentDocument.open();
        testFrame.contentDocument.write(testContents[testIndex]);
        testFrame.contentDocument.close;

Note the lack of parens on that "close" bit.  That means that as far as we're concerned the document is still loading and we don't show it until the 250ms paint suppression timer fires.

I filed https://bugs.webkit.org/show_bug.cgi?id=47045 on that.

And as far as that goes, my bisect is now done.  A few of the changesets in roc's push don't compile, so I get:

Due to skipped revisions, the first bad revision could be any of:
changeset:   37083:4ccff5df452c
user:        Robert O'Callahan <robert@ocallahan.org>
date:        Thu Oct 08 16:01:15 2009 +1300
summary:     Bug 526394. Part 31: Move scroll implementation into nsGfxScrollFrame. r=mats

changeset:   37084:17131f693e28
user:        Robert O'Callahan <robert@ocallahan.org>
date:        Tue Jan 12 10:45:19 2010 +1300
summary:     Bug 526394. Part 31.5: Cleanup ScrolledAreaEvent stuff to be simpler. In particular we don't need to store the scrolled area, we can just compute it when we fire the event. r=mats

changeset:   37085:4384150589e0
user:        Robert O'Callahan <robert@ocallahan.org>
date:        Tue Jan 12 10:45:19 2010 +1300
summary:     Bug 526394. Part 32: Remove code that only existed so that scrollframes could have views. r=mats

changeset:   37086:2f7d76044ee8
user:        Robert O'Callahan <robert@ocallahan.org>
date:        Tue Jan 12 10:45:20 2010 +1300
summary:     Bug 526394. Part 33: Remove nsScrollPortView etc. r=mats
I still don't know why any of that would have affected behavior here, though.
Assignee: nobody → bzbarsky
Adding the missing parens for "close" fixes this for me, by the way, except in my build which has the patch for bug 598482 (because that makes invalidates come off the refresh driver and onload doesn't flush them, so we don't do those invalidates until the iframe has been destroyed).

For the scroll view stuff affecting this, it might be that painting was still suppressed on that subframe but we painted the view for some reason.  Or maybe painting got unsuppressed expeditiously from some code that got removed.  I'll check that on Monday
Note that we didn't honour paint suppression in subdocuments until recently (bug 591435).
Interesting.  So maybe we used to end up with a repaint for the toplevel document for some reason before roc's changes but not after them?

In any case, I'm tempted to say that documents whose load started with document.open() should unsuppress immediately in StartLayout.  Any async behavior here won't be due to network lag but to simple failure to call close(), or will actually be desired by the web page (e.g. multiple writes off of timeouts).
Resizing a view used to trigger (I fixed it in bug 587542 recently) an invalidation that ignores painting suppression (regular frame based invalidations have respected painting suppression for a while), so with the removing of views from scroll frames we would expect to get less ignored invalidations, which could result in less painting.
(In reply to comment #5)
> so with the
> removing of views from scroll frames we would expect to get less ignored
> invalidations, which could result in less painting.

I should have said "get less _non_-ignored invalidations"
Ah, that's very plausible.  OK, let's assume that was what happened.

Anyone object to my proposed solution from comment 4?
Sounds fine to me.
Turns out, that's not good enough, because bug 596942 made inline scripts not wait on stylesheets, but layout _does_ still wait on stylesheets.  That means that the subframe is still waiting on the sheet when the test script runs, and as soon as it runs it posts a 10ms timeout to run the next test.  So as long as the sheet load takes more than 10ms (looking into why that might be now, btw), we'll lose.
Blocks: 596942
Target Milestone: --- → mozilla2.0b7
In an opt build, the (cached!) stylesheet load sometimes, but only rarely, manages to win the race against the 10ms continue-the-test timer.  If we cached parsed sheets across documents that would help, of course, but I really don't like that race in general.

Of course the only reason we have layout blocking on the stylesheet is that we fixed bug 571298.  Perhaps we should back that patch out....
Blocks: 571298
Ok, so backing out bug 571298 fixes this bug, obviously.  There's still some flicker if we manage to pull up the sheet within 10ms, but very little, on my machine.  I think we should do that, since the slight flicker is better than what we have now.

I _do_ think we should look into why we're failing to load a file from our HTTP cache (on a fast machine with an SSD) in 10ms....  Oh, and I tried disabling the delay line stuff on timers, so it really was 10ms.
Created attachment 481584 [details] [diff] [review]
Like so
Attachment #481584 - Flags: review?(jst)
Whiteboard: [need review]
So you want us to have a flash of unstyled content, intentionally?
Yep.

The other option for fixing this bug is to figure out how to guarantee that the stylesheet loads in under 10ms no matter what; that's going to mean an in-memory cache, because we just can't rely on disks doing that for us.

Note that this is also the behavior we had in 3.6; the FOUC-avoidance here has only been on trunk since June of this year.  So I doubt it would seriously break anything, though I agree it's unfortunate.
Or, we could back out bug 596942...  but that might have its own issues.
Yeah, we need a memory-based cache, or rather to use the one we have.  That everything has to go via disk is unsane, but it's unsane for another bug.
For what it's worth, I tried forcing HTTP to always use only the memory cache, leave in the patch for bug 571298, and doing what I suggested in comment 4.  The combination of those gives me approximately the Opera behavior on this testcase: the text shows up, and doesn't FOUC, but does flicker in and out due presumably to the stylesheet load still having _some_ latency and us sometimes painting while it's loading.  We could obviously reduce the latency more if we had a cache like imagelib for stylesheets (object cache with HTTP semantics).  I wonder whether it's worth making it possible to do that....

Updated

7 years ago
OS: Mac OS X → Windows 7

Updated

7 years ago
Attachment #481584 - Flags: review?(jst) → review+
Duplicate of this bug: 604408
Blocks: 604408

Updated

7 years ago
Duplicate of this bug: 604408

Comment 20

7 years ago
It seems that by opening Tab Candy, Minefield starts painting the subframe.

Comment 21

7 years ago
(In reply to comment #20)
> It seems that by opening Tab Candy, Minefield starts painting the subframe.

Yes I can confirm.
Mozilla/5.0 (Windows NT 6.1; rv:2.0b8pre) Gecko/20101104 Firefox/4.0b8pre
> It seems that by opening Tab Candy, Minefield starts painting the subframe.

Yes, and that seems really broken.  I filed bug 609685.
Depends on: 612632
Given the data in bug 612632, I think there's nothing we can do about that flicker short of not painting precisely at those times when we're still waiting on the stylesheet...

Right now, what happen is that we insert the new iframe into the DOM; this starts a refresh driver timer.  Then we run the script in the iframe; this is typically order of 20ms, so the refresh driver fires right after that, lays out the new iframe, and triggers an invalidate, which causes a paint "asap", which is before the stylesheet (which needs at least two trips through the event loop as things stand) can possibly load no matter which cache back end is in use.  Given the run length of the testcases, and the fact that the refresh driver timer starts before the testcases run, we'd probably need the stylesheet load to take 0 trips through the event loop to make sure it's loaded before the refresh driver fires....

Updated

7 years ago
Duplicate of this bug: 621485
After opening panorama once, this bug is not seen for the rest of the session.
That is known, see comment 22.
tracking-fennec: --- → ?
Whiteboard: [need review] → [need approval]
Comment on attachment 481584 [details] [diff] [review]
Like so

This patch still applies cleanly.  It basically reverts us to the 3.6 behavior here (except at least on Mac with GL accel the flicker the code was trying to fix originally) seems to be gone anyway.  So I think this is very safe.

Brad says this is causing problems for at least some mobile viewers, and definitely looks bad on desktop, so I think we should just take this patch.
Attachment #481584 - Flags: approval2.0?
Comment on attachment 481584 [details] [diff] [review]
Like so

a- ; drivers got nervous and decided that the potential for unseen regressions outweighed the potential benefit
Attachment #481584 - Flags: approval2.0? → approval2.0-
Whiteboard: [need approval] → [need gk2 ship]
clear fennec-blocking flag.
tracking-fennec: ? → ---
Pushed http://hg.mozilla.org/mozilla-central/rev/3d8f6ccc93db
Status: NEW → RESOLVED
Last Resolved: 7 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Whiteboard: [need gk2 ship]
Target Milestone: mozilla2.0b7 → mozilla2.2
You need to log in before you can comment on or make changes to this bug.