Closed Bug 129953 Opened 22 years ago Closed 22 years ago
Regression causing "frame drop-offs" [DHTML / painting-supression]
Just go to http://www.formula-one.nu/dhtml/3d.htm - in recent builds major frame drop-offs happen. Unfortunately this is also present on the 0.9.9 trunk (testing 2002031903) Tested back builds, I find out that the regression started with build 2002021803 (the testcase is fine with build 2002021708)
Markush, please give this to the XP timer owner, pavlov. I'll help. /be
Assignee: jst → pavlov
No problems with 0.9.9 build 2002031104
A very fine testcase from Peter for playing around http://www.formula-one.nu/mozilla/timeouttest.htm You can test with different timeouts and number of elements. You can see that a small timeout causes frame-drop. You can also see that a big timeout but with many elements causes the same problem.
matic: it would help if you could produce something that measured the time between SetTimeout() and the timeout being called, and display the distribution (NOT average) of times. I suspect the problem is that for small timeouts the algorithm occasionally does some subtraction and ends up using an effective 0 timeout, causing the appearance of a jump (the reflows may get coalesced for two updates, even with the current code, before it's drawn).
jesup: I've updated the testcase with timers. (the same url) Do you think it can be used or do you want me to change it?
bug 117061 has some useful stats from an updated testcase attached. I'm duplicating a comment I just made there, since it's really more applicable to this specific bug: This test now shows we have serious problems with the SetTimeout processing, perhaps partly due to the previous attempt to fix timers. With a SetTimeout of 100ms (which is what matic used, the default settings), you shouldn't see a pattern of sometimes 250ms, sometimes 10ms. Now, the 250 may be largely due to bug 129115, but we shouldn't be seeing 10 or 20, and especially not several of them in a row. OS->All
OS: Windows 2000 → All
Hardware: PC → All
http://www.kunstforum-wien.at/index.php?area=newsletter is another really bad example. if you move the mouse-cursor over the symbol for scrolling up or down it's even hard to realise that this is animating as there seems to be no animation. if you move around with the mouse-cursor the animation takes place. as jesup already stated in bug 117061 comment #10 we are having major troubles with firing timers.
this patch makes things a lot better, but still isn't perfect..
Great - hope to see this in 1.0 as this would ease a few things.
this patch makes it up to the app's main loop code to call to turn on idle timers and makes the appshell code for mozilla on windows do it by default. other apps and platforms can make this call and do the make the calls to get this functionality, otherwise they will continue to operate as they used to.
Attachment #77190 - Attachment is obsolete: true
I'll review this soon, by tonight for sure. I'm keen to hear from Markush and others who might try it, if they can (or get an optimized build that contains this patch from pavlov). Also, I think someone should r= -- if that's me, who will sr=? /be
Updated nsAppShell.cpp patch which also processes idle timers when we are modal.
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla1.0
Comment on attachment 77335 [details] [diff] [review] better fix email@example.com
Attachment #77335 - Flags: superreview+
Comment on attachment 78150 [details] [diff] [review] Updated patch to nsAppShell.cpp (previous patch required - nsAppShell changes) firstname.lastname@example.org
Attachment #78150 - Flags: superreview+
Comment on attachment 77335 [details] [diff] [review] better fix email@example.com
Attachment #77335 - Flags: review+
Comment on attachment 78150 [details] [diff] [review] Updated patch to nsAppShell.cpp (previous patch required - nsAppShell changes) firstname.lastname@example.org
Comment on attachment 78150 [details] [diff] [review] Updated patch to nsAppShell.cpp (previous patch required - nsAppShell changes) a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #78150 - Flags: approval+
Comment on attachment 77335 [details] [diff] [review] better fix a=asa (on behalf of drivers) for checkin to the 1.0 trunk
Attachment #77335 - Flags: approval+
rjesup wanted to know if I was cool with these changes, and I am -- pav already showed them to me last week -- I was just waiting for others to r=/sr= and relieve me of contributing to volunteer effect. /be
We have r/sr/a, and brendan is ok with it. Let's commit this now! If you don't have time/etc to commit it, reassign to me and I can do it.
What kind of risk is associated with this patch and how much testing has been done. Can we get some builds to QA for testing? Can you checkin on the trunk after Moz 1.0 branches and let it bake for a day or two
Time to RC1 is short. This patch is not huge and it's well reviewed. We should get this in quickly.
i've landed this on the trunk. Will land on the branch once I get adt approval.......
Let's let it bake on the trunk for a couple of days, have QA take a look at it, and make sure there are no regressions introduced, then come back to us for approval - ADT. Removing adt1.0.0. Pls renominate once it has been on the trunk a couple days, and been tested.
I've tested the patch with 2002041003 NT4. Since I have a slow machine (PII 300), the testcase I used is a modified version of http://www.formula-one.nu/mozilla/timeouttest.htm (http://bugzilla.mozilla.org/showattachment.cgi?attach_id=78592 see bug#136688) Results show very stable numbers. Timeout 500 700 300 --------------------------------------------- Iter 1 501 701 290 " 2 501 701 301 " 3 500 701 290 " 4 501 701 291 " 5 501 701 301 " 6 500 701 300 " 7 501 701 310 " 8 501 701 311 " 9 501 701 300 " 10 500 701 310 ............................................ I'm suprised by times less than the timeout.
I'm going to be out of the office for the next week or so. I would appriciate it if someone could land this on the branch while I'm away.
How much work is needed to get this fixed for non-Windows platforms at this point (per comment 11)?
Comment on attachment 77335 [details] [diff] [review] better fix these changes were backed out due to a performance regression (see bug 136677)
Comment on attachment 78150 [details] [diff] [review] Updated patch to nsAppShell.cpp (previous patch required - nsAppShell changes) marking needs-work since this was backed out (bug 136677)
Final numbers given seem to be 1% increase in Tp. For a significant improvement in DHTML functionality (not just performance) this seems worth it. DHTML functionality was broken by the previous timer checkins to reduce latency; this fixes that regression. If there are Tp issues due to that, we can resolve them or decide that they're unresolvable; we shouldn't back out a significant regression fix for 1% Tp. Note this fixes considerable randomness in timers; without this fix they can fire way earlier than requested or way later. 100ms SetTimeout was taking 10ms to 250ms, apparently randomly. I say check it back in, mirror it for Unix/etc, and then resolve any fallout.
I'm thinking the same as Randell Jesup in comment #31. Just 1%-1.5% increase in pageload for a much wanted bugfix!? I agree focus on performance is important in current Mozilla, but...
I would rather back out the whole new timer implementation that caused many of these regressions. I don't know what good came from it, but since it's probably too late to back it out we _have_ to fix the regressions so that the stuff that depends on timers are unbroken. That it cost performance sucks, but what can we do? The timers as they are right now are just not acceptable. :-(
is bug 136447 a dup of this? It got around extremely much faster after this checkin (allthough still around 5-10 times slower than MSIE)
Speed is always an issue, but in this case I would rather see Mozilla 1.0 do the right thing a little slow than see it doing the wrong thing a little more quickly.
Since these changes can be checked in in 3 independent parts, I'm going to check them in in 3 pieces to see which piece tweaked the Linux builds (since none of them should have). I just checked in the Makefile.in / makefile.win to add the nsITimerManager.idl to the build.
OK, adding the IDL, not surprisingly, didn't increase the time. So I just checked in the rest of the xpcom/threads/ changes. (Still no factory change or Windows appshell change.)
So, as I was just discussing with bbaetz & jrgm on IRC, this code *is* used by linux. The registering of the class occurs on all platforms. The change to nsXPComInit.cpp causes an extra RegisterFactory call from inside of NS_InitXPCOM2. It doesn't seem like a single class should cause a 10ms change in page load...especially since NS_InitXPCOM2 should be called way before pageload occurs but that's the only common connection that I can see.
pavlov is out; to manager.
Assignee: pavlov → gagan
Status: ASSIGNED → NEW
adt1.0.0+ (on ADT's behalf) for checkin into the 1.0 branch. Pls check this in to the branch today. After it is checked in, pls add fixed1.0.0. Once QA has verified it on the branch, then add verified1.0.0.
Comment on attachment 78150 [details] [diff] [review] Updated patch to nsAppShell.cpp (previous patch required - nsAppShell changes) removing needs work and re-approving for the 1.0 branch
Comment on attachment 77335 [details] [diff] [review] better fix removing needs work and re-approving for the 1.0 branch.
doug can you help? :)
Assignee: gagan → dougt
@#$%!! The patch does not apply cleanly. It does not include mac or unix makefiles. Ugh. Attaching what I think the patch is suppose to look like. I am deeply conserned about the nsAppShell.cpp changes as they are not the same on the trunk.
after talking with dbaron, we agreed that the nsAppShell.cpp from the trunk should be landed.
Checking in widget/src/windows/nsAppShell.cpp; /cvsroot/mozilla/widget/src/windows/nsAppShell.cpp,v <-- nsAppShell.cpp new revision: 22.214.171.124; previous revision: 126.96.36.199 done Checking in xpcom/build/nsXPComInit.cpp; /cvsroot/mozilla/xpcom/build/nsXPComInit.cpp,v <-- nsXPComInit.cpp new revision: 188.8.131.52; previous revision: 184.108.40.206 done Checking in xpcom/threads/TimerThread.cpp; /cvsroot/mozilla/xpcom/threads/TimerThread.cpp,v <-- TimerThread.cpp new revision: 220.127.116.11; previous revision: 18.104.22.168 done Checking in xpcom/threads/makefile.win; /cvsroot/mozilla/xpcom/threads/makefile.win,v <-- makefile.win new revision: 22.214.171.124; previous revision: 126.96.36.199 done Checking in xpcom/threads/Makefile.in; /cvsroot/mozilla/xpcom/threads/Makefile.in,v <-- Makefile.in new revision: 188.8.131.52; previous revision: 184.108.40.206 done Checking in xpcom/threads/MANIFEST_IDL; /cvsroot/mozilla/xpcom/threads/MANIFEST_IDL,v <-- MANIFEST_IDL new revision: 220.127.116.11; previous revision: 18.104.22.168 done Checking in xpcom/threads/nsITimerManager.idl; /cvsroot/mozilla/xpcom/threads/nsITimerManager.idl,v <-- nsITimerManager.idl new revision: 22.214.171.124; previous revision: 1.1 done Checking in xpcom/threads/nsTimerImpl.cpp; /cvsroot/mozilla/xpcom/threads/nsTimerImpl.cpp,v <-- nsTimerImpl.cpp new revision: 126.96.36.199; previous revision: 188.8.131.52 done Checking in xpcom/threads/nsTimerImpl.h; /cvsroot/mozilla/xpcom/threads/nsTimerImpl.h,v <-- nsTimerImpl.h new revision: 184.108.40.206; previous revision: 220.127.116.11 done Landed the above files on the branch.
Status: NEW → RESOLVED
Closed: 22 years ago
Resolution: --- → FIXED
Um... are we opening other bugs for non-Windows platforms then?
Sorry but this is definitely fixed only on windows. I've opened bug 137706 for the Linux platform.
adding fised1.0.0 keyord to make sure this gets QA verification.
Reopening because there are problems with the patch! The nsAppShell.cpp patch caused dialog problems. Those that are familar with this patch please review the change which I landed last night. rpotts and rjesup, You reviewed these changes. Any idea what could cause 137796? Note that dbaron and I landed the trunk version of the nsAppShell.cpp.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
fixes the dialog problem
Comment on attachment 79482 [details] [diff] [review] better patch v.2 sr=alecf but I'd sure like a reviewer who's more familiar with this to read this.
Comment on attachment 79482 [details] [diff] [review] better patch v.2 email@example.com - this is the same as the patch I r='d before (2nd part of pav's patch)
Attachment #79482 - Flags: review+
Comment on attachment 79482 [details] [diff] [review] better patch v.2 AlecF sr=='d it already
Attachment #79482 - Flags: superreview+
Comment on attachment 79482 [details] [diff] [review] better patch v.2 i'll approve this; straight backout of this file didn't build for me.
Attachment #79482 - Flags: approval+
Checking in nsAppShell.cpp; /cvsroot/mozilla/widget/src/windows/nsAppShell.cpp,v <-- nsAppShell.cpp new revision: 18.104.22.168; previous revision: 22.214.171.124 done this should do it.
there are residual problems with other dialogs, even with your change (cookie dialogs, for instance, require that i click on ok or cancel in rapid succession to dismiss the dialog). I don't think we can ship rc1 unless we can clean up that behaviour.
This isn't fixed. We probably need to just back it out.
Checking in mozilla/xpcom/threads/nsTimerImpl.h; /cvsroot/mozilla/xpcom/threads/nsTimerImpl.h,v <-- nsTimerImpl.h new revision: 126.96.36.199; previous revision: 188.8.131.52 done Checking in mozilla/xpcom/threads/nsTimerImpl.cpp; /cvsroot/mozilla/xpcom/threads/nsTimerImpl.cpp,v <-- nsTimerImpl.cpp new revision: 184.108.40.206; previous revision: 220.127.116.11 done Checking in mozilla/xpcom/threads/MANIFEST_IDL; /cvsroot/mozilla/xpcom/threads/MANIFEST_IDL,v <-- MANIFEST_IDL new revision: 18.104.22.168; previous revision: 22.214.171.124 done Checking in mozilla/xpcom/threads/Makefile.in; /cvsroot/mozilla/xpcom/threads/Makefile.in,v <-- Makefile.in new revision: 126.96.36.199; previous revision: 188.8.131.52 done Checking in mozilla/xpcom/threads/makefile.win; /cvsroot/mozilla/xpcom/threads/makefile.win,v <-- makefile.win new revision: 184.108.40.206; previous revision: 220.127.116.11 done Checking in mozilla/xpcom/threads/TimerThread.cpp; /cvsroot/mozilla/xpcom/threads/TimerThread.cpp,v <-- TimerThread.cpp new revision: 18.104.22.168; previous revision: 22.214.171.124 done Checking in mozilla/xpcom/build/nsXPComInit.cpp; /cvsroot/mozilla/xpcom/build/nsXPComInit.cpp,v <-- nsXPComInit.cpp new revision: 126.96.36.199; previous revision: 188.8.131.52 done Checking in mozilla/widget/src/windows/nsAppShell.cpp; /cvsroot/mozilla/widget/src/windows/nsAppShell.cpp,v <-- nsAppShell.cpp new revision: 184.108.40.206; previous revision: 220.127.116.11 done bash-2.05a$ backed out.
Assignee: dougt → pavlov
Status: REOPENED → NEW
Sorry for the spam, but did this get backed out properly? bonsai shows the backout applying against a branch called MOZILLA_1_0_BRANCH (not MOZILLA_1_0_0_BRANCH). Does that matter?
Removing adt1.0.0+, as this was backed out. Pls nominate again, when there is a patch that fixes the problem.
MOZILLA_1_0_BRANCH and MOZILLA_1_0_0_BRANCH should be aliases of each other (for now, at least).
This is no longer a branch smoketest blocker since it was backed out of the branch.
Severity: blocker → major
Whiteboard: [adt2] BRANCH ONLY → [adt2]
Any chance this will be fixed for 1.0, or is it just to late now that RC1 is released ? There's so much DHTML not working properly, including some stuff I made :-( Well, sorry for the spam, but I reelly really hope mozilla.org has the nesecary patience to wait for a fix for this bug. And thanks to those people who have already been working hard trying to fixing this bug.
Since this isn't fixed shouldn't it be fixed on all platforms? I don't think that there should be so major differences between behaviour on Windows, Linux and Mac versions.
There might be something related in bug 138791. I found some problems in the timer code while making it handle interval rollover correctly.
Depends on: 138791
The reason this wasn't changed on linux or mac is because their old timer implementations did not behave this way.
Status: NEW → ASSIGNED
Do we know exactly what is holding up this patch on the branch? This does ruin alot of DHTML websites, and it would be nice for the patch to make it into 1.0. The trunk seems to be working fine with this patch on Win. What are we looking for to get approval to land on the branch?
Timer code has been fixed, but we still have problems. Please see my comment 70 from bug 117061 for some analysis.
i think most of the remaining problems are being caused by other bugs such as the various dhtml sucking due to layout problems.
If you want to get this on the branch you should add adt1.0.0 keyword.
adding adt1.0.0 keyword... We shouldn't need Netscape's permission to check into the 1.0 branch, but it can't hurt to put this on people's radars. Like Stuart said, the problems that still remain with DHTML performance are other bugs. Fixing this one will at least get DHTML animations to work as opposed to simply stalling the browser with frame-dropoffs until the animation is finished. we have a few drivers CCd on this bug already, and it is a dependency on bug 138000 (Make RC2 not suck) so unless there are some negative issues with this patch, this and probably other related bugs (bug 117061 for example) should be checked in to the 1.0 branch.
Doug - Considering taking this one after beta. What Top 100 sites are exhibiting a problem with this bug?
This bug is a major regressions causing most DHTML sites to hang and many DHTML examples have frame-drop-offs. It has already done fine on the trunk - so why not checking into 1.0 branch?!
This is a stopship for RC2 (that's why it's blocking 138000). It's been on the trunk for a long time with nothing but good results. We must get this landed before tomorrow.
rolling the dice ... adt1.0.0+ (on ADT's behalf) approval for checkin to the 1.0 branch. Pls check this in to 1.0 branch today, then add the fixed1.0.0 keyword.
marking fixed. now on trunk and branch.
Status: ASSIGNED → RESOLVED
Closed: 22 years ago → 22 years ago
Resolution: --- → FIXED
I must say, that lastest trunk for Windows is amazing. Amazing results with moving layers in most combinations (with clipping etc.). The problem still apears when Mozilla has to show many, many layers (bug #140789 ). I hope this patch wont be exluded in RC2?
It's on the 1.0 branch, so ready for RC2 :) Fixing bug 102321 helped many, many things - also a bit doing animations over a background image. Nevertheless bug 139986, bug 87808, bug 107746 still need our attention.
*** Bug 142399 has been marked as a duplicate of this bug. ***
The patch in this bug causes a pretty serious problem (see bug 291386). By suppressing timers until the windows event loop is idle, the timer thread's filter gets greatly skewed, and as a result, the timer thread ends up over-correcting like mad for the skew. It results in timers firing at 1ms intervals post page load, for example. Unfortunately, all of the testcases linked from this bug are now 404s. Can anyone help resurrect a testcase for this bug? Thanks!
The former testcases of formula-one.nu are at http://wd-testnet.world-direct.at/mozilla/dhtml/funo/
You need to log in before you can comment on or make changes to this bug.