Closed Bug 161481 Opened 18 years ago Closed 18 years ago

Page source rendering terribly slow

Categories

(Core :: Layout, defect)

PowerPC
Mac System 8.6
defect
Not set

Tracking

()

RESOLVED FIXED

People

(Reporter: benjamin, Assigned: rbs)

References

()

Details

(Keywords: perf, regression)

Attachments

(4 files, 6 obsolete files)

Since yesterday (perhaps two or three builds before...) the page source
rendering is terribly slow.

Mozilla 1.1b
Mozilla/5.0 (Macintosh; U; PPC; en-US; rv:1.1b) Gecko/2002080703

Mac8.6
Could you find out which day the regression occured on? It'd help narrow down
what broke.

I -think- these bugs go in the ViewSource component; I'm sure someone will yell
if that's wrong.
Component: Parser → ViewSource
Keywords: regression
Confirmed using Mac/2002080208/9.2.2. Viewing source of [http://www.mozilla.org/] takes 7 
seconds.
Status: UNCONFIRMED → NEW
Ever confirmed: true
Keywords: perf
Does this happen on any other platforms?
Can you give me a 12-24 hour period to work with?

As a note, for me (Linux) view-source is instantaneous on
http://www.mozilla.org/.  but I have a fast machine...
Using FizzillaCFM, this bug first appears in 2002-08-02-08-trunk. It was not
apparent in 2002-08-01-08-trunk.

(There are other bugs that first showed up in that build, too, such as bug 160887.)
Confirming with 2002080717-trunk/Mac OS 9.2 on this page.
There is not problem on 20020728.
And I don't think this is only about "View Surce".
So... is this a viewsource problem only?  Or general layout?

ccing rbs, since his is the only mac-specific checkin I see in that date range...
There are other _symptoms_ that appeared on Mac at about the same time, that may
or may not be related. In addition to the bug I mentioned, there are problems
with TEXTAREA and selection performance.
(The text typing problem is noted in bug 129857, comment 10, as a worsening of
the originally reported problem in recent builds.)
>ccing rbs, since his is the only mac-specific checkin I see in that date range...

Yes, that was for bug 107146.

I have this gut feeling that the perf upshot might just be something that is
computed over and over again -- rather than being cached. I am seriously impeded
by the fact that I don't have a Mac to explore this. Cc:ing shofield for input.
sampling mozilla shows that the slowness is caused by the use of the atsui fallback to 
measure (seemingly all) the text.  i'll be able to find out why it's using the atsui fallback 
after i get my build environment updated.  i'll report back in a few hours.
Related Bug 161766 ?
Attached patch patch (obsolete) — Splinter Review
generic fonts ('monospace,' 'serif,' etc) were not getting initialized properly
due to an oversight in the mathml changes.  this patch should fix that. 
however, i am getting assertions deep inside of nsString:


###!!! ASSERTION: data in U+0100-U+FFFF will be lost: '(*first < 256)', file
bufferRoutines.h, line 317
Break: at file bufferRoutines.h, line 317
###!!! ASSERTION: data in U+0100-U+FFFF will be lost: '(*first < 256)', file
bufferRoutines.h, line 317
Break: at file bufferRoutines.h, line 317
###!!! ASSERTION: data in U+0100-U+FFFF will be lost: '(*first < 256)', file
bufferRoutines.h, line 317
Break: at file bufferRoutines.h, line 317


at this location:


Breakpoint 1, CopyChars2To1 (aDest=0xbfffda38 "Osaka.h??p¿??H¿?¿?1E?",
anDestOffset=0, aSource=0x1393c470 "", anOffset=0, aCount=8) at
bufferRoutines.h:317
317	      NS_ASSERTION( (*first < 256), "data in U+0100-U+FFFF will be
lost");
(gdb) bt
#0  CopyChars2To1 (aDest=0xbfffda38 "Osaka.h??p¿??H¿?¿?1E?", anDestOffset=0,
aSource=0x1393c470 "", anOffset=0, aCount=8) at bufferRoutines.h:317
#1  0x0146f168 in nsStrPrivate::StrAppend (aDest=@0xbfffda2c,
aSource=@0xbfffd978, anOffset=0, aCount=8) at nsStr.cpp:204
#2  0x0147448c in nsCString::AssignWithConversion (this=0xbfffda28,
aString=0x1393c470, aCount=8) at nsString.cpp:640
#3  0x01474518 in nsCString::AssignWithConversion (this=0xbfffda28,
aString=@0x1393c460) at nsString.cpp:645
#4  0x07da216c in GetEncoding (aFontName=@0x1393c460, aValue=@0xbfffdbf8) at
nsMacUnicodeFontInfo.cpp:513
#5  0x07da23f4 in GetConverter (aFontName=@0x1393c460, aConverter=0xbfffdd60)
at nsMacUnicodeFontInfo.cpp:549
#6  0x07da2758 in nsMacUnicodeFontInfo::GetConverterAndCCMap
(aFontName=@0x1393c460, aConverter=0xbfffdd60, aCCMap=0xbfffdd64) at
nsMacUnicodeFontInfo.cpp:595
#7  0x07da3fd8 in nsUnicodeFontMappingMac::FontEnumCallback
(aFamily=@0xbfffdeec, aGeneric=1, aData=0xbfffdfe8) at
nsUnicodeFontMappingMac.cpp:227


roger, maybe you know what's going on since it's happening inside of the
GetConverter() function.
re: the assertions. They arise because this call
  const nsString* fontName =  info->GenericFontNameForScript(script,type);
is returning a double-byte fontname for certain scripts, and GetConverter()
calls GetEncoding() which first tries to convert the fontname to a single-byte,
and the assertions fire. The fix on for that side is to make GetEncoding()
double-byte safe.

re: patch, some side-notes. I am not sure what the plans are for the Mac
regarding having a list of fonts per each generic font name. What
GenericFontNameForScript() is doing is to pick a single (hardcoded) font. If a
generic font is remapped to a font list, "font1, font2", we might have to
recursively callback the font enumerator to populate the list of data (as we do
on Windows). Maybe if a list of fonts was used, it wouldn't be necessary to loop
again over the scripts. But as I said, I am not sure what are the plans here.
Cc:ing ftang so that he can keep that in these side-notes in his radar.

The patch fits with the current setup. I only had to add an "if ((BAD_SCRIPT !=
script)" to stay coherent, and make GetEncoding() double-byte safe.
Attached patch patch - take2 (obsolete) — Splinter Review
Attachment #94665 - Attachment is obsolete: true
Comment on attachment 94685 [details] [diff] [review]
patch - take2

sr=rbs (if the patch compiles :-)
Attachment #94685 - Flags: superreview+
ftang, care to r= this perf bug. The patch should also fix bug 161766, and maybe
other duplicates.
To quantify "terribly slow", see
ftp://ftp.mozilla.org/pub/data/loadtimes/daily_loadtime.html and check out the
OSX graph.
+    // XXX need tuning to list the appropriate fonts for the Mac
+    if (fontName.Equals(NS_LITERAL_STRING("Arial")) ||
+        fontName.Equals(NS_LITERAL_STRING("Times New Roman")) ||
+        fontName.Equals(NS_LITERAL_STRING("Courier New")) )

Can we fix this before we commit the patch please?
Also, we need to get Mac pageload numbers from before the MathML fixes, and
after (with this patch) and make sure that we have not regressed. We cannot let
MathML regress Mac pageload; pageload should have been tested before the changes
were landed.
More possible relateds: bug 161878 and bug 161887.
Per comment #19, just adding a printf in nsDeviceContextMac::GetSystemFont() to
get the fonts used at startup, namely the fonts used in the 'font:' short-hand:


font : caption | icon | menu | message-box | small-caption | status-bar | etc

Need help from someone who can apply the patch and report the output produced
so that the main patch can be updated.
Attached patch real helper patch (obsolete) — Splinter Review
Attachment #94874 - Attachment is obsolete: true
From comment 11, the perf problem is due to the fact that the ATSUI fallback is
used all the time -- this means that characters are processed *one at a time*!
So this fix might bring back the pageload graph close to where they were since
the dreaded ATSUI fallback won't be used anymore.

(Also hoping that this activity will help in fueling the drive for improving the
Mac font code at some stage. With the hindsight that has been built over time,
there are avenues for some overall improvement/re-organization of the code.)
No longer blocks: 161766
pageload tests on OS/X for 2002-08-01-08-trunk and 2002-08-02-08-trunk:
    8/01 - 1144 msec. (avg. med.)
    8/02 - 1357 msec. (avg. med.)
which is about a 19% increase. I should note, though, that two of the testpages 
account for the lion's share of the slowdown: lxr.mozilla.org, which went from 
~1300 ms to ~6500 ms, and www.w3.org_DOML2Core, which went from ~2230 ms to 
~3740 ms. (Those are also, not surprisingly, two of the most text-heavy 
documents in the set of test pages). Not counting those two test pages, the
increase was about 4% in time to load pages.

[I should also note that I suck for not picking this up at the time the 
regression occurred. Thanks to the people who tracked this down.]
Although informative, the tests are somewhat premature because the fix for this
bug is still out. There was an inadvertant omission in the landing, and
unfortunately it is causing one of the slowest conceivable codepath to be
executed: measuring a string one character at a time, drawing it one character
at a time, etc. I  am not sure why the ATSUI fallback was implemented that way.
The pageload tests confirm indeed that it can't be used by default. I suggest
waiting for this patch to land, as it will fix the problem that is activating
this dreaded fallback codepath. Upon checking of the fix, the execution will be
back to the initial default codepath.
rob, could you figure out the fonts used at startup so that we can be done with 
this bug? Are you using a fast machine where the perf decline wasn't apparent?
(Or alternatively, do you have a slow machine where Mozilla was/is just slow 
anyway :-)
This is a blocker to 1.1. adding to the 1.1 dependency list. Can someone give me
a time estimate here? If we can't get this corrected in a reasonably short
period of time, I'm inclined to back out the changes that caused it on the 1.1
branch. 
Blocks: 1.1
-->ViewSource
Assignee: harishd → doron
QA Contact: moied → pmac
the output on startup is:

GetSystemFont: ID=3, family=Lucida Grande
GetSystemFont: ID=3, family=Lucida Grande
GetSystemFont: ID=3, family=Lucida Grande

not viewsource. This is a layout bug. fixing summary and moving to layout.
Assignee: doron → rbs
Component: ViewSource → Layout
QA Contact: pmac → petersen
Is that the same output for both skins, classic and modern?
For information, what if you printf() in the callback function itself to see?
Re: comment #28
> Can someone give me a time estimate here?

Once the fonts used at startup are known, it should be ready to go. Today or 
tomorrow -- if reviews drag on.
You need to use different fonts for Mac OS 9 and Mac OS X. On Mac OS 9, the user
can choose their system font (Gadget/Chicago etc etc).
Attached patch patch - take 3 (obsolete) — Splinter Review
the fonts are the same for the classic and modern themes.

on my machine (classic environment) the os9 appearance control panel allows for
the "Large System Font" to be "Charcoal," "Chicago," "Capitals," "Gadget,"
"Sand," "Techo," or "Textile."	it allows the "Small System Font" to be
"Geneva."  on osx i think the system font can currently only be "Lucida
Grande," but i suppose that could change in the future.

is something like this patch what you had in mind?
Missing "Lucida Grande" in the check list.
Nope, it is there. It was the (in)famous Mac-binary attachment problem...
Any r= on schofield's patch to move it forward?
Keywords: review
Confirming additional comment # 35, these are the available Large System fonts
under Mac OS 9.2.2 English-North American.
Comment on attachment 95037 [details] [diff] [review]
patch - take 3

r=jfrancis;
note: i dont know diddly about mac font issues; so get another review if you
need assurance there.
From a line-by-line of view,it looks ok.  I do wonder if we have a hard and
fast guarantee about
::FontToScript(fontNum)) never returning a script that is greater than the size
of the mScriptFallbackFontIDs array.  Does that [] operator do bounds checking?
Attachment #95037 - Flags: review+
Good point. In fact, looking into details, I see that:

for(ScriptCode script = 0; script < 32 ; script++)
{   ^^^^^^^^^^^^^^^^^
	const nsString* fontName =  info->GenericFontNameForScript(script,type);	
[...]
        ScriptCode script = (converter ? BAD_SCRIPT : ::FontToScript(fontNum));
        ^^^^^^^^^^^^^^^^^
The running "script" was overwritten.

Updated patch coming up. Actually, since there isn't any generic font that is
mapped to a special symbolic font (as yet), the converter business isn't
necessary there.
Attached patch patch - take 4 (obsolete) — Splinter Review
patch to be checked in
Attachment #94685 - Attachment is obsolete: true
Attachment #94875 - Attachment is obsolete: true
Attachment #95037 - Attachment is obsolete: true
Attachment #95125 - Attachment is obsolete: true
Checked in. Just cvs update mozilla/gfx/src/mac to get this fix in your tree.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Following bug 107146, these are the fixups that have been made so far. The
fixups include this bug and the one-liner that was made for bug 161048.
Seeking a= on the m1.1 branch.
Attachment #95146 - Flags: approval+
Comment on attachment 95146 [details] [diff] [review]
proposed fixups against the m1.1 branch

a=asa (on behalf of drivers) for checkin to 1.1
actually, wait. Where's the super-review for this code?
Comment on attachment 95146 [details] [diff] [review]
proposed fixups against the m1.1 branch

posthumous sr=sfraser
Attachment #95146 - Flags: superreview+
Comment on attachment 95146 [details] [diff] [review]
proposed fixups against the m1.1 branch

r=scc
Attachment #95146 - Flags: review+
Per comment 18, anyone know why page load on OS X dropped so dramatically
following the spike up? The last data point is well below what it was before
MathML went in.
Fixed on the branch.
this seems to have caused bustage on the mac opt 1.1 build.
> this seems to have caused bustage on the mac opt 1.1 build.

my gcc build seems to work ok.
Mac builds use CodeWarrior, not gcc. However, Tinderbox seems to have cleared
itself up.
Blocks: 160887
My tests suggests that we regained the performance loss. These builds aren't
from the exact day of the changes but they're close and the data seems reasonable.

1.0 final - 1780 ms (average median) 1.0 datapoint
    07/31 - 1719 ms (average median) before rbs regression
1.1 08/10 - 2085 ms (average median) after rbs regression
1.1 08/14 - 1734 ms (average median) after rbs fix

I'm not sure if that means we're all the way back down but it's back to within
1% when it had deviated about 21%. I only did one run of the test and I didn't
write down the specific pages and their changes but it seems we're pretty much
back to normal. John, does this look/sound right to you?

All tests performed on my G4500 with 256MB RAM. Each test was run with a new
profile on a fresh install. 
Asa, thanks. Yeah, sounds pretty good. I can't attach significance to 
difference below 1%. I'll give this a spin on another OS/X box tomorrow,
but in the interim, looks fine.
Blocks: 161842
You need to log in before you can comment on or make changes to this bug.