Closed Bug 25963 Opened 25 years ago Closed 24 years ago

excessive string conversion resolving :visited style

Categories

(Core :: CSS Parsing and Computation, defect, P1)

defect

Tracking

()

VERIFIED FIXED

People

(Reporter: waterson, Assigned: dbaron)

References

()

Details

(Keywords: perf, Whiteboard: [HAVE FIX] [nsbeta3+])

Attachments

(5 files)

The style system code that resolves whether or not a link has the :visited
pseudo-attribute uses a version of NS_MakeAbsoluteURI() that does some pretty
hefty string twiddling. Specifically, it allocates temporary storage from the
heap, and does a pile of one-to-two-byte conversion (and vice versa).

Since the anchor's HREF is gonna be in double-byte, and history expects a
single-byte string, we're screwed and have to do this at least once; however, we
should make sure that we don't do it -more- than that.

I'll post quantify data on how much time this is taking.
Keywords: perf
Changing component to Style System
Assignee: troy → pierre
Component: Layout → Style System
QA Contact: petersen → chrisd
The description isn't very clear.
NS_MakeAbsoluteURI() is Necko code: changed component and reassigned to owner.
Assignee: pierre → gagan
Component: Style System → Networking
QA Contact: chrisd → tever
pierre: i'll look at this. the problem is that we are converting URLs back and
forth between single- and double-byte representations. The Necko APIs that the
style system chose are hog-tied and have no choice but to allocate from the heap.

So, we need to fix the style system (and possibly the nsILinkHandler interface)
to make better API choices.
Assignee: gagan → waterson
Component: Networking → Style System
Thanks Chris. As you guessed, I'd prefer to leave you the initiative on that 
problem. When a solution is defined, replacing the calls to NS_MakeAbsoluteURI() 
shouldn't be a big deal.
Status: NEW → ASSIGNED
Target Milestone: M15
Target Milestone: M15 → M16
After checking in changes to use nsIURI objects instead of raw strings, we still

see that nsCSSStyleSheet::SelectorMatches() is dominated by NS_NewURI() (~50%)

and nsWebShell::GetLinkState() (~25%). I'm going to push this bug to M18, and

revisit when we get the new string APIs landed. This will be an obvious

candidate for using the new APIs.

Target Milestone: M16 → M18
We used to see a lot of time going to getting the IO service. I presume that's 
gone now. Another thing to investigate is converting nsIURI over to talk about 
wstrings. That might help minimize the number of translations.
We're still using the global to avoid hitting the service manager.

I didn't dig into why NS_NewURI() was expensive (I assume string operations).

The 25% in GetLinkState() is dominated (80%) by nsIURI::GetSpec(); I didn't dig 
further to see why.
*** Bug 36953 has been marked as a duplicate of this bug. ***
One thought about reducing the need for matching :link and :visited :  Some of
the rules in ua.css apply to *both* :link and :visited, which makes the check
for whether the link is visited somewhat pointless.  Perhaps you could create a
new pseudo-class called ":-moz-any-link" and change all the rules in ua.css
except the ones that need to differentiate to use this new pseudo-class and
avoid the :link/:visited tests.  If you want to do this, I could figure out the
necessary ua.css changes (and how many :link/:visited selectors would be
needed).

It might also be possible to make this optimization in the style system itself
so that it could apply to web pages that use :link and :visited - by looking
(during parsing?) at rules with :link and :visited to see if they can be
condensed.
probably can fix with 12493.
Assignee: waterson → pierre
Status: ASSIGNED → NEW
Pushing to M20 like bug 12493

Status: NEW → ASSIGNED
Target Milestone: M18 → M20
This bug is a dogfood bug for me. It causes Mozilla to lock up my machine 
for minutes at at time when loading CVS blame pages in LXR, like:
<http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/editor/base/
nsHTMLEditor.cpp>

So I can't use Mozilla for lxr and friends, basically.
Keywords: dogfood
I checked in changes to nsCSSStyleSheet.cpp for bug that push the canonifying
code into the anchor tag, to be done once per link. This probably goes some of
the way towards fixing this bug (see
http://bugzilla.mozilla.org/show_bug.cgi?id=29611).

We could continue to improve things by creating a special "nsILinkElement"
interface (or something) that avoids copying strings altogether. I'll need to
re-profile and see how much string conversion is actually going on now.

pierre: I'm stealing this back for now.
Assignee: pierre → waterson
Status: ASSIGNED → NEW
sfraser - let us know itf the partial waterson checked in helps enough to take 
this off of dogfood for you.  thanks!
Sorry, no bananas. Loading < http://bonsai.mozilla.org/cvsblame.cgi?file=mozilla/
editor/base/nsHTMLEditor.cpp> still makes my mac unusable for about 2 minutes. 
Offending stack is:

...
  095F0670    PPC  1C296448  CSSRuleProcessor::RulesMatching(nsIPresContext*, 
nsIAtom*, nsIContent*, nsIStyleContext*, nsISupportsArray*)+00170
  095F05D0    PPC  1C28BDD8  RuleHash::EnumerateAllRules(nsIAtom*, nsIAtom*, 
const nsVoidArray&, void (*)(nsICSSStyleRule*, void*), void*)+00278
  095F0530    PPC  1C2961A8  ContentEnumFunc(nsICSSStyleRule*, void*)+0004C
  095F04E0    PPC  1C295454  SelectorMatches(nsIPresContext*, nsCSSSelector*, 
nsIContent*, int)+00A28
  095F0110    PPC  1C328720  nsHTMLAnchorElement::GetAttribute(int, nsIAtom*, 
nsString&) const+00030
  095F00D0    PPC  1C31C224  nsGenericHTMLElement::GetAttribute(int, nsIAtom*, 
nsString&) const+00270
  095F0010    PPC  1CBF5E08  nsString::Length() const+014E8
  095EFEF0    PPC  1CC1BA9C  nsString::SetLength(unsigned int)+00038
  095EFEB0    PPC  1CC1BB30  nsString::SetCapacity(unsigned int)+00038
  095EFE70    PPC  1CC1142C  nsStr::GrowCapacity(nsStr&, unsigned int)+00060
  095EFE10    PPC  1CC1132C  nsStr::EnsureCapacity(nsStr&, unsigned int)+00038
  095EFDC0    PPC  1CC12A9C  nsStr::Realloc(nsStr&, unsigned int)+00058
  095EFD60    PPC  1CC12930  nsStr::Alloc(nsStr&, unsigned int)+0007C
  095EFD10    PPC  1CBED6A8  nsAllocator::Alloc(unsigned int)+00064
  095EFCD0    PPC  1CBED4F8  nsAllocatorImpl::Alloc(unsigned int)+00014
  095EFC90    PPC  1D7EBDC0  PR_Malloc+00014
  095EFC50    PPC  1D852758  malloc+0007C

I think we really need to start sharing strings here.

We also need to to be yeilding to the user event queue much more frequently here, 
to make the app more responsive. While this page is loading in 4.x, I can keep 
typing in another app. Loading it in mozilla just brings my whole machine to its 
knees.
Putting on [dogfood+] radar.
Whiteboard: [dogfood+]
Well, I guess this had better be M17 then.
Status: NEW → ASSIGNED
Priority: P3 → P1
Target Milestone: M20 → M17
It turns out 12% of the time is spent in nsVoidArray::InsertElementAt(), which 
should probably use a better allocation algorithm. I'm investigating.
Whiteboard: [dogfood+] → [dogfood+] investigating
Talked with sfraser about this: apparently pink's checkin fixed the problem, 
which was Mac specific. We *still* are wasteful here with string conversion; 
moving out to M20 when NEW_STRING_APIS will be better defined and we'll have 
better tools to deal with this. Marking nsbeta3.

sfraser: if this is still dogfood, let me know. also, there are a bevvy of other 
bugs related to flowed documents that are too long that i'm looking into right 
now: see 19051, 26028, 26030.
Keywords: dogfoodnsbeta3
Whiteboard: [dogfood+] investigating
Target Milestone: M17 → M20
I just did a jprof profile of loading a huge hiprof profile, and decided to look
into this since SelectorMatches took almost a third of the total time spent (I
had to kill the process before it finished, because I don't know if it would
have finished).

I'm not sure if any string conversion is needed.  nsHTMLAnchorElement::GetHref
converts one-byte to two-byte, and returns the value.   nsWebShell::GetLinkState
then converts back to one-byte.  Keeping the string one-byte throughout this
whole chain should save a good bit of time (hopefully it won't even need to be
duplicated).  Almost 20% of the time in SelectorMatches is spent in
AppendWithConversion (nsString and nsCString), and about 10% in constructors and
(mostly) destructors for the strings created.  There are also some things in
IsValidLink() that ought to be cleaned up (I don't think the author of the
function expected it to be run quite as much as it is -- so that every little
bit counts).

I should probably post this profile somewhere...
David, your idea to prevent unnecessary one-two-one byte string conversion is 
an excellent one. We should try this out and compare the profiles (something you 
are probably already doing, right?).

Also, I am curious what you think should be cleaned-up in the IsValidLink 
method... I indeed expected it to be run pretty often, but it has a lot of work 
to do when there IS a link. Please advise if you have ideas on how to improve 
it.
It's hard to tell from the hiprof profile testcase exactly how much this helps. 
The main problem is that the page won't finish loading (it crashes first).  The
first profile I ran, I stopped after a few minutes.  It seems like
SelectorMatches doesn't start getting called until about 3 minutes have passed,
and in my baseline profile I stopped the load during the phase where it was
being called.

In any case, the profile is a good bit cleaner and it looks like a good bit (40%
??) of the time in SelectorMatches has been eliminated.  (This figure is based
on the assumption that, since I cut off the first run before the end of the
styling phase, the time in nsGlobalHistory::GetLastVisitDate would stay
constant.  If I use other assumptions, I get smaller improvements, but I think
that one may be more reliable.)  That figure is still only for an obscure edge
case - a page where *all* the elements are links (within one huge PRE).  For
more normal pages, the improvement would be *much* smaller.

If I'm in the mood for backing these diffs out of my tree sometime, I'll try
nsCSSFrameConstructor.cpp in LXR as a less demanding testcase...
- How about "nsILink". Since QI() is expressing an "is-a" relationship, it makes
sense to say that "an nsHTMLAnchorElement is-a nsILink". nsILinkable makes it
sound like you can "link against it" or something weird.

- Use "const char**" rather than "const char*&"; use "PRBool*" rather than
"PRBool&". It indicates that these are clearly mutable out parameters in the
calling code.

- In nsCSSStyleSheet, since we're touching "nsIAtom* contentTag" anyway, why not
make it an nsCOMPtr and avoid the inevitable flow-of-control problems that will
ensue later.

Otherwise, this looks good!
One problem I see is that the nsHTMLLinkElement::GetHrefBuffer sets the 
mCanonicalBuffer to an empty-string if there is no HREF, but in IsValidLink you 
check for it to be null to determine if there is no HREF. I think that is going 
to resurface bug 23209.

I cannot agree with Chris' second point, but I do agree with his other two 
points. Also, I agree that the changes look good (excepting the comment above).
Chris:

1) done.  Hopefully nobody else will want nsILink for something more 
significant...  (Should we use nsIHTMLLink??)

2) I also prefer pointers over reference variables (since I agree that 
reference parameters make code hard to understand), but I was trying to stick to 
the style already existing in the code.  In nsCSSStyleSheet IsValidLink and 
IsSimpleXLink were actually the only functions I could see (except maybe one 
other) that used reference params for out params, so I changed those.  However, 
I left GetHrefBuffer the way it was to match everything else in content nodes.

3) I guess.  I'll have to check that nsCOMPtr stuff doesn't show up in the 
profile...

Marc:
I'm using the empty string as a dummy value to say that we have a cached value 
but that cached value is null.  Even when it's set to an empty string, I still 
return null.


Actually, a bigger change in the changes is to make Area and Link elements 
return the canonical HREF.  I tested 4.x's DOM, and this is what AREA elements 
do in the 4.x DOM, so I think it's right.  I still need to get the test to work 
in Mozilla.  This probably also means there are places where NS_MakeAbsoluteURI 
could be removed elsewhere...
David, I see what you mean, I mesread the statement:

if (*mCanonicalHref) {
   aBuf = mCanonicalHref;
}

Given that you are checking for a null-string here an not a null-ptr as I 
originally thought, is a check for a null mCanonicalHref necessary here since 
the preceeding call to strdup may return null?
Taking this bug so that I remember that I ought to check this stuff in when the
tree opens for nsbeta3.

Marc:  I added the additional null check in my tree.

I realize the another thing that we could be doing (that would speed things up
even more) is caching the *visited state* of the link in the link element.  If
we don't do that, we should definitely work on improving the performance of
nsGlobalHistory::GetLastVisitDate (waterson said that there is a new Mork API
that allows this to be done faster).  What do others think?  Caching the visited
state could make it harder to implement re-coloring of links while a page is
shown, but then again, that seems hard to implement anyway without big
performance problems...
Assignee: waterson → dbaron
Status: ASSIGNED → NEW
filed bug 45668 for global history rework.
Target Milestone: M20 → M18
[HAVE FIX] for perf issue. PDT please approve checkin.
Whiteboard: [HAVE FIX]
dbaron: did you add eLinkState_Unknown to nsILinkHandler, as well?
With these changes, plus the changes from bug 19051, total time to layout "a 
long LXR page"

http://lxr.mozilla.org/mozilla/source/rdf/content/src/nsXULTemplateBuilder.cpp

Drops from 35s to 20s. (With 19051 alone, it drops from 35s to 25s, so this 
shaves off another 5s.) For reference, 4.x lays out the page in ~18s; IE4 lays 
it out in ~16s.

I looked at the changes, and feel "familiar enough" to say r= for the whole lot 
of them. There are changes to embedding, HTML content, and style, so if 
valeski, jst, attinasi, or pierre wants to double-check, please speak up now!

Great work, dbaron!
Since we are caching the link state now are we at risk of not reverting a 
visited link to a non-visited link when the user clears their history (while he 
same page is up)? More generally, it seems that by not going to the history to 
get the link status we loose the ability to be 'in-synch' with the history 
(out-of-date may act differently too if the link goes out-of-date while the page 
is up). That said, I think the performance benefit outweighs the picky 
correctness issue with clearing history while a page is up, so I'm all for it.
Status: NEW → ASSIGNED
Keywords: reviewapproval
a=waterson. I've been running with these changes for a while now and they rock. 
Like Cleveland.
plusing, run forrest, run!
Whiteboard: [HAVE FIX] → [HAVE FIX] [nsbeta3+]
Fix checked in, 2000-07-27 16:17.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
marking verified per engineer's comments
Status: RESOLVED → VERIFIED
*** Bug 12713 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: