Closed Bug 230776 Opened 20 years ago Closed 20 years ago

Component.classes enumerator includes non-existent contract id's

Categories

(Core :: XPCOM, defect)

defect
Not set
minor

Tracking

()

RESOLVED FIXED
mozilla1.8alpha1

People

(Reporter: mbockelkamp, Assigned: malcolm-bmo)

References

Details

Attachments

(1 file, 1 obsolete file)

The about:about page seems to include recently entered mistyped about: urls. Try
this:

1. Type about:bla into the location bar and press enter
2. Type about:about into the location bar and press enter
3. See about:bla in the list of about: urls
Confirming with a current trunk build on Linux.

-> Browser/XP Apps
Assignee: general → jag
Status: UNCONFIRMED → NEW
Component: Browser-General → XP Apps
Ever confirmed: true
QA Contact: general → pawyskoczka
Just a guess (and off the top of my head). about:about calls do_GetService to 
check whether we have a factory for a given protocol. do_GetService maintains a 
hash of {contract id => factory}, and also creates entries in that hash 
for 'bogus' contract id's (giving the a no-such-factory factory).

Is the Components.classes enumerator failing to skip contract id's which don't 
have a valid factory?

All of the above gleaned after about ten minutes with LXR, so it may be 
complete rubbish.
Comment 2 sounds about right, actually.
Using xpcshell:

js> var i = 0; for (var cid in Components.classes) ++i;
960
js> Components.classes["foo"]
js> var i = 0; for (var cid in Components.classes) ++i;
961
js> var i = 0; for (var cid in Components.classes) if (!Components.classes[cid])
print (cid);
foo
js>

Note that Components.classes["foo"] is just another way to generate a 'dummy'
entry (a mapping to the kNonExistentContractID factory), even though it's not
exactly the same path that's taken for do_getService.

The code to change is somewhere downstream from here:
http://lxr.mozilla.org/seamonkey/source/xpcom/components/nsComponentManager.cpp#3382
... though it's not at all clear (to me) at what point we could insert a check
for a dummy entry.

-> XPCOM. I wonder if this affects anything else.
Component: XP Apps → XPCOM
Assignee: jag → dougt
> I wonder if this affects anything else.

This also affects ChatZilla's syntax highlighting. CZ enumerates
Components.classes to work out what schemes we support. The following
demonstrates this:

1. Start Mozilla.
2. Start CZ.
3. Join a network and /join #foo
4. Type http://example.com/foo. Note that the result is a link.
5. Type httpa://example.com/foo. Note that the result is not a link.
6. Switch back to Mozilla and navigate to httpa://example.com/foo. You'll get
the 'httpa is not a registered protocol' message, and the Contract ID will be
created in the Components.classes collection.
7. Close CZ (it caches its list of schemes) and repeat steps 2-5. This time,
httpa://example.com/foo will be a link.

I suspect that Venkman is also affected, but I don't know to what extent. I
mention this here (rather than open a new bug) since: a. it's very minor, and b.
I imagine we'll fix it by fixing the Components.classes enumerator.

What reason is there for Components.classes to behave in such an odd fashion?
Are the performance characteristics of PLDHash so strange that a search for a
non-existent entry is significantly worse than bloating the hash with
'not-found' values?

The only real comment is in nsComponentManagerImpl::GetFactoryEntry:
1608     // If no mapping found, add a special non-existent mapping
1609     // so the next time around, we dont have to waste time doing the
1610     // same mapping over and over again

Morphing bug. Old summary:
about:about displays about: entries from location bar history
Summary: about:about displays about: entries from location bar history → Component.classes enumerator includes non-existent contract id's
The non-existent entry was introduced as part of bug 94883, though there's no 
indication there as to why that part of the change was necessary.
Attached patch Remove non-existent mappings (obsolete) — Splinter Review
This patch removes the {Contract ID => 'non-existent' nsFactoryEntry} mapping
from nsComponentManagerImpl. For good measure, it also removes two unused
overloads of HashContractID and removes the unused 'cidKey' parameter from
GetFactoryEntry.

The patch:
* appears to work; I'm browsing with it now.
* fixes all the problems noted in this patch (comment 0, comment 4, comment 5).

* /intuitively/ should be faster, since we no longer bloat the hash with
'non-existent' entries. I have not tested performance explicitly, though.
Comment on attachment 139063 [details] [diff] [review]
Remove non-existent mappings

Seeking r= and sr=. Comments welcome.
Attachment #139063 - Flags: superreview?(dougt)
Attachment #139063 - Flags: review?(alecf)
Assignee: dougt → malcolm-bmo
Status: NEW → ASSIGNED
The comment explains why it was done:

-    // If no mapping found, add a special non-existent mapping
-    // so the next time around, we dont have to waste time doing the
-    // same mapping over and over again

Is there some code/script out there that does a lot of querying of non-existent
ID's?
> Is there some code/script out there that does a lot of querying of
> non-existent ID's?

I don't know (but I could add hooks to check). Even if there were, the only
situation I can see this being a win would be if:

* we have a small number of 'core' non-existent Contract ID's that we look up at
high frequency;
* we have a heavily-loaded hash with lots of collisions, so there are long
chains coming out of every bucket;
* we manage to insert the 'non-existent' entry early in the hash chain, which
implies that the first lookup must occur early; and
* to avoid bloating the hash, we only look up a small set of non-existent
Contract ID's over the lifetime of XPCOM.

That way, we could use the dummy entry as an early-abort marker, instead of
travelling down the chain each time. Now, that's a pretty unlikely scenario, in
my opinion; furthermore, our double hashing implementation can switch from
linear lists coming out of a hash bucket (normal hashing) to a double-hashing
system, which I doubt would benefit from the current behavior.

I'm not an expert in our PLDHash implementation by any means, so I might be
wrong about some of the above; brendan would know for sure.
Ok, I've done some investigation. I started from clean, waited for the load of 
http://www.mozilla.org to fail (no network connectivity on my build machine at 
the moment), loaded about:, then quit).

On first start (during auto-registration), we have about 800+ calls to 
GetFactoryEntry for non-existent (but all different) class id's. Subsequent 
startups use compreg.dat, so the classes are already registered (though there 
are still two calls that occur before autoreg occurs).

Once we've started, there are a few failed lookups for contract IDs that don't 
exist in my build: xremoteservice, xsession-roaming, xpcom/leakdetector, and 
security/entropy. There are also a larger number of failed lookups for the 
contract "@mozilla.org/rdf/resource-factory;1?name=http", and likewise for 
other schemes (particularly urn: and chrome:). These calls originate from 
RDFServiceImpl::GetResource, but (probably due to the RDF service's cache), the 
number of calls to GetFactoryEntry for this contract (about 80) was dwarfed by 
the number of calls to GetResource (about 2000 with a scheme).

In summary, I'm not seeing anything obvious that would lead me to think that 
creating 'not-found' entries in the hash would be at all helpful.
Attachment #139063 - Flags: superreview?(dougt)
Attachment #139063 - Flags: superreview?(alecf)
Attachment #139063 - Flags: review?(dougt)
Attachment #139063 - Flags: review?(alecf)
Target Milestone: --- → mozilla1.7alpha
I didn't go digging far enough in bonsai for comment 6; this optimisation was 
*first* introduced in nsRepository.cpp rev 3.47, by waterson. Then, it *was* an 
optimisation. If the Contract ID (then ProgID) to CLSID mapping failed, we hit 
the registry to look it up, every time.

Nowdays, we preload the component registry, and GetFactoryEntry has no fallback 
for looking up non-existent entries, and so this 'negative caching' entry has 
no reason for existing any longer.
*** Bug 231434 has been marked as a duplicate of this bug. ***
Comment on attachment 139063 [details] [diff] [review]
Remove non-existent mappings

interesting arguments. They seem valid to me, and I agree that the creaton of
these dummy entries seems pretty silly. Please watch Ts/Txul/Tp after this
checkin to see if this helps. 

I would hope that a hashtable-miss wouldn't be any more expensive than a
hashtable-hit, but I guess there's the crux of it - right now the code assumes
that a hit is better than a miss. 

in any case, lets give this thorough testing, as I can see how this might now
confuse RDF and the like. (in which case we might want to fix RDF rather than
keep this broken behavior)

doug? what are your thoughts.

sr=alecf
Attachment #139063 - Flags: superreview?(alecf) → superreview+
*** Bug 232049 has been marked as a duplicate of this bug. ***
Comment on attachment 139063 [details] [diff] [review]
Remove non-existent mappings

the patch looks okay and I am sure that it fixes exposing this special
non-existance id, but I am not sure if the approach is correct.

My concern is performance.... Malcolm makes good points.

Timeless, how about getting some numbers of startup and pageloading.

If they look good, r=dougt.
dougt: erm, while we have hit this at work, i don't have resources to toss at
the problem. i merely specified you as the reviewer (correcting malcom's choices).
(In reply to comment #14)
> I would hope that a hashtable-miss wouldn't be any more expensive than a
> hashtable-hit, but I guess there's the crux of it - right now the code assumes
> that a hit is better than a miss.

Alec, the code assumes a miss is more expensive because we used to force a disk 
access after every miss, hence a miss *was* more expensive. We don't do that 
any more - see comment 12.

> in any case, lets give this thorough testing, as I can see how this might now
> confuse RDF and the like.

No client should be able to notice the difference between the old and new 
behaviour (at least, I certainly hope it shouldn't). With the exception of the 
edge case that this bug was originally reported against, the code makes darn 
sure that 'matches' against the 'not-found' factory are not reported.


I'd also like to know what kind of effect this patch has on performance; from 
the analysis I've done, I'd expect that there should be little-to-no time 
difference and possibly a *slight* reduction in bloat. Is there any way that I 
can help to test any of this? (I've only got access to a Windows PC - can I run 
any of the existing tests?)

Or would a better approach be to just check this in at the start of 1.8a and 
watch the tinderboxen?

doug: Can you review the current patch for correctness?
Target Milestone: mozilla1.7alpha → mozilla1.8alpha
I've measured Ts pre- and post-patch using the information at
http://www.mozilla.org/performance/tinderbox-tests.html. The end result is that
the patch does not appear to affect Ts at all (with a CI of 95%). However, the
results I got exhibited quite a large variance (up to a second over the 15 runs
I measured), so it may be that there was too much noise.
I've made a fresh patch, as the previous one no longer cleanly applies. I've
transferred alecf's sr, and re-requested r= from doug.
Attachment #139063 - Attachment is obsolete: true
Attachment #139063 - Flags: review?(dougt)
Attachment #146037 - Flags: superreview+
Attachment #146037 - Flags: review?(dougt)
Comment on attachment 146037 [details] [diff] [review]
Remove non-existent mappings (unbitrotted)

r=dougt
Attachment #146037 - Flags: review?(dougt) → review+
bz checked this in for me. As expected, it doesn't appear to have affected Ts,
Txul, or Tp at all. Marking FIXED.
Status: ASSIGNED → RESOLVED
Closed: 20 years ago
Resolution: --- → FIXED
mkaply: Since attachment 146037 [details] [diff] [review] was checked in, the AliveTest on the OS2GCC
build has started failing. Could you check whether this checkin causes any
problems for OS/2?
We do a clobber build at midnight.

If it is still orange after that, I'll check it out on Monday.
(In reply to comment #24)
> We do a clobber build at midnight.
> If it is still orange after that, I'll check it out on Monday.

That fixed it. Weird.
*** Bug 244535 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.