Closed Bug 195262 Opened 22 years ago Closed 21 years ago

static atom tables

Categories

(Core :: XPCOM, defect, P2)

x86
Windows 2000
defect

Tracking

()

RESOLVED FIXED
mozilla1.4alpha

People

(Reporter: alecf, Assigned: alecf)

References

Details

(Keywords: memory-footprint, topembed-)

Attachments

(4 files, 7 obsolete files)

After some conversations with brendan, darin, and doug, I started to wonder how
hard it would be to use gperf's generated code to make static atom tables. Turns
out not to be too hard. 

here's how it works.

XPCOM Support
-------------
- in XPCOM, we define a new concrete class, nsStaticAtom:
  class nsStaticAtom : nsIAtom {
    ...
    const char* mValue;
    ...
  }
  In this class, AddRef/Release are no-ops (they return 2 and 1 respectively)
- we define a new function NS_RegisterAtomLookupFunction(...) which takes a
function that looks up an atom value. The function looks like:
  nsStaticAtom* lookup(const char* str, unsigned int len);
  (this is the signature of the function generated by gperf)
- when an atom is looked up, we look in the global hashtable for the atom. If it
is not found, then we call each lookup function that has been registered. If a
match is found in a lookup function, we return that. If it is not found, we
create the atom and add it to the global hashtable.

Clients
-------
- the client has a wordlist, like html-atoms.txt that looks like:
  class nsStaticAtom {};
  %%
  atom1
  atom2
  atom3
  .. etc ..
- gperf is used to generate the C++:
  % gperf --language=C++ --class-name=nsHTMLAtoms --duplicates 
    --key-positions=* --struct-type --slot-name=GetValue() --omit-struct-type 
    --global < html-atom-list.txt > nsHTMLAtomsImpl.cpp
- the generated code is included in some larger file, which handles the
registration:
  #include "nsAtomTable.h"
  #include "nsHTMLAtomsImpl.cpp"

  nsresult NS_RegisterHTMLAtoms()
  {
      return NS_RegisterStaticAtoms(nsHTMLAtoms::in_word_set);
  }

And that's about it. 

A few details:
- I initially thought that for performance, we'd call the lookup functions
first. Actually, the global hashtable should always be consulted first, because
it is possible for someone to look up an atom before all the lookup functions
are registered.
- So then I thought, what's the gain here? Is there really a performance win if
we have to consult the dynamic structure first? The real win is that we aren't
creating hundreds of atoms on the heap - instead they're permenant, readonly
structures mapped directly into memory. Currently all dynamic atoms make copies
of the strings that created them, which is bad. I see 2019 atoms created in the
tinderbox logs just for startup stuff, so even if the average string size of an
atom is 8 characters * 2 bytes/char + the 4 byte overhead of an atom thats a
potential savings of about 40k of of heap (that lives for the lifetime of the
product)
- the one problem I've run into is nsIAtom::GetUnicode() which wants shared
access to the internal unicode buffer.. which there wouldn't be for these atoms
- they hold raw char* strings. My thought was that since nsIAtom isn't frozen,
that we just dump GetUnicode() or even provide an attribute AUTF8String value;
Most atoms are just ASCII inflated into PRUnichars so the conversion would be
cheap. (and for nsStaticAtoms, we could decide that they are ASCII by
definition, making the GetValue() implementation easy)
Oh, I realized the one place this will also help us in terms of performance is
startup, because we won't have to go through specialized operators and
constructors to create these objects.
Status: NEW → ASSIGNED
Keywords: footprint, topembed
Priority: -- → P2
Target Milestone: --- → mozilla1.4alpha
alec: so, when i mentioned http yesterday, i forgot the obvious case: http needs
to flatten atoms in order to build up the http request.  so, converting an atom
to its string equivalent shouldn't be too costly.  nsHttpAtom looks like this:

struct nsHttpAtom
{
    operator const char *() { return _val; }
    //...
    const char *_val;
};

so, i'm sure if we want to convert HTTP over to this, then we are going to take
some probably tiny perf hit (unless nsI?Atom can likewise provide inline access
to the |const char *| equivalent).
so the one issue I'm having w.r.t. to direct access to the internal buffers is
that the current atom implementation stores PRUnichar*'s internally.. that means
that the two implementations are mis-matched. 

however, I looked at many of the implementations that currently call the
[shared] nsIAtom::GetUnicode(const PRUnichar**) and almost all of them want to
compare the atom to some string.

So I'm thinking that I'm going to add two methods to nsIAtom:

boolean equalsUnicode(in AString unicodeValue);
boolean equalsUTF8(in AString utf8Value);

would having an 'equals' type method help you with the string flattening stuff?

otherwise, what I may end up doing is changing the current atom implementation
to internally store UTF8, and change GetUnicode() to GetUTF8() or something.

alec:

Equals wouldn't do it.  HTTP needs to copy the string-equivalent of the atom
into a string buffer.

  nsHttpAtom::Content_Type --> "Content-Type"

so it can generate a HTTP request.

BTW: why do we need nsIAtom to be an interface?  why don't we just have nsAtom?

  class nsAtom
  {
  public:
    const char *get() const      { return mAtomString; }
    operator const char*() const { return mAtomString; }
  private:
    char *mAtomString;
  };

this could still be an abstract base class if need be.  i'm pretty sure we said
long ago at one of the API review meetings that nsIAtom would never be frozen.
Hmm... well for one, I think atoms need to remain scriptable.. I know dougt said
they shouldn't be, but they're used in JS and have been for a year or two. 

In any case I need to look at consumers. Sure, it would be nice to get HTTP
using the global atoms, but there are a lot of other consumers out there and I
need to make sure that I'm not making THEM jump through hoops to make HTTP happy.
agreed, though "nsAtom : nsIAtom" isn't inconceivable, right? ;-)
oh, heh. good point :)
yeah, I'm redesigning nsIAtom. I've made it through roughly half the consumers
and there isn't any major loss to dropping GetUnicode() and switching over to
ToString() or GetUTF8String(const char*)

(darin: I'd like to address the nsAtom : nsIAtom issue in a seperate bug - since
this design doesn't preclude what you're asking for)

Here's the updated nsIAtom, let me know what you guys think:

[scriptable, uuid(3d1b15b0-93b4-11d1-895b-006008911b81)]
interface nsIAtom : nsISupports
{
  /**
   * Get the Unicode or UTF8 value for the string
   */
  AString toString(); 
  AUTF8String toUTF8String();
  
  /**
   * Return a pointer to a zero terminated UTF8 string.
   */
  [noscript] void getUTF8String([shared, retval] out string aResult);

  /**
   * Compare the atom to a specific string value
   */
  boolean equalsUTF8(in AUTF8String aString);
  
  boolean equalsUnicode(in AString aString);
};

equalsUTF8 and equalsUnicode have been especially useful in the conversion.
how about "equals(in AString)" instead of "equalsUnicode(in AString)" ... seems
to better parallel toString/toUTF8String, etc.
alec: also, punting nsAtom : nsIAtom to later is cool by me ;-)
Discussed in edt.  We need more info.  alecf, can you quantify the potential
footprint reduction?
working to figure that out right now.
ok, here's the first patch. This changes the nsIAtom API and all callers of the
old GetUnicode() - I tried to switch to ToString / ToUTF8String / GetUTF8String
where appropriate - many times code was greatly simplified. You'll see massive
changes in the windows gfx implementation, because I changed a bunch of stuff
over to "const nsString&" from "nsString*" to simplify some code
Comment on attachment 116354 [details] [diff] [review]
update the nsIAtom API and callers

darin/doug/brendan - would love to get some reviews on this so I can move
forward with the gperf work.
Attachment #116354 - Flags: superreview?(darin)
Attachment #116354 - Flags: review?(dougt)
dbaron, I could also use your review here in case dougt/darin are busy.

dbaron: I'm contemplating adding a 2nd field to nsStaticAtom, |nsIAtom**
mSourceAtom;| which would allow the static atom table as generated by gperf, to
also function as the atom mapping table which you implemented to reduce the size
of the AddRefAtoms() stuff.
Comment on attachment 116354 [details] [diff] [review]
update the nsIAtom API and callers

A few comments without looking at this in detail:

> interface nsIAtom : nsISupports

I tend to think all new methods should be [noscript] until we make the entire
interface [noscript], which it should be.  (We should probably have a bug on
that.)

>   /**
>-   * Return a pointer to a zero terminated unicode string.
>+   * Return a pointer to a zero terminated UTF8 string.
>    */
>-   void GetUnicode([shared, retval] out wstring aResult);
>+  [noscript] void getUTF8String([shared, retval] out string aResult);

Technically this seems wrong, no, since it's really UTF-8?  Perhaps a comment
would be good?	Also, maybe it should be at the end of the interface since it's
most tied to the implementation and thus most likely to change?

>@@ -72,7 +80,9 @@ interface nsIAtom : nsISupports
>  * C string is translated into its unicode equivalent.
>  */
> extern NS_COM nsIAtom* NS_NewAtom(const char* isolatin1);
>+extern NS_COM nsIAtom* NS_NewAtom(const nsACString& isolatin1);
> extern NS_COM nsIAtom* NS_NewPermanentAtom(const char* isolatin1);
>+extern NS_COM nsIAtom* NS_NewPermanentAtom(const nsACString& isolatin1);
> 
> inline already_AddRefed<nsIAtom> do_GetAtom(const char* isolatin1)
>     { return NS_NewAtom(isolatin1); }

Please add do_GetAtom variants for any new NS_New*.

>-  AtomTableHashKey,
>+  PL_DHashStringKey,

If you're not using it anymore, remove it.  (But are there issues with null?)
dbaron: thanks for the review. Some comments:
* As for the noscript stuff, I'm not convinced that nsIAtom should be noscript,
given that atoms are used all over in the xul outliner, which needs to be
scriptable. I'd like to avoid debating that here, and leave these as scriptable
for now. That reminds me that I need to go back to fix up all the .js files that
call .GetUnicode() and switch them to .toString()/.toUTF8String()
* I'm not sure I understand your comment w.r.t. UTF8 strings. I haven't really
changed anything about the semantics of the interface, just that unicode values
go through the UTF8 converter when they are stored. I mean, there's always the
potential for someone to store a non-UTF8 value in the atom, but I think that
potential existed before as well.
* as for the rest of the stuff - good points, and I've fixed them up - if there
were issues with null, I'm not aware of them - previously we called into
nsCRT::HashCode(const PRUnichar*) which stopped at the first null unicode
character.. now that we're doing UTF8, its the same. I could potentially use
nsCRT::HashCodeAsUTF8() - I'm not sure I see the value but I could be convinced.

new patch coming up..

(new patch coming soon, 
Attachment #116354 - Flags: superreview?(darin)
Attachment #116354 - Flags: review?(dougt)
Comment on attachment 116354 [details] [diff] [review]
update the nsIAtom API and callers

marking current patch obsolete as to avoid confusing the EDT while they
evaluate topembed bugs.

I need to get some more specific numbers on this...
Attachment #116354 - Attachment is obsolete: true
ok here's a complete patch. This patch adds support for using gperf to store
atom tables, and converts the internal atom implementation to use const char*
strings since 99% of atoms are ASCII.

A few notes:
* rules.mk: in order to generate the .gperf files, we have to go from .h -> .i
-> .strings -> .gperf - this adds all the appropriate rules (and should work on
all platforms) - we have to go through the .i file to catch things like #ifdef
NS_DEBUG
* make-atom-strings.pl: this translates an existing list of atoms (such as
nsHTMLAtomList.h) into a .strings file so that gperf can process it. Docs are
in the file
* nsStaticAtoms.h will be used by consumers to declare a list of static atoms,
just like dbaron's stuff.
* Ok, now for the tricky part: it turns out that there's no way to declare a
"const" structure that has a vtable - because each entry needs a vtable pointer
that has to be set at runtime. (there are other issues as well, but that should
be sufficient) - so what I did was to create a structure very similar to
dbaron's atom list stuff, and make consumers create that. Then we have a
wrapper class (nsStaticAtomWrapper) which exists on the heap (in an arena,
since they are long-living and not refcounted) and points to these structures.
In addition, the atom pointed to by the nsStaticAtom will be updated.. this
will make initialization cheaper.
* when an atom is requested/created, it will try to first look it up in the
hashtable. 
    If it is found, the type is determined and the right object is returned. 
    If it is not found, then we look it up in the static tables. 
	If it is found, we create a wrapper object and insert that into the 
	hashtable.
	If it is not found, we fall back to just creating an atom as usual.
    Special casing of permanent atoms is handled in an appropriate manner. In
the future I'd like to roll the permanent atom stuff into the static atom
stuff.
* I had to update some .js files, just converting .GetUnicode() to .toString()
Comment on attachment 116898 [details] [diff] [review]
update atom API and callers, and add nsStaticAtom support

ok, I think this is ready for reviews.

requesting dbaron because of his previous atom work and darin, but I welcome
reviews from others as well...
Attachment #116898 - Flags: superreview?(darin)
Attachment #116898 - Flags: review?(dbaron)
Comment on attachment 116898 [details] [diff] [review]
update atom API and callers, and add nsStaticAtom support

>Index: xpcom/ds/nsAtomTable.cpp

>+struct AtomTableEntry : public PLDHashEntryHdr {
>+  // mAtom & 0x1 means (mAtom & ~0x3) points to an nsStaticAtomWrapper
>+  // else it points to an nsAtomImpl
>+  PtrBits mAtom;

ah, so this works because of the fact that the address of these structures
must be aligned... nifty :-)

but what happens on platforms with 64-bit addressing?


>+  inline PRBool IsStaticAtom() const {
>+    return !!(mAtom & 0x1);

huh?  (mAtom & 0x1) has only two possible values: 0 or 1.


>+  inline void SetStaticAtomWrapper(nsStaticAtomWrapper* aAtom) {
>+    mAtom = PtrBits(aAtom) & 0x1;
>+  }

hmm... don't you mean "or" instead of "and"?


>+  inline PRBool HasValue() const {
>+    return (mAtom & ~0x3);
>+  }

is this where you want the "!!" trick?	although i'd say never mind since
folks should never write |if (bVal == PR_TRUE)|.


>+  aBuf.Assign(NS_ConvertUTF8toUCS2(mString));

one of these days, someone is going to break down and finally implement
CopyUTF8toUCS2 ;-)


>+LookupStaticAtom(const nsACString& aString)
...
>+  for (i=0; i<count; i++) {
...
>+      lookup(PromiseFlatCString(aString).get(), aString.Length());

probably a good idea to move PromiseFlatCString and Length call outside
the loop, though i'm sure we don't loop many times.


>Index: xpcom/ds/nsStaticAtom.h

i stopped here.  will pick up later.
Attachment #116898 - Flags: superreview?(darin) → superreview-
>>+struct AtomTableEntry : public PLDHashEntryHdr {
>>+  // mAtom & 0x1 means (mAtom & ~0x3) points to an nsStaticAtomWrapper
>>+  // else it points to an nsAtomImpl
>>+  PtrBits mAtom;
> 
> ah, so this works because of the fact that the address of these structures
> must be aligned... nifty :-)
> 
> but what happens on platforms with 64-bit addressing?

As long as PtrBits scales to that width -- which I sure hope it does -- it
should all be fine.  You'll get "more" alignment, not less, so the low 2 bits
are safe.
> As long as PtrBits scales to that width -- which I sure hope it does -- it

  typedef unsigned long PtrBits;

is it true that sizeof(unsigned long) is always sufficient to hold an address? 
i thought that wasn't a requirement.
Yeah, I'm pretty sure that |unsigned long| is always big enough.  |unsigned int|
isn't always, though.
c99 has an optional intptr_t type "with the property that any valid pointer to
void can be converted to this type, then converted back to pointer to void, and
the result will compare equal to the original pointer". (Theres also an unsigned
uintptr_t variant)

However, we do have an NSPR type for that - ptrdiff_t. Which happens to be a
typedef for |unsigned long|, but we may as well use the nspr version ot be on
teh safe side.

Are we really sure that the last two bits are always zero? I have this
recollection that other places in the code assume that (and I also knwo that
valgrind doesn't provide that unless you use a special flag, which can be an
issue if you forget)...

Are we really this tied up over 2 bits per atom? (Which probably becomes one
byte, given padding)
so I just copied the PtrBits pattern from about 3 other places in layout where
we do the same thing.

I guess the !! isn't really necessary though I did get the logic backwards in
SetStaticAtomWrapper() - I just caught that myself after debugging this.. doh!

I'm welcome to use whatever for PtrBits but the key is that I want to save
myself 4 bytes (Which is what an extra bool would align to) by storing the bit
in the pointer. This is a very safe thing that we do all over the place...
Alecf, if you're using & 1 to test and | 1 to tag, then use & ~1 to mask the tag
and get the high order bits.  Unless you have some reason to require 0 mod 4
alignment (say, to use 2 as another tag bit), of course.

bbaetz: didn't you mean PRPtrdiff?  ptrdiff_t is not necessarily big enough (nor
is size_t).

/be
the rest of the patch was pretty straight forward, so I corrected stuff and
have a new patch.

I originally was using 0x3 because I thought I was going to need another flag
and pointers are always aligned to at least 2 bits. But since I only need the
lower bit, I moved everying to 0x1
Attachment #116898 - Attachment is obsolete: true
brendan: |typedef ptrdiff_t PRPtrdiff;| is in prtypes.h So I did mean PRPtrDiff,
but they are the same.

Its not 'very safe'; it just happens to work on all the systems we care about. :)
oops! I found a tiny typo in nsHTMLContentSerializer..
Attachment #116991 - Attachment is obsolete: true
stop the presses! this is all WAY more complex than it needs to be. The more I
thought about this, the more I realized that we don't need gperf at ALL - it
doesn't buy us a thing. Really, we just need one call:

NS_RegisterStaticAtoms(const nsStaticAtom*, PRUint32 aAtomCount);

this method would just iterate through every static atom, creating the dynamic
wrapper if necessary and updating the atom pointed to by the mAtom slot. I'll
have a new patch tomorrow. this is MUCH simpler and faster.
Status: ASSIGNED → RESOLVED
Closed: 21 years ago
Resolution: --- → FIXED
alec: maybe i'm missing something... wasn't the whole point of gperf to speed up
resolving a string value to its corresponding atom?  perfect hash and all that
jazz, right?
yup. 

gperf gives you an O(1) lookup on a fixed set of values... our atom stuff has
two fundamental "problems" that conflict with that:
1) it is dynamic, meaning that you can insert any value into the table at any
time. This means that you always have to check a dynamic table before checking
any static table. It also means that if you register more than one table, you
have to iteratively look through every table in order to find your value.
2) nsIAtom. It is a reference counted COM object, so it needs a vtable, which
has to be initialized at runtime.. which means we need wrappers around
nsStaticAtoms. Basically, gperf can't map strings -> nsIAtom implementations
because nsIAtom implementations cannot be constant. 

The other issue is that the way layout uses atoms (which is to have eight
zillion nsFooAtom::atomName entries, we still have to initialize every atom at
startup, which is why the NS_RegisterStaticAtoms that I just described is still
a major improvement.

Maybe I was a little overzealous in my denouncing of gperf - in fact I see at
least 2 other places where we can benefit from it, and I will be incorporating
it into the work I do after the initial static atom work. But in any case, gperf
doesn't benefit us in the general dynamic atom case. 

The two cases are: nsStaticNameTable (since it it is neither dynamic nor does it
use vtables) and 3 classes in content/shared (nsCSSAnonBoxes,
nsCSSPseudoElements, nsCSSPseudoClasses) where we need to test if an atom is a
member of a set.

I don't know how this go marked FIXED though. reopening and changing the
description slightly.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Summary: static atom tables with gperf → static atom tables
ok, here's what I'm talking about. Its a lot simpler, and you'll see that
NS_RegisterStaticAtoms() has efficient backdoor access to the hashtable, so it
can register a list of static atoms quickly... far more quickly than it could
with the gperf callbacks.

I have left the rules.mk and make-atom-strings.pl changes in the patch because
they will become useful in the two cases I just mentioned...
Attachment #117004 - Attachment is obsolete: true
> Basically, gperf can't map strings -> nsIAtom implementations
> because nsIAtom implementations cannot be constant.

gperf can map strings to indices, no?  And then could you put the nsIAtom
pointers in a vector, with the gperf-produced indices?
Comment on attachment 117105 [details] [diff] [review]
update atom API and callers, nsStaticAtomSupport v1.2

yes, but then you have to walk a chain of lookup tables, after you've already
looked in the global hashtable. I don't know how to explain this any better -
gperf is simply not an appropriate tool for atoms, as they are used now. I
don't think we should try to bend over backwards to use gperf if its not going
to actually help us.

requesting reviews from darin/dbaron
(and yes, dbaron, this means we've essentially rolled all the work from
nsAtomListUtils into the main atom service, with higher performance.... in my
tree I've actually removed nsAtomListUtils...)
Attachment #117105 - Flags: superreview?(darin)
Attachment #117105 - Flags: review?(dbaron)
I don't see anything in the patch related to the current users of
nsAtomListUtils.  What does this have to do with it?  I'm not quite sure what
performance gain you can expect over atom lists (except perhaps by avoiding
global variable access).  Isn't the gain you're after here the performance of
converting strings to atoms?
dbaron: look at NS_RegisterStaticAtoms() - basically that does the work of
nsAtomListUtils::AddRefAtoms, but from inside the atom table, so that you get
direct access to the hashtable, rather than going through NS_NewPermanentAtom()
every time. In addition, this avoids making a copy of the string, which is what
NS_NewPermanentAtom() would result in.
So what cases are you using gperf for?
dbaron: see the bottom of comment 33 for the gperf cases (obviously not part of
this patch/bug)
Status: REOPENED → ASSIGNED
Attached file review comments
nothing major, just some nits.
Attachment #117105 - Flags: superreview?(darin) → superreview-
minusing topembed. please re-nominate topembed once we have an estimate on
footprint savings.
Keywords: topembedtopembed-
ok, here's an updated patch. I added the inline Equals/EqualsUTF8 and cleaned
up the consumers where appropriate

the only thing I didn't address was the SysAllocString() stuff - this is an
array returned via a microsoft COM interface, and thus we need to use
SysAllocString()
Attachment #117105 - Attachment is obsolete: true
Comment on attachment 117649 [details] [diff] [review]
update atom API and callers, nsStaticAtom support v1.3

ok, updated patch. I'd really like to land this for 1.4alpha
Attachment #117649 - Flags: superreview?(darin)
Attachment #117649 - Flags: review?(dbaron)
Comment on attachment 117649 [details] [diff] [review]
update atom API and callers, nsStaticAtom support v1.3

sr=darin
Attachment #117649 - Flags: superreview?(darin) → superreview+
Comment on attachment 117649 [details] [diff] [review]
update atom API and callers, nsStaticAtom support v1.3

> extern NS_COM nsIAtom* NS_NewAtom(const char* isolatin1);
>+extern NS_COM nsIAtom* NS_NewAtom(const nsACString& isolatin1);
> extern NS_COM nsIAtom* NS_NewPermanentAtom(const char* isolatin1);
>+extern NS_COM nsIAtom* NS_NewPermanentAtom(const nsACString& isolatin1);
> 
> inline already_AddRefed<nsIAtom> do_GetAtom(const char* isolatin1)
>     { return NS_NewAtom(isolatin1); }
>+inline already_AddRefed<nsIAtom> do_GetAtom(const nsACString& isolatin1)
>+    { return NS_NewAtom(isolatin1); }
>+ 
> inline already_AddRefed<nsIAtom> do_GetPermanentAtom(const char* isolatin1)
>+    { return NS_NewPermanentAtom(isolatin1); }
>+inline already_AddRefed<nsIAtom> do_GetPermanentAtom(const nsACString& isolatin1)
>     { return NS_NewPermanentAtom(isolatin1); }

"isolatin1" seems a bad name for the parameters if they're in UTF-8.  Are they?


>+// this is the "legal" way to have static classes
>+
>+static nsStaticAtomArena&
>+StaticAtomArena() {
>+  static nsStaticAtomArena arena;
>+  return arena;

Function-scope statics with destructors are bad to have in shared libraries --
gcc 2.x on Linux (and probably other platforms), when not used with
-fuse-cxa-atexit (which requires a glibc from about the past 4 years or
something like that), runs destructors of function-scope statics |atexit|,
which isn't a good thing if the library is unloaded.  I'm also not sure what
the code size overhead for running the destructor is.

>+  inline PRBool IsStaticAtom() const {
>+    return (mAtom & 0x1);
>+  }

Perhaps a "!= 0" would be good?

>+  inline void SetAtomImpl(AtomImpl* aAtom) {
>+    NS_ASSERTION(aAtom, "Setting null atom");
>+    mAtom = PtrBits(aAtom);
>+  }
>+
>+  inline void SetStaticAtomWrapper(nsStaticAtomWrapper* aAtom) {
>+    NS_ASSERTION(aAtom, "Setting null atom");
>+    mAtom = PtrBits(aAtom) | 0x1;
>+  }

Just in case (for someone porting to a weird platform; to document
assumptions), how about asserting in both of these that the pointer doesn't
have the low bit set?

>+  inline PRBool HasValue() const {
>+    return (mAtom & ~0x1);
>+  }

Perhaps "!= 0"?

>+  // get an addreffed nsIAtom
>+  // (no need to addref static atom wrappers though)
>+  inline nsIAtom* GetAtom() const {

This should return |already_AddRefed<nsIAtom>|.

>+  char* toBegin = &ii->mString[0];
>+  nsReadingIterator<char> fromBegin, fromEnd;

You didn't write this, but that should be |nsACString::const_iterator|, not
|nsReadingIterator<char>| (which shouldn't appear in string client code).

>+NS_IMETHODIMP
>+nsStaticAtomWrapper::ToString(nsAString& aBuf)
>+{
>+  // static atoms are always ASCII

Why the limitation?  Is this the only reason?  Are there assertions to ensure
that?

>+  CopyASCIItoUCS2(nsDependentCString(mStaticAtom->mString), aBuf);
>+  return NS_OK;
>+}

>+NS_COM nsresult
>+NS_RegisterStaticAtoms(const nsStaticAtom* aAtoms, PRUint32 aAtomCount)
>+{

>+    if (he->HasValue() && aAtoms[i].mAtom)
>+      // there already is an atom with this name in the table.. but we
>+      // still have to update mAtom
>+      *aAtoms[i].mAtom = he->GetAtom();
>+    else {
>+      nsStaticAtomWrapper* atom = WrapStaticAtom(&aAtoms[i]);
>+      NS_ASSERTION(atom, "Failed to wrap static atom");
>+
>+      // but even if atom is null, no real difference in code..
>+      he->SetStaticAtomWrapper(atom);
>+      if (aAtoms[i].mAtom)
>+        *aAtoms[i].mAtom = atom;

Why do you null check |aAtoms[i].mAtom| in one codepath but not the other?


More later...
Er, I wasn't looking at the code in my last comment closely enough -- but
shouldn't that null check be inside the then-clause?  Otherwise you risk
creating two atoms for the same string.
dbaron: i recommended doing away with the != 0 checks.  PRBool values shouldn't
have to be constrained to 0 or 1.  doing so is just bloat.  IMO code should
never compare a PRBool value directly against PR_TRUE.
We've had problems in the past with things like assigning the result of a
function returning PRBool into a PRPackedBool or a bitfield.  Regarding the
possibility of code bloat -- since this code is inline, it shouldn't matter,
since it should get optimized away.
Attached patch updated patch (obsolete) — Splinter Review
ok, here's yet another version.. this addresses most of dbaron's comments
except:
- I do specifically want to avoid non-ASCII static atoms unless we see some
need - there's no easy way to declare a UTF8 string in C++ anyway, so it should
be pretty hard to construct a non-ASCII static atom
- I didn't use already_AddReffed<nsIAtom> because it caused all sorts of
problems (like when I said "return foo->GetAtom();") - all the callers use raw
nsIAtom pointers and it just doesn't make sense to try to use nsCOMPtr here.
Attachment #117649 - Attachment is obsolete: true
Comment on attachment 117799 [details] [diff] [review]
updated patch

carrying over sr=darin because the changes were pretty minor, requesting
another review from dbaron. 

We're getting down to the wire here for mozilla 1.4alpha, and I still have
about 10 sites in the tree that I want to convert to nsStaticAtom, and they
depend on this patch landing!
Attachment #117799 - Flags: superreview+
Attachment #117799 - Flags: review?(dbaron)
Comment on attachment 117799 [details] [diff] [review]
updated patch


>+    if (he->HasValue() && aAtoms[i].mAtom)
>+      // there already is an atom with this name in the table.. but we
>+      // still have to update mAtom
>+      *aAtoms[i].mAtom = he->GetAtom();

Shouldn't this be (as I said above):
      if (he->HasValue()) {
	if (aAtoms[i].mAtom)
	  *aAtoms[i].mAtom = he->GetAtom();
      }
so that you avoid possible creation of two atoms for the same string?

>+    else {
>+      nsStaticAtomWrapper* atom = WrapStaticAtom(&aAtoms[i]);
>+      NS_ASSERTION(atom, "Failed to wrap static atom");
>+
>+      // but even if atom is null, no real difference in code..
>+      he->SetStaticAtomWrapper(atom);
>+      if (aAtoms[i].mAtom)
>+        *aAtoms[i].mAtom = atom;
>+    }
Actually, though, you really want even more than that, since you want to ensure
that if there's an existing atom, it's at least a permanent atom so that (1) we
don't waste time refcounting it and (2) you don't transfer a reference to a
structure that isn't planning to release it.  So what you really want is:

      if (he->HasValue()) {
        if (!he->IsStaticAtom() && !he->GetAtomImpl()->IsPermanent) {
          /* code from NS_NewPermanentAtom that does refcount business and
             placement new goes here, hopefully by moving it into an inline
             function that's called from both places */
        }
	if (aAtoms[i].mAtom)
	  *aAtoms[i].mAtom = he->GetAtom();
      } else {
        ...
Comment on attachment 117799 [details] [diff] [review]
updated patch

Of course, I missed the "()" after "IsPermanent" in my previous comment.  (And
I'd call the inline function something like "PromoteToPermanent".)

>+NS_COM
>+nsIAtom* NS_NewPermanentAtom( const nsACString& aString )
>+{
>+  AtomTableEntry *he = GetAtomHashEntry(PromiseFlatCString(aString).get());
> 
>+  if (he->HasValue() && he->IsStaticAtom())
>+    return he->GetAtom();

That last line could just be:

      return he->GetStaticAtomWrapper();

which is much simpler.

[nsNodeInfo.cpp]
>nsNodeInfo::Equals(const nsAString& aName, PRInt32 aNamespaceID) const

Could use some more indentation on the second line of the return expression.

>+  PRBool nameEqual, prefixEqual;

No need to add unused variables.  And you can remove |nullChar| too.
Comment on attachment 117799 [details] [diff] [review]
updated patch

[nsCSSStyleSheet.cpp]

The first of the changes in nsCSSStyleSheet is a change in behavior, since
you're changing things from full unicode string comparison to ASCII-only string
comparison.  (Fortunately that's what PL_strcasecmp does, rather than
ISO-8859-1 case-insensitive string comparison.)

I *think* this change is acceptable, but there's a chance it might break some
websites.  (It is just a quirk...)  And I don't see an alternative with
acceptable performance.

Judging from the changes you had to make, is it worth making |AppendToString|
and |AppendToUTF8String| methods on nsIAtom?  I think these reduce copying in a
bunch of cases, although they're not ideal.  Maybe I'll finish up my AtomString
work sometime.

The nsHTMLAttributes.cpp change is also a change in function (what type of case
comparison).  I can't answer whether it's acceptable.  [We really should have
had a UTF-8 string type.]

The second change in nsXMLElement.cpp might be a noticable performance hit
since you're moving things inside the loop and doing more work than you used
to.  Of course, the old code might have been prematurely optimized.  Likewise
for the first pair of nsXULDocument.cpp changes.

In nsXMLDocument.cpp, you don't need the added |PRBool equal;|.

The nsEditor.cpp change also changes the type of string comparison done.

I'm up to nsParserService.cpp.	More comments later...
ok, here's an updated patch. 

regarding some of the client conversions:
- the RuleHash_CIMatchEntry() conversion - I looked into this more, and
actually it looks like CSS classes are for the most part supposed to be case
sensitive, except in quirks mode (I'm sure you knew that) - but I'm also
thinking that "class" is also almost always going to be an ascii string. I
agree with your assessment. 

(One possibility is to make an nsCaseInsensitiveUTF8Comparator() class in
nsUnicharUtils.h which one-by-one promotes each pair of characters into UCS2,
and does a case-insensitive comparison.)

regarding the nsXMLElement changes, I discovered that BOTH of these atom
comparisons are just frivilous. I removed the atoms and just switched to
NS_LITERAL_STRING()

I don't see huge value in AppendToString() to be honest - just that stuff in
nsNodeInfo and I don't know how often thats really called (given that there are
a half dozen Equals() style methods, and that one in particular seems like the
last resort)

nsHTMLAttributes.cpp: this is for something called HasClass() which is checking
to see if the given attribute has a given class. I think we should just go with
the same policy we did for nsCSSStyleSheet above.
 
nsEditor.cpp: NodeIsType() is just checking the tag name... do we/should
we/should we not make assumptions that the tag name is ascii/isolatin1?
Attachment #117799 - Attachment is obsolete: true
Attachment #117887 - Flags: review?(dbaron)
Comment on attachment 117887 [details] [diff] [review]
updated patch, v1.5

In nsFontMetricsWin, there are many additional performance tweaks that could be
made, but it's not necessary at this point.

In nsFontMetricsWinA::LoadGenericFont you have the same type of case-comparison
change that I mentioned elsewhere.

The txNameTest.cpp change  changes a function that used to append to |aDest| to
overwrite |aDest|.  Is this significant?

You need an assertion somewhere that static atoms contain only ASCII.
It's not that hard to hand-code a character or two of UTF-8, and
considering that it will completely work except for
ToString(nsAString&), and that will only break if there's no non-static
atom for the string in question registered first, someone doing this
could easily think that it's OK (especially considering all the other
Atom APIs are explicitly UTF-8).

cls should review the rules.mk changes.

I'm hoping the performance cost of the conversions won't cause a problem, but I
suspect it will.  This has added a bunch of string conversions (and copying) to
some rather frequently-used places.  Have you run pageload performance tests?

r=dbaron assuming you've gone through all my comments.	I think you should
probably write a UTF-8 case comparison function, since we'll need it at some
point if we move things to UTF-8, and I think it is the safe thing to do here.
thanks! I'm going to land it now (late at night) and see what kind of
performance hit/improvement we get. I feel like the patch has  a mix of
conversions saved vs. new conversions added.. if I see any even minor
performance hit, I'll back out and see if I can narrow down the issue.
ugh, so this did cause a big performance hit to Ts/Txul/Tp :(

Have no fear, I'm going to back myself out on friday, when I'm a little more awake.
on a side note to GetUnicode, the Equals code actually created a copy for each
compare, that should, at least for users which have data bound to unicode
(transformiix is one), be slower than before. Using a nsDependentString around
a const PRUnichar* and calling Equals just looks more complicated, but calling
NS_ConvertUCS2toUTF8(aString) acutally is. That acutally loops the string twice,
just to be paranoid about the string length.
A side note, this stuff shouldn't pop up in performance hot spots, I guess, but
anyway.
nsIAtom.idl
 95 /**
 96  * Find an atom that matches the given ISO-Latin1 C string. The
 97  * C string is translated into its unicode equivalent.
 98  */

I don't think this comment is correct anymore. 
ah ha!
I figured out the cause of the Txul regression (and thus probably the Ts
regression) - it turns out as we parsed XUL we were using
nsINodeInfo::QualifiedNameEquals() on every single attribute in order to find
the nsINodeInfo associated with that particular attribute...we were calling this
in two loops in XUL, and every call to QualifiedNameEquals() would call
nsIAtom::Equals() up to 3 times, which would create an NS_ConvertUCS2toUTF8
every time. 

So my fix is to change QualifiedNameEquals to take a UTF8 string (const
nsACString&) and fix the 3 consumers of it to create the UTF8 string outside the
loops)

patch fragment forthcoming

I also figured out the tree selection regression - I had inadvertently changed
":-moz-tree-" to ":moz-tree" - duh!
here's a small, 5-file patch which shows how I changed nsINodeInfo.

Now to investigate the editor regression...
What about the case-insensitive UTF-8 string comparison?
I'm workin' on that...lets file a new bug since this one covers the basic
architectural changes.

also for those reading this, please see bug 199170, for converting nsIAtom
providers over to providing static atoms.
Status: ASSIGNED → RESOLVED
Closed: 21 years ago21 years ago
Resolution: --- → FIXED
So.. this still regressed Tp by about 1% and Ts and Txul by a bit...  in fact,
it seems to have regressed them all about as much as the original checkin had. 
See tinderbox logs for the 25th (and do it soon, since tinderbox is about to
forget them!)
ugh. I will investigate more, I must not have watched the tboxes for long enough

the good news is that bug 199170 got us a signifigant Ts improvement. 
A partial perf fix in XSLT, needed because of the NS_NewAtom slowdown, has been
documented here: http://bugzilla.mozilla.org/show_bug.cgi?id=205810 
Blocks: 215636
A possible way to have static atoms while still storing utf16 strings is being
investigated in bug 277479
*** Bug 105924 has been marked as a duplicate of this bug. ***
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: