Closed
Bug 711563
Opened 14 years ago
Closed 14 years ago
reduce space and relocations required by quickstubs tables
Categories
(Core :: XPConnect, defect)
Core
XPConnect
Tracking
()
RESOLVED
FIXED
mozilla12
People
(Reporter: froydnj, Assigned: froydnj)
Details
Attachments
(4 files, 4 obsolete files)
|
1.65 KB,
patch
|
jorendorff
:
review+
|
Details | Diff | Splinter Review |
|
8.91 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
|
10.20 KB,
patch
|
jorendorff
:
review+
|
Details | Diff | Splinter Review |
|
1.26 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
Quickstubs's tables take up more space than required due to:
1) size_t for hash table entry chaining;
2) 0-terminated property and function tables;
3) pointers to said tables, requiring relocations; and
4) pointers to strings in said tables, requiring relocations.
Patches for the first three points to be attached shortly. A patch for the 4th should not be that hard, but I don't have it ready yet.
| Assignee | ||
Comment 1•14 years ago
|
||
Our xpc_qsHashEntry table only has ~200ish entries; we don't need a full size_t to index all of them. Use a uint16_t instead, which saves a size_t for each entry. Not that big of a savings, admittedly, but still.
| Assignee | ||
Comment 2•14 years ago
|
||
xpc_qsHashEntry contains a pointer to its properties and functions. This is all well and good, except that the pointer requires relocations at startup. (And pointers are big on 64-bit platforms, too.) The individual arrays also have an unnecessary nsnull entry at the end, which we don't need.
Instead of per-entry properties and functions, just produce one big array of properties, one big array of functions, and record indices and lengths in the xpc_qsHashEntrys. No more relocations, no more nsnull entries.
16 bits of index and count should be enough for anybody! We have less than a thousand properties and functions each, so more than two orders of magnitude to grow here.
Attachment #582349 -
Flags: review?(jorendorff)
Comment 3•14 years ago
|
||
That first patch needs a PR_STATIC_ASSERT about table size _somewhere_.
| Assignee | ||
Comment 4•14 years ago
|
||
Now with PR_STATIC_ASSERTs (thanks, bz!).
Not using ArrayLength due to bizarre template errors, fwiw.
Attachment #582347 -
Attachment is obsolete: true
Attachment #582347 -
Flags: review?(jorendorff)
Attachment #582355 -
Flags: review?(jorendorff)
Updated•14 years ago
|
Attachment #582355 -
Flags: review?(jorendorff) → review+
Comment 5•14 years ago
|
||
Comment on attachment 582349 [details] [diff] [review]
use flatter property and function tables
Instead of 'mydict.has_key(key)', write 'key in mydict'.
Don't use global state like iface_propspecs and iface_funcspecs. Here are two alternatives:
- create them in one of the functions and pass them around everywhere
- instead of using dictionaries, just stick the propspecs and funcspecs on
each interface object. You can do it unconditionally at the end of
writeStubsForInterface:
iface.propspecs = propspecs
iface.funcspecs = funcspecs
There is no need to declare the new properties.
Then writeSpecs would look like this:
>+def writeSpecs(f, elementType, varname, spec_type, spec_indices, interfaces):
>+ index = 0
>+ f.write("static const %s %s[] = {\n" % (elementType, varname))
>+ for iface in interfaces:
>+ specs = getattr(iface, spec_type)
>+ if specs:
>+ spec_indices[iface.name] = index
>+ f.write(" // %s (index %d)\n" % (iface.name,index))
>+ for s in specs:
>+ f.write(" %s,\n" % s)
>+ index += len(specs)
>+ f.write("};\n\n")
and for spec_type the caller would the string 'propspecs' or 'funcspecs'.
How much of a win is this at startup?
Attachment #582349 -
Flags: review?(jorendorff) → review+
| Assignee | ||
Comment 6•14 years ago
|
||
Changes as requested by review. Carrying over r+.
Attachment #582349 -
Attachment is obsolete: true
Attachment #584824 -
Flags: review+
| Assignee | ||
Comment 7•14 years ago
|
||
Each const char * in our propspecs and funcspecs takes up sizeof(str) sizeof(const char *) + sizeof(relocation). This patch cuts out the relocations required by producing a string table containing all the names of properties and functions and storing indexes into that table in place of pointers. About 1k relocations are trimmed by doing this, or ~8k of data on x86 and ARM.
A clever implementation would have used offsetof instead of computing the indices in Python.
xpc_qsFunctionSpecs can be further reduced by better packing after this change; that patch is coming up.
Attachment #584834 -
Flags: review?(jorendorff)
| Assignee | ||
Comment 8•14 years ago
|
||
'arity' in xpc_qs{Function,Traceable}Spec doesn't need a full 32-bit word; by downsizing it, we can rearrange those structures to take up less space.
Attachment #584835 -
Flags: review?(jorendorff)
| Assignee | ||
Comment 9•14 years ago
|
||
(In reply to Jason Orendorff [:jorendorff] from comment #5)
> How much of a win is this at startup?
I haven't tried to measure this. Overall savings are about 16K of data (.data.rel.ro and .rel.dyn; .rodata gets a smidge bigger due to the hash table entries not requiring relocations) on x86 Linux.
Comment 10•14 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #9)
> (In reply to Jason Orendorff [:jorendorff] from comment #5)
> > How much of a win is this at startup?
>
> I haven't tried to measure this.
OK, how much do we usually expect to win per relocation removed? I don't need actual cachegrind numbers, any back-of-envelope number will do.
| Assignee | ||
Comment 11•14 years ago
|
||
(In reply to Jason Orendorff [:jorendorff] from comment #10)
> OK, how much do we usually expect to win per relocation removed? I don't
> need actual cachegrind numbers, any back-of-envelope number will do.
Mike Hommey says normal relocation processing on Android (non-elfhacked) takes ~270ms. libxul has ~240k relocations and though our other shared libraries have relocations, let's say there is only libxul. We eliminate ~1.2k relocs with this patch, so the startup wins would be on the order of 1ms, plus slightly less disk I/O.
Depressing, huh? :)
Comment 12•14 years ago
|
||
Comment on attachment 584834 [details] [diff] [review]
use flatter property and function names
In xpconnect/src/XPCQuickStubs.h:
> struct xpc_qsTraceableSpec {
>- const char *name;
>+ uint16_t name_index;
> JSNative native;
> uintN arity;
> };
Please just delete this struct. Also delete writeTraceableStub and MAX_TRACEABLE_NATIVE_ARGS from qsgen.py. All three are dead code.
>+class StringTable:
>+ def __init__(self):
>+ self.current_index = 0;
>+ self.table = {}
>+ self.reverse_table = {}
reverse_table is unnecessary.
>+ def writeDefinition(self, f, name):
>+ entries = self.reverse_table.items()
>+ entries.sort(lambda x, y: cmp(x[0], y[0]))
In this case, that turns out to be the default sort order, so you could just do:
entries.sort()
However since we're getting rid of reverse_table, you have to swap the pairs somehow, perhaps like this:
entries = [(offset, str) for (str, offset) in self.table]
entries.sort()
or like this:
entries = sorted((offset, str) for (str, offset) in self.table)
By the way -- don't use cmp-functions when sorting in Python. Use key-functions instead. Your code ends up being shorter. (Key-sorting is also faster by a constant factor, not that it matters here.) For example, your original code would look like this:
entries = self.reverse_table.items()
entries.sort(key=lambda x: x[0])
More on key functions:
http://wiki.python.org/moin/HowTo/Sorting#Key_Functions
>+ f.write("union strtab {\n")
>+ f.write(" struct f {\n")
>+ for e in entries:
>+ index = e[0]
>+ string = e[1]
>+ f.write(" char field%d[%d];\n" % (index, self.c_strlen(string)))
>+ f.write(" } f;\n")
>+ f.write(" char table[1];\n")
>+ f.write("};\n\n")
>+
>+ f.write("static const union strtab %s = { {\n" % name)
>+ for e in entries:
>+ f.write(' "%s",\n' % e[1])
>+ f.write("} };\n\n")
You can simplify this to:
f.write("static const char %s[] =\n" % name)
for pair in entries[:-1]:
f.write(' /* %5d */ "%s\0"\n' % pair)
f.write(' /* %5d */ "%s";\n' % entries[-1])
(This code intentionally does not write any commas. C/C++ glues together adjacent string literal tokens.)
Attachment #584834 -
Flags: review?(jorendorff) → review+
Comment 13•14 years ago
|
||
Comment on attachment 584835 [details] [diff] [review]
part 4 - compact function and traceable specs
r=me with the Traceable stuff removed.
Attachment #584835 -
Flags: review?(jorendorff) → review+
| Assignee | ||
Comment 14•14 years ago
|
||
(In reply to Jason Orendorff [:jorendorff] from comment #12)
> You can simplify this to:
>
> f.write("static const char %s[] =\n" % name)
> for pair in entries[:-1]:
> f.write(' /* %5d */ "%s\0"\n' % pair)
> f.write(' /* %5d */ "%s";\n' % entries[-1])
>
> (This code intentionally does not write any commas. C/C++ glues together
> adjacent string literal tokens.)
This approach has two problems:
1) The standard does not require compilers to support arbitrary length string constants. The required lower bound on the maximum is somewhat small (~500 characters, IIRC). I don't know of any compilers we support that go quite that low, but some versions of MSVC have limits of 2k; more recent versions have a limit of 65k. (MSVC enforces its limits post-concatenation, fwiw.)
2) GCC will grouse at you for null characters in string literals. This warning is on by default and I haven't found any way to turn it off.
Avoiding both of these issues is why I went with the union way. It is more robust, if more complicated.
| Assignee | ||
Comment 15•14 years ago
|
||
I had an epiphany this morning: we can just write the string table out as a big char[], there's no need to do all the games with unions and structs.
Requesting re-review as this is not a cosmetic change.
Attachment #584834 -
Attachment is obsolete: true
Attachment #588080 -
Flags: review?(jorendorff)
| Assignee | ||
Comment 16•14 years ago
|
||
Traceable stuff nuked in part 3, as requested. Carrying over r+.
Attachment #584835 -
Attachment is obsolete: true
Attachment #588081 -
Flags: review+
| Assignee | ||
Updated•14 years ago
|
Attachment #582355 -
Attachment description: uint16_t indices for hash table chains, v2 → part 1 - uint16_t indices for hash table chains, v2
| Assignee | ||
Updated•14 years ago
|
Attachment #584824 -
Attachment description: use flatter property and function tables, v2 → part 2 -use flatter property and function tables, v2
Comment 17•14 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #11)
> (In reply to Jason Orendorff [:jorendorff] from comment #10)
> > OK, how much do we usually expect to win per relocation removed? I don't
> > need actual cachegrind numbers, any back-of-envelope number will do.
>
> Mike Hommey says normal relocation processing on Android (non-elfhacked)
> takes ~270ms. libxul has ~240k relocations and though our other shared
> libraries have relocations, let's say there is only libxul. We eliminate
> ~1.2k relocs with this patch, so the startup wins would be on the order of
> 1ms, plus slightly less disk I/O.
>
> Depressing, huh? :)
Hey, you shave off enough milliseconds and eventually you're snappy. Thanks for finding out for me.
Comment 18•14 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #14)
> (In reply to Jason Orendorff [:jorendorff] from comment #12)
> > You can simplify this to:
> >
> > f.write("static const char %s[] =\n" % name)
> > for pair in entries[:-1]:
> > f.write(' /* %5d */ "%s\0"\n' % pair)
> > f.write(' /* %5d */ "%s";\n' % entries[-1])
>
> 2) GCC will grouse at you for null characters in string literals. This
> warning is on by default and I haven't found any way to turn it off.
I wonder if this is because I wrote \0 when I should have written \\0.
In any case the char array is fine.
Comment 19•14 years ago
|
||
Comment on attachment 588080 [details] [diff] [review]
part 3 - use flatter property and function names
Looks good.
Attachment #588080 -
Flags: review?(jorendorff) → review+
| Assignee | ||
Updated•14 years ago
|
Keywords: checkin-needed
Comment 20•14 years ago
|
||
Has this passed try?
Comment 21•14 years ago
|
||
Try run for ff93001d627c is complete.
Detailed breakdown of the results available here:
https://tbpl.mozilla.org/?tree=Try&rev=ff93001d627c
Results (out of 276 total builds):
exception: 2
success: 245
warnings: 26
failure: 3
Builds (or logs if builds failed) available at:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/philringnalda@gmail.com-ff93001d627c
Comment 22•14 years ago
|
||
Behold, the twisted and mutilated visage of a successful try run.
Comment 23•14 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/15a53db6fca9
https://hg.mozilla.org/integration/mozilla-inbound/rev/1f327a1c588b
https://hg.mozilla.org/integration/mozilla-inbound/rev/dbb0ed6a5a6f
https://hg.mozilla.org/integration/mozilla-inbound/rev/9cf47a611458
Keywords: checkin-needed
Target Milestone: --- → mozilla12
Comment 24•14 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/15a53db6fca9
https://hg.mozilla.org/mozilla-central/rev/1f327a1c588b
https://hg.mozilla.org/mozilla-central/rev/dbb0ed6a5a6f
https://hg.mozilla.org/mozilla-central/rev/9cf47a611458
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•