Closed
Bug 711563
Opened 13 years ago
Closed 12 years ago
reduce space and relocations required by quickstubs tables
Categories
(Core :: XPConnect, defect)
Core
XPConnect
Tracking
()
RESOLVED
FIXED
mozilla12
People
(Reporter: froydnj, Assigned: froydnj)
Details
Attachments
(4 files, 4 obsolete files)
1.65 KB,
patch
|
jorendorff
:
review+
|
Details | Diff | Splinter Review |
8.91 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
10.20 KB,
patch
|
jorendorff
:
review+
|
Details | Diff | Splinter Review |
1.26 KB,
patch
|
froydnj
:
review+
|
Details | Diff | Splinter Review |
Quickstubs's tables take up more space than required due to: 1) size_t for hash table entry chaining; 2) 0-terminated property and function tables; 3) pointers to said tables, requiring relocations; and 4) pointers to strings in said tables, requiring relocations. Patches for the first three points to be attached shortly. A patch for the 4th should not be that hard, but I don't have it ready yet.
Assignee | ||
Comment 1•13 years ago
|
||
Our xpc_qsHashEntry table only has ~200ish entries; we don't need a full size_t to index all of them. Use a uint16_t instead, which saves a size_t for each entry. Not that big of a savings, admittedly, but still.
Assignee | ||
Comment 2•13 years ago
|
||
xpc_qsHashEntry contains a pointer to its properties and functions. This is all well and good, except that the pointer requires relocations at startup. (And pointers are big on 64-bit platforms, too.) The individual arrays also have an unnecessary nsnull entry at the end, which we don't need. Instead of per-entry properties and functions, just produce one big array of properties, one big array of functions, and record indices and lengths in the xpc_qsHashEntrys. No more relocations, no more nsnull entries. 16 bits of index and count should be enough for anybody! We have less than a thousand properties and functions each, so more than two orders of magnitude to grow here.
Attachment #582349 -
Flags: review?(jorendorff)
Comment 3•13 years ago
|
||
That first patch needs a PR_STATIC_ASSERT about table size _somewhere_.
Assignee | ||
Comment 4•13 years ago
|
||
Now with PR_STATIC_ASSERTs (thanks, bz!). Not using ArrayLength due to bizarre template errors, fwiw.
Attachment #582347 -
Attachment is obsolete: true
Attachment #582347 -
Flags: review?(jorendorff)
Attachment #582355 -
Flags: review?(jorendorff)
Updated•12 years ago
|
Attachment #582355 -
Flags: review?(jorendorff) → review+
Comment 5•12 years ago
|
||
Comment on attachment 582349 [details] [diff] [review] use flatter property and function tables Instead of 'mydict.has_key(key)', write 'key in mydict'. Don't use global state like iface_propspecs and iface_funcspecs. Here are two alternatives: - create them in one of the functions and pass them around everywhere - instead of using dictionaries, just stick the propspecs and funcspecs on each interface object. You can do it unconditionally at the end of writeStubsForInterface: iface.propspecs = propspecs iface.funcspecs = funcspecs There is no need to declare the new properties. Then writeSpecs would look like this: >+def writeSpecs(f, elementType, varname, spec_type, spec_indices, interfaces): >+ index = 0 >+ f.write("static const %s %s[] = {\n" % (elementType, varname)) >+ for iface in interfaces: >+ specs = getattr(iface, spec_type) >+ if specs: >+ spec_indices[iface.name] = index >+ f.write(" // %s (index %d)\n" % (iface.name,index)) >+ for s in specs: >+ f.write(" %s,\n" % s) >+ index += len(specs) >+ f.write("};\n\n") and for spec_type the caller would the string 'propspecs' or 'funcspecs'. How much of a win is this at startup?
Attachment #582349 -
Flags: review?(jorendorff) → review+
Assignee | ||
Comment 6•12 years ago
|
||
Changes as requested by review. Carrying over r+.
Attachment #582349 -
Attachment is obsolete: true
Attachment #584824 -
Flags: review+
Assignee | ||
Comment 7•12 years ago
|
||
Each const char * in our propspecs and funcspecs takes up sizeof(str) sizeof(const char *) + sizeof(relocation). This patch cuts out the relocations required by producing a string table containing all the names of properties and functions and storing indexes into that table in place of pointers. About 1k relocations are trimmed by doing this, or ~8k of data on x86 and ARM. A clever implementation would have used offsetof instead of computing the indices in Python. xpc_qsFunctionSpecs can be further reduced by better packing after this change; that patch is coming up.
Attachment #584834 -
Flags: review?(jorendorff)
Assignee | ||
Comment 8•12 years ago
|
||
'arity' in xpc_qs{Function,Traceable}Spec doesn't need a full 32-bit word; by downsizing it, we can rearrange those structures to take up less space.
Attachment #584835 -
Flags: review?(jorendorff)
Assignee | ||
Comment 9•12 years ago
|
||
(In reply to Jason Orendorff [:jorendorff] from comment #5) > How much of a win is this at startup? I haven't tried to measure this. Overall savings are about 16K of data (.data.rel.ro and .rel.dyn; .rodata gets a smidge bigger due to the hash table entries not requiring relocations) on x86 Linux.
Comment 10•12 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #9) > (In reply to Jason Orendorff [:jorendorff] from comment #5) > > How much of a win is this at startup? > > I haven't tried to measure this. OK, how much do we usually expect to win per relocation removed? I don't need actual cachegrind numbers, any back-of-envelope number will do.
Assignee | ||
Comment 11•12 years ago
|
||
(In reply to Jason Orendorff [:jorendorff] from comment #10) > OK, how much do we usually expect to win per relocation removed? I don't > need actual cachegrind numbers, any back-of-envelope number will do. Mike Hommey says normal relocation processing on Android (non-elfhacked) takes ~270ms. libxul has ~240k relocations and though our other shared libraries have relocations, let's say there is only libxul. We eliminate ~1.2k relocs with this patch, so the startup wins would be on the order of 1ms, plus slightly less disk I/O. Depressing, huh? :)
Comment 12•12 years ago
|
||
Comment on attachment 584834 [details] [diff] [review] use flatter property and function names In xpconnect/src/XPCQuickStubs.h: > struct xpc_qsTraceableSpec { >- const char *name; >+ uint16_t name_index; > JSNative native; > uintN arity; > }; Please just delete this struct. Also delete writeTraceableStub and MAX_TRACEABLE_NATIVE_ARGS from qsgen.py. All three are dead code. >+class StringTable: >+ def __init__(self): >+ self.current_index = 0; >+ self.table = {} >+ self.reverse_table = {} reverse_table is unnecessary. >+ def writeDefinition(self, f, name): >+ entries = self.reverse_table.items() >+ entries.sort(lambda x, y: cmp(x[0], y[0])) In this case, that turns out to be the default sort order, so you could just do: entries.sort() However since we're getting rid of reverse_table, you have to swap the pairs somehow, perhaps like this: entries = [(offset, str) for (str, offset) in self.table] entries.sort() or like this: entries = sorted((offset, str) for (str, offset) in self.table) By the way -- don't use cmp-functions when sorting in Python. Use key-functions instead. Your code ends up being shorter. (Key-sorting is also faster by a constant factor, not that it matters here.) For example, your original code would look like this: entries = self.reverse_table.items() entries.sort(key=lambda x: x[0]) More on key functions: http://wiki.python.org/moin/HowTo/Sorting#Key_Functions >+ f.write("union strtab {\n") >+ f.write(" struct f {\n") >+ for e in entries: >+ index = e[0] >+ string = e[1] >+ f.write(" char field%d[%d];\n" % (index, self.c_strlen(string))) >+ f.write(" } f;\n") >+ f.write(" char table[1];\n") >+ f.write("};\n\n") >+ >+ f.write("static const union strtab %s = { {\n" % name) >+ for e in entries: >+ f.write(' "%s",\n' % e[1]) >+ f.write("} };\n\n") You can simplify this to: f.write("static const char %s[] =\n" % name) for pair in entries[:-1]: f.write(' /* %5d */ "%s\0"\n' % pair) f.write(' /* %5d */ "%s";\n' % entries[-1]) (This code intentionally does not write any commas. C/C++ glues together adjacent string literal tokens.)
Attachment #584834 -
Flags: review?(jorendorff) → review+
Comment 13•12 years ago
|
||
Comment on attachment 584835 [details] [diff] [review] part 4 - compact function and traceable specs r=me with the Traceable stuff removed.
Attachment #584835 -
Flags: review?(jorendorff) → review+
Assignee | ||
Comment 14•12 years ago
|
||
(In reply to Jason Orendorff [:jorendorff] from comment #12) > You can simplify this to: > > f.write("static const char %s[] =\n" % name) > for pair in entries[:-1]: > f.write(' /* %5d */ "%s\0"\n' % pair) > f.write(' /* %5d */ "%s";\n' % entries[-1]) > > (This code intentionally does not write any commas. C/C++ glues together > adjacent string literal tokens.) This approach has two problems: 1) The standard does not require compilers to support arbitrary length string constants. The required lower bound on the maximum is somewhat small (~500 characters, IIRC). I don't know of any compilers we support that go quite that low, but some versions of MSVC have limits of 2k; more recent versions have a limit of 65k. (MSVC enforces its limits post-concatenation, fwiw.) 2) GCC will grouse at you for null characters in string literals. This warning is on by default and I haven't found any way to turn it off. Avoiding both of these issues is why I went with the union way. It is more robust, if more complicated.
Assignee | ||
Comment 15•12 years ago
|
||
I had an epiphany this morning: we can just write the string table out as a big char[], there's no need to do all the games with unions and structs. Requesting re-review as this is not a cosmetic change.
Attachment #584834 -
Attachment is obsolete: true
Attachment #588080 -
Flags: review?(jorendorff)
Assignee | ||
Comment 16•12 years ago
|
||
Traceable stuff nuked in part 3, as requested. Carrying over r+.
Attachment #584835 -
Attachment is obsolete: true
Attachment #588081 -
Flags: review+
Assignee | ||
Updated•12 years ago
|
Attachment #582355 -
Attachment description: uint16_t indices for hash table chains, v2 → part 1 - uint16_t indices for hash table chains, v2
Assignee | ||
Updated•12 years ago
|
Attachment #584824 -
Attachment description: use flatter property and function tables, v2 → part 2 -use flatter property and function tables, v2
Comment 17•12 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #11) > (In reply to Jason Orendorff [:jorendorff] from comment #10) > > OK, how much do we usually expect to win per relocation removed? I don't > > need actual cachegrind numbers, any back-of-envelope number will do. > > Mike Hommey says normal relocation processing on Android (non-elfhacked) > takes ~270ms. libxul has ~240k relocations and though our other shared > libraries have relocations, let's say there is only libxul. We eliminate > ~1.2k relocs with this patch, so the startup wins would be on the order of > 1ms, plus slightly less disk I/O. > > Depressing, huh? :) Hey, you shave off enough milliseconds and eventually you're snappy. Thanks for finding out for me.
Comment 18•12 years ago
|
||
(In reply to Nathan Froyd (:froydnj) from comment #14) > (In reply to Jason Orendorff [:jorendorff] from comment #12) > > You can simplify this to: > > > > f.write("static const char %s[] =\n" % name) > > for pair in entries[:-1]: > > f.write(' /* %5d */ "%s\0"\n' % pair) > > f.write(' /* %5d */ "%s";\n' % entries[-1]) > > 2) GCC will grouse at you for null characters in string literals. This > warning is on by default and I haven't found any way to turn it off. I wonder if this is because I wrote \0 when I should have written \\0. In any case the char array is fine.
Comment 19•12 years ago
|
||
Comment on attachment 588080 [details] [diff] [review] part 3 - use flatter property and function names Looks good.
Attachment #588080 -
Flags: review?(jorendorff) → review+
Assignee | ||
Updated•12 years ago
|
Keywords: checkin-needed
Comment 20•12 years ago
|
||
Has this passed try?
Comment 21•12 years ago
|
||
Try run for ff93001d627c is complete. Detailed breakdown of the results available here: https://tbpl.mozilla.org/?tree=Try&rev=ff93001d627c Results (out of 276 total builds): exception: 2 success: 245 warnings: 26 failure: 3 Builds (or logs if builds failed) available at: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/philringnalda@gmail.com-ff93001d627c
Comment 22•12 years ago
|
||
Behold, the twisted and mutilated visage of a successful try run.
Comment 23•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/15a53db6fca9 https://hg.mozilla.org/integration/mozilla-inbound/rev/1f327a1c588b https://hg.mozilla.org/integration/mozilla-inbound/rev/dbb0ed6a5a6f https://hg.mozilla.org/integration/mozilla-inbound/rev/9cf47a611458
Keywords: checkin-needed
Target Milestone: --- → mozilla12
Comment 24•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/15a53db6fca9 https://hg.mozilla.org/mozilla-central/rev/1f327a1c588b https://hg.mozilla.org/mozilla-central/rev/dbb0ed6a5a6f https://hg.mozilla.org/mozilla-central/rev/9cf47a611458
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•