Closed
Bug 704848
Opened 13 years ago
Closed 12 years ago
reduce space required by nsEffectiveTLDService with more preprocessing
Categories
(Core :: Networking, defect)
Core
Networking
Tracking
()
RESOLVED
FIXED
mozilla18
People
(Reporter: froydnj, Unassigned)
Details
Attachments
(1 file, 4 obsolete files)
7.88 KB,
patch
|
jduell.mcbugs
:
review+
|
Details | Diff | Splinter Review |
The space required by nsEffectiveTLDService.cpp:gEntries can be reduced significantly with a little more work in prepare_tlds.py. Patch and explanation coming up.
Reporter | ||
Comment 1•13 years ago
|
||
The pointers in ETLDEntry create two problems: 1. They (indirectly) introduce padding at the end of ETLDEntry, wasting space; 2. They require runtime relocations, increasing startup time and footprint. This patch addresses both of these issues by turning the pointers into indices into a private string table. The runtime cost is minimal: an extra arithmetic instruction on x86, probably two arithmetic instructions on ARM, with GCC. I haven't checked what MSVC does to the code, but I would hope it does something similar. After doing this: 1. ETLDEntry shrinks from 8 bytes on 32-bit platforms to 4 bytes. (The savings are even better on 64-bit systems.) 2. We no longer need runtime relocations (there are no pointers in the data), so we save on the space required by them and the startup cost needed to process them. I realize elfhack helps reduce the space of the relocations, but the best work is work that you don't do, right? All told, we win by ~30-40K by doing this (16K of table size, plus 32K of relocation entries on x86/ARM, shrunk by whatever elfhack does to them). I assume something similar happens on Windows. I realize this is sort of an odd thing to do, but looking at data that needed relocations, nsEffectiveTLDService.cpp:gEntries was the biggest offender that was easily modifiable (e.g. not vtables, not table-driven QueryInterface bits).
Assignee: nobody → nfroyd
Attachment #576520 -
Flags: review?(jduell.mcbugs)
Reporter | ||
Comment 2•13 years ago
|
||
Actually, this doesn't quite build right; other files that include nsEffectiveTLDService.h don't know where to find the now-required .inc file generated by the build process. This is handled correctly for netwerk/dns/, but not for other netwerk/ subdirectories. Will have a think about how to handle this; suggestions welcome. Going to change review? to feedback?.
Comment hidden (typo) |
Reporter | ||
Comment 4•13 years ago
|
||
New patch, this time appropriately modifying the build system so everything builds. It also includes preprocessor magic that reduces the Python modifications required and eliminates warnings about NULLs in strings from GCC (and possibly elsewhere). Also gone are the two-level arrays and extra indexing that required; any compiler that can't generate good code now is not worth using. Static asserts that our indices into the string table do not overflow are also provided, at a small cost: 1 byte for a dummy function totally unused at runtime. The placement of the actual data in nsDomainEntry is a little weird, but seeing as how putting it anywhere else has problems, it is the best solution. I looked at the generated code and it's ideal. Actually asking for review this time. Benefits: 40-50k smaller binary, nearly 2% fewer relocations to help improve startup.
Attachment #576520 -
Attachment is obsolete: true
Attachment #576520 -
Flags: feedback?(jduell.mcbugs)
Attachment #582382 -
Flags: review?(jduell.mcbugs)
Reporter | ||
Comment 5•13 years ago
|
||
Now with less idiocy on my part. We don't need a separate index array because we're storing the indices directly in the ETLDEntry struct.
Attachment #582382 -
Attachment is obsolete: true
Attachment #582382 -
Flags: review?(jduell.mcbugs)
Attachment #582445 -
Flags: review?(jduell.mcbugs)
Reporter | ||
Comment 6•12 years ago
|
||
Should mention that if the giant struct method is unpalatable, doing it the way parser/html/nsHtml5NamedCharactersInclude.h does it is also a possibility. (We're generating the #include, so adding lengths and splitting things into spaces is no problem.)
Comment 7•12 years ago
|
||
As discussed with Nathan on IRC, this patch look good but I think we can do even better using gperf (it will also let us skip initializing the ~5K element hash table at startup). I'm going to give that a go.
Assignee: nfroyd → nobody
Component: Networking → Nanojit
QA Contact: networking → nanojit
Comment 8•12 years ago
|
||
Comment on attachment 582445 [details] [diff] [review] patch v3 Clearing review--if gperf approach doesn't work, we'll revisit.
Attachment #582445 -
Flags: review?(jduell.mcbugs)
Updated•12 years ago
|
Component: Nanojit → Networking
QA Contact: nanojit → networking
Reporter | ||
Comment 9•12 years ago
|
||
FWIW, gperf's hashtable, even with -P (optimize for fewer relocations in shared libraries) produces a struct with: { char *name; bool b1; bool b2; } which is the same as what we have today. The generated code spews warnings with GCC (not even with -Wall, just -O2) and features dodgy casts like (int)(long)&chararray[index]. Hacking gperf might be an option (would be useful in other places)...
Comment 10•12 years ago
|
||
-P gives us an "int" offset (not a char *), which is still not what we want (we want an unsigned 16 bit value). I'm planning to either postprocess that with a python script, or add a flag to gconf.
> dodgy casts like (int)(long)&chararray[index]
what do you recommend? Casting to a uintptr_t, then to PRUint16?
Component: Networking → Nanojit
Reporter | ||
Comment 11•12 years ago
|
||
Doh, I skimmed the documentation for -P and didn't notice that you should change the type of the 'name' field. We still have the dodgy casts, but defining: struct dnsentry { PRUint16 offset; bool isException; bool isWild; }; works just fine and eliminates the warning spew. The generated hashtable is ~300K; it can be cut down to ~220K by using an appropriately high -m option. It's not clear that trading a couple hundred K of on-disk space to save several tens of K of runtime memory and some computation is worth it.
Comment 12•12 years ago
|
||
Nathan, Thanks for looking into this. If you've got momentum and want to write a patch that uses gperf, grab this from me and go for it. Otherwise I should get to it pretty soon.
Reporter | ||
Comment 14•12 years ago
|
||
FWIW, DMD reports the ETLD hashtable as: ==1512== Unreported: 1 block(s) in record 14 of 13721 ==1512== 135,168 bytes (131,076 requested / 4,092 slop) (this is a debug build, hence the slop) So the table we have today takes up ~128K at runtime; I don't think trading that for a gperf hashtable that's twice the size on-disk would be worthwhile.
Component: Nanojit → Networking
Reporter | ||
Comment 15•12 years ago
|
||
Comment on attachment 582445 [details] [diff] [review] patch v3 Hey Jason, it's been six months with no movement on the gperf front. I think this patch is worthwhile to go in as-is, so setting r?jduell on it.
Attachment #582445 -
Flags: review?(jduell.mcbugs)
Comment 16•12 years ago
|
||
Sounds like a good plan--I'll review, but may not happen until after B2G fork.
Assignee: jduell.mcbugs → nobody
Comment 17•12 years ago
|
||
Comment on attachment 582445 [details] [diff] [review] patch v3 Is this review something we can delegate to Steve Workman or Patrick McManus?
Comment 18•12 years ago
|
||
I can find the cycles to keep reviewing this. Sorry for the delay. So I unbitrotted the patch (mainly PRType -> stdint). Alas, it's now barfing during compilation with a warning (which, thankfully we treat as an error--yay!) that int constants are being truncated to fit into the uint16 'strtab_index' field. The culprit? effective_tld_names.dat has grown from 71K (including comments) to 98K since this patch was submitted, and the approach that stores an index into a concatenated array of domain names will no longer work with a 16 bit index. (I'm going to email Gerv to see if we can try to contain the growth rate of the ETLD database, which is getting scary). Going forward there are probably a bunch of ways to proceed that will still save us memory storage. Off the top of my head: we could generate two separate arrays, and pick which is indexed based on some simple hash (first character of domain name, etc).
Attachment #582445 -
Attachment is obsolete: true
Attachment #582445 -
Flags: review?(jduell.mcbugs)
Attachment #659968 -
Flags: feedback+
Reporter | ||
Comment 19•12 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #18) > I can find the cycles to keep reviewing this. Sorry for the delay. Thanks for the review. > Going forward there are probably a bunch of ways to proceed that will still > save us memory storage. Off the top of my head: we could generate two > separate arrays, and pick which is indexed based on some simple hash (first > character of domain name, etc). I think the easiest way is to use bitfields: struct ETLDEntry { uint32_t strtab_index : 30; uint32_t exception : 1; uint32_t wild : 1; }; 30 bits should be enough for anybody!
Reporter | ||
Comment 20•12 years ago
|
||
Here's an updated patch that compiles on my Linux box (I think previous versions were failing to define the `strings' structure, leading to link errors; not sure how I missed that...).
Attachment #659968 -
Attachment is obsolete: true
Attachment #660035 -
Flags: review?(jduell.mcbugs)
Comment 21•12 years ago
|
||
To answer jduell's question: the change is mostly due to an update from the .jp registry, which instituted a large set of fairly specific regional 3rd level domains, of which there are over 1600 :-| This sort of thing is a fairly unusual occurrence. We don't have a specific plan to limit the size of the list because it's merely a reflection of reality. The only way to reduce its size, in its raw form, would be to make it less accurate. Of course, one could do cunning things to reduce its size in processed/compiled form, as we are doing here... Gerv
Comment 22•12 years ago
|
||
Comment on attachment 660035 [details] [diff] [review] patch v5, v4 + bitfields Review of attachment 660035 [details] [diff] [review]: ----------------------------------------------------------------- Looks good! Please run through try before landing. Thanks for taking this on and being patient with reviews :) ::: netwerk/dns/prepare_tlds.py @@ +109,2 @@ > > + for (i, etld) in enumerate(getEffectiveTLDs(sys.argv[1])): So we don't use "i": any reason to use enumerate then?
Attachment #660035 -
Flags: review?(jduell.mcbugs) → review+
Reporter | ||
Comment 23•12 years ago
|
||
(In reply to Jason Duell (:jduell) from comment #22) > Thanks for taking this on and being patient with reviews :) Thanks for the review! > ::: netwerk/dns/prepare_tlds.py > @@ +109,2 @@ > > > > + for (i, etld) in enumerate(getEffectiveTLDs(sys.argv[1])): > > So we don't use "i": any reason to use enumerate then? Hm, no. Fixed in the push: https://hg.mozilla.org/integration/mozilla-inbound/rev/e7b4f8be9a4d The try run was a little odd: https://tbpl.mozilla.org/?tree=Try&rev=64db83b424a8 There was one (heretofore unknown?) intermittent orange (browser/base/content/test/browser_bug435235.js), but I'm fairly certain that it was unrelated to the specific changes made here. If the sheriffs and/or the networking folks feel differently, I'm happy to back this out.
Comment 24•12 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/e7b4f8be9a4d
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
You need to log in
before you can comment on or make changes to this bug.
Description
•