Last Comment Bug 704848 - reduce space required by nsEffectiveTLDService with more preprocessing
: reduce space required by nsEffectiveTLDService with more preprocessing
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Networking (show other bugs)
: unspecified
: All All
: -- normal (vote)
: mozilla18
Assigned To: Nobody; OK to take it and work on it
:
: Patrick McManus [:mcmanus]
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-11-23 08:46 PST by Nathan Froyd [:froydnj]
Modified: 2012-09-13 13:06 PDT (History)
9 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
patch (8.83 KB, patch)
2011-11-23 09:03 PST, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
patch, v2 (6.10 KB, patch)
2011-12-16 13:53 PST, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
patch v3 (7.39 KB, patch)
2011-12-16 17:35 PST, Nathan Froyd [:froydnj]
no flags Details | Diff | Splinter Review
v4: v3, unbitrotted, but now overflows (7.44 KB, patch)
2012-09-10 22:20 PDT, Jason Duell [:jduell] (needinfo me)
jduell.mcbugs: feedback+
Details | Diff | Splinter Review
patch v5, v4 + bitfields (7.88 KB, patch)
2012-09-11 04:49 PDT, Nathan Froyd [:froydnj]
jduell.mcbugs: review+
Details | Diff | Splinter Review

Description Nathan Froyd [:froydnj] 2011-11-23 08:46:10 PST
The space required by nsEffectiveTLDService.cpp:gEntries can be reduced significantly with a little more work in prepare_tlds.py.  Patch and explanation coming up.
Comment 1 Nathan Froyd [:froydnj] 2011-11-23 09:03:57 PST
Created attachment 576520 [details] [diff] [review]
patch

The pointers in ETLDEntry create two problems:

1. They (indirectly) introduce padding at the end of ETLDEntry, wasting space;
2. They require runtime relocations, increasing startup time and footprint.

This patch addresses both of these issues by turning the pointers into indices into a private string table.  The runtime cost is minimal: an extra arithmetic instruction on x86, probably two arithmetic instructions on ARM, with GCC.  I haven't checked what MSVC does to the code, but I would hope it does something similar.  After doing this:

1. ETLDEntry shrinks from 8 bytes on 32-bit platforms to 4 bytes.  (The savings are even better on 64-bit systems.)
2. We no longer need runtime relocations (there are no pointers in the data), so we save on the space required by them and the startup cost needed to process them.  I realize elfhack helps reduce the space of the relocations, but the best work is work that you don't do, right?

All told, we win by ~30-40K by doing this (16K of table size, plus 32K of relocation entries on x86/ARM, shrunk by whatever elfhack does to them).  I assume something similar happens on Windows.

I realize this is sort of an odd thing to do, but looking at data that needed relocations, nsEffectiveTLDService.cpp:gEntries was the biggest offender that was easily modifiable (e.g. not vtables, not table-driven QueryInterface bits).
Comment 2 Nathan Froyd [:froydnj] 2011-11-23 11:16:07 PST
Actually, this doesn't quite build right; other files that include nsEffectiveTLDService.h don't know where to find the now-required .inc file generated by the build process.  This is handled correctly for netwerk/dns/, but not for other netwerk/ subdirectories.  Will have a think about how to handle this; suggestions welcome.

Going to change review? to feedback?.
Comment 3 Nathan Froyd [:froydnj] 2011-11-23 11:16:37 PST Comment hidden (typo)
Comment 4 Nathan Froyd [:froydnj] 2011-12-16 13:53:17 PST
Created attachment 582382 [details] [diff] [review]
patch, v2

New patch, this time appropriately modifying the build system so everything builds.  It also includes preprocessor magic that reduces the Python modifications required and eliminates warnings about NULLs in strings from GCC (and possibly elsewhere).  Also gone are the two-level arrays and extra indexing that required; any compiler that can't generate good code now is not worth using.  Static asserts that our indices into the string table do not overflow are also provided, at a small cost: 1 byte for a dummy function totally unused at runtime.

The placement of the actual data in nsDomainEntry is a little weird, but seeing as how putting it anywhere else has problems, it is the best solution.  I looked at the generated code and it's ideal.

Actually asking for review this time.  Benefits: 40-50k smaller binary, nearly 2% fewer relocations to help improve startup.
Comment 5 Nathan Froyd [:froydnj] 2011-12-16 17:35:08 PST
Created attachment 582445 [details] [diff] [review]
patch v3

Now with less idiocy on my part.  We don't need a separate index array because we're storing the indices directly in the ETLDEntry struct.
Comment 6 Nathan Froyd [:froydnj] 2012-01-18 11:48:19 PST
Should mention that if the giant struct method is unpalatable, doing it the way parser/html/nsHtml5NamedCharactersInclude.h does it is also a possibility.  (We're generating the #include, so adding lengths and splitting things into spaces is no problem.)
Comment 7 Jason Duell [:jduell] (needinfo me) 2012-02-04 15:35:07 PST
As discussed with Nathan on IRC, this patch look good but I think we can do even better using gperf (it will also let us skip initializing the ~5K element hash table at startup).  I'm going to give that a go.
Comment 8 Jason Duell [:jduell] (needinfo me) 2012-02-04 15:35:51 PST
Comment on attachment 582445 [details] [diff] [review]
patch v3

Clearing review--if gperf approach doesn't work, we'll revisit.
Comment 9 Nathan Froyd [:froydnj] 2012-02-04 23:06:50 PST
FWIW, gperf's hashtable, even with -P (optimize for fewer relocations in shared libraries) produces a struct with:

{ char *name; bool b1; bool b2; }

which is the same as what we have today.

The generated code spews warnings with GCC (not even with -Wall, just -O2) and features dodgy casts like (int)(long)&chararray[index].  Hacking gperf might be an option (would be useful in other places)...
Comment 10 Jason Duell [:jduell] (needinfo me) 2012-02-05 19:37:43 PST
-P gives us an "int" offset (not a char *), which is still not what we want (we want an unsigned 16 bit value).  I'm planning to either postprocess that with a python script, or add a flag to gconf.

> dodgy casts like (int)(long)&chararray[index]

what do you recommend?  Casting to a uintptr_t, then to PRUint16?
Comment 11 Nathan Froyd [:froydnj] 2012-02-08 06:19:37 PST
Doh, I skimmed the documentation for -P and didn't notice that you should change the type of the 'name' field.  We still have the dodgy casts, but defining:

struct dnsentry { PRUint16 offset; bool isException; bool isWild; };

works just fine and eliminates the warning spew.

The generated hashtable is ~300K; it can be cut down to ~220K by using an appropriately high -m option.  It's not clear that trading a couple hundred K of on-disk space to save several tens of K of runtime memory and some computation is worth it.
Comment 12 Jason Duell [:jduell] (needinfo me) 2012-02-08 09:34:34 PST
Nathan,

Thanks for looking into this.  If you've got momentum and want to write a patch that uses gperf, grab this from me and go for it.  Otherwise I should get to it pretty soon.
Comment 13 Nathan Froyd [:froydnj] 2012-02-08 10:00:41 PST
Go for it. :)
Comment 14 Nathan Froyd [:froydnj] 2012-06-08 06:38:56 PDT
FWIW, DMD reports the ETLD hashtable as:

==1512== Unreported: 1 block(s) in record 14 of 13721
==1512==  135,168 bytes (131,076 requested / 4,092 slop)

(this is a debug build, hence the slop)

So the table we have today takes up ~128K at runtime; I don't think trading that for a gperf hashtable that's twice the size on-disk would be worthwhile.
Comment 15 Nathan Froyd [:froydnj] 2012-08-08 09:37:30 PDT
Comment on attachment 582445 [details] [diff] [review]
patch v3

Hey Jason, it's been six months with no movement on the gperf front.  I think this patch is worthwhile to go in as-is, so setting r?jduell on it.
Comment 16 Jason Duell [:jduell] (needinfo me) 2012-08-08 22:00:17 PDT
Sounds like a good plan--I'll review, but may not happen until after B2G fork.
Comment 17 Josh Aas 2012-09-08 13:42:04 PDT
Comment on attachment 582445 [details] [diff] [review]
patch v3

Is this review something we can delegate to Steve Workman or Patrick McManus?
Comment 18 Jason Duell [:jduell] (needinfo me) 2012-09-10 22:20:35 PDT
Created attachment 659968 [details] [diff] [review]
v4: v3, unbitrotted, but now overflows

I can find the cycles to keep reviewing this.  Sorry for the delay.

So I unbitrotted the patch (mainly PRType -> stdint).  Alas, it's now barfing during compilation with a warning (which, thankfully we treat as an error--yay!) that int constants are being truncated to fit into the uint16 'strtab_index' field.

The culprit?  effective_tld_names.dat has grown from  71K (including comments) to 98K since this patch was submitted, and the approach that stores an index into a concatenated array of domain names will no longer work with a 16 bit index.   (I'm going to email Gerv to see if we can try to contain the growth rate of the ETLD database, which is getting scary).

Going forward there are probably a bunch of ways to proceed that will still save us memory storage.  Off the top of my head: we could generate two separate arrays, and pick which is indexed based on some simple hash (first character of domain name, etc).
Comment 19 Nathan Froyd [:froydnj] 2012-09-11 01:02:50 PDT
(In reply to Jason Duell (:jduell) from comment #18)
> I can find the cycles to keep reviewing this.  Sorry for the delay.

Thanks for the review.

> Going forward there are probably a bunch of ways to proceed that will still
> save us memory storage.  Off the top of my head: we could generate two
> separate arrays, and pick which is indexed based on some simple hash (first
> character of domain name, etc).

I think the easiest way is to use bitfields:

struct ETLDEntry {
  uint32_t strtab_index : 30;
  uint32_t exception : 1;
  uint32_t wild : 1;
};

30 bits should be enough for anybody!
Comment 20 Nathan Froyd [:froydnj] 2012-09-11 04:49:17 PDT
Created attachment 660035 [details] [diff] [review]
patch v5, v4 + bitfields

Here's an updated patch that compiles on my Linux box (I think previous versions were failing to define the `strings' structure, leading to link errors; not sure how I missed that...).
Comment 21 Gervase Markham [:gerv] 2012-09-12 08:41:43 PDT
To answer jduell's question: the change is mostly due to an update from the .jp registry, which instituted a large set of fairly specific regional 3rd level domains, of which there are over 1600 :-| This sort of thing is a fairly unusual occurrence.

We don't have a specific plan to limit the size of the list because it's merely a reflection of reality. The only way to reduce its size, in its raw form, would be to make it less accurate. Of course, one could do cunning things to reduce its size in processed/compiled form, as we are doing here...

Gerv
Comment 22 Jason Duell [:jduell] (needinfo me) 2012-09-12 11:03:55 PDT
Comment on attachment 660035 [details] [diff] [review]
patch v5, v4 + bitfields

Review of attachment 660035 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good!  Please run through try before landing.

Thanks for taking this on and being patient with reviews :)

::: netwerk/dns/prepare_tlds.py
@@ +109,2 @@
>  
> +  for (i, etld) in enumerate(getEffectiveTLDs(sys.argv[1])):

So we don't use "i": any reason to use enumerate then?
Comment 23 Nathan Froyd [:froydnj] 2012-09-13 04:29:11 PDT
(In reply to Jason Duell (:jduell) from comment #22)
> Thanks for taking this on and being patient with reviews :)

Thanks for the review!

> ::: netwerk/dns/prepare_tlds.py
> @@ +109,2 @@
> >  
> > +  for (i, etld) in enumerate(getEffectiveTLDs(sys.argv[1])):
> 
> So we don't use "i": any reason to use enumerate then?

Hm, no.  Fixed in the push:

https://hg.mozilla.org/integration/mozilla-inbound/rev/e7b4f8be9a4d

The try run was a little odd:

https://tbpl.mozilla.org/?tree=Try&rev=64db83b424a8

There was one (heretofore unknown?) intermittent orange (browser/base/content/test/browser_bug435235.js), but I'm fairly certain that it was unrelated to the specific changes made here.

If the sheriffs and/or the networking folks feel differently, I'm happy to back this out.
Comment 24 Ed Morley [:emorley] 2012-09-13 13:06:39 PDT
https://hg.mozilla.org/mozilla-central/rev/e7b4f8be9a4d

Note You need to log in before you can comment on or make changes to this bug.