Closed Bug 1380154 Opened 8 years ago Closed 7 years ago

Improve Effective TLD service lookups

Categories

(Core :: Networking: DNS, enhancement)

enhancement
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: erahm, Assigned: erahm)

References

Details

(Whiteboard: [necko-would-take])

Attachments

(4 files, 5 obsolete files)

Currently eTLD service lookups are rather slow (see bug 1376038) and the data structure used to hold the lookup table is rather large. I have a PoC that uses Chromium's [1] DAFSA [2] technique (a specialized trie [3]) which shows a large improvement in memory usage and potential smaller perf wins. I also looked into gperf which provided ~30% perf improvement but was too large of a size tradeoff. [1] https://chromium.googlesource.com/chromium/src/net/+/6ba04a90565e236b5a420c3a5039718183ad35bd/tools/dafsa/make_dafsa.py [2] https://en.wikipedia.org/wiki/Deterministic_acyclic_finite_state_automaton [3] https://en.wikipedia.org/wiki/Trie
And a bit of history on the Chromium patch [1], I'm tempting to take the code wholesale and not think too hard about the implementation. [1] https://codereview.chromium.org/197183002/
Whiteboard: [necko-would-take]
Assignee: nobody → erahm
Status: NEW → ASSIGNED
WIP patch, fails some tests. Need to look into why dafsa lookups are failing.
MozReview-Commit-ID: 3zMBzUM9QUg
Attachment #8887254 - Attachment is obsolete: true
Nick can you take a look at this? With Nathan out you seem like the best option. TL;DR: this patch just imports the Chromium DAFSA code, adds a thin wrapper around it and some tests to make sure everything works. I'd prefer not to make any changes to the original logic in this patch, but am happy to make changes in a follow up patch. The new interface, commments, etc are fair game of course. ---------------------------------------------- This imports Chromium's `make_dafsa.py` script [1]. It takes in a gperf formatted file (note: gperf is *not* required) and converts that to a compact binary representation of the string data in the form of a deterministic acyclic finite state automaton (DAFSA) [2]. The only change made to the script was to make it handle the arguments our file generation script passes in to the `main` function. It also imports the logic for traversing the DAFSA [3] almost verbatim in `Dafsa.cpp`. A thin wrapper was added so that we can reuse the DAFSA structure for multiple tables. The only change made to the original logic was to swap in mozilla style assertions and rename the not found constant from `kNotFound` to `Dafsa::kKeyNotFound` in order to avoid a collision with `kNotFound` defined in our nsString code. [1] https://chromium.googlesource.com/chromium/src/net/+/6ba04a90565e236b5a420c3a5039718183ad35bd/tools/dafsa/make_dafsa.py [2] https://en.wikipedia.org/wiki/Deterministic_acyclic_finite_state_automaton [3] https://chromium.googlesource.com/chromium/chromium/+/a2a90a35aa5b7178e219668bde5889595c710451/net/base/registry_controlled_domains/registry_controlled_domain.cc#72 MozReview-Commit-ID: Eion9POHZm5
Attachment #8888533 - Flags: review?(n.nethercote)
Attachment #8887253 - Attachment is obsolete: true
Jason, do you mind reviewing this? You've reviewed past refactorings of this code. This replaces our giant sorted array of eTLD names with a more compact DAFSA.
Attachment #8888545 - Flags: review?(jduell.mcbugs)
Attachment #8887674 - Attachment is obsolete: true
I should note I don't know if I've done the right thing regarding copyright notices and whatnot. For `make_dafsa.py` I just kept Google's bsd-style as-is. For the lifted code I added the Google copyright in addition to ours.
Comment on attachment 8888545 [details] [diff] [review] Part 2: Generate a DAFSA and use it for eTLDs Review of attachment 8888545 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/dns/prepare_tlds.py @@ +104,4 @@ > eTLD file is then printed to output. > """ > > + # Find and load the `make_dafsa.py` script under xpcom/ds. I'm not necessarily super happy about this, but the logic is copied from `file_generate` [1] if that makes you feel any better. [1] http://searchfox.org/mozilla-central/rev/3a3af33f513071ea829debdfbc628caebcdf6996/python/mozbuild/mozbuild/action/file_generate.py#50-53
Comment on attachment 8888533 [details] [diff] [review] Part 1: Add the Chromium DAFSA generator Review of attachment 8888533 [details] [diff] [review]: ----------------------------------------------------------------- ::: xpcom/ds/Dafsa.cpp @@ +149,5 @@ > +} > + > +namespace mozilla { > + > +int Dafsa::Lookup(const nsACString& aKey) const This is the only piece I added, the rest is imported from Chromium. ::: xpcom/ds/make_dafsa.py @@ +4,5 @@ > +# found in the LICENSE file. > + > +""" > +A Deterministic acyclic finite state automaton (DAFSA) is a compact > +representation of an unordered word list (dictionary). This is all imported code, so probably not worth reviewing but the comment is pretty helpful in laying out what this thing does.
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #8) > I should note I don't know if I've done the right thing regarding copyright > notices and whatnot. For `make_dafsa.py` I just kept Google's bsd-style > as-is. For the lifted code I added the Google copyright in addition to ours. Gerv, I'm pulling in a few pieces of code from Chromium. I'm not sure what we should do from a license perspective. There are thorough details with sources in comment 6 laying out what I've done. Can you provide some guidance?
Flags: needinfo?(gerv)
Comment on attachment 8888533 [details] [diff] [review] Part 1: Add the Chromium DAFSA generator Seems reasonable. I'd like some measurements for speed, binary size and memory usage before it lands. I want to make sure there are clear benefits that outweight the increased complexity.
Attachment #8888533 - Flags: review?(n.nethercote) → review+
Unless there's an unusual license, we are pretty much set up to take code from Chromium. Find the Chromium license in about:license, find the list of directories covered, and add yours. Gerv
Flags: needinfo?(gerv)
It looks like part 2 reduces our binary size by 75K and scores as well on the 'njn test' from bug 1247835. For further testing I added a large stress test created by browsing several sites from alexa top 500 (US, Chinese, and Russian) and logging every call to |GetBaseDomainInternal|. I then converted that list to a gtest calling |GetBaseDomainFromHost| thus bypassing JS and creating nsIURI instances. On this test I saw ~30% slowdown (+/- 8%). So by previous standards we're doing well, but by a new test we do worse on perf which is unfortunate. I'm going to try some talos tests to see if anything turns up and I'll clean up the stress test and attach it to this bug for a sanity check.
Attached patch Perf testSplinter Review
MozReview-Commit-ID: 8BVD8tutO5O
Comment on attachment 8889691 [details] [diff] [review] Perf test Review of attachment 8889691 [details] [diff] [review]: ----------------------------------------------------------------- Nick, this is the stress test. Let me know what you think of it as far as being worth measuring against. Something interesting to think about given the repetitiveness of the URLs is the possibility of adding an MRU cache to quickly lookup repeated values.
Attachment #8889691 - Flags: feedback?(n.nethercote)
Comment on attachment 8889691 [details] [diff] [review] Perf test Review of attachment 8889691 [details] [diff] [review]: ----------------------------------------------------------------- It certainly is repetitive! You said you browsed several sites from alexa top 500. Is this the actual requests from that browsing? Just for kicks I ran the list through a histogram script: > ( 1) 22783 (46.0%, 46.0%): NS_LITERAL_CSTRING("lenta.ru"), > ( 2) 5387 (10.9%, 56.9%): NS_LITERAL_CSTRING("www.youtube.com"), > ( 3) 2683 ( 5.4%, 62.3%): NS_LITERAL_CSTRING("techcrunch.com"), > ( 4) 2550 ( 5.2%, 67.5%): NS_LITERAL_CSTRING("lax1-ib.adnxs.com"), > ( 5) 2190 ( 4.4%, 71.9%): NS_LITERAL_CSTRING("twitter.com"), > ( 6) 1665 ( 3.4%, 75.3%): NS_LITERAL_CSTRING("pubads.g.doubleclick.net"), > ( 7) 1470 ( 3.0%, 78.2%): NS_LITERAL_CSTRING("www.jd.com"), > ( 8) 747 ( 1.5%, 79.8%): NS_LITERAL_CSTRING("pbs.twimg.com"), > ( 9) 594 ( 1.2%, 81.0%): NS_LITERAL_CSTRING("warriors.jd.com"), > ( 10) 462 ( 0.9%, 81.9%): NS_LITERAL_CSTRING("ok.ru"), > ( 11) 364 ( 0.7%, 82.6%): NS_LITERAL_CSTRING("apx.moatads.com"), > ( 12) 343 ( 0.7%, 83.3%): NS_LITERAL_CSTRING("profile.ssp.rambler.ru"), > ( 13) 281 ( 0.6%, 83.9%): NS_LITERAL_CSTRING("abs.twimg.com"), > ( 14) 261 ( 0.5%, 84.4%): NS_LITERAL_CSTRING("s0.2mdn.net"), > ( 15) 245 ( 0.5%, 84.9%): NS_LITERAL_CSTRING("pixel.rubiconproject.com"), > ( 16) 191 ( 0.4%, 85.3%): NS_LITERAL_CSTRING("icdn.lenta.ru"), > ( 17) 188 ( 0.4%, 85.7%): NS_LITERAL_CSTRING("i.ytimg.com"), > ( 18) 174 ( 0.4%, 86.0%): NS_LITERAL_CSTRING("youtube.com"), > ( 19) 172 ( 0.3%, 86.4%): NS_LITERAL_CSTRING("ssp.rambler.ru"), > ( 20) 149 ( 0.3%, 86.7%): NS_LITERAL_CSTRING("search.yahoo.com"), > ( 21) 144 ( 0.3%, 87.0%): NS_LITERAL_CSTRING("img11.360buyimg.com"), > ( 22) 135 ( 0.3%, 87.2%): NS_LITERAL_CSTRING("www.google.com"), > ( 23) 130 ( 0.3%, 87.5%): NS_LITERAL_CSTRING("img14.360buyimg.com"), > ( 24) 128 ( 0.3%, 87.8%): NS_LITERAL_CSTRING("tap-secure.rubiconproject.com"), > ( 25) 128 ( 0.3%, 88.0%): NS_LITERAL_CSTRING("img12.360buyimg.com"), > ( 26) 128 ( 0.3%, 88.3%): NS_LITERAL_CSTRING("img13.360buyimg.com"), > ( 27) 120 ( 0.2%, 88.5%): NS_LITERAL_CSTRING("img10.360buyimg.com"), > ( 28) 108 ( 0.2%, 88.7%): NS_LITERAL_CSTRING("i.mycdn.me"), > ( 29) 106 ( 0.2%, 89.0%): NS_LITERAL_CSTRING("ib.adnxs.com"), > ( 30) 94 ( 0.2%, 89.1%): NS_LITERAL_CSTRING("secure.adnxs.com"), I dunno, seems reasonable to me, I guess, given that it's based on real usage.
Attachment #8889691 - Flags: feedback?(n.nethercote) → feedback+
(In reply to Nicholas Nethercote [:njn] from comment #17) > Comment on attachment 8889691 [details] [diff] [review] > Perf test > > Review of attachment 8889691 [details] [diff] [review]: > ----------------------------------------------------------------- > > It certainly is repetitive! You said you browsed several sites from alexa > top 500. Is this the actual requests from that browsing? Yeah it's essentially the output of adding the following to |GetBaseDomainInternal|: > printf_stderr(" NS_LITERAL_CSTRING(\"%s\"),\n", aHostname.get());
Comment on attachment 8888545 [details] [diff] [review] Part 2: Generate a DAFSA and use it for eTLDs Review of attachment 8888545 [details] [diff] [review]: ----------------------------------------------------------------- Nice to save the 75K, thanks! I wouldn't worry too much about the low-level performance here--this code doesn't get called that frequently and I'd be surprised if it shows on Talos, etc.
Attachment #8888545 - Flags: review?(jduell.mcbugs) → review+
This adds a most recently used (MRU) cache for the most common base domain requests (aAddtionalParts == 1). With a table size of 31 I saw 8777 hits and 22 misses when loading twitter, youtube, and techcrunch. In stress testing this provided a 75% reduction in run time. MozReview-Commit-ID: 3JgCwIZagMs
Attachment #8890132 - Flags: review?(jduell.mcbugs)
Comment on attachment 8890132 [details] [diff] [review] Part 3: Cache most recently used eTLD entries Review of attachment 8890132 [details] [diff] [review]: ----------------------------------------------------------------- ::: netwerk/dns/nsEffectiveTLDService.cpp @@ +341,5 @@ > +{ > + MOZ_ASSERT(NS_IsMainThread()); > + > + const uint32_t hash = HashString(aHost.BeginReading(), aHost.Length()); > + *aEntry = &mMruTable[hash % kTableSize]; Integer division is slow. Can you make kTableSize 32 and then use masking?
(In reply to Nicholas Nethercote [:njn] from comment #21) > Comment on attachment 8890132 [details] [diff] [review] > Part 3: Cache most recently used eTLD entries > > Review of attachment 8890132 [details] [diff] [review]: > ----------------------------------------------------------------- > > ::: netwerk/dns/nsEffectiveTLDService.cpp > @@ +341,5 @@ > > +{ > > + MOZ_ASSERT(NS_IsMainThread()); > > + > > + const uint32_t hash = HashString(aHost.BeginReading(), aHost.Length()); > > + *aEntry = &mMruTable[hash % kTableSize]; > > Integer division is slow. Can you make kTableSize 32 and then use masking? 31 is used because it's prime and this is a small fixed-size hash table. See bug 1383125 for an example where this was requested and proved to perform poorly.
> 31 is used because it's prime and this is a small fixed-size hash table. See > bug 1383125 for an example where this was requested and proved to perform > poorly. Huh, weird. That suggests the hash function is no good. Anyway, a comment about this would be helpful :)
(In reply to Nicholas Nethercote [:njn] from comment #23) > > 31 is used because it's prime and this is a small fixed-size hash table. See > > bug 1383125 for an example where this was requested and proved to perform > > poorly. > > Huh, weird. That suggests the hash function is no good. > > Anyway, a comment about this would be helpful :) Yes of course, it's not obvious unless you just happen to be discussing it in another bug :)
It'd be really nice to have a top-level comment explaining the two-level data structure: - Base structe is a DAFSA, which is moderately fast, and very compact because it doesn't store every string individually. - In front is a small direct-mapped cache which has a high hit rate because lookups are very repetitive in practice.
Updated per njn's comments. I'm going to redirect the review as jduell already indicated the DAFSA switch was okay in part 2. MozReview-Commit-ID: 3JgCwIZagMs
Attachment #8891142 - Flags: review?(n.nethercote)
Attachment #8890132 - Attachment is obsolete: true
Attachment #8890132 - Flags: review?(jduell.mcbugs)
Comment on attachment 8891142 [details] [diff] [review] Part 3: Cache most recently used eTLD entries Review of attachment 8891142 [details] [diff] [review]: ----------------------------------------------------------------- A bunch of nits, but overall looks good. ::: netwerk/dns/nsEffectiveTLDService.cpp @@ +212,5 @@ > if (result == PR_SUCCESS) > return NS_ERROR_HOST_IS_IP_ADDRESS; > > + // Lookup in the hash if this is a normal query. > + TLDHashEntry* index = nullptr; s/index/entry/? ::: netwerk/dns/nsEffectiveTLDService.h @@ +42,3 @@ > mozilla::Dafsa mGraph; > + > + struct TLDHashEntry Would TLDCacheEntry be a better name? This is a direct-mapped cache, not a hash table, even if a hash function is used to map strings to entries. (The same comment applies to various other uses of the word "hash" in this patch.) @@ +43,5 @@ > + > + struct TLDHashEntry > + { > + nsCString host; > + nsCString baseDomain; mHost and mBaseDomain, please. @@ +53,5 @@ > + // We first check the cache for a matching result and avoid a DAFSA lookup > + // if a match is found. Otherwise we lookup the domain in the DAFSA and then > + // cache the result. During standard browsing the same domains are repeatedly > + // fed into |GetBaseDomainInternal| so this ends up being an effective > + // mitigation. A tiny bit of data would be nice here. E.g. "testing showed a 99%+ hit rate". @@ +58,5 @@ > + // > + // A size of 31 is used rather than a more logical power-of-two such as 32 > + // since it is a prime number and provides fewer collisions when when used > + // with our hash algorithms. > + enum { kTableSize = 31 }; Is there a reason for the enum instead of |static const size_t kTableSize = 31;|? @@ +69,5 @@ > + * @param aHost The host to lookup. > + * @param aEntry Out param, the entry in the MRU table to use. > + * @return True if a match was found, false if there was a miss. > + */ > + inline bool LookupForAdd(const nsACString& aHost, TLDHashEntry** aEntry); This would be simpler: > inline TLDHashEntry* LookupForAdd(const nsACString&); where the return value is null on failure.
Attachment #8891142 - Flags: review?(n.nethercote) → review+
(In reply to Nicholas Nethercote [:njn] from comment #27) > Comment on attachment 8891142 [details] [diff] [review] > Part 3: Cache most recently used eTLD entries > > Review of attachment 8891142 [details] [diff] [review]: > ----------------------------------------------------------------- > > A bunch of nits, but overall looks good. > > ::: netwerk/dns/nsEffectiveTLDService.cpp > @@ +212,5 @@ > > if (result == PR_SUCCESS) > > return NS_ERROR_HOST_IS_IP_ADDRESS; > > > > + // Lookup in the hash if this is a normal query. > > + TLDHashEntry* index = nullptr; > > s/index/entry/? Sure. > ::: netwerk/dns/nsEffectiveTLDService.h > @@ +42,3 @@ > > mozilla::Dafsa mGraph; > > + > > + struct TLDHashEntry > > Would TLDCacheEntry be a better name? This is a direct-mapped cache, not a > hash table, even if a hash function is used to map strings to entries. (The > same comment applies to various other uses of the word "hash" in this patch.) Sure. > @@ +43,5 @@ > > + > > + struct TLDHashEntry > > + { > > + nsCString host; > > + nsCString baseDomain; > > mHost and mBaseDomain, please. Will update. > @@ +53,5 @@ > > + // We first check the cache for a matching result and avoid a DAFSA lookup > > + // if a match is found. Otherwise we lookup the domain in the DAFSA and then > > + // cache the result. During standard browsing the same domains are repeatedly > > + // fed into |GetBaseDomainInternal| so this ends up being an effective > > + // mitigation. > > A tiny bit of data would be nice here. E.g. "testing showed a 99%+ hit rate". Eh, it's in the commit message but I can add it to the comment too. > @@ +58,5 @@ > > + // > > + // A size of 31 is used rather than a more logical power-of-two such as 32 > > + // since it is a prime number and provides fewer collisions when when used > > + // with our hash algorithms. > > + enum { kTableSize = 31 }; > > Is there a reason for the enum instead of |static const size_t kTableSize = > 31;|? No, I'll switch to that. Or I guess uint32_t to match the hash we're modding. > @@ +69,5 @@ > > + * @param aHost The host to lookup. > > + * @param aEntry Out param, the entry in the MRU table to use. > > + * @return True if a match was found, false if there was a miss. > > + */ > > + inline bool LookupForAdd(const nsACString& aHost, TLDHashEntry** aEntry); > > This would be simpler: > > > inline TLDHashEntry* LookupForAdd(const nsACString&); > > where the return value is null on failure. We want the entry even on failure so we can update it with the new cached value.
Updated per Nick's review. Jason can you give this a once over as a module peer?
Attachment #8891148 - Flags: review?(jduell.mcbugs)
Attachment #8891142 - Attachment is obsolete: true
(In reply to Eric Rahm [:erahm] (please no mozreview requests) from comment #1) > And a bit of history on the Chromium patch [1], I'm tempting to take the > code wholesale and not think too hard about the implementation. Yeah, I'm pretty sure they thought pretty hard about this. This is, AIUI, the third data structure they've tried for this data. Gerv
jduell: 5 day review ping!
Comment on attachment 8891148 [details] [diff] [review] Part 3: Cache most recently used eTLD entries Review of attachment 8891148 [details] [diff] [review]: ----------------------------------------------------------------- Sorry for review latency. Looks good!
Attachment #8891148 - Flags: review?(jduell.mcbugs) → review+
Backed out for failing OS X and Linux static at xpcom/ds/Dafsa.h:35 with 'constexpr constructor never produces a constant expression': https://hg.mozilla.org/integration/mozilla-inbound/rev/d1ba080ed5b0a365444c00cc038f4e8732f1f33e https://hg.mozilla.org/integration/mozilla-inbound/rev/89d10f8a22339aeccdb72398d406e2d65f889e06 https://hg.mozilla.org/integration/mozilla-inbound/rev/5e1312eafffb38f0d3eb3f27c8a7f5b8157a054d Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=27c23e731b015c389a699bfc4bc2b3d74e6759c0&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel Build log: https://treeherder.mozilla.org/logviewer.html#?job_id=120622809&repo=mozilla-inbound [task 2017-08-03T06:22:31.408024Z] 06:22:31 INFO - In file included from /home/worker/workspace/build/src/obj-firefox/xpcom/ds/Unified_cpp_xpcom_ds0.cpp:2: [task 2017-08-03T06:22:31.408088Z] 06:22:31 INFO - In file included from /home/worker/workspace/build/src/xpcom/ds/Dafsa.cpp:11: [task 2017-08-03T06:22:31.408301Z] 06:22:31 INFO - /home/worker/workspace/build/src/xpcom/ds/Dafsa.h:35:22: error: constexpr constructor never produces a constant expression [-Winvalid-constexpr] [task 2017-08-03T06:22:31.408361Z] 06:22:31 INFO - explicit constexpr Dafsa(const Graph& aData) : mData(aData) {} [task 2017-08-03T06:22:31.408825Z] 06:22:31 INFO - ^ [task 2017-08-03T06:22:31.409682Z] 06:22:31 INFO - /home/worker/workspace/build/src/xpcom/ds/Dafsa.h:35:50: note: non-literal type 'const Graph' (aka 'const Span<const unsigned char>') cannot be used in a constant expression [task 2017-08-03T06:22:31.409815Z] 06:22:31 INFO - explicit constexpr Dafsa(const Graph& aData) : mData(aData) {} [task 2017-08-03T06:22:31.409868Z] 06:22:31 INFO - ^ [task 2017-08-03T06:22:31.410189Z] 06:22:31 INFO - 1 error generated.
Flags: needinfo?(erahm)
Backed out for debug build bustage at xpcom/tests/gtest/TestExpirationTracker.cpp:83: https://hg.mozilla.org/integration/mozilla-inbound/rev/8a73a88c8a3dd35883ffd2dc3192c9e431436348 https://hg.mozilla.org/integration/mozilla-inbound/rev/830795a28f312fb69215e128611319ccd36c7330 https://hg.mozilla.org/integration/mozilla-inbound/rev/5295bf294fb3eab7d50f6f127d7c2469ba3a500c Push with bustage: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=57d1339ad1c0096ce889ce3088c54aa08ffa1412&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel Build log: https://treeherder.mozilla.org/logviewer.html#?job_id=121466099&repo=mozilla-inbound [task 2017-08-07T18:42:47.386949Z] 18:42:47 INFO - In file included from /home/worker/workspace/build/src/obj-firefox/xpcom/tests/gtest/Unified_cpp_xpcom_tests_gtest1.cpp:2:0: [task 2017-08-07T18:42:47.387050Z] 18:42:47 INFO - /home/worker/workspace/build/src/xpcom/tests/gtest/TestExpirationTracker.cpp: In member function 'void TestExpirationTracker::Tracker<K>::DoRandomOperation()': [task 2017-08-07T18:42:47.387256Z] 18:42:47 INFO - /home/worker/workspace/build/src/xpcom/tests/gtest/TestExpirationTracker.cpp:83:7: error: 'UniquePtr' was not declared in this scope etc.
Flags: needinfo?(erahm)
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Depends on: 1403115
No longer depends on: 1403115
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: