Last Comment Bug 622232 - DNS prefetches continue after a tab is closed
: DNS prefetches continue after a tab is closed
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: Networking (show other bugs)
: Trunk
: All All
: -- normal (vote)
: mozilla12
Assigned To: Steve Workman [:sworkman] (please use needinfo)
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2010-12-30 20:31 PST by Justin Lebar (not reading bugmail)
Modified: 2012-03-06 05:42 PST (History)
14 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Testcase (1.79 KB, text/html)
2010-12-30 20:32 PST, Justin Lebar (not reading bugmail)
no flags Details
Bigger testcase (240 bytes, text/html)
2011-07-14 12:13 PDT, Justin Lebar (not reading bugmail)
no flags Details
Proposed Diff (24.29 KB, patch)
2011-10-03 17:21 PDT, Steve Workman [:sworkman] (please use needinfo)
mcmanus: review+
Details | Diff | Review
Proposed Diff v2.0 (23.81 KB, patch)
2011-10-05 17:06 PDT, Steve Workman [:sworkman] (please use needinfo)
sworkman: review+
bzbarsky: superreview-
Details | Diff | Review
Proposed Diff v2.1 (23.88 KB, patch)
2011-10-12 14:24 PDT, Steve Workman [:sworkman] (please use needinfo)
sworkman: review+
Details | Diff | Review
Proposed Diff v2.2 (24.14 KB, patch)
2011-10-14 16:13 PDT, Steve Workman [:sworkman] (please use needinfo)
sworkman: review+
bzbarsky: superreview+
Details | Diff | Review
Fixed patch to deal with changes in base code (17.98 KB, patch)
2011-10-24 10:54 PDT, Steve Workman [:sworkman] (please use needinfo)
sworkman: review+
sworkman: superreview+
Details | Diff | Review
Update for checkin #2, diff v2.4; fix r and sr usernames (17.95 KB, patch)
2011-10-24 11:43 PDT, Steve Workman [:sworkman] (please use needinfo)
sworkman: review+
sworkman: superreview+
Details | Diff | Review
v2.5; fixes mochitest-1 breakage (18.34 KB, patch)
2011-10-25 15:19 PDT, Steve Workman [:sworkman] (please use needinfo)
sworkman: review+
sworkman: superreview+
Details | Diff | Review
v3.0 Should avoid perf regression on dromaeo dom-modify (26.73 KB, patch)
2011-12-30 19:04 PST, Steve Workman [:sworkman] (please use needinfo)
mcmanus: review+
Details | Diff | Review
v3.1 Should avoid perf regression on dromaeo dom-modify; updated after r+ from :mcmanus (19.69 KB, patch)
2012-01-04 12:17 PST, Steve Workman [:sworkman] (please use needinfo)
sworkman: review+
bzbarsky: superreview+
Details | Diff | Review
v3.2 Should avoid perf regression on dromaeo dom-modify; updated after sr+ from :bz (21.20 KB, patch)
2012-01-17 15:00 PST, Steve Workman [:sworkman] (please use needinfo)
sworkman: review+
sworkman: superreview+
Details | Diff | Review

Description Justin Lebar (not reading bugmail) 2010-12-30 20:31:02 PST
STR:
* Load wireshark / tcpdump, start capturing.
* Load attached page in a new tab.
* Notice that Firefox begins prefetching the domains linked to on the page (the quickbrownfox1.com, thequickbrownfox2.com, etc.)
* Close the tab, but don't close Firefox.

Actual results:

Firefox will continue to send DNS requests for domains on the page, even after it's closed.

Expected results:

I'd expect Firefox to stop prefetching domains for a tab after it's closed.  Maybe it doesn't matter, but it seems silly to keep prefetching, since pages may have many links.

Mozilla/5.0 (X11; Linux x86_64; rv:2.0b9pre) Gecko/20101230 Firefox/4.0b9pre
Comment 1 Justin Lebar (not reading bugmail) 2010-12-30 20:32:48 PST
Created attachment 500470 [details]
Testcase
Comment 2 Jason Duell [:jduell] (needinfo? me) 2011-06-15 14:38:47 PDT
Now that we keep mDNSPrefetch in nsHttpChannel, we could add a cancel() method to nsIDNSListener, and call if it the channel is cancelled?

Note that this would only cancel DNS prefetches caused by opening channels, not  those which the HTML parser issues while parsing the page (which I'd guess would stop after closing a tab, but I'm not sure how long that train takes to stop).
Comment 3 Justin Lebar (not reading bugmail) 2011-07-13 13:54:19 PDT
I wonder if this could cause a leak similar to bug 671053.
Comment 4 Kyle Huey [:khuey] (khuey@mozilla.com) 2011-07-13 13:56:31 PDT
Excellent question!
Comment 5 Jason Duell [:jduell] (needinfo? me) 2011-07-13 17:11:08 PDT
> I wonder if this could cause a leak similar to bug 671053.

Not likely--the code in nsDNSPrefetch.h doesn't hold any references to listeners or other outside objects.  AFAIK not stopping DNS prefetches only wastes a little network bandwidth, uses up some DNS cache entries, and more seriously, potentially delays other, more useful DNS queries (i.e. navigation initiated after closing the tab may wind up waiting for the tab's lingering prefetches to resolve: though we do reserve some slots for non-prefetch DNS queries). I'd say that's the main impetus for fixing this at some point (I'd give it low-to-medium priority in the DNS bug heap).

Justin: for how long did you see the host requests getting issued after closing the tab (and how many requests?  All of them on your test page?)

> Note that this would only cancel DNS prefetches caused by opening
> channels, not  those which the HTML parser issues while parsing the page 

We could also potentially have the HTML parser keep a reference to prefetches, and cancel them if they're no longer going to be needed.
Comment 6 Justin Lebar (not reading bugmail) 2011-07-14 12:04:54 PDT
> Justin: for how long did you see the host requests getting issued after closing 
> the tab (and how many requests?  All of them on your test page?)

I just re-tested, and the DNS queries now don't appear to be rate-limited. As I recall when I filed, there was a second or so between queries, but now we appear to send one out as soon as we get one back.

I think that makes this bug -- if it even still exists -- a much lower priority.  It would presumably only affect people with very slow DNS servers, who are going to have a crappy experience anyway.
Comment 7 Justin Lebar (not reading bugmail) 2011-07-14 12:13:02 PDT
Created attachment 545970 [details]
Bigger testcase

Here's a testcase with a lot more domains.

It seems to stop fetching new domains about 10s after closing the tab, but it keeps retrying ones it'd tried before for longer.  So this really isn't so bad.
Comment 8 Jason Duell [:jduell] (needinfo? me) 2011-07-14 12:20:44 PDT
So we issue as many as 8 DNS queries, 3 of which can be "non-high" priority:

 http://mxr.mozilla.org/mozilla-central/source/netwerk/dns/nsHostResolver.h#72

Prefetches of links on pages are not high: prefetches from channels that were asyncOpened but canceled before actually connecting to the server are probably high.  

I'm not too worried about slowing down future link prefetches.  It'd be nice to cancel the AsyncOpen ones, since they might actually slow down really needed (not speculative) DNS queries if there's more the 5 outstanding at once, but I'm guessing this doesn't happen often, and the infrastructure needed doesn't make this rise to the top of our TODO list anytime soon AFAICT.

10s seems like a long time, but it sounds like this is just because it's retrying queries (all the links of your pages have bogus DNS hosts, so that's an unusual case).  I'd be curious to know what sort of delay (if any) you see in resolving links if you navigate to another page like your test page (with different hosts in the links) right after you close the first one.  If those are delayed 10s, then maybe it's worth fixing something here (though again, all bogus hosts is a rarified case).
Comment 9 Steve Workman [:sworkman] (please use needinfo) 2011-10-03 17:21:19 PDT
Created attachment 564402 [details] [diff] [review]
Proposed Diff

Removes DNS prefetches for nsHTMLAnchorElement objects, which should resolve the issue at hand.

Notes: 

-1- Using testcase supplied (attachment 545970 [details] in comment #7) some DNS prefetches are still sent after tab is closed, but much fewer than the max allowed (150) AND no retries seem to be sent.  Suggest the queries that do get through are due to the size of the testcase.

-2- Did not change basic Deferral FIFO implementation in nsHTMLDNSPrefetch.cpp, i.e. it's still using a basic array, which means that removing elements leaves holes in the array.  I suggest that changing the implementation is not worth it, since SubmitQueue() resets it every call and the holes are only of impact in cases where prefetches are requested before the page is fully loaded and the tab is closed just after the page is loaded - it seems to be a corner case.
Comment 10 Patrick McManus [:mcmanus] 2011-10-04 07:59:21 PDT
Comment on attachment 564402 [details] [diff] [review]
Proposed Diff

Review of attachment 564402 [details] [diff] [review]:
-----------------------------------------------------------------

Steve - impressive work! Thanks!

Please document in a comment that the effect of this patch is to cancel all pending prefetches for the same name, even if they are pending from a different tab than the one that triggered the cancel. I think that's ok - the bookkeeping required to do it right is going to cost you more on the common non-cancel case than you are losing here. Same logic applies to leaving holes in the array - its ok in the cancel path.

A couple nits: the idl change (nsIDNSService.idl) requires you bump the uuid.. and you'll need a sr too (all idl changes need an sr).

nsDNSPrefetch.cpp seems to just have a whitespace change. Best to leave that out of the patch.

you should get a set of tp4/5 runs on try to confirm that you don't have a performance problem before landing. (don't rely on the tbpl colors for tp4/5, read the numbers - the colors don't catch performance regressions.) I say this just because changing something like  nsHTMLAnchorElement.cpp can be very sensitive to performance.

conditional on those things, r+mcmanus
Comment 11 Justin Lebar (not reading bugmail) 2011-10-04 08:03:58 PDT
FYI, if you're new to this kind of thing, our perf-comparison tools aren't very good.  The trick is figuring out whether your test results are within the current range of results.

Compare Talos [1] does only an OK job with this.  I find I have more success pulling up the graph of the test result at [2] and comparing the results by hand.

[1] http://services.forerunnerdesigns.com/compare-talos/
[2] http://graphs-new.mozilla.org/
Comment 12 Steve Workman [:sworkman] (please use needinfo) 2011-10-04 17:04:39 PDT
Thanks Patrick for the review - changes made to the code and I've pushed it to the try server for results.

Thanks Justin for the headsup re performance measurement.

I'll update the diff here once try finishes.
Comment 13 Steve Workman [:sworkman] (please use needinfo) 2011-10-04 19:10:11 PDT
FYI: I'm seeing some memory leaks for nsHostRecord from the try server tests.  So, I'm going to have to do a little bit more analysis on this tomorrow before trying to land it.
Comment 14 Mozilla RelEng Bot 2011-10-05 03:11:04 PDT
Try run for 831fb53fd28e is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=831fb53fd28e
Results (out of 229 total builds):
    exception: 1
    success: 203
    warnings: 22
    failure: 3
Builds available at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/sworkman@mozilla.com-831fb53fd28e
Comment 15 Jason Duell [:jduell] (needinfo? me) 2011-10-05 11:07:30 PDT
Comment on attachment 564402 [details] [diff] [review]
Proposed Diff

Review of attachment 564402 [details] [diff] [review]:
-----------------------------------------------------------------

::: content/html/content/src/nsHTMLAnchorElement.cpp
@@ +140,5 @@
>  
>  nsHTMLAnchorElement::nsHTMLAnchorElement(already_AddRefed<nsINodeInfo> aNodeInfo)
>    : nsGenericHTMLElement(aNodeInfo),
> +    Link(this),
> +    mDNSPrefetchRequested(PR_FALSE)

nit:  the convention we use mostly for constructor lists is

  : foo(this)
  , bar(that)
  , etc(frob)

putting the comma at the front means future patches don't have to break hg blame for the previous line just to add a ',' where there wasn't one before.

@@ +210,5 @@
>  
>    // Prefetch links
>    if (aDocument && nsHTMLDNSPrefetch::IsAllowed(GetOwnerDoc())) {
>      nsHTMLDNSPrefetch::PrefetchLow(this);
> +    mDNSPrefetchRequested = PR_TRUE;

patch-wide, use 'true' and false for bools.  Use PR_TRUE only for PRBool (which we now only use for functions that are in IDLs)

@@ +226,5 @@
> +  // Cancel any DNS prefetches which were requested - removes unnecessary
> +  // prefetches from the queue (see bug #622232)
> +  if (mDNSPrefetchRequested) {
> +    nsHTMLDNSPrefetch::CancelPrefetchLow(this, NS_ERROR_ABORT);
> +    mDNSPrefetchRequested = PR_FALSE;

use 'false'.

There's a whole internal debate we have about whether to put bug #s in comments (bz and others like to avoid clutter and rely on hg blame; others find it safer to keep bug # in, especially if code is confusing/unintuitive).  I'd nix the mention of the bug # in this case--code is reasonably straightforward.  I bet you can make the comment fit into one line too:

  // Cancel any DNS prefetches.

::: content/html/content/src/nsHTMLDNSPrefetch.cpp
@@ +233,5 @@
>                                     sDNSListener, nsnull, getter_AddRefs(tmpOutstanding));
>  }
>  
>  nsresult
> +nsHTMLDNSPrefetch::CancelPrefetch(nsAString &hostname, PRUint16 flags, nsresult aReason)

nit: I'd generally put overloaded functions (2 versions of CancelPrefetch) right next to each other in the source file.

::: netwerk/dns/nsDNSService2.cpp
@@ +615,5 @@
> +    {
> +        MutexAutoLock lock(mLock);
> +
> +        if (mDisablePrefetch && (aFlags & RESOLVE_SPECULATE))
> +            return NS_ERROR_DNS_LOOKUP_QUEUE_FULL;

Seems like a weird error code to return, but prob not a big deal.
Comment 16 Steve Workman [:sworkman] (please use needinfo) 2011-10-05 17:04:10 PDT
(In reply to Jason Duell (:jduell) from comment #15)

> ::: content/html/content/src/nsHTMLAnchorElement.cpp
> @@ +140,5 @@
> >  
> >  nsHTMLAnchorElement::nsHTMLAnchorElement(already_AddRefed<nsINodeInfo> aNodeInfo)
> >    : nsGenericHTMLElement(aNodeInfo),
> > +    Link(this),
> > +    mDNSPrefetchRequested(PR_FALSE)
> 
> nit:  the convention we use mostly for constructor lists is
> 
>   : foo(this)
>   , bar(that)
>   , etc(frob)
> 
> putting the comma at the front means future patches don't have to break hg
> blame for the previous line just to add a ',' where there wasn't one before.
Understood, however the original code had the comma at the end of the line.  I've changed it so it has it at the start, so changesets from here on will not mess with hg blame.

> 
> ::: content/html/content/src/nsHTMLDNSPrefetch.cpp
> @@ +233,5 @@
> >                                     sDNSListener, nsnull, getter_AddRefs(tmpOutstanding));
> >  }
> >  
> >  nsresult
> > +nsHTMLDNSPrefetch::CancelPrefetch(nsAString &hostname, PRUint16 flags, nsresult aReason)
> 
> nit: I'd generally put overloaded functions (2 versions of CancelPrefetch)
> right next to each other in the source file.
Understood - I was trying to fit in some way with what was already in the file, which had:
Prefetch
PrefetchLow
PrefetchMedium
PrefetchHigh
Prefetch
PrefetchLow
PrefetchMedium
PrefetchHigh

I've changed it so the Cancel functions are grouped together, in the manner you requested, just after the group of Prefetch functions.

> ::: netwerk/dns/nsDNSService2.cpp
> @@ +615,5 @@
> > +    {
> > +        MutexAutoLock lock(mLock);
> > +
> > +        if (mDisablePrefetch && (aFlags & RESOLVE_SPECULATE))
> > +            return NS_ERROR_DNS_LOOKUP_QUEUE_FULL;
> 
> Seems like a weird error code to return, but prob not a big deal.
Agreed.  I'm copying the return code and flag check from already in use in nsDNSService::AsyncResolve().

Amended diff (which also includes the NS_RELEASE that caused the mem leak on try) should be uploaded next.
Comment 17 Steve Workman [:sworkman] (please use needinfo) 2011-10-05 17:06:47 PDT
Created attachment 565081 [details] [diff] [review]
Proposed Diff v2.0

Amended diff based on comments, and fixed mem leak discovered on try run.
Comment 18 Steve Workman [:sworkman] (please use needinfo) 2011-10-05 17:07:52 PDT
Not sure who can do an sr, so I just punted it to Jason for now.  Please change and let me know if need be.
Comment 19 Steve Workman [:sworkman] (please use needinfo) 2011-10-07 10:27:17 PDT
Try run (forgot to ask it to attach):
    https://tbpl.mozilla.org/?tree=Try&usebuildbot=1&rev=ef022aaf9924
Performance results look similar to past 7 days.  So, looks like this part is a go.
Comment 20 Jason Duell [:jduell] (needinfo? me) 2011-10-07 13:40:31 PDT
Comment on attachment 565081 [details] [diff] [review]
Proposed Diff v2.0

Honestly I don't think this really ought to require +sr, but it does add a function to an IDL, so I'm going to run it past Boris  (Steve: I'm not a superreviewer: for necko that's bz/biesi/josh, and if you are in a hurry and no one else is available, jst or any other sr).
Comment 21 Boris Zbarsky [:bz] 2011-10-07 19:57:55 PDT
Comment on attachment 565081 [details] [diff] [review]
Proposed Diff v2.0

The idl is fine, but the content changes could use review from a content peer.  Happy to volunteer.  ;)

Instead of adding a word to nsHTMLAnchorElement, you could use one of the existing free bits it has.  Add a PR_STATIC_ASSERT(ELEMENT_TYPE_SPECIFIC_BITS_OFFSET < 32) in nsHTMLAnchorElement, define a new flag name like so

#define HTML_ANCHOR_DNS_PREFETCH_REQUESTED \
  (1 << ELEMENT_TYPE_SPECIFIC_BITS_OFFSET) 

and then use SetFlags/UnsetFlags with that value to add/remove the flag and HasFlag() to check for that flag.
Comment 22 Mozilla RelEng Bot 2011-10-10 23:30:55 PDT
Try run for 86298f846c34 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=86298f846c34
Results (out of 184 total builds):
    exception: 3
    success: 157
    warnings: 22
    failure: 2
Builds available at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/sworkman@mozilla.com-86298f846c34
Comment 23 Steve Workman [:sworkman] (please use needinfo) 2011-10-12 14:22:09 PDT
Having issues comparing my patch's Tp results.  Looks like there were several spikes in the most recent tests, and then nothing since Oct 6th or so.  Or at least that's what I see using Justin's links from comment #11.

This patch (uploading next) takes into consideration bz's comments for nsHTMLAnchorElement.  Tp5 results look similar to the spikes.

Thoughts on how to proceed?
Comment 24 Steve Workman [:sworkman] (please use needinfo) 2011-10-12 14:24:09 PDT
Created attachment 566632 [details] [diff] [review]
Proposed Diff v2.1

Amends nsHTMLAnchorElement per bz's request.
Comment 25 Boris Zbarsky [:bz] 2011-10-12 14:38:10 PDT
Comment on attachment 566632 [details] [diff] [review]
Proposed Diff v2.1

>+++ b/content/html/content/src/nsHTMLAnchorElement.cpp
>-  : nsGenericHTMLElement(aNodeInfo),
>-    Link(this)
>+  : nsGenericHTMLElement(aNodeInfo)
>+  , Link(this)

No need for this change.

The documentation of aListener for the new nsIDNSService method doesn't make much sense.  Could you please fix that?

Have you measured the performance impact on inserting/removing a large tree with lots of anchors in it?
Comment 26 Justin Lebar (not reading bugmail) 2011-10-12 14:46:39 PDT
> This patch (uploading next) takes into consideration bz's comments for 
> nsHTMLAnchorElement.  Tp5 results look similar to the spikes.

It looks like TP has been changing.  It's now Tp5MozAfterPaint?  I must not be reading whatever newsgroup this was announced on.

I'd rebase and rerun the benchmarks :|
Comment 27 Steve Workman [:sworkman] (please use needinfo) 2011-10-12 15:12:55 PDT
@bz, See Jason's comment #15 and my reply in #16 :) re the constructor.

Updated the comment for aListener in nsIDNSService ...
     * @param aListener
     *        the original listener which was to be notified about the host lookup
     *        result - used to match request information to requestor.

Sound ok?

Re testing - do you know of a specific unit test script in content that could speed this along?  I'm not familiar with those scripts.

@Justin, thanks for the headsup.  If you didn't get this info from a newsgroup, where did you get it?
Comment 28 Justin Lebar (not reading bugmail) 2011-10-12 15:17:19 PDT
I noticed on graphserver that within the past week, TP5 results on mac were gone, replaced by TP5MozAfterPaint.  The same thing happened on Linux today-ish.

Also TBPL reports TP5_paint, which I assume is the same thing?  But you should probably ping someone in #build to find out what's going on for real.
Comment 29 Boris Zbarsky [:bz] 2011-10-12 15:53:36 PDT
> Sound ok?

Yes.

> do you know of a specific unit test script in content

As I recall, one of the dromaeo tests exercises this stuff pretty heavily.  Look for the blame on patch that added the whole "wait for documents to finish loading" prefetch queue, maybe?
Comment 30 Steve Workman [:sworkman] (please use needinfo) 2011-10-12 15:58:16 PDT
Thanks bz, Justin.  I'll look into these.
Comment 31 Mozilla RelEng Bot 2011-10-14 15:21:03 PDT
Try run for 592d318af935 is complete.
Detailed breakdown of the results available here:
    https://tbpl.mozilla.org/?tree=Try&rev=592d318af935
Results (out of 39 total builds):
    success: 36
    warnings: 1
    failure: 2
Builds available at http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/sworkman@mozilla.com-592d318af935
Comment 32 Steve Workman [:sworkman] (please use needinfo) 2011-10-14 16:10:11 PDT
Finally got around to running the tests again, and comparing the results with the correct background runs.

In response to Comment 25 and 29, I think that the Tp5 tests are the most suitable for this.  Dromaeo seems to be all about JavaScript.  Bz, is there any reason why Tp wouldn't be enough?

Tp5 load times are pretty much the same as the past few runs.

FYI, the TP5 MozAfterPaint times from my try run were compared to the same tests on Try-PGO.  Apparently, normal pushes to the try server are not build-optimised using PGO.  So, comparison of results has to be for Non-PGO, otherwise it's comparing a non-optimised build with an optimised one.

Diff (including bz's comments) added next.
Comment 33 Steve Workman [:sworkman] (please use needinfo) 2011-10-14 16:13:13 PDT
Created attachment 567211 [details] [diff] [review]
Proposed Diff v2.2
Comment 34 Boris Zbarsky [:bz] 2011-10-14 18:45:51 PDT
> Dromaeo seems to be all about JavaScript.

Dromaeo includes DOM manipulation too.  I'm not saying to run the whole test suite; just the relevant subtests.

> Bz, is there any reason why Tp wouldn't be enough?

Yes; it doesn't exercise the worst-case behavior for this code, while dromaeo does.
Comment 35 Steve Workman [:sworkman] (please use needinfo) 2011-10-17 13:25:11 PDT
Ok, no problem.  I also did Dromaeo tests in that try run.  For all the DOM-related ones (DOM, jslib and CSS), results are comparable.

Bz, anything else you need for the sr?
Comment 36 Boris Zbarsky [:bz] 2011-10-17 13:28:07 PDT
Steve, I meant running the specific dromaeo subtest that exercises this, not the overall test suite (where even a large regression in a subtest can look like noise).  See comment 29.
Comment 37 Steve Workman [:sworkman] (please use needinfo) 2011-10-17 18:22:20 PDT
DNS Prefetch deferrals seem to have been added in bug 464838.  That bug was initially filed to solve performance regression for Tp3, but bug 464838, comment 20 also shows that dromaeo modify was affected.  So, that seems to be the test we want to look at here.

To try to compare accurately, I ran dromaeo modify tests (http://dromaeo.com/?dom-modify) from non-debug, optimized builds using the same .mozconfig.  Two builds were used - one with the patch, one without.  Same base code.  I ran the tests 5x each just to be certain.

With patch:
innerHTML: 160.82runs/s ±1.76%
innerHTML: 161.75runs/s ±2.01%
innerHTML: 162.13runs/s ±2.37%
innerHTML: 160.62runs/s ±1.93%
innerHTML: 158.39runs/s ±2.30%

Without patch:
innerHTML: 150.55runs/s ±3.38%
innerHTML: 151.76runs/s ±1.98%
innerHTML: 150.51runs/s ±1.55%
innerHTML: 153.59runs/s ±2.88%
innerHTML: 150.78runs/s ±2.41%

I *think* I'm reading this right, but the patch looks like it performs marginally better, right?  (More runs/s seems good to me).

Please confirm.
Comment 38 Boris Zbarsky [:bz] 2011-10-17 19:15:55 PDT
Yep, those dromaeo results look great.  Thank you for double-checking that!
Comment 39 Boris Zbarsky [:bz] 2011-10-17 19:18:03 PDT
Comment on attachment 567211 [details] [diff] [review]
Proposed Diff v2.2

sr=me
Comment 40 Steve Workman [:sworkman] (please use needinfo) 2011-10-18 11:46:22 PDT
Thanks, bz!
Comment 41 Daniel Holbert [:dholbert] 2011-10-21 13:39:22 PDT
Steve, could you suggest a checkin message for this bug? (briefly summarizing the changes)

Bonus points if you post an updated patch with headers including the checkin messages & you as the author, as described herE:
https://developer.mozilla.org/en/Mercurial_FAQ#How_can_I_generate_a_patch_for_somebody_else_to_check-in_for_me.3F

(Also, the attached patch doesn't apply cleanly to current mozilla-inbound, though it does apply with fuzz.)
Comment 42 Daniel Holbert [:dholbert] 2011-10-21 13:40:24 PDT
> including the checkin messages

er, "message" :)
Comment 43 Dão Gottwald [:dao] 2011-10-22 12:40:10 PDT
The patch fails to apply.
Comment 44 Steve Workman [:sworkman] (please use needinfo) 2011-10-24 10:54:27 PDT
Created attachment 569108 [details] [diff] [review]
Fixed patch to deal with changes in base code

Updated diff to deal with changes in base code.
Added suitable commit message; format is mq patch - should be ready for commit now :)
Comment 45 Steve Workman [:sworkman] (please use needinfo) 2011-10-24 11:43:43 PDT
Created attachment 569123 [details] [diff] [review]
Update for checkin #2, diff v2.4; fix r and sr usernames

Fix usernames for r and sr to irc nicknames, not email addresses.
Comment 46 Justin Lebar (not reading bugmail) 2011-10-24 11:46:22 PDT
Pushed to inbound: https://hg.mozilla.org/integration/mozilla-inbound/rev/0144dca654e9
Comment 47 Ed Morley [:emorley] 2011-10-24 15:55:03 PDT
Backed out of inbound for M1 permaorange:

https://hg.mozilla.org/integration/mozilla-inbound/rev/aedc89b90ba5

https://tbpl.mozilla.org/?tree=Mozilla-Inbound&rev=0144dca654e9
https://tbpl.mozilla.org/php/getParsedLog.php?id=7007831&tree=Mozilla-Inbound
{
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/slave/m-in-lnx64-dbg/build/netwerk/dns/nsIDNService.cpp, line 314
WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /builds/slave/m-in-lnx64-dbg/build/netwerk/dns/nsIDNService.cpp, line 195
--DOMWINDOW == 418 (0x6d5c538) [serial = 1805] [outer = (nil)] [url = http://mochi.test:8888/tests/content/html/content/test/bug441930_iframe.html]
--DOMWINDOW == 417 (0x6dd9308) [serial = 1812] [outer = (nil)] [url = http://mochi.test:8888/tests/content/html/content/test/test_bug460568.html]
--DOMWINDOW == 416 (0x5890348) [serial = 1811] [outer = (nil)] [url = http://mochi.test:8888/tests/content/html/content/test/test_bug458037.xhtml]
--DOMWINDOW == 415 (0x54b0e68) [serial = 1807] [outer = (nil)] [url = http://mochi.test:8888/tests/content/html/content/test/bug441930_iframe.html]
--DOMWINDOW == 414 (0x7b942f8) [serial = 1806] [outer = (nil)] [url = http://mochi.test:8888/tests/content/html/content/test/bug441930_iframe.html]
62959 ERROR TEST-UNEXPECTED-FAIL | /tests/content/html/content/test/test_bug481335.xhtml | Test timed out.
}
Comment 48 Steve Workman [:sworkman] (please use needinfo) 2011-10-25 15:19:16 PDT
Created attachment 569522 [details] [diff] [review]
v2.5; fixes mochitest-1 breakage

Previous patch broke mochitest-1, specifically test_bug481335.xhtml.  Discussed with bz; he explained that canceling the prefetch in nsHTMLAnchorElement::UnbindFromTree() should happen before RefreshLinkState() not after.  Otherwise it causes a cached value to reappear from invalid information - this is why the test failed.
Fix applied and all tests are passing on try server:
    https://tbpl.mozilla.org/?tree=Try&rev=774faea12be8
Comment 49 Justin Lebar (not reading bugmail) 2011-10-25 15:47:46 PDT
Inbound: https://hg.mozilla.org/integration/mozilla-inbound/rev/08a63bc26c75

(Btw, try to remember to delete the "try:" goop before you upload the patch!)
Comment 50 Steve Workman [:sworkman] (please use needinfo) 2011-10-25 15:51:06 PDT
Thanks for the push to try.  Oops re "try:" goop :)
Comment 51 Ed Morley [:emorley] 2011-10-25 17:58:44 PDT
https://hg.mozilla.org/mozilla-central/rev/08a63bc26c75
Comment 52 Boris Zbarsky [:bz] 2011-10-25 21:07:38 PDT
Tinderbox says:

  Regression  Dromaeo (DOM) decrease 2.32% on MacOSX 10.6.2 (rev3) Mozilla-Inbound

and similar on OSX 10.6 and OSX 10.5.8....
Comment 53 Marco Bonardo [::mak] 2011-10-26 01:38:42 PDT
I was about to say the same as bz, so I'd rather say that I checked the regression range on graphs-new, and looks correctly pointing to this bug.
Comment 54 Boris Zbarsky [:bz] 2011-10-26 04:59:02 PDT
Marco, on graphs-new is the regression mac-only?

Which OS were the comment 37 tests done on?
Comment 55 Marco Bonardo [::mak] 2011-10-26 05:02:14 PDT
(In reply to Boris Zbarsky (:bz) from comment #54)
> Marco, on graphs-new is the regression mac-only?

no, it looks cross-platforms, Windows has a large hit on central for example, but it's not yet reported due to lack of further results.
Comment 56 Marco Bonardo [::mak] 2011-10-26 05:03:26 PDT
where by "large hit" I mean about 4%
Comment 57 Boris Zbarsky [:bz] 2011-10-26 05:04:19 PDT
So I just looked at the actual talos compare from when this patch landed on m-i.  It shows dromaeo_dom regressing on Mac and on Linux64 (Linux32 tests seem to not have completed on that changeset yet????).  And the details links show a large (~10%) regression precisely on modify.html:

http://perf.snarkfest.net/compare-talos/breakdown.html?oldTestIds=8816347&newTestIds=8816955&testName=dromaeo_dom
http://perf.snarkfest.net/compare-talos/breakdown.html?oldTestIds=8816731&newTestIds=8818138&testName=dromaeo_dom

It's not a 64-bit effect, probably, because OS X 10.5.8 shows it too....
Comment 58 Boris Zbarsky [:bz] 2011-10-26 05:04:53 PDT
Sounds to me like we need to back this out until we can mitigate the performance impact....
Comment 59 Marco Bonardo [::mak] 2011-10-26 05:10:32 PDT
(In reply to Boris Zbarsky (:bz) from comment #58)
> Sounds to me like we need to back this out until we can mitigate the
> performance impact....

I can do that, right now.
Comment 60 Marco Bonardo [::mak] 2011-10-26 05:14:29 PDT
backed out from inbound (merge pending, when trees will recover from Ash craziness)
Comment 61 Steve Workman [:sworkman] (please use needinfo) 2011-10-26 10:32:40 PDT
Just seeing all this now.  I'll start looking into it.
Comment 62 Steve Workman [:sworkman] (please use needinfo) 2011-10-26 10:35:27 PDT
Sorry - don't know what happened there - did not intend to hit resolve/fix.
Comment 63 Steve Workman [:sworkman] (please use needinfo) 2011-12-30 18:53:50 PST
Latest patch has the following results on dromaeo.com/?dom-modify

-----TEST------  --PATCH--  --BASE--
createElement:     810.87     738.70
createTextNode:    117.20     115.15
innerHTML:         161.55     157.33
cloneNode:         154.77     157.29
appendChild:     1,363.25   1,388.45
insertBefore:      790.78     780.99

Values are in "runs/s", so the patch seems to be performing well.  From my understanding, the point is to avoid performance regressions with this one, not so much look for performance improvements.

I've also submitted a try job: https://tbpl.mozilla.org/?tree=Try&rev=90076bbcccf0
I'll check those results on Tuesday.

Patch uploaded next.
Comment 64 Steve Workman [:sworkman] (please use needinfo) 2011-12-30 19:04:33 PST
Created attachment 585099 [details] [diff] [review]
v3.0 Should avoid perf regression on dromaeo dom-modify

New version aims to avoid perf regression.

Per bz's suggestion, flags are used in nsHTMLAnchorElement to indicate if the request is in the deferral queue or has already been passed to the DNS Service.  When submitting the deferral queue, nsHTMLDNSPrefetch::SubmitQueue checks the flags and determines whether a prefetch request should be sent or not.

In this way, no search needs to be done to cancel requests in the deferral queue.

In the case of canceling requests already passed to the DNS Service, the lookup is done in the DNS cache hash table, so it should be pretty quick.  Locally run tests seem to suggest that this is ok.

As mentioned in the previous comment, Try results are pending.
Comment 65 Patrick McManus [:mcmanus] 2012-01-03 06:00:21 PST
Comment on attachment 585099 [details] [diff] [review]
v3.0 Should avoid perf regression on dromaeo dom-modify

Review of attachment 585099 [details] [diff] [review]:
-----------------------------------------------------------------

Overall I couldn't find any /necko/ changes between 3.0 and 2.5, but the interdiff tool failed me (as usual) so I had to look for them.. it makes sense from the comment trail in the bug that the adjustments would be in /content/ .. based on that r+ for necko bits. If there was something please highlight.

::: netwerk/dns/nsDNSService2.h
@@ +37,5 @@
>  #include "nsPIDNSService.h"
>  #include "nsIIDNService.h"
>  #include "nsIObserver.h"
>  #include "nsHostResolver.h"
> +#include "nsICancelable.h"

a little unusual to add a include to a header with no other changes. is this needed?
Comment 66 Steve Workman [:sworkman] (please use needinfo) 2012-01-04 12:10:59 PST
Thanks, Patrick.  New diff uploaded next removing that superfluous inclusion - it was most likely left over from earlier changes I had made.

Try results: https://tbpl.mozilla.org/?tree=Try&rev=90076bbcccf0

I compared overall Dromaeo results (in addition to the Dromaeo DOM Modify tests I performed locally).  They look to be similar.

Platform---  Patch-  Inbound-   Try----
                     non-PGO
Linux Opt    176.29   173-178   174-176
 ~ 64 Opt    214.35   212-222   211-221
OS X Opt     264.33   259-264   262-265
 ~ 64 Opt    308.89   302-312   307-310
 ~ 10.7 Opt  310.38   310-314   310-313
Win (7) Opt  204.60   200-210   203-208
 ~ XP Opt    206.63   207-210   206-210

Try isn't the best measure to use, but I figure it's another benchmark to look at.
Comment 67 Steve Workman [:sworkman] (please use needinfo) 2012-01-04 12:17:25 PST
Created attachment 585849 [details] [diff] [review]
v3.1 Should avoid perf regression on dromaeo dom-modify; updated after r+ from :mcmanus

Removes the superfluous inclusion noticed by :mcmanus
Comment 68 Boris Zbarsky [:bz] 2012-01-12 13:07:20 PST
Comment on attachment 585849 [details] [diff] [review]
v3.1 Should avoid perf regression on dromaeo dom-modify; updated after r+ from :mcmanus

>+++ b/content/base/src/Link.h	Wed Jan 04 11:41:54 2012 -0800

Need to change the IID, right?

>+++ b/content/html/content/src/nsHTMLAnchorElement.cpp	Wed Jan 04 11:41:54 2012 -0800
>+// Make sure we have enough space for those bits
>+PR_STATIC_ASSERT(ELEMENT_TYPE_SPECIFIC_BITS_OFFSET < 32);

This is missing a +1 on the LHS of the compare.

>+// Indicates if a DNS Prefetch has been requested from this Anchor elem

s/if/that/

HTML_ANCHOR_DNS_PREFETCH_DEFERRED needs documentation.

> nsHTMLAnchorElement::UnbindFromTree(bool aDeep, bool aNullParent)

Note the hostname _can_ change between when the prefetch started and here.  Does it matter?  (In the common case it won't.)

>+++ b/content/html/content/src/nsHTMLDNSPrefetch.cpp	Wed Jan 04 11:41:54 2012 -0800
>+nsHTMLDNSPrefetch::CancelPrefetch(nsAString &hostname,
>+      gNeckoChild->SendCancelHTMLDNSPrefetch(nsAutoString(hostname), flags,

Do you really need the nsAutoString?  nsString or better yet PromiseFlatString() won't work?

>+  return sDNSService->CancelAsyncResolve(NS_ConvertUTF16toUTF8(hostname),
>+                                      flags | nsIDNSService::RESOLVE_SPECULATE,
>+                                      sDNSListener, aReason);

Please fix the indentation there.

>+          if (NS_OK == rv)

NS_SUCCEEDED?

>+++ b/netwerk/ipc/NeckoParent.cpp	Wed Jan 04 11:41:54 2012 -0800
>+NeckoParent::RecvCancelHTMLDNSPrefetch(const nsString& hostname,

>+  nsAutoString h(hostname);

Why do you need |h|?  Just passing hostname should work fine.

sr=me modulo above nits.
Comment 69 Steve Workman [:sworkman] (please use needinfo) 2012-01-17 15:00:24 PST
Created attachment 589314 [details] [diff] [review]
v3.2 Should avoid perf regression on dromaeo dom-modify; updated after sr+ from :bz

Updated based on bz's comments.  Also changed e10s recv function, RecvHTMLDNSPrefetch() to not use nsAutoString, same as RecvCancelHTMLDNSPrefetch().
Comment 70 Marco Bonardo [::mak] 2012-01-24 05:02:59 PST
So, looks like this has been pushed to inbound, but backed out by philor for build bustage on linux

cc1plus: warnings being treated as errors

../../../../../content/html/content/src/nsHTMLDNSPrefetch.cpp: In static member function 'static nsresult nsHTMLDNSPrefetch::CancelPrefetch(mozilla::dom::Link*, PRUint16, nsresult)':
../../../../../content/html/content/src/nsHTMLDNSPrefetch.cpp:217:12: error: unused variable 'rv'
make[8]: *** [nsHTMLDNSPrefetch.o] Error 1
make[8]: Leaving directory `/builds/slave/m-in-lnx/build/obj-firefox/content/html/content/src'
make[7]: *** [src_libs] Error 2
make[7]: Leaving directory `/builds/slave/m-in-lnx/build/obj-firefox/content/html/content'
make[6]: *** [content_libs] Error 2
make[6]: Leaving directory `/builds/slave/m-in-lnx/build/obj-firefox/content/html'
make[5]: *** [html_libs] Error 2
make[5]: *** Waiting for unfinished jobs....
Comment 71 Steve Workman [:sworkman] (please use needinfo) 2012-01-24 16:08:06 PST
Pushed to try:
    https://hg.mozilla.org/try/rev/c61151b368ab
No warnings as errors observed.
Comment 72 Steve Workman [:sworkman] (please use needinfo) 2012-01-24 16:22:12 PST
Pushed to mozilla-inbound:
    https://hg.mozilla.org/integration/mozilla-inbound/rev/49c9b4e6325b

Note You need to log in before you can comment on or make changes to this bug.