The default bug view has changed. See this FAQ.

Implement prefetch in network predictor

RESOLVED FIXED in Firefox 48

Status

()

Core
Networking
RESOLVED FIXED
3 years ago
10 months ago

People

(Reporter: nwgh, Assigned: nwgh)

Tracking

unspecified
mozilla48
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox48 fixed)

Details

(Whiteboard: [necko-active])

Attachments

(2 attachments, 8 obsolete attachments)

Comment hidden (empty)
Depends on: 1020416
Depends on: 1020419
Status: NEW → ASSIGNED
Jeremy - as part of this, it would be great if you could add telemetry for how often we're using these prefetched resources versus how often we just end up ignoring the prefetched resource. That will help us tune the algorithm over time if/as needed.
Depends on: 1029760
Created attachment 8463627 [details] [diff] [review]
bug-1016628-fix.patch

This test is still a work in progress, but confirmation that none of this is insane would be appreciated. :)
Attachment #8463627 - Flags: feedback?(hurley)
Created attachment 8464181 [details] [diff] [review]
review-1020416.patch

Just added an internal version of IsForcedValid entry which doesn't get the mutex lock first.
Attachment #8464181 - Flags: review?(michal.novotny)
wrong bug *facepalm*
Attachment #8464181 - Flags: review?(michal.novotny)
Comment on attachment 8463627 [details] [diff] [review]
bug-1016628-fix.patch

Review of attachment 8463627 [details] [diff] [review]:
-----------------------------------------------------------------

This all looks pretty good to me. The one thing I've thought about is that we *might* not want to force validity on responses with, say, 5xx (server error) statuses. Presumably the server should be making these non-cacheable, but I wouldn't be surprised to find out there are a lot of people who aren't. It's quite likely there are other status codes we shouldn't force valid (401 and 407 come to mind as potential candidates), but I'm totally hazy today, so don't trust my judgement 100% :)
Attachment #8463627 - Flags: feedback?(hurley) → feedback+
Thanks for looking that over :) I've been making some design simplications for tests and stuff, but I'll extend the status code stuff to do as you say (and incorporate Pat's comment from earlier today about not prefetching things that can't be prefetched)
Sounds good. Also, double-check with Pat about particular error codes that we don't want to force validity for. It's entirely possible that I'm wrong about all of those, and we just want to trust the channel to Do The Right Thing.
Attachment #8464181 - Attachment is obsolete: true
(In reply to Nicholas Hurley [:hurley] (Offline until 28 July) from comment #7)
> Sounds good. Also, double-check with Pat about particular error codes that
> we don't want to force validity for. It's entirely possible that I'm wrong
> about all of those, and we just want to trust the channel to Do The Right
> Thing.

I would go the other way - only prefetch things that have given 200 in the past. There are other things that are cacheable, but they are corner cases in this context.
Created attachment 8466547 [details] [diff] [review]
bug-1016628-fix.patch

https://tbpl.mozilla.org/?tree=Try&rev=210262225943
Attachment #8463627 - Attachment is obsolete: true
Attachment #8466547 - Flags: review?(hurley)
Created attachment 8466571 [details] [diff] [review]
bug-1016628-fix.patch

This is version that is less likely to burn on try. We shall see, but for now I'm going to hold on pushing to try until I address the major nits/suggestions for this patch.
Attachment #8466547 - Attachment is obsolete: true
Attachment #8466547 - Flags: review?(hurley)
Attachment #8466571 - Flags: review?(hurley)
This try build is probably the last "full" build I try to do - I made some minor changes that should allow for safe compilation on WinXP (*shakes head*). The primary goal of this build is to see if the tweaks I made to the test will now pass on OSX. If they don't, all future tweaks will be test with try builds focusing on that specific test.

https://tbpl.mozilla.org/?tree=Try&rev=1c9822288fe6
https://tbpl.mozilla.org/?tree=Try&rev=80ab5d569ddb (last test was not what I wanted so I canceled it ASAP)
Comment on attachment 8466571 [details] [diff] [review]
bug-1016628-fix.patch

Review of attachment 8466571 [details] [diff] [review]:
-----------------------------------------------------------------

This looks pretty good! I want to discuss with you (we can do this tomorrow) the possibility of making cacheable false by default in the db. This may or may not be possible, but I want some time to think about it and discuss before we go forward with this implementation.

::: netwerk/base/src/Predictor.cpp
@@ +574,5 @@
>      // have to change this to actually upgrade things as appropriate.
>      MOZ_ASSERT(currentVersion == PREDICTOR_SCHEMA_VERSION,
>                 "Invalid predictor schema version!");
>      if (currentVersion != PREDICTOR_SCHEMA_VERSION) {
> +      // Delete the old file, and make a new DB

You need to close the db connection here before removing the file.

@@ +634,5 @@
>                           "  hid INTEGER NOT NULL,\n"
>                           "  origin TEXT NOT NULL,\n"
>                           "  hits INTEGER DEFAULT 0,\n"
>                           "  last_hit INTEGER DEFAULT 0,\n"
> +                         "  cacheable INTEGER DEFAULT 1,\n"

Don't add this column to this table. We'll do some sql hackery below to make it irrelevant :)

@@ +771,5 @@
>                           "  pid INTEGER NOT NULL,\n"
>                           "  uri TEXT NOT NULL,\n"
>                           "  hits INTEGER DEFAULT 0,\n"
>                           "  last_hit INTEGER DEFAULT 0,\n"
> +                         "  cacheable INTEGER DEFAULT 1,\n"

I'm trying to think of a way we can make this false by default (to make it harder to prefetch something we shouldn't)... I haven't come up with a good idea yet, though.

@@ +1076,5 @@
> +    if (mQueryType == Predictor::QUERY_PAGE) {
> +      stmt = gPredictor->mStatements.GetCachedStatement(
> +        NS_LITERAL_CSTRING("UPDATE moz_subresources SET cacheable = "
> +          ":is_cacheable WHERE id = :id;"));
> +    } else {

Don't have this branch, as we won't have this column to update on moz_subhosts :)

@@ +1086,5 @@
> +      return NS_OK;
> +    }
> +    mozStorageStatementScoper scope(stmt);
> +
> +    // Set cacheable to false for this entry

So this comment lies, as you could, in fact, set it to true here :)

@@ +1087,5 @@
> +    }
> +    mozStorageStatementScoper scope(stmt);
> +
> +    // Set cacheable to false for this entry
> +    int32_t cacheable = mCacheable? 1 : 0;

Space before the ?

@@ +1090,5 @@
> +    // Set cacheable to false for this entry
> +    int32_t cacheable = mCacheable? 1 : 0;
> +    nsresult rv = stmt->BindInt32ByName(NS_LITERAL_CSTRING("is_cacheable"),
> +                                        cacheable);
> +    if (rv != NS_OK)

Braces around the body of the if

@@ +1094,5 @@
> +    if (rv != NS_OK)
> +      return rv;
> +
> +    rv = stmt->BindInt32ByName(NS_LITERAL_CSTRING("id"), mId);
> +    if (rv != NS_OK)

Braces around the body of the if

@@ +1161,5 @@
> +  }
> +
> +  // Get the cache entry and set the forced validation setting
> +  nsCOMPtr<nsIHttpChannel> channel = do_QueryInterface(aContext);
> +  if (!channel)

Braces around the body of the if

@@ +1171,5 @@
> +  channel->GetResponseStatus(&response);
> +  if (response == 200) {
> +    rv = channel->ForceCacheEntryValidFor(300);
> +  }
> +  else {

This should be with the } on the previous line

@@ +1196,5 @@
> +                                  nsIInputStream *aInputStream,
> +                                  uint64_t aOffset,
> +                                  uint32_t aCount)
> +{
> +  char *buffer = new char[aCount];

Just do char buffer[aCount];

@@ +1200,5 @@
> +  char *buffer = new char[aCount];
> +  uint32_t writeCount;
> +
> +  // Read the buffer so runtime doesn't complain
> +  nsresult rv = aInputStream->Read(buffer, aCount, &writeCount);

Just do "return aInputStream->Read(...);"

@@ +1204,5 @@
> +  nsresult rv = aInputStream->Read(buffer, aCount, &writeCount);
> +  return rv;
> +}
> +
> +nsresult PrefetchListener::UpdateIsCacheable(bool isCacheable)

Newline after nsresult

@@ +1622,4 @@
>    } else {
>      stmt = mStatements.GetCachedStatement(
> +        NS_LITERAL_CSTRING("SELECT id, origin, hits, last_hit, cacheable FROM "
> +                           "moz_subhosts WHERE hid = :id;"));

SELECT id, origin, hits, last_hit, 0 AS cacheable FROM moz_subhosts WHERE hid = :id

@@ +1873,5 @@
>      int32_t hitCount;
>      PRTime lastHit;
>      int baseConfidence, confidence;
>  
> +    // We use goto nextrow here instead of just failing, because we want

I SPEL GUD!

@@ +2136,3 @@
>    } else {
>      stmt = mStatements.GetCachedStatement(
> +        NS_LITERAL_CSTRING("SELECT id, hits, last_hit, cacheable FROM "

SELECT id, hits, last_hit, 0 AS cacheable ...

@@ +2189,5 @@
>    } else {
>      stmt = mStatements.GetCachedStatement(
>          NS_LITERAL_CSTRING("INSERT INTO moz_subhosts "
> +                           "(hid, origin, hits, last_hit, cacheable) VALUES "
> +                           "(:parent_id, :key, 1, :now, 1);"));

Drop the cacheable and default value 1 here, since it should no longer exist on this table.

::: netwerk/protocol/http/HttpChannelChild.cpp
@@ +1332,5 @@
>    return NS_ERROR_NOT_IMPLEMENTED;
>  }
>  
> +NS_IMETHODIMP
> +HttpChannelChild::ForceCacheEntryValidFor(uint32_t aSecondsToTheFuture) {

Newline before {

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +4325,5 @@
>      return NS_OK;
>  }
>  
> +NS_IMETHODIMP
> +nsHttpChannel::ForceCacheEntryValidFor(uint32_t aSecondsToTheFuture) {

Newline before {

::: netwerk/protocol/viewsource/nsViewSourceChannel.cpp
@@ +783,5 @@
>          mHttpChannel->RedirectTo(uri);
>  }
>  
> +NS_IMETHODIMP
> +nsViewSourceChannel::ForceCacheEntryValidFor(uint32_t aSecondsToTheFuture) {

Newline before {

::: netwerk/test/unit/test_predictor.js
@@ +304,5 @@
> +  // Now would be a good time to turn on what we're testing
> +  prefs.setBoolPref("network.predictor.enable-prefetch", true);
> +
> +  // Lets set up a server to hand back some resources
> +  httpserv = new HttpServer();

Declare httpserv somewhere, don't just rely on javascripts (junky) "global by default" behavior.
Attachment #8466571 - Flags: review?(hurley)
>@@ +1196,5 @@
>> +                                  nsIInputStream *aInputStream,
>> +                                  uint64_t aOffset,
>> +                                  uint32_t aCount)
>> +{
>> +  char *buffer = new char[aCount];
>
>Just do char buffer[aCount];

Actually this causes a build error in Win XP. It was buffer[aCount] originally. The latest try build was supposed to confirm that, but I modified a file somewhere else that was causing my warnings-as-errors build to fail (and my fix for that actually breaks on XP as well haha)
(In reply to Jeremy Poulin [:jaypoulz] from comment #14)
> Actually this causes a build error in Win XP. It was buffer[aCount]
> originally. The latest try build was supposed to confirm that, but I
> modified a file somewhere else that was causing my warnings-as-errors build
> to fail (and my fix for that actually breaks on XP as well haha)

*shakes fist* WINDOOOOOOOOOOOOWS!
Depends on: 1050613
Created attachment 8469761 [details] [diff] [review]
bug-1016628-fix.patch

Time to see if I accidentally fixed my OSX testing problems :)
This is built on top of my current 1050613 patch.

https://tbpl.mozilla.org/?tree=Try&rev=4700be290d9c
Attachment #8466571 - Attachment is obsolete: true
Attachment #8469761 - Flags: review?(hurley)
Hey Nick - post review, would you mind taking a look at some of these test failures? I think it either has to do with nsIURI, the implementation we chose for UpdateResource, or some combination of both. The const array size still whines in XP *shrug* :)
https://tbpl.mozilla.org/?tree=Try&rev=d9a5a1742a03 -- Here are some debug builds with tests. The only other modified code is revert back to 'new char[]' so XP quits whining.
Comment on attachment 8469761 [details] [diff] [review]
bug-1016628-fix.patch

Review of attachment 8469761 [details] [diff] [review]:
-----------------------------------------------------------------

Looking mostly good, but see below about the pitfalls of relying on mReferrer directly.

Let's also get the try failures worked out before coming back for the (hopefully) final round of review.

::: netwerk/base/src/Predictor.cpp
@@ +1243,5 @@
> +                                  nsIInputStream *aInputStream,
> +                                  uint64_t aOffset,
> +                                  const uint32_t aCount)
> +{
> +  char buffer[aCount];

So I know you changed this back to new char[aCount], and that's cool, but... I think I remember in the previous version of the patch that you never did a delete[] buffer before returning. You should really do that to avoid some nasty memory leaks :) (If I'm wrong, then just ignore me, as usual.)

@@ +2206,5 @@
>  
> +  int32_t cacheable;
> +  rv = stmt->GetInt32(3, &cacheable);
> +  NS_ENSURE_SUCCESS(rv, false);
> +  info.cacheable = cacheable? true : false;

space before ?

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +1220,5 @@
> +    // Notify the network predictor to see if this is a cacheable resource
> +    nsCOMPtr<nsIURI> target;
> +    GetURI((nsIURI **) &target);
> +    if (mReferrer && httpStatus) {
> +      mozilla::net::Predictor::PredictorUpdateResource(mReferrer, target,

So the problem with actually using mReferrer here is... it may not always be set (for example in the case where this is an http-schemed resource on an https-schemed page), or it may be set to some substring of the actual URI (see HttpBaseChannel::SetReferrer for some gory details).

I *think* what we want to do here (and I could be totally wrong, we should consult with people smarter than ourselves) is get at the docshell as an nsIWebNavigation (which should be gettable from mLoadGroup in a similar fashion as getting the nsILoadContext (also docshell) in one of the PredictorLearn incarnations), and get the currentURI off of the nsIWebNavigation. If that fails, we should fall back to the referrer (for the sake of testing, at least, since we won't have a docshell in tests). This will likely take some manual testing and poking at the database to make sure that we've gotten this right.
Attachment #8469761 - Flags: review?(hurley)
Mind glancing over this: https://tbpl.mozilla.org/?tree=Try&rev=0827b63ec8e7 (it can wait till Monday of course) :)
Flags: needinfo?(hurley)
So I think the problem around contentLength is... you might need to explicitly set the Content-Length header on your responses (I could be wrong, though, it's been a while since I've messed with the xpcshell http server). (This is the 10.6 debug xpcshell failure.)

The other xpcshell failure is ... *some* check going wrong in the cache listener. Unfortunately, the cache listener is kind of horrible in that there's no indication of *which* check actually went wrong, so it'll be just as easy for you to figure that one out as it would for me (add some debugging print statements to head_cache2.js and re-run the tests).

The rest of the failures (ie, the non-xpcshell ones) seem to be Not Your Fault.
Flags: needinfo?(hurley)
Patrick - would you glancing over this and giving your thoughts?

Excerpt from Comment 19:
> ::: netwerk/protocol/http/nsHttpChannel.cpp
> @@ +1220,5 @@
> > +    // Notify the network predictor to see if this is a cacheable resource
> > +    nsCOMPtr<nsIURI> target;
> > +    GetURI((nsIURI **) &target);
> > +    if (mReferrer && httpStatus) {
> > +      mozilla::net::Predictor::PredictorUpdateResource(mReferrer, target,
> 
> So the problem with actually using mReferrer here is... it may not always be
> set (for example in the case where this is an http-schemed resource on an
> https-schemed page), or it may be set to some substring of the actual URI
> (see HttpBaseChannel::SetReferrer for some gory details).
> 
> I *think* what we want to do here (and I could be totally wrong, we should
> consult with people smarter than ourselves) is get at the docshell as an
> nsIWebNavigation (which should be gettable from mLoadGroup in a similar
> fashion as getting the nsILoadContext (also docshell) in one of the
> PredictorLearn incarnations), and get the currentURI off of the
> nsIWebNavigation. If that fails, we should fall back to the referrer (for
> the sake of testing, at least, since we won't have a docshell in tests).
> This will likely take some manual testing and poking at the database to make
> sure that we've gotten this right.
Flags: needinfo?(mcmanus)
nick sounds like he's on the right tack there.. the easiest path is probably to abstract up the code nsHttpChannel::GetPerformance() uses to find the innner window in the form of a nsPIDomWindow and then you can just call GetDocumentURI on that
Flags: needinfo?(mcmanus)
Created attachment 8530522 [details] [diff] [review]
WIP-bug-1016628.patch

Does this look like a sane approach to getting at the docshell? Obviously there is more code in nsHttpChannel to call into this, but I'd like you to take a quick look at this to verify my sanity before I move ahead with make a full patch.
Attachment #8530522 - Flags: feedback?(hurley)
Comment on attachment 8530522 [details] [diff] [review]
WIP-bug-1016628.patch

Review of attachment 8530522 [details] [diff] [review]:
-----------------------------------------------------------------

This looks pretty sane... I think. Splinter isn't giving me enough context to see the whole body of GetInnerDomWindow, but since you haven't changed much, I assume it's all good :)

For the next round, let's use the new mozreview tool, since reviewboard (which is what mozreview is) is *way* nicer than splinter. I'm happy to finally have my code review system from 6 years ago back :)

::: netwerk/protocol/http/HttpBaseChannel.h
@@ +41,4 @@
>  #include "nsIHttpChannel.h"
>  #include "nsISecurityConsoleMessage.h"
>  #include "nsCOMArray.h"
> +#include "nsPIDOMWindow.h"

Don't include this here, only in the .cpp (if it isn't already included there). Just forward-declare nsPIDOMWindow like:

class nsPIDOMWindow;

and then you'll be fine in the .h

@@ +260,4 @@
>    // drop reference to listener, its callbacks, and the progress sink
>    void ReleaseListeners();
>  
> +  nsPIDOMWindow* GetInnerDomWindow();

Caps for DOM - GetInnerDOMWindow
Attachment #8530522 - Flags: feedback?(hurley) → feedback+
Comment hidden (obsolete)
Taking this, since Jeremy does not appear to be working on it any more, and we want to get prefetch landed - For Awesomeness.
Assignee: jaypoulz → hurley
Attachment #8469761 - Attachment is obsolete: true
Comment on attachment 8530522 [details] [diff] [review]
WIP-bug-1016628.patch

These patches are both wildly out of date - written back when the predictor still used sqlite for storage! I've mostly rewritten them, just need to work out some issues and do some more testing before I post new ones.
Attachment #8530522 - Attachment is obsolete: true
Comment hidden (obsolete)
Created attachment 8700087 [details] [diff] [review]
patch

This one changed rather dramatically, so I'm claiming it as my own now. But, it works.
Attachment #8700087 - Flags: review?(honzab.moz)
Comment on attachment 8700087 [details] [diff] [review]
patch

Review of attachment 8700087 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/Predictor.cpp
@@ +149,5 @@
>  // Version of metadata entries we expect
>  const uint32_t METADATA_VERSION = 1;
>  
> +// Flags available in entries
> +const uint32_t FLAG_CACHEABLE = 1 << 0;

add comment what this flag exactly does, how it is used..

@@ +1271,5 @@
> +  nsCOMPtr<nsIStreamListener> listener = new PrefetchListener(verifier, uri,
> +                                                              this);
> +  PREDICTOR_LOG(("    calling AsyncOpen listener=%p channel=%p", listener.get(),
> +                 channel.get()));
> +  rv = channel->AsyncOpen(listener, channel);

why exactly is the channel its own listener context? (comment sufficient)

@@ +2167,3 @@
>   */
>  NS_IMETHODIMP
> +Predictor::OnPredictPrefetch(nsIURI *aURI, uint32_t httpStatus) {

{ on a new line

@@ +2260,5 @@
> +  rv = httpChannel->GetResponseStatus(&httpStatus);
> +  bool isNoStore;
> +  rv = httpChannel->IsNoStoreResponse(&isNoStore);
> +  bool isNoCache;
> +  rv = httpChannel->IsNoCacheResponse(&isNoCache);

you rewrite rv two times.

@@ +2261,5 @@
> +  bool isNoStore;
> +  rv = httpChannel->IsNoStoreResponse(&isNoStore);
> +  bool isNoCache;
> +  rv = httpChannel->IsNoCacheResponse(&isNoCache);
> +  if (NS_SUCCEEDED(rv) && httpStatus == 200 && !isNoStore && !isNoCache) {

Why exactly also |&& !isNoStore && !isNoCache| ?  definitely neither of them is treated by necko as non-cachable.  no-store is held in memory and no-cache is cached but revalidated.  in case of a prefetch you can probably go with being it forced valid (however, it's disputable as it may cause problems in some cases like an early refresh of the page - sooner than in 10 seconds ; OTOH, will the referrer still be the same in that case?).  

I'm a bit worried that most of the top-level document responses will have no-cache set...

@@ +2293,5 @@
> +{
> +  nsCString buffer;
> +  buffer.SetCapacity(aCount + 1);
> +  uint32_t writeCount;
> +  return aInputStream->Read(buffer.BeginWriting(), aCount, &writeCount);

use NS_DiscardSegment

@@ +2310,5 @@
> +    PREDICTOR_LOG(("Predictor::UpdateCacheability in PB mode - ignoring"));
> +    return;
> +  }
> +
> +  RefPtr<Predictor> self = sSelf;

assert thread (probably the main thread?)

@@ +2360,5 @@
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +
> +  PREDICTOR_LOG(("CacheabilityAction::OnCacheEntryAvailable this=%p", this));
> +  if (NS_FAILED(result) || isNew) {

you will never get isNew == true with OPEN_READONLY, probably just assert isNew == false

@@ +2385,5 @@
> +    uint32_t hitCount, lastHit, flags;
> +
> +    if (!mPredictor->ParseMetaDataEntry(key, value, getter_AddRefs(uri),
> +                                        hitCount, lastHit, flags)) {
> +      PREDICTOR_LOG(("    failed to parse key=%s value=%s", key, value));

isn't this just bloating of the log?  there is a number of keys in an entry's metadata.

@@ +2402,5 @@
> +      }
> +      nsCString newValue;
> +      MakeMetadataEntry(hitCount, lastHit, flags, newValue);
> +      entry->SetMetaDataElement(key, newValue.BeginReading());
> +      break;

should you always break on ParseMetaDataEntry success?

::: netwerk/base/nsINetworkPredictorVerifier.idl
@@ +17,5 @@
>  {
>      /**
> +     * Callback for when we do a predictive prefetch
> +     *
> +     * @param uri - The URI tha twas prefetched

tha twas

::: netwerk/protocol/http/HttpBaseChannel.cpp
@@ +3046,5 @@
>      }
> +
> +    nsCOMPtr<nsPIDOMWindow> pDomWindow = GetInnerDOMWindow();
> +    if (!pDomWindow) {
> +      return nullptr;

4 spaces

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +1531,5 @@
>          bool saw_quic = (alt_protocol && PL_strstr(alt_protocol, "quic")) ? 1 : 0;
>          Telemetry::Accumulate(Telemetry::HTTP_SAW_QUIC_ALT_PROTOCOL, saw_quic);
>      }
>  
> +    // Let the predictor know whether this was a cacheable response or not

why?

@@ +1541,5 @@
> +    nsCOMPtr<nsIURI> referrer = GetReferringPage();
> +    if (!referrer) {
> +        referrer = mReferrer;
> +    }
> +    if (referrer && httpStatus) {

What exactly you expect from |httpStatus| (!= 0) as a condition?

@@ +1552,2 @@
>      LOG(("nsHttpChannel::ProcessResponse [this=%p httpStatus=%u]\n",
>          this, httpStatus));

would be nice to log at the top of the method (or as top as possible)

@@ +3306,5 @@
>      // See netwerk/cache2/nsICacheEntry.idl for details
> +    else if (isForcedValid && !mCachedResponseHead->NoStore() &&
> +             !mCachedResponseHead->NoCache() &&
> +             !mCachedResponseHead->ExpiresInPast() &&
> +             !mCachedResponseHead->MustValidateIfExpired()) {

doesn't this this break the notion of "force valid"?  if not, please explain in a comment why you are changing the condition like this.

@@ +4873,5 @@
> +    if (!mCacheEntry) {
> +        LOG(("nsHttpChannel::ForceCacheEntryValidFor found no cache entry "
> +             "for this channel [this=%p].", this));
> +    } else {
> +        mCacheEntry->ForceValidFor(aSecondsToTheFuture);

this may fail (not_implemented) in case of an appcache entry

::: netwerk/protocol/http/nsIHttpChannel.idl
@@ +385,5 @@
> +     *
> +     * @param aSecondsToTheFuture
> +     *
> +     */
> +    void forceCacheEntryValidFor(in unsigned long aSecondsToTheFuture);

should this rather be on nsICachingChannel?

::: netwerk/protocol/viewsource/nsViewSourceChannel.cpp
@@ +941,5 @@
>  
>  NS_IMETHODIMP
> +nsViewSourceChannel::ForceCacheEntryValidFor(uint32_t aSecondsToTheFuture)
> +{
> +  // disabled until something needs this

for me an argument to put this method on the nsICachingChannel interface

@@ +943,5 @@
> +nsViewSourceChannel::ForceCacheEntryValidFor(uint32_t aSecondsToTheFuture)
> +{
> +  // disabled until something needs this
> +  return NS_ERROR_NOT_IMPLEMENTED;
> +}

blank line after this line
Attachment #8700087 - Flags: review?(honzab.moz) → feedback+
About to upload a new patch. Everything is taken care of as requested, with exceptions listed here:

(In reply to Honza Bambas (:mayhemer) from comment #31)
> @@ +2261,5 @@
> > +  bool isNoStore;
> > +  rv = httpChannel->IsNoStoreResponse(&isNoStore);
> > +  bool isNoCache;
> > +  rv = httpChannel->IsNoCacheResponse(&isNoCache);
> > +  if (NS_SUCCEEDED(rv) && httpStatus == 200 && !isNoStore && !isNoCache) {
> 
> Why exactly also |&& !isNoStore && !isNoCache| ?  definitely neither of them
> is treated by necko as non-cachable.  no-store is held in memory and
> no-cache is cached but revalidated.  in case of a prefetch you can probably
> go with being it forced valid (however, it's disputable as it may cause
> problems in some cases like an early refresh of the page - sooner than in 10
> seconds ; OTOH, will the referrer still be the same in that case?).  

So I'm fairly certain (thinking back) that these were added to fix some test failures. I haven't fixed this (yet) because if that's why, then it will take more work to figure out the failing tests and modify them to not fail. That shouldn't prevent the rest of the review from going forward, though.

> I'm a bit worried that most of the top-level document responses will have
> no-cache set...

I'm not - top-level documents will never (in the current world) be candidates for prefetch (it's really hard to know we'll be loading a page until the user already starts loading it), so that's a total non-issue :)


> @@ +2385,5 @@
> > +    uint32_t hitCount, lastHit, flags;
> > +
> > +    if (!mPredictor->ParseMetaDataEntry(key, value, getter_AddRefs(uri),
> > +                                        hitCount, lastHit, flags)) {
> > +      PREDICTOR_LOG(("    failed to parse key=%s value=%s", key, value));
> 
> isn't this just bloating of the log?  there is a number of keys in an
> entry's metadata.

Not really - this is already limited earlier in the code to keys that the predictor added. So the only logging that happens here is when it really is a surprising failure.

> @@ +2402,5 @@
> > +      }
> > +      nsCString newValue;
> > +      MakeMetadataEntry(hitCount, lastHit, flags, newValue);
> > +      entry->SetMetaDataElement(key, newValue.BeginReading());
> > +      break;
> 
> should you always break on ParseMetaDataEntry success?

Nope - just b/c ParseMetaDataEntry succeeded doesn't mean it found the key we were looking for.

> @@ +3306,5 @@
> >      // See netwerk/cache2/nsICacheEntry.idl for details
> > +    else if (isForcedValid && !mCachedResponseHead->NoStore() &&
> > +             !mCachedResponseHead->NoCache() &&
> > +             !mCachedResponseHead->ExpiresInPast() &&
> > +             !mCachedResponseHead->MustValidateIfExpired()) {
> 
> doesn't this this break the notion of "force valid"?  if not, please explain
> in a comment why you are changing the condition like this.

Like above, I'm fairly certain I added this in because of failing tests (trying to get to a state where I could tell if my code had any issues left, not tests that made assumptions that are now broken). Will re-evaluate.

> @@ +4873,5 @@
> > +    if (!mCacheEntry) {
> > +        LOG(("nsHttpChannel::ForceCacheEntryValidFor found no cache entry "
> > +             "for this channel [this=%p].", this));
> > +    } else {
> > +        mCacheEntry->ForceValidFor(aSecondsToTheFuture);
> 
> this may fail (not_implemented) in case of an appcache entry

That's just fine, it's not something we particularly care about - all this is an optimization, and if we can't optimize for whatever reason (appcache), it's not the end of the world.
Created attachment 8711790 [details] [diff] [review]
patch

And here's the updated patch.
Attachment #8711790 - Flags: review?(honzab.moz)
Attachment #8700087 - Attachment is obsolete: true
Comment on attachment 8711790 [details] [diff] [review]
patch

Review of attachment 8711790 [details] [diff] [review]:
-----------------------------------------------------------------

::: netwerk/base/Predictor.cpp
@@ +152,5 @@
>  const uint32_t METADATA_VERSION = 1;
>  
> +// Flags available in entries
> +// FLAG_CACHEABLE - we have determined that this item is eligible for prefetch
> +const uint32_t FLAG_CACHEABLE = 1 << 0;

should the name then be FLAG_PREFATCHABLE ?  but up to you, you know better what exactly this flag means

@@ +1275,5 @@
> +                                                              this);
> +  PREDICTOR_LOG(("    calling AsyncOpen listener=%p channel=%p", listener.get(),
> +                 channel.get()));
> +  // The channel is its own context so the PrefetchListener can get information
> +  // from it in the On*Request callbacks

OK, but aRequest argument on On*Request is the channel as well, so I'm still not sure why you need it.

@@ +1650,5 @@
> +static void
> +MakeMetadataEntry(const uint32_t hitCount, const uint32_t lastHit,
> +                  const uint32_t flags, nsCString &newValue)
> +{
> +  newValue.AssignLiteral("");

newValue.Trancate()

@@ +1652,5 @@
> +                  const uint32_t flags, nsCString &newValue)
> +{
> +  newValue.AssignLiteral("");
> +  newValue.AppendInt(METADATA_VERSION);
> +  newValue.AppendLiteral(",");

Append(',')

and for next time, for literals with len > 1 more efficient is:

Append(NS_LITERAL_CSTRING("literal"));

@@ +2268,5 @@
> +
> +  nsresult rv = NS_OK;
> +  uint32_t httpStatus;
> +  rv = httpChannel->GetResponseStatus(&httpStatus);
> +  bool isNoStore;

just to make any sanitization checks happy, please initiate to false.

@@ +2273,5 @@
> +  if (NS_SUCCEEDED(rv)) {
> +    // Ensure we don't shadow a failure with a success
> +    rv = httpChannel->IsNoStoreResponse(&isNoStore);
> +  }
> +  bool isNoCache;

as well

@@ +2375,5 @@
> +                                                     nsIApplicationCache *appCache,
> +                                                     nsresult result)
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +  MOZ_ASSERT(!isNew);

nit: please add a comment that you are opening READONLY and hence never expect isNew == true

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +3308,5 @@
>      // See netwerk/cache2/nsICacheEntry.idl for details
> +    else if (isForcedValid && !mCachedResponseHead->NoStore() &&
> +             !mCachedResponseHead->NoCache() &&
> +             !mCachedResponseHead->ExpiresInPast() &&
> +             !mCachedResponseHead->MustValidateIfExpired()) {

if the only reason is test failures we should not land this.  you can easily turn the preload off for the failing tests (of fix them if reasonable)

@@ +3313,3 @@
>          LOG(("NOT validating based on isForcedValid being true.\n"));
> +        Telemetry::AutoCounter<Telemetry::PREDICTOR_TOTAL_PREFETCHES_USED> used;
> +        ++used;

not really the right place...  we could have a flag on cache entries to say "preloaded" but that is for a followup (please file it).  for now let's go with this.

::: netwerk/protocol/http/nsIHttpChannel.idl
@@ +379,5 @@
>       *         has been opened.
>       */
>      void redirectTo(in nsIURI aNewURI);
>  
> +

nit: leftover ws
Attachment #8711790 - Flags: review?(honzab.moz) → review+
Duplicate of this bug: 679819
Whiteboard: [necko-active]
Created attachment 8738310 [details] [diff] [review]
interdiff from old version to new

This is the interdiff between the current (r+'d) version, and the new version (incoming in next attachment)
Created attachment 8738311 [details] [diff] [review]
patch

And here's the new version, now with a rolling window for prefetch (meaning prefetch will not be useless both on sites already known by the predictor, and on sites that, at some point in their life, undergo a major redesign). Advantage of this is that it makes test changes I had to make previously irrelevant, as none of them fill up the rolling window.

Try is happy, if you are: https://treeherder.mozilla.org/#/jobs?repo=try&revision=34eb3ee5e72f
Attachment #8711790 - Attachment is obsolete: true
Attachment #8738311 - Flags: review?(honzab.moz)
Nick, I still don't fully understand how the rolling window works and why it's stored into entries.  Probably also because of lack of (any) comments in your code.  Could you please explain even more how does that work?
Flags: needinfo?(hurley)
The whole idea behind a rolling window is to allow for prefetching to occur on sites that undergo a major redesign (for example).

We only want to prefetch when we're super-confident (~100% confident) that we'll be using that resource. With the old method of calculation, it would be (quite literally) mathematically impossible to prefetch a resource that gets added to a page at some point after the first time we load that page (even if that new resource becomes a permanent part of the page).

With the rolling window for determining prefetch, we only use the last N loads (where the default value of N = 10) and ask "did we load this particular resource the last N times we loaded this page?". If the answer to that question is "yes", then we can prefetch. We have to be able to keep track of that independent of the total pageload count & total resource load count, as those have no concept of where the overlap in load counts is (so if total pageload count is 20 and total resource load count is 10, which 10 loads of the page do those resource loads correspond to? The first 10? The most recent 10? Alternating loads?) The rolling window, since it gets updated on every page load (regardless of if the resource is actually loaded) actually keeps that information for us.

Does that clarify things?
Flags: needinfo?(hurley)
Thanks.  On each sub-resource entry - the array you keep on the top-level doc cache entry - you have a bit field array that is shifted (having then 0 at the lowest bit) on every top-level doc load.  And when a sub-resource is later loaded by that page, you set 1 at the lowest bit.  And you only prefetch a sub-resource when you find all N (default N=10) bits set.

Right?
OK, I'll get to this review on Monday.  Thanks for the interdiff, btw!
(In reply to Honza Bambas (:mayhemer) from comment #40)
> Thanks.  On each sub-resource entry - the array you keep on the top-level
> doc cache entry - you have a bit field array that is shifted (having then 0
> at the lowest bit) on every top-level doc load.  And when a sub-resource is
> later loaded by that page, you set 1 at the lowest bit.  And you only
> prefetch a sub-resource when you find all N (default N=10) bits set.
> 
> Right?

Yep, that's exactly right.
Comment on attachment 8738311 [details] [diff] [review]
patch

Review of attachment 8738311 [details] [diff] [review]:
-----------------------------------------------------------------

I didn't test the patch manually.

r+ with all comments addressed.

::: modules/libpref/init/all.js
@@ +1789,5 @@
>  pref("network.predictor.subresource-degradation.week", 10);
>  pref("network.predictor.subresource-degradation.month", 25);
>  pref("network.predictor.subresource-degradation.year", 50);
>  pref("network.predictor.subresource-degradation.max", 100);
> +pref("network.predictor.prefetch-rolling-load-count", 10);

isn't this just too hi?  would 5 or 3 be better?  up to you, tho.

::: netwerk/base/Predictor.cpp
@@ +162,5 @@
> +// We save 12 bits in the "flags" section of our metadata for actual flags, the
> +// rest are to keep track of a rolling count of which loads a resource has been
> +// used on to determine if we can prefetch that resource or not;
> +const uint8_t kRollingLoadOffset = 12;
> +const int32_t kMaxPrefetchRollingLoadCount = 20;

shouldn't all these consts be static?

@@ +1220,5 @@
> +  // on the last n (for n <= 20) loads of a page.
> +  const uint32_t kRollingLoadCountMask = (1 << mPrefetchRollingLoadCount) - 1;
> +
> +  // Figure out our new count, making sure the "most recent load" bit is clear,
> +  // since we aren't yet sure if the resource is being used on this load or not.

Rather saying few words what this method does as a whole would be better, like:

"on every page load we shift the rolling window by one bit.  the lowest bit (which is the most recent loaded/not-loaded flag) is now null.  it will be set later when the page hits the sub-resource."

or something like this.  because "Figure out our new count" makes one a headache ;)

@@ +1223,5 @@
> +  // Figure out our new count, making sure the "most recent load" bit is clear,
> +  // since we aren't yet sure if the resource is being used on this load or not.
> +  uint32_t rollingLoadCount = (flags & (kRollingLoadCountMask << kRollingLoadOffset)) >> kRollingLoadOffset;
> +  rollingLoadCount <<= 1;
> +  rollingLoadCount &= kRollingLoadCountMask;

so when someone changes the pref to say 3, this will delete the history.  isn't it waste?  also, it adds inconsistency when  user moves the pref and only some records are updated with new numbers.  

shouldn't we just keep all the history?  it doesn't waste space anyway :)  and would make this code a bit more simpler and readable.


then you could do all of this simply on one line only as:

flags = ((flags << 1) & (~kFlagsMask << 1)) | (flags & kFlagsMask);

where 
static uint32_t const kFlagsMask = ((1 << kRollingLoadOffset) - 1);

@@ +1279,5 @@
> +      PREDICTOR_LOG(("    forcing non-cacheability - no referrer"));
> +      flags &= ~FLAG_PREFETCHABLE;
> +    } else {
> +      uint32_t rollingLoadCheck = (1 << mPrefetchRollingLoadCount) - 1;
> +      rollingLoadCheck <<= kRollingLoadOffset;

Here |rollingLoadCheck| actually represents 10 (when at default) bits set, which is 10 loads in line, right?  Some more depicting name would be nice, I had to think a bit before deciphering this ;)  could also be that it was too late in the evening.

@@ +1300,5 @@
>    MOZ_ASSERT(NS_IsMainThread());
>  
> +  nsAutoCString uriStr;
> +  uri->GetAsciiSpec(uriStr);
> +  PREDICTOR_LOG(("SetupPrediction mEnablePrefetch=%d mPrefetchMinConfidence=%d mPreconnectMinConfidence=%d mPreresolveMinConfidence=%d flags=%d confidence=%d uri=%s", mEnablePrefetch, mPrefetchMinConfidence, mPreconnectMinConfidence, mPreresolveMinConfidence, flags, confidence, uriStr.get()));

nit: try 80 chars wrapping

@@ +1345,5 @@
> +  PREDICTOR_LOG(("    calling AsyncOpen listener=%p channel=%p", listener.get(),
> +                 channel.get()));
> +  // The channel is its own context so the PrefetchListener can get information
> +  // from it in the On*Request callbacks
> +  rv = channel->AsyncOpen(listener, channel);

I still don't follow why you attach the channel itself again as a context.  You really don't need to.  Just QI aRequest in On* callbacks.

@@ +1637,5 @@
>        // is update the timestamps and fetch count, and that's done for us by
>        // opening the cache entry.
> +      if (fullUri) {
> +        PREDICTOR_LOG(("    WARNING - updating rolling load count. "
> +                       "If you see this outside tests, you did it wrong"));

can you please more explain this?  seems like this whole code is here just for tests (your tests?  any test we currently run?  or what?)

actually this whole branch needs some good explanation.

@@ +1642,5 @@
> +        if (mPrefetchRollingLoadCount < 0) {
> +          mPrefetchRollingLoadCount = 0;
> +        } else if (mPrefetchRollingLoadCount > kMaxPrefetchRollingLoadCount) {
> +          mPrefetchRollingLoadCount = kMaxPrefetchRollingLoadCount;
> +        }

could you have a SanitizePrefs() method for this?

@@ +1815,5 @@
>      PREDICTOR_LOG(("    existing resource"));
>      hitCount = std::min(hitCount + 1, static_cast<uint32_t>(loadCount));
>    }
>  
> +  // Update our rolling count so prefetch will work

so, here you actually set the lowest bit of the rolling load counter when you hit the sub-resource load?

Please update this comment then to reflect and explain it.  This one is nothing but confusing.

@@ +2344,5 @@
> +    return aStatusCode;
> +  }
> +  Telemetry::AccumulateTimeDelta(Telemetry::PREDICTOR_PREFETCH_TIME, mStartTime);
> +
> +  nsCOMPtr<nsIHttpChannel> httpChannel = do_QueryInterface(aContext);

QI on aRequest will have exactly the same result.  no need to put the channel itself as a context

::: netwerk/cache2/CacheStorageService.cpp
@@ +1150,5 @@
>                                                  nsACString const &aEntryKey)
>  {
>    mozilla::MutexAutoLock lock(mForcedValidEntriesLock);
>  
> +  LOG(("Removing forced validity contextKey=%s entryKey=%s", aContextKey.BeginReading(), aEntryKey.BeginReading()));

LOG(("CacheStorageService::RemoveEntryForceValid, context='%s' entryKey=%s", aContextKey.BeginReading(), aEntryKey.BeginReading()));

::: netwerk/protocol/http/nsHttpChannel.cpp
@@ +3393,5 @@
>      // Check isForcedValid to see if it is possible to skip validation
>      // See netwerk/cache2/nsICacheEntry.idl for details
> +    else if (isForcedValid &&
> +             (!mCachedResponseHead->ExpiresInPast() ||
> +              !mCachedResponseHead->MustValidateIfExpired())) {

these need to be well explained here in a comment and also in the IDL for the forceValid method.  it may break the force-valid functionality (expectancy) and have unpredictable behavior.
Attachment #8738311 - Flags: review?(honzab.moz) → review+
https://treeherder.mozilla.org/#/jobs?repo=try&revision=50ad06b6d747

Comment 45

11 months ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/b3b5be1d2027

Updated

11 months ago
Depends on: 1266735

Comment 46

11 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/b3b5be1d2027
Status: ASSIGNED → RESOLVED
Last Resolved: 11 months ago
status-firefox48: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla48
Depends on: 1267058
Depends on: 1266671

Comment 47

11 months ago
This change appears to have triggered several telemetry alerts [1][2][3].

In short, it seems as though the time spent in the predictor went up, the number of preresolves went down, and the number of predictions went up.

Is this a good thing or a bad thing? Are these changes consistent with the patches submitted? 

[1]: http://alerts.telemetry.mozilla.org/index.html#/detectors/1/metrics/809/alerts/?from=2016-04-23&to=2016-04-23
[2]: http://alerts.telemetry.mozilla.org/index.html#/detectors/1/metrics/28/alerts/?from=2016-04-23&to=2016-04-23
[3]: http://alerts.telemetry.mozilla.org/index.html#/detectors/1/metrics/592/alerts/?from=2016-04-23&to=2016-04-23
Flags: needinfo?(hurley)

Comment 48

11 months ago
Also, is it possible that this work could be responsible for changes in metrics SPDY_SYN_RATIO, SPDY_CHUNK_RECVD and SPDY_SYN_SIZE?
The alerts in comment 47 are most definitely consistent with the patches. It is neither good nor bad (well, I suppose it's good in that it indicates things are working as expected).

As for the changes in the SPDY telemetry metrics - I suppose it could be (we've added some requests which may have slightly different characteristics), but I can't say for certain.
Flags: needinfo?(hurley)

Comment 50

11 months ago
Thank you for your prompt response! I'll go ahead and disregard the changes as a transition to the new normal.

Would you happen to know someone who would be able to comment on the SPDY changes more definitively? I have a gut feeling from reading the changelog[1] that it's this bug, but my knowledge of netwerk is rather slim.

[1]: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=0891f0fa044cba28024849803e170ed7700e01e0&tochange=37f04460ddb76d6ef4e7c32a8a6b2fbc44cb8776
For SPDY stuff, it's either me or :mcmanus for answering questions. Looking at the changelog you posted, yeah, I imagine this is the change that caused it. Again, I imagine this is "transition to a new normal" territory for the telemetry, not something to be particularly worried about.

Comment 52

11 months ago
Excellent. Will comment and subsequently ignore.

Updated

11 months ago
Depends on: 1269448
See Also: → bug 1268208
Depends on: 1271944
Depends on: 1273882
This is disabled on 48, right? Is it ready for aurora 49 or will it remain disabled? I came across this feature from looking at Talos regression bugs.
Flags: needinfo?(hurley)
It's disabled on 48, yes. I still have hopes of having this ready for aurora 49, but am ready to disable it if I don't make the headway I would like.
Flags: needinfo?(hurley)
Depends on: 1278636
Depends on: 1268208
See Also: bug 1268208
You need to log in before you can comment on or make changes to this bug.