Closed Bug 1054572 Opened 6 years ago Closed 6 years ago

HTTP cache v2: Remove redundant references to old cache IDLs (exc. FTP)

Categories

(Core :: Networking: Cache, defect)

25 Branch
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla34

People

(Reporter: sworkman, Assigned: sworkman)

References

(Blocks 1 open bug)

Details

Attachments

(3 files, 1 obsolete file)

Bug 1054418 highlighted some references to the old cache IDLs. This bug removes those references, excluding FTP code which is covered in bug 913827.

See Bug 1054418 comment #6.
Just trying to remove some of the references. Builds fine locally. Want to make sure tests are passing. Note: cachev2 is enabled in testing infra in this try push:

https://tbpl.mozilla.org/?tree=Try&rev=536459e64167
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=536459e64167
Gian-Carlo, can you take a look at this test case please? I changed nsChannelClassifier.cpp to use nsICacheEntry instead of nsICacheEntryDescriptor (cache v1 -> v2), so I'd like to add a test case to test_classifier.html to exercise this code path. Looks like this patch does it - let me know if you think something isn't right here. Thanks.
Attachment #8474859 - Flags: review?(gpascutto)
That mochitest other failure in the try run from comment 1 should be fixed by Bug 1055180 (discussed with :mmc earlier). It's just landed on m-i, so here's a new try run with the patch prepended.

https://tbpl.mozilla.org/?tree=Try&rev=ad8707db8167
https://treeherder.mozilla.org/ui/#/jobs?repo=try&revision=ad8707db8167
Comment on attachment 8474859 [details] [diff] [review]
v1.0 Add a second doUpdate in test_classifier to verify HTTP cache classification metadata

Review of attachment 8474859 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/url-classifier/tests/mochitest/classifierFrame.html
@@ +18,5 @@
> +  // Call doUpdate again to test classification metadata in HTTP cache entries.
> +  if (window.parent.firstLoad) {
> +    window.parent.info("Reloading from cache...");
> +    window.parent.firstLoad = false;
> +    window.parent.doUpdate(window.parent.testUpdate);

I don't understand what this is doing. We load the page, then force a SafeBrowsing update (again I presume, since I don't see the original doUpdate call removed), then reload the page?

What is this trying to accomplish, exactly?
This should help fix the mochitest failure in test_classified_annotations.html. Turns out that nsChannelClassifier was still using nsICacheEntryDescriptor from cache v1; so it wasn't writing or reading any cache entry metadata. Seems like nsChannelClassifier is not dependent on the cache, but we should still land this asap on m-c and try to land on aurora and beta.
Attachment #8476293 - Flags: review?(mmc)
Attachment #8476293 - Flags: review?(mmc) → review+
(In reply to Steve Workman [:sworkman] from comment #5)
> Created attachment 8476293 [details] [diff] [review]
> v1.0 Don't mark cache entries as classified if on the allow list
> 
> This should help fix the mochitest failure in
> test_classified_annotations.html. Turns out that nsChannelClassifier was
> still using nsICacheEntryDescriptor from cache v1; so it wasn't writing or
> reading any cache entry metadata. Seems like nsChannelClassifier is not
> dependent on the cache, but we should still land this asap on m-c and try to
> land on aurora and beta.

The allowlist code landed in 34, so no need to uplift if that's the only breakage.
Comment on attachment 8474003 [details] [diff] [review]
v1.0 Remove redundant references to old HTTP cache IDLs

Honza/Michal - whichever one of you gets to this one first. Mostly cleanup; the only real change is in nsChannelClassifier.cpp.
Attachment #8474003 - Flags: review?(michal.novotny)
Attachment #8474003 - Flags: review?(honzab.moz)
(In reply to Gian-Carlo Pascutto [:gcp] from comment #4)
> Comment on attachment 8474859 [details] [diff] [review]
> v1.0 Add a second doUpdate in test_classifier to verify HTTP cache
> classification metadata

> What is this trying to accomplish, exactly?

Hi Gian-Carlo - I'm trying to have nsChannelClassifier::HasBeenClassified called and ensure that it checks the cache metadata. I'm not familiar with the classifier, so I'm reaching out a bit blindly here.

I tried calling location.reload from within classifierFrame.html, but that doesn't seem to invoke classification. So, I just called out to the parent document for it to call doUpdate(). However, playing around with things just now, I see that I can call out to the parent to set frame.src:

  window.parent.document.getElementById("testFrame").src = "classifierFrame.html";

That's a little ugly, but it works and nsChannelClassifier::HasBeenClassified checks the channel's cache metadata.

(BTW, any reason why reload doesn't invoke the classifier? Is this to be expected? Note: Even with location.reload(false) I'm not seeing HasBeenClassified check the cache).
Comment on attachment 8474003 [details] [diff] [review]
v1.0 Remove redundant references to old HTTP cache IDLs

Review of attachment 8474003 [details] [diff] [review]:
-----------------------------------------------------------------

Good for me, Steve!  Thanks a lot for this patch.  I though I caught all of the cases :/

::: dom/tests/mochitest/ajax/offline/offlineTests.js
@@ +17,5 @@
>  
>  OfflineCacheContents.prototype = {
>  QueryInterface: function(iid) {
>      if (!iid.equals(Ci.nsISupports) &&
> +        !iid.equals(Ci.nsICacheEntryOpenCallback)) {

how could this ever work?  we run these tests regularly.
Attachment #8474003 - Flags: review?(honzab.moz) → review+
Comment on attachment 8474859 [details] [diff] [review]
v1.0 Add a second doUpdate in test_classifier to verify HTTP cache classification metadata

Review of attachment 8474859 [details] [diff] [review]:
-----------------------------------------------------------------

>So, I just called out to the parent document for it to call doUpdate(). 

That really shouldn't have an effect either. If it does it must be in a very roundabout way. I'm going to r- based on this, I don't want to make code work by coincidence and magic.

>That's a little ugly, but it works and nsChannelClassifier::HasBeenClassified checks the channel's cache metadata.

Seems preferable to me, because at least forcing a page reload in a specific way (which would be expected to call HasBeenClassified) makes more sense than forcing an UrlClassifier update (which shouldn't affect this).

>(BTW, any reason why reload doesn't invoke the classifier? Is this to be expected? Note: Even with 
>location.reload(false) I'm not seeing HasBeenClassified check the cache).

I don't know. Maybe it's a bug.
Attachment #8474859 - Flags: review?(gpascutto) → review-
Comment on attachment 8474003 [details] [diff] [review]
v1.0 Remove redundant references to old HTTP cache IDLs

Review of attachment 8474003 [details] [diff] [review]:
-----------------------------------------------------------------

Clearing the review request since Honza already did the review.
Attachment #8474003 - Flags: review?(michal.novotny)
Made loadTestFrame a global function in test_classifier.html. It calls src=... and so classifierFrame.html calls out to that function to do the reload.
Attachment #8474859 - Attachment is obsolete: true
Attachment #8476834 - Flags: review?(gpascutto)
(In reply to Gian-Carlo Pascutto [:gcp] from comment #11)
> >(BTW, any reason why reload doesn't invoke the classifier? Is this to be expected? Note: Even with 
> >location.reload(false) I'm not seeing HasBeenClassified check the cache).
> 
> I don't know. Maybe it's a bug.

I looked into this because I wanted to make sure we weren't doing something wrong: reload maps to nsIDocShell::LOAD_CMD_RELOAD which is mapped to VALIDATE_ALWAYS in nsDocShell::DoChannelLoad. Calling src="" doesn't set this flag, so the entry is just pulled from the cache without getting revalidated.

Looks like it's a timing issue: HasBeenClassified is called before the entry is revalidated, so cachingChannel->IsFromCache returns false, so we effectively don't check the cache entry's metadata for "necko:classified". In the code, this is because OpenCacheEntry is called (in nsHttpChannel::Connect) just before classifier->Start() (which calls HasBeenClassified).

So, if we're not validating (.src=""), it looks like the cache entry can be ready before HasBeenClassified. But it we are validating (.reload -> VALIDATE_ALWAYS), it doesn't get there in time.

nsChannelClassifier works either way; it just skips the cache if the cache entry is ready and classifies the URI again. I don't know if this is by design or not. In any case, I don't think it's a serious concern that needs immediate attention. Honza? Michal?
Flags: needinfo?(michal.novotny)
Flags: needinfo?(honzab.moz)
Comment on attachment 8476834 [details] [diff] [review]
v1.1 Reload classifierFrame.html in test_classifier.html to verify HTTP cache classification metadata

Review of attachment 8476834 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. Thanks for doing the analysis, though we still don't know why doUpdate was doing something here (I could easily see it affecting timing, though). 

>nsChannelClassifier works either way; it just skips the cache if the cache entry is ready and classifies the URI again.

Don't you mean "isn't ready" here? Sounds pretty wasteful otherwise.

::: toolkit/components/url-classifier/tests/mochitest/classifierFrame.html
@@ +14,5 @@
>    var elt = document.getElementById("styleCheck");
>    var style = document.defaultView.getComputedStyle(elt, "");
>    window.parent.isnot(style.visibility, "hidden", "Should not load bad css");
>  
> +  // Call doUpdate again to test classification metadata in HTTP cache entries.

Comment is out of date.
Attachment #8476834 - Flags: review?(gpascutto) → review+
(In reply to Steve Workman [:sworkman] from comment #14)
> (In reply to Gian-Carlo Pascutto [:gcp] from comment #11)
> > >(BTW, any reason why reload doesn't invoke the classifier? Is this to be expected? Note: Even with 
> > >location.reload(false) I'm not seeing HasBeenClassified check the cache).
> > 
> > I don't know. Maybe it's a bug.
> 
> I looked into this because I wanted to make sure we weren't doing something
> wrong: reload maps to nsIDocShell::LOAD_CMD_RELOAD which is mapped to
> VALIDATE_ALWAYS in nsDocShell::DoChannelLoad. Calling src="" doesn't set
> this flag, so the entry is just pulled from the cache without getting
> revalidated.
> 
> Looks like it's a timing issue: HasBeenClassified is called before the entry
> is revalidated, so cachingChannel->IsFromCache returns false, so we
> effectively don't check the cache entry's metadata for "necko:classified".
> In the code, this is because OpenCacheEntry is called (in
> nsHttpChannel::Connect) just before classifier->Start() (which calls
> HasBeenClassified).
> 
> So, if we're not validating (.src=""), it looks like the cache entry can be
> ready before HasBeenClassified. 

It will always be ready, or not?  If you mean "ready" = the metadata are set.

> But it we are validating (.reload ->
> VALIDATE_ALWAYS), it doesn't get there in time.

I'm not sure I follow.  The classifier works per channel and it's purely a channel thing to validate the entry.  This seems to me well sequenced, hence I'm not sure I can see a race condition in your explanation.

> 
> nsChannelClassifier works either way; it just skips the cache if the cache
> entry is ready and classifies the URI again. I don't know if this is by
> design or not. In any case, I don't think it's a serious concern that needs
> immediate attention. Honza? Michal?

Hmm.. depends on author of the classifier.  But reload may indicate to also recheck the page classification.  So I think we good here :)
Flags: needinfo?(honzab.moz)
(In reply to Gian-Carlo Pascutto [:gcp] from comment #15)
> Looks good. Thanks for doing the analysis, though we still don't know why
> doUpdate was doing something here (I could easily see it affecting timing,
> though).

Ah, sorry - I read my comment #14 again, and it probably wasn't clear because there are multiple layers here. The core timing issue that I'm discussing is whether the cache entry metadata is ready for reading when nsChannelClassifier::HasBeenClassified has been called. But let me clarify the different options from the test's point of view.

-- doUpdate: does some classifier DB updates and then calls src="", so yes, it's slower than calling src="" directly.
-- src="": allows the cache entry to be opened without validation.
-- reload: requires validation, so we must go to the network.

So, in test_classifier.html, reload is slower than src=""; doUpdate is slower than src=""; reload and doUpdate? I don't know, but src="" is fastest, and we don't need validation or classifier DB updates, so best to use that one :)

> >nsChannelClassifier works either way; it just skips the cache if the cache entry is ready and classifies the URI again.
> 
> Don't you mean "isn't ready" here? Sounds pretty wasteful otherwise.

Indeed, I did mean "isn't ready" - sorry about that :)

Now, the potential timing issue in nsHttpChannel...

(In reply to Honza Bambas (:mayhemer) from comment #16)
> I'm not sure I follow.  The classifier works per channel and it's purely a
> channel thing to validate the entry.  This seems to me well sequenced, hence
> I'm not sure I can see a race condition in your explanation.

It seems to work per channel - so, yes I agree with you that we're good :)

The race I see is that in BeginConnect we call classifier->Start() just after we've async'ly called cacheStorage->AsyncOpenURI(). So, all the cache APIs, starting with IsFromCache, in HasBeenClassified (called in Start()) are racing with the cache entry being opened, no? Looks like IsFromCache would return false if mCachePump isn't set, and that's not set until ReadFromCache is called in the context of OnCacheEntryAvailable.

Is there something I'm missing about how AsyncOpenURI works?

But yes, channel classifier seems to work with or without the cache entry metadata. I don't know if it's an optimization to wait for the cache entry or not ...

(clearing Michal's needinfo too).
Flags: needinfo?(michal.novotny)
(In reply to Steve Workman [:sworkman] from comment #18)
> The race I see is that in BeginConnect we call classifier->Start() just
> after we've async'ly called cacheStorage->AsyncOpenURI(). So, all the cache
> APIs, starting with IsFromCache, in HasBeenClassified (called in Start())
> are racing with the cache entry being opened, no? Looks like IsFromCache
> would return false if mCachePump isn't set, and that's not set until
> ReadFromCache is called in the context of OnCacheEntryAvailable.
> 
> Is there something I'm missing about how AsyncOpenURI works?

AsyncOpenURI may fire the onCheck and onAvail callback before it returns to the caller when the entry is in memory.  Not sure how the conditions for HasBeenClassified() changes then, but it may be different.  IMO, not that important to deal with.
I had to back these out for apparently being the cause of some ASAN mochitest-3 leaks:
https://tbpl.mozilla.org/php/getParsedLog.php?id=46579619&tree=Mozilla-Inbound

https://hg.mozilla.org/integration/mozilla-inbound/rev/603a6713b031


(The previous push was not part of the build, so hopefully didn't break this, and the push prior to that had a green ASAN m3 run. Every push since this landed has had the leak.
Flags: needinfo?(sworkman)
(In reply to Wes Kocher (:KWierso) from comment #20)
> I had to back these out for apparently being the cause of some ASAN
> mochitest-3 leaks:
> https://tbpl.mozilla.org/php/getParsedLog.php?id=46579619&tree=Mozilla-
> Inbound
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/603a6713b031

Wes, those look like intermittent leaks that should be suppressed in https://hg.mozilla.org/integration/mozilla-inbound/rev/d7e28d7d7505, no? That patch was landed after you backed out my patches so they wouldn't have been suppressed earlier. So, I'm going to go ahead and try relanding the patches..

https://hg.mozilla.org/integration/mozilla-inbound/rev/c35cdc3a9853
https://hg.mozilla.org/integration/mozilla-inbound/rev/562ad3180cf9
https://hg.mozilla.org/integration/mozilla-inbound/rev/cf7967ad3bb8
Flags: needinfo?(sworkman)
(In reply to Steve Workman [:sworkman] from comment #21)
> (In reply to Wes Kocher (:KWierso) from comment #20)
> > I had to back these out for apparently being the cause of some ASAN
> > mochitest-3 leaks:
> > https://tbpl.mozilla.org/php/getParsedLog.php?id=46579619&tree=Mozilla-
> > Inbound
> > 
> > https://hg.mozilla.org/integration/mozilla-inbound/rev/603a6713b031
> 
> Wes, those look like intermittent leaks that should be suppressed in
> https://hg.mozilla.org/integration/mozilla-inbound/rev/d7e28d7d7505, no?
> That patch was landed after you backed out my patches so they wouldn't have
> been suppressed earlier. So, I'm going to go ahead and try relanding the
> patches..
> 
> https://hg.mozilla.org/integration/mozilla-inbound/rev/c35cdc3a9853
> https://hg.mozilla.org/integration/mozilla-inbound/rev/562ad3180cf9
> https://hg.mozilla.org/integration/mozilla-inbound/rev/cf7967ad3bb8

Yep, ASAN m3 is green on your relanding.
https://hg.mozilla.org/mozilla-central/rev/c35cdc3a9853
https://hg.mozilla.org/mozilla-central/rev/562ad3180cf9
https://hg.mozilla.org/mozilla-central/rev/cf7967ad3bb8
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla34
You need to log in before you can comment on or make changes to this bug.