Use nsIURLClassifier.asyncClassifyLocalWithTables in nsDocument::PrincipalFlashClassification

ASSIGNED
Assigned to

Status

()

Toolkit
Safe Browsing
P3
normal
ASSIGNED
6 months ago
3 days ago

People

(Reporter: hchang, Assigned: tnguyen)

Tracking

(Depends on: 2 bugs, Blocks: 3 bugs)

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [qf:p1])

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(2 attachments)

(Reporter)

Description

6 months ago
http://searchfox.org/mozilla-central/rev/1bc7720a434af3c13b68b8d69f92078cae8890c4/dom/base/nsDocument.cpp#13110
(Reporter)

Updated

6 months ago
Assignee: nobody → hchang

Updated

6 months ago
Blocks: 1342333
Status: NEW → ASSIGNED
Priority: -- → P3
Could you provide the rationale for this change?

I see that this Bug blocks Bug 1342333 - "Remove any user of nsIURLClassifier.classifyLocalWithTables on the main thread". I assume that by the main thread refers to the Chrome/parent process. However, nsDocument::PrincipalFlashClassification should be called exclusively from the Content process.

I was explicitly told that this interface to access Shavar's URL Classification, so it would be nice to know why and how the interface is changing.
Flags: needinfo?(hchang)
(Reporter)

Comment 2

5 months ago
Hi Kirk, 

The plan is to replace with the asynchronous version [1] of classifyLocalWithTables.
since that function might be blocked by I/O thread. (Precisely speaking, the function
is to post a SyncRunnable to some urlclassifier worker thread, which might be blocked
by I/O while we are updating databases.)

That said, given that this callsite will be out of content and parent process,
probably we can even lower the priority. Thanks for telling me this fact :)


[1] http://searchfox.org/mozilla-central/rev/d4adaabd6d2f88be0d0b151e022c06f6554909da/netwerk/base/nsIURIClassifier.idl#88

(In reply to Kirk Steuber [:bytesized] from comment #1)
> Could you provide the rationale for this change?
> 
> I see that this Bug blocks Bug 1342333 - "Remove any user of
> nsIURLClassifier.classifyLocalWithTables on the main thread". I assume that
> by the main thread refers to the Chrome/parent process. However,
> nsDocument::PrincipalFlashClassification should be called exclusively from
> the Content process.
> 
> I was explicitly told that this interface to access Shavar's URL
> Classification, so it would be nice to know why and how the interface is
> changing.
Flags: needinfo?(hchang)
Whiteboard: [qf:p1]
Hi Henry, just wanted to quickly check if you're still working on this, and if you have faced any issues etc.  Thanks!
Flags: needinfo?(hchang)
(Reporter)

Comment 4

5 months ago
Hi Ehsan,

I am working on other stuff irrelevant to safebrowsing lately since 
I thought I was told in comment 1 that the callsite is non-content process.
However, re-reading the comment, I seem to have misunderstood his comment:
It's "exclusively called from the content process" instead of "not on the content process".

So, it should be re-prioritized since it still potentially leads to UI jank.

Thankfully I already got a patch for e10s support for the async function in

https://bugzilla.mozilla.org/show_bug.cgi?id=1343425

Once it is in place, it should be not very difficult to replace the sync call with the async version.
Flags: needinfo?(hchang)
(Reporter)

Comment 5

5 months ago
Hi Kirk, Francois.

Do you know the flash blocking feature shipping time line?
I heard it hasn't been enabled yet but will be soon.

I am asking because the potential UI jank will not be an issue
until the feature is enabled.

Thanks :)
Flags: needinfo?(ksteuber)
Flags: needinfo?(francois)
We plan to run a test on 55 soon (in the next week or two). We are doing a Shield study on 53, planned for about a week after 53 hits release. It should be enabled for everyone starting around 55 aurora or 55 beta.
Flags: needinfo?(ksteuber)
Flags: needinfo?(francois)
Kirk, is there a bug already open for turning this on in release?

If so, we could make the current bug block that to ensure we don't get a perf regression.
Flags: needinfo?(ksteuber)
Blocks: 1351784
Flags: needinfo?(ksteuber)
(Reporter)

Comment 8

5 months ago
I'll start working on this and Bug
Depends on: 1343425
(Reporter)

Comment 9

4 months ago
Congratulation on myself that the dependent bug (asyncClassifyLocalWithTables e10s support)
has been solved by me (Self high five!) so I can start solving this bug from now on!
(Reporter)

Comment 10

4 months ago
After a deep investigation, I don't think there exists a solution to get rid of
the "synchronous-ness" thoroughly because nsDocument's member functions are almost
synchronous and there is no such "cutting point" where we can inject the async function.

A mitigation might be firing the async call to initialize mFlashClassifition
as early as possible (e.g. StartDocumentLoad). If mFlashClassifition is required
but yet to be initialized, we wait for the initialization completion.
However, the current async API does not allow us doing so because the callback
is on the main thread. It's almost [1] impossible to wait on main thread for
a callback which is going to occur on main thread, too. 

I was trying to rewrite the API so that the callback will not on the main
thread today. It's simple for the non-e10s version but impossible for e10s
if the manager protocol is PContent (because its worker thread is main thread.)
I managed to rewrite the protocol to make it managed by PBackground
but still stuck in an unknown crash :( 


[1] We can use NS_ProcessNextEvent but it might lead to side effect.
(Reporter)

Comment 11

4 months ago
To sum up for figuring out how to move forward:

1) I don't think classifyLocalWithTables could be replaced with asyncClassifyLocalWithTables
   without a big architectural change.

2) What we only can do is to minimize the main thread blocking time (but still a synchronous call
   on main thread.)
Depends on: 1353701
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Reporter)

Updated

4 months ago
Depends on: 1356211
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Reporter)

Comment 17

4 months ago
Hi Kirk,

I don't know if you can review this bug you are definitely the best one
to give me some feedback for the patch. I just submitted a patch, which
is based on Bug 1356211, to implement the "init early" we discussed in
the mail.

So, the basic idea is to call nsIURLClassifier.asyncClassifyLocalWithTables
in nsDocument::StartDocument() and take the result when needed. If the result
has not come out yet, we will be waiting on the main thread for the callback
of nsIURLClassifier.asyncClassifyLocalWithTables, which would occur on
non-main-thread.

There are two things I'd like to bring up in particular:

1) The semantics of mFlassClassification remains the same, which is derived
   from the "PrincipalFlassification()" and the parent document's state.

2) nsDocument::IsThirdParty() is relevant to its parent document so we cannot
   call it before the parent document is properly set. I have no idea the exact 
   timing when the parent document is available but I assume the timing that
   IsThirdParty() gets called in the current codebase is fine. Therefore I tried
   not to change the timing we call IsThirdParty(). Instead, when do classification,
   we include all tables and save the match lists. When PrincipalFlashClassification()
   is called, we derive IsThirdParty() and the resolve the classification result
   (allow/denied/unknown) from the match lists with "isThirdParty" taken into account.

I passed the test case "browser_flashblock_on_with_always_activate.js"
and I think this one is representative enough so I decided to submit 
the patch for getting feedback right now :)

Note that the patch depends on Bug 1356211 so if you'd like to give a try,
you need to apply patch on Bug 1356211 first. I'll continue to test other test
cases tomorrow. Just let you know the status for now.

Thanks!
Flags: needinfo?(ksteuber)
A few thoughts:

It seems like the Flash Classification is now being cached in two places: nsDocument::mFlashClassification and nsDocument::PrincipalFlashClassifier::mResult. I feel like this should not be necessary, but perhaps I am missing something?

It isn't ideal that we are checking non-third-party documents against the third party list every time. That list will most likely have by far the most content to check against. However, I do not really have any suggestions for how to avoid this, so maybe we can live with it.

I would recommend running all 6 of the tests in the same directory as "browser_flashblock_on_with_always_activate.js". Between them, they should do a pretty thorough job of making sure that all documents are classified correctly in all cases.

Thanks for doing this!
Flags: needinfo?(ksteuber)
I plan to run an A/B experiment (tracked by bug 1352224) on Nightly during next week (Apr 17 - 23), so if you could delay landing this until Apr 24 at least, to minimize the risk of this breaking or interfering with the experiment, that would be much appreciated!
(Reporter)

Comment 20

4 months ago
(In reply to Kirk Steuber [:bytesized] from comment #18)
> A few thoughts:
> 
> It seems like the Flash Classification is now being cached in two places:
> nsDocument::mFlashClassification and
> nsDocument::PrincipalFlashClassifier::mResult. I feel like this should not
> be necessary, but perhaps I am missing something?
> 

I think you are right. Only mFlashClassification needs to be cached actually.
nsDocument::PrincipalFlashClassifier::mResult works like an necessary 
intermediary result. Once we derived mFlashClassification from 
PrincipalFlashClassifier::mResult and the parent document's state, 
PrincipalFlashClassifier::mResult becomes unused at all.

> It isn't ideal that we are checking non-third-party documents against the
> third party list every time. That list will most likely have by far the most
> content to check against. However, I do not really have any suggestions for
> how to avoid this, so maybe we can live with it.
> 
> I would recommend running all 6 of the tests in the same directory as
> "browser_flashblock_on_with_always_activate.js". Between them, they should
> do a pretty thorough job of making sure that all documents are classified
> correctly in all cases.
> 

I just had the courage to run all the tests and all of them were PASSED!!

> Thanks for doing this!

No problem :)
(Reporter)

Comment 21

4 months ago
(In reply to :Felipe Gomes (needinfo me!) from comment #19)
> I plan to run an A/B experiment (tracked by bug 1352224) on Nightly during
> next week (Apr 17 - 23), so if you could delay landing this until Apr 24 at
> least, to minimize the risk of this breaking or interfering with the
> experiment, that would be much appreciated!

Sure and I need to land 1356211 first, which doesn't affect the flash block
feature at all because it's about the extension of an API that "flash block" is
not using.
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Reporter)

Updated

3 months ago
Attachment #8857311 - Flags: review?(ksteuber)
Attachment #8857311 - Flags: review?(amarchesini)
(Reporter)

Comment 24

3 months ago
Per offline discussion with baku, we decide to take the approach which mixes 
the async and sync API call: We use the async API to initialize the 
"matched list" on "StartDocumentLoad()" for further use [1]. 
If we need that "matched list" but the async initialization hasn't finished, 
we then use the sync API as the fallback.

baku, Kirk, could you please review from DOM and "flash block" perspective
respectively? I have verified the changes wouldn't screw up the test cases
under http://searchfox.org/mozilla-central/source/toolkit/components/url-classifier/tests/browser

Thanks!

[1] The classification result (a list of list where the given URI is on)
    has to be resolved with "IsThirdPary" taken into account

    http://searchfox.org/mozilla-central/rev/224cc663d54085994a4871ef464b7662e0721e83/dom/base/nsDocument.cpp#13432
 
    and IsThirdParty() relies on the parent document. If we directly resolve
    the "matched list" to the accept/reject result when the async API is 
    just notified, the parent document may have not been set properly. As
    a result, we only save the "matched list" and resolve to the actual result
    in the first use of PrincipalFlashClassifier::Result().
(Reporter)

Comment 25

3 months ago
Sorry I forgot to add the telemetry to measure how frequent the sync API is called.
Would you mind having the review first and I add the telemetry in the next round?
Comment on attachment 8857311 [details]
Bug 1345058 - Asynchronously decide if a flash document should be blocked.

https://reviewboard.mozilla.org/r/129274/#review141028

We need more comments about the workflow. Can you add a good description of how this object works just before its CTOR?

::: dom/base/nsDocument.h:1460
(Diff revision 7)
>    // The root of the doc tree in which this document is in. This is only
>    // non-null when this document is in fullscreen mode.
>    nsWeakPtr mFullscreenRoot;
>  
> +  // Classify the flash based on the principal.
> +  class PrincipalFlashClassifier final : public nsIURIClassifierCallback

move this in the cpp file, and leave here just:

class PrincipalFlashClassifier;

::: dom/base/nsDocument.cpp:2587
(Diff revision 7)
> +  // Perform a async flash classification based on the doc principal
> +  // in an early stage to reduce the blocking time.
> +  if (!mPrincipalFlashClassifier) {
> +    mPrincipalFlashClassifier = new PrincipalFlashClassifier();
> +  }
> +  mFlashClassification = FlashClassification::Unclassified;

This is the default value. You don't need to add this line.

::: dom/base/nsDocument.cpp:13233
(Diff revision 7)
>  }
>  
> -/**
> - * Helper function for |nsDocument::PrincipalFlashClassification|
> - *
> - * Takes an array of table names and a comma separated list of table names
> +namespace {
> +
> +// An object to store all preferences we need for flash blocking feature.
> +// TODO: Observe the change of string prefs.

you already do this, right?

::: dom/base/nsDocument.cpp:13433
(Diff revision 7)
> +{
> +  {
> +    nsCOMPtr<nsIPrincipal> principal = aPrincipal;
> +    nsCOMPtr<nsIURI> uri;
> +    principal->GetURI(getter_AddRefs(uri));
> +  }

why this?
Attachment #8857311 - Flags: review?(amarchesini) → review+
(Reporter)

Comment 27

3 months ago
Also, I saw a lot of crashes in the try results [1]. I guess they are all due to
same reason (maybe cycle collection or ... )

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=edaff84cd1d37d8d3f9271bcc35632d8638212fb
I'm still reviewing the patch, but I thought I would point out that it appears that the crashes in debug builds point to |mPrincipalFlashClassifier| being dereferenced while null in |nsDocument::PrincipalFlashClassification|. I guess it is possible for |nsDocument::PrincipalFlashClassification| to be called before |nsDocument::StartDocumentLoad|? I would *guess* that this happens when the determination is being made as to whether navigator.plugins should include Flash, but I have not looked into it far enough to be sure of this.

Details on the how navigator.plugins decides whether to include Flash can be found in Bug 1323064.

Comment 29

3 months ago
mozreview-review
Comment on attachment 8857311 [details]
Bug 1345058 - Asynchronously decide if a flash document should be blocked.

https://reviewboard.mozilla.org/r/129274/#review141274

I have some concerns with aspects of this patch. Some of them were brought up already by :baku. But by far the most concerning thing for me is the move back to comma-separated lists being returned by classification functions. I do not believe this patch properly addresses the needed changes that would require. I will address this in more detail in my individual comments.

::: dom/base/nsDocument.cpp:13340
(Diff revision 7)
> +               "mUriClassifier should be nonnull if mAsyncClassifying == true");
> +    // OnClassifyComplete hasn't been called. We have to unfortunately
> +    // do a sync call to get the mMatchedTables.
> +    nsTArray<nsCString> classificationResults;
> +    nsresult rv = mUriClassifier->ClassifyLocalWithTables(mClassificationURI,
> +                                                          mClassificationTables,

It appears that at this point `mClassificationTables` will contain the third-party lists even though it likely does not need them. This makes sense when we didn't know if the document was third-party or not (and therefore can't avoid this), but by the time this function is called we should know that.

The third-party lists will likely be the largest lists by far and we should really avoid classifying anything against them if it is at all avoidable.

::: dom/base/nsDocument.cpp:13342
(Diff revision 7)
> +    // do a sync call to get the mMatchedTables.
> +    nsTArray<nsCString> classificationResults;
> +    nsresult rv = mUriClassifier->ClassifyLocalWithTables(mClassificationURI,
> +                                                          mClassificationTables,
> +                                                          classificationResults);
> +    if (NS_SUCCEEDED(rv)) {

It seems like if classification failed, it is just ignored here. In the current implementation [1], that failure has consequences for the classification that do not seem to be preserved in this implementation.

[1] http://searchfox.org/mozilla-central/rev/7057a51c5cf29b5b115b1db19ace2cfe3fd83a0e/dom/base/nsDocument.cpp#13322

::: dom/base/nsDocument.cpp:13344
(Diff revision 7)
> +    nsresult rv = mUriClassifier->ClassifyLocalWithTables(mClassificationURI,
> +                                                          mClassificationTables,
> +                                                          classificationResults);
> +    if (NS_SUCCEEDED(rv)) {
> +      // Flatten the result array. Note that the return format is inconsistent
> +      // between the async and sync API.

I do not really like to see that we are moving back to a comma-separated string rather than an array. I believe I had good reasons when I moved away from this (Bug 1319571).

::: dom/base/nsDocument.cpp:13369
(Diff revision 7)
> +nsDocument::PrincipalFlashClassifier::OnClassifyComplete(nsresult /*aErrorCode*/,
> +                                                         const nsACString& aLists, // Only this matters.
> +                                                         const nsACString& /*aProvider*/,
> +                                                         const nsACString& /*aPrefix*/)
> +{
> +  if (FlashClassification::Unclassified != mResult) {

Is there any legitamate way this could happen? Perhaps this should be an assertion?

::: dom/base/nsDocument.cpp:13376
(Diff revision 7)
> +    return NS_OK;
> +  }
> +
> +  // We only populate the matched list without resolving the classification
> +  // result because we are not sure if the parent doc has been properly set.
> +  mMatchedTables = aLists;

A bit of a nit:

I keep trying to figure out exactly what function this will ultimately invoke, but I am not totally sure. The question is, does this perform an unnecessary copy of `aLists`? It appears that the underlying data could simply be taken or swapped rather than copied.

::: dom/base/nsDocument.cpp:13397
(Diff revision 7)
> +
> +  auto& prefs = GetPrefStore();
> +
> +  // Deny if on deny tables and not on exception tables.
> +  if (FindInReadable(mMatchedTables, prefs.mDenyTables) &&
> +      !FindInReadable(mMatchedTables, prefs.mDenyExceptionsTables)) {

I believe that this approach is wrong. You are searching for a comma-separated string in another comma-separated string. This will fall apart once there is more than one table stored in the pref (which there will be).

This is why I wanted to move away from comma-separated lists, and towards arrays. If we want to move back in the other direction, it will require that the array of matching tables be compiled, converted to a comma-separated string, and then separated back to individual table names in order to make these comparisons. I would really like to avoid that.

::: dom/base/nsDocument.cpp:13418
(Diff revision 7)
> +
> +void
> +nsDocument::PrincipalFlashClassifier::AsyncClassify(nsIPrincipal* aPrincipal)
> +{
> +  if (FlashClassification::Unclassified != mResult) {
> +    NS_WARNING("Previous AsyncClassify() hasn't finished.");

Is this actually what's going on here?

It seems like if `mResult` already has a classification, the issue is not that a previous `AsyncClassify()` call hasn't finished, it is that it was already called and completed and is now being called again for some reason.

Like line 13369, it seems like maybe this should be an assertion. Is there any legitamate way this could happen?

::: dom/base/nsDocument.cpp:13480
(Diff revision 7)
> -                                              results);
> +                                                   this);
>    if (NS_FAILED(rv)) {
>      if (rv == NS_ERROR_MALFORMED_URI) {
>        // This means that the URI had no hostname (ex: file://doc.html). In this
>        // case, we allow the default (Unknown plugin) behavior.
>        return FlashClassification::Unknown;

I didn't look too deeply into it, but it seems like the error I was looking for with this check will not be discovered until the Chrome process has had a chance to attempt to classify the URI.

So it seems like this possibility should be handled elsewhere, right? Like in `nsDocument::PrincipalFlashClassifier::OnClassifyComplete`?
Attachment #8857311 - Flags: review?(ksteuber) → review-
(Reporter)

Comment 30

3 months ago
mozreview-review-reply
Comment on attachment 8857311 [details]
Bug 1345058 - Asynchronously decide if a flash document should be blocked.

https://reviewboard.mozilla.org/r/129274/#review141274

Thanks for the detailed review!

As for the comma-separated lists, I might need your feedback that which approach I should take.
Please see my individual reply for the two possible approaches.

> It appears that at this point `mClassificationTables` will contain the third-party lists even though it likely does not need them. This makes sense when we didn't know if the document was third-party or not (and therefore can't avoid this), but by the time this function is called we should know that.
> 
> The third-party lists will likely be the largest lists by far and we should really avoid classifying anything against them if it is at all avoidable.

Totally makes sense. Thanks for pointing out this :)

> It seems like if classification failed, it is just ignored here. In the current implementation [1], that failure has consequences for the classification that do not seem to be preserved in this implementation.
> 
> [1] http://searchfox.org/mozilla-central/rev/7057a51c5cf29b5b115b1db19ace2cfe3fd83a0e/dom/base/nsDocument.cpp#13322

oops I should have noticed this :( sorry..

> Is there any legitamate way this could happen? Perhaps this should be an assertion?

It would happen if this callback is fired way after than Result() being called.

> A bit of a nit:
> 
> I keep trying to figure out exactly what function this will ultimately invoke, but I am not totally sure. The question is, does this perform an unnecessary copy of `aLists`? It appears that the underlying data could simply be taken or swapped rather than copied.

Oh you are right. We can just "move" aList to mMatchedList to reduce memory copy.

> I believe that this approach is wrong. You are searching for a comma-separated string in another comma-separated string. This will fall apart once there is more than one table stored in the pref (which there will be).
> 
> This is why I wanted to move away from comma-separated lists, and towards arrays. If we want to move back in the other direction, it will require that the array of matching tables be compiled, converted to a comma-separated string, and then separated back to individual table names in order to make these comparisons. I would really like to avoid that.

I was't aware that "there is more than one table stored in the pref" is possible.

In the very beginning I wanted to reuse the callback type (OnClassifyComplete)
and its semantics as much as possible. So I can either

1) Create a new callback interface for returning a string array or
2) Format the returned string with the edge case taken into account. 
   For example, for ["a,b,c", "b,c,d"] ==> we return "a,b,c;b,c,d" rather than "a,b,c,b,c,d"
   
What do you think?

> Is this actually what's going on here?
> 
> It seems like if `mResult` already has a classification, the issue is not that a previous `AsyncClassify()` call hasn't finished, it is that it was already called and completed and is now being called again for some reason.
> 
> Like line 13369, it seems like maybe this should be an assertion. Is there any legitamate way this could happen?

Line 13369 should happen and this indeed should to be an assertion to prevent
from trying to call AsyncClassify() before resetting first.

> I didn't look too deeply into it, but it seems like the error I was looking for with this check will not be discovered until the Chrome process has had a chance to attempt to classify the URI.
> 
> So it seems like this possibility should be handled elsewhere, right? Like in `nsDocument::PrincipalFlashClassifier::OnClassifyComplete`?

Yes. That kind of error is what I wasn't aware. Thank you for bring that up :)
(Reporter)

Comment 31

3 months ago
mozreview-review-reply
Comment on attachment 8857311 [details]
Bug 1345058 - Asynchronously decide if a flash document should be blocked.

https://reviewboard.mozilla.org/r/129274/#review141028

I will definitely add the workflow for PrincipalFlashClassifier. Thanks.

> you already do this, right?

I believe this is the legacy code from my old patch :p

> why this?

the legacy code again :( really sorry.
(Reporter)

Comment 32

3 months ago
(In reply to Henry Chang [:hchang] from comment #30)
> 
> > This is why I wanted to move away from comma-separated lists, and towards arrays. If we want to move back in the other direction, it will require that the array of matching tables be compiled, converted to a comma-separated string, and then separated back to individual table names in order to make these comparisons. I would really like to avoid that.
> 
> I was't aware that "there is more than one table stored in the pref" is
> possible.
> 
> In the very beginning I wanted to reuse the callback type
> (OnClassifyComplete)
> and its semantics as much as possible. So I can either
> 
> 1) Create a new callback interface for returning a string array or
> 2) Format the returned string with the edge case taken into account. 
>    For example, for ["a,b,c", "b,c,d"] ==> we return "a,b,c;b,c,d" rather
> than "a,b,c,b,c,d"
>    
  I think I got some misunderstanding here. The example should be
  matched list: [a, b, c, d, e] (represented by "a,b,c,d,e")
  table names: [b, c] (represented by "b,c")
  and to search if any of the elements from [a, b, c, d, e] is appeared in [b, c]

  So, my second approach should be converting "a,b,c,d,e" to [a,b,c,d,e]
  and do the same check as what the current code base is doing. 

> What do you think?
>
Comment hidden (mozreview-request)
(Reporter)

Comment 34

3 months ago
mozreview-review-reply
Comment on attachment 8857311 [details]
Bug 1345058 - Asynchronously decide if a flash document should be blocked.

https://reviewboard.mozilla.org/r/129274/#review141274

> I was't aware that "there is more than one table stored in the pref" is possible.
> 
> In the very beginning I wanted to reuse the callback type (OnClassifyComplete)
> and its semantics as much as possible. So I can either
> 
> 1) Create a new callback interface for returning a string array or
> 2) Format the returned string with the edge case taken into account. 
>    For example, for ["a,b,c", "b,c,d"] ==> we return "a,b,c;b,c,d" rather than "a,b,c,b,c,d"
>    
> What do you think?

The example is so wrong. It should be as stated here:

https://bugzilla.mozilla.org/show_bug.cgi?id=1345058#c32
(Reporter)

Comment 35

3 months ago
mozreview-review-reply
Comment on attachment 8857311 [details]
Bug 1345058 - Asynchronously decide if a flash document should be blocked.

https://reviewboard.mozilla.org/r/129274/#review141028

> This is the default value. You don't need to add this line.

I added this reset because StartDocumentLoad() could be called more than
once per document, shall it?
(Reporter)

Comment 36

3 months ago
Have submitted a new patch which

1) Addressed most of the review comments. 
   (For those comments which are not addressed with mentioned below)

2) Regarding the "comma-separated tables" issue, what I did to this revision is
   to keep the return format but parse it to an array of string then do
   the same check with ArrayContainsTable(). 

3) Moved nsDocument::PrincipalFlashClassifier to ::PrincipalFlashClassifier
   so that we can avoid having its declaration in nsDocument.h. Also
   comments are added to describe how it works.

4) Added a new function "GetClassificatioTables(bool aIsThirdParty)" for the case 
   where we need to call classifyLocalWithTables.

5) Moved the sanity check of nsURLClassifierDBService::AsyncClassifyLocalWithTables()
   to a little bit earlier so that we can catch the invalid input prior to
   IPC.

6) Fixed (probably) the crashes I've seen in the previous patch by creating
   mPrincipalFlashClassifier in the ctor because StartDocumentLoad() may not
   have been called.

Comment 37

3 months ago
mozreview-review
Comment on attachment 8857311 [details]
Bug 1345058 - Asynchronously decide if a flash document should be blocked.

https://reviewboard.mozilla.org/r/129274/#review141654

This is looking pretty good. Thanks for working on this!

::: dom/base/nsDocument.cpp:329
(Diff revision 8)
> +// 2) At any time you need the classification result, call Result()
> +//    and it is guaranteed to give you the result. Note that you have
> +//    to specify "aIsThirdParty" to the function so please make sure
> +//    you can already corretly decide if the document is third-party.
> +//
> +//    Behind the scence, the sync classification API

nit: scence -> scenes

::: dom/base/nsDocument.cpp:362
(Diff revision 8)
> +  mozilla::dom::FlashClassification AsyncClassifyInternal(nsIPrincipal* aPrincipal);
> +  nsCString GetClassificationTables(bool aIsThirdParty);
> +
> +  // For the fallback sync classification.
> +  nsCOMPtr<nsIURI> mClassificationURI;
> +  nsCString mClassificationTables;

It doesn't seem like `mClassificationTables` is used anymore. Can we remove it?

::: dom/base/nsDocument.cpp:2644
(Diff revision 8)
>    }
>  
> +  // Perform a async flash classification based on the doc principal
> +  // in an early stage to reduce the blocking time.
> +  mFlashClassification = FlashClassification::Unclassified;
> +  mPrincipalFlashClassifier->AsyncClassify(GetPrincipal());

Since there was previously an issue with `nsDocument::PrincipalFlashClassification` being called before `mPrincipalFlashClassifier` was set in `nsDocument:StartDocumentLoad`, I am guessing that it is possible that this line could be reached after `mResult` has been set synchronously. If this happens, `AsyncClassify()` will fail its assertion that `FlashClassification::Unclassified == mResult`.

If, on the other hand, the `nsDocument::PrincipalFlashClassification` crash was happening because `nsDocument:StartDocumentLoad` is just never called in some cases, then maybe this is ok as it is. I am just not sure what situation caused that crash to arise.

::: dom/base/nsDocument.cpp:13428
(Diff revision 8)
> +  if (mAsyncClassifying) {
> +    MOZ_ASSERT(mUriClassifier,
> +               "mUriClassifier should be nonnull if mAsyncClassifying == true");
> +    // OnClassifyComplete hasn't been called. We have to unfortunately
> +    // do a sync call to get the mMatchedTables.
> +    nsCString classificationTables = GetClassificationTables(aIsThirdParty);

Does this result in an unecessary string copy?

::: dom/base/nsDocument.cpp:13482
(Diff revision 8)
> +    }
> +    begin = iter;
> +    if (begin != end) {
> +      begin++;
> +    }
> +  }

I would still prefer not to have to split the string into an array here. Looking at the implementation of `nsUrlClassifierDBService::AsyncClassifyLocalWithTables`, I see that this data originally **was** in an array [1]. I don't like the idea of converting that array to a string, passing the data, and then converting it right back to an array.

I think my biggest problem is that I cannot find any reason why the conversion to string was ever performed in the first place. As far as I can tell, no consumer takes advantage of this format.

That being said, it appears that this conversion may be non-trivial. If necessary, I think it would be acceptable to use the implementation you have here and file a separate bug to get this fixed.

[1] http://searchfox.org/mozilla-central/rev/7057a51c5cf29b5b115b1db19ace2cfe3fd83a0e/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#1919

::: dom/base/nsDocument.cpp:13554
(Diff revision 8)
>      return FlashClassification::Denied;
>    }
>  
> -  nsAutoCString allowTables, allowExceptionsTables,
> -                denyTables, denyExceptionsTables,
> -                subDocDenyTables, subDocDenyExceptionsTables,
> +  // We haven't been able to decide if it's a third party document
> +  // so just assume "yes" to take all tables into account.
> +  nsCString tables = GetClassificationTables(true);

Perhaps another unnecessary string copy?

Also, I would like the comment above to be a bit more clear. I think that it should convey *why* this function cannot determine whether the document is third-party and that `PrincipalFlashClassifier::Resolve` will be able to make that determination and ignore any third-party table matches if necessary.
Attachment #8857311 - Flags: review?(ksteuber) → review+
(Reporter)

Comment 38

3 months ago
mozreview-review-reply
Comment on attachment 8857311 [details]
Bug 1345058 - Asynchronously decide if a flash document should be blocked.

https://reviewboard.mozilla.org/r/129274/#review141654

Thanks for the review! 

For the comma-separated lists, I got your concern and the main reason I convert 
it to a string in the first place is to reuse the callback interface which takes 
a string as the the callback value. In other use cases, only the "emptiness" 
of that string matters so everything goes well until this one.

I will file a bug to try to create a new callback interface for AsyncClassifyLocalWithTables.

Thanks :)

> It doesn't seem like `mClassificationTables` is used anymore. Can we remove it?

Sorry I should've removed it :(

> Does this result in an unecessary string copy?

If the RVO (return value optimization) works properly, no string/object copy will be involved.
Comment hidden (mozreview-request)
(Reporter)

Comment 40

3 months ago
There seems to be a platform-specific issue [1] according to the try results.
test_refresh_firefox.py failed on all platforms except MacOS so I am trying
to reproduce on Linux. 

[1] https://treeherder.mozilla.org/#/jobs?repo=try&revision=07efe31617a4efd8aa345dec37766b6e1fb7ac54
(Reporter)

Comment 41

3 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=75ad5c3dfe78
Comment hidden (mozreview-request)

Comment 43

3 months ago
We're sorry, Autoland could not rebase your commits for you automatically. Please manually rebase your commits and try again.

hg error in cmd: hg rebase -s 88fca9f8c607 -d a1c4ad36a95f: rebasing 396373:88fca9f8c607 "Bug 1345058 - Asynchronously decide if a flash document should be blocked. r=baku,bytesized" (tip)
merging dom/base/nsDocument.cpp
merging dom/base/nsDocument.h
merging toolkit/components/url-classifier/nsUrlClassifierDBService.cpp
warning: conflicts while merging dom/base/nsDocument.cpp! (edit, then use 'hg resolve --mark')
unresolved conflicts (see hg resolve, then hg rebase --continue)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Comment 46

3 months ago
Pushed by hchang@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/44635c073692
Asynchronously decide if a flash document should be blocked. r=baku,bytesized
Backed out on suspicion of causing intermittent xpcshell failures in toolkit/components/url-classifier/tests/unit/test_bug1274685_unowned_list.js on OS X:

https://hg.mozilla.org/integration/autoland/rev/939f52993a6bae4a040a77156c3fda0ce44d53a6

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&revision=44635c0736928ed71f080bec6d1e46778edc9cfe&filter-resultStatus=testfailed&filter-resultStatus=busted&filter-resultStatus=exception&filter-resultStatus=retry&filter-resultStatus=usercancel&filter-resultStatus=runnable
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=100401666&repo=autoland

03:47:08     INFO -  TEST-START | toolkit/components/url-classifier/tests/unit/test_bug1274685_unowned_list.js
03:52:08  WARNING -  TEST-UNEXPECTED-TIMEOUT | toolkit/components/url-classifier/tests/unit/test_bug1274685_unowned_list.js | Test timed out
03:52:08     INFO -  TEST-INFO took 300000ms
03:52:08     INFO -  >>>>>>>
03:52:08     INFO -  PID 6534 | [6534] WARNING: Couldn't get the user appdata directory. Crash events may not be produced.: file /home/worker/workspace/build/src/toolkit/crashreporter/nsExceptionHandler.cpp, line 2888
03:52:08     INFO -  (xpcshell/head.js) | test MAIN run_test pending (1)
03:52:08     INFO -  (xpcshell/head.js) | test MAIN run_test finished (1)
03:52:08     INFO -  exiting test
03:52:08     INFO -  PID 6534 | Couldn't convert chrome URL: chrome://branding/locale/brand.properties
03:52:08     INFO -  PID 6534 | [6534] WARNING: Could not get the program name for a cubeb stream.: 'NS_SUCCEEDED(rv)', file /home/worker/workspace/build/src/dom/media/CubebUtils.cpp, line 317
03:52:08     INFO -  "CONSOLE_MESSAGE: (info) No chrome package registered for chrome://branding/locale/brand.properties"
03:52:08     INFO -  <<<<<<<
Flags: needinfo?(hchang)
(Reporter)

Comment 48

3 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=aecaac2b808f
(Reporter)

Comment 49

3 months ago
I suspect it's due to bug 1363038 which is a known issue and the patch for this bug 
makes it make more easily. I'll just flag this one dependent on bug 1363038 first.
(Reporter)

Updated

3 months ago
Depends on: 1363038
Flags: needinfo?(hchang)
(Reporter)

Comment 50

3 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=fb54c504c79e
(Reporter)

Comment 51

3 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9f1fb8f51e78
(Reporter)

Comment 52

3 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=9fd62f73481a
(Reporter)

Comment 53

2 months ago
https://treeherder.mozilla.org/#/jobs?repo=try&revision=80fc28fed7ff
Henry, do we have progress on this? thanks.
Flags: needinfo?(hchang)
(Reporter)

Comment 55

2 months ago
No :( The patch has been landed weeks ago but backed out due to a test case failure
which I can't resolve until now. I have to resolve bug 1363038 first but I still
couldn't have a good solution after struggling with different failed approaches. 
Considering this bug would only matter when flash blocking is enabled, 
I decided to lower its priority and be working on other bugs. Will go back to
this one after I resolve 1355746 and 1348591.
Flags: needinfo?(hchang)
Blocks: 1331674
Hi Henry, sorry to bother you about this, but I was wondering if you're now actively working on this?  This is now showing up pretty high on the list of sync IPC messages that we should probably fix for 57...  Thanks!
(In reply to :Ehsan Akhgari (needinfo please, extremely long backlog) from comment #56)
> Hi Henry, sorry to bother you about this, but I was wondering if you're now
> actively working on this?  This is now showing up pretty high on the list of
> sync IPC messages that we should probably fix for 57...  Thanks!

Hi Ehsan,
Thanks for asking. Thomas will take over this bug.
Assignee: hchang → tnguyen
(Assignee)

Comment 58

4 days ago
I would like to fix bug 1363038 which could resolve the try failures first.
(Assignee)

Comment 59

3 days ago
Created attachment 8897798 [details] [diff] [review]
Asynchronously decide if a flash document should be blocked.

MozReview-Commit-ID: K91cSvIdOrb
(Assignee)

Comment 60

3 days ago
(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #59)
> Created attachment 8897798 [details] [diff] [review]
> Asynchronously decide if a flash document should be blocked.
> 
> MozReview-Commit-ID: K91cSvIdOrb

Rebased and revised a bit to fix crash on wpt test, we still have to wait for bug 1363038 completes
You need to log in before you can comment on or make changes to this bug.