Closed Bug 1351472 Opened 3 years ago Closed 3 years ago

thousands of warnings about "Could not find prefix in PrefixSet during noise lookup"

Categories

(Toolkit :: Safe Browsing, defect, P3)

defect

Tracking

()

RESOLVED FIXED
mozilla55
Tracking Status
firefox55 --- fixed

People

(Reporter: jya, Assigned: tnguyen)

References

Details

(Whiteboard: #sbv4-m7)

Attachments

(1 file)

STR:
Go to www.bbc.co.uk

browse around, watch some videos.

The console shows a lot of warning such as:
[32845] WARNING: Could not find prefix in PrefixSet during noise lookup: file /Users/jyavenard/Work/Mozilla/mozilla-central/toolkit/components/url-classifier/Classifier.cpp, line 1398
[32845] WARNING: NS_ENSURE_SUCCESS(rv, rv) failed with result 0x80004005: file /Users/jyavenard/Work/Mozilla/mozilla-central/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp, line 354
Is that on Nightly 55? In a brand new profile or one that has been left running for a little while?
Flags: needinfo?(jyavenard)
Priority: -- → P3
Whiteboard: #sbv4-m6
This isn't nightly per say but a local central build (debug). Version 55

New profile
Flags: needinfo?(jyavenard)
It may be due to the fact that while we're now doing gethash calls on Nightly, we're not caching anything in the LookupCache, so there's no way to find noise entries. Once bug 1311935 has landed, we should only see the warning a few times, that is until the first gethash response gets cached.

Does my theory make sense Dimi?
Depends on: 1311935
Flags: needinfo?(dlee)
(In reply to François Marier [:francois] from comment #3)
> It may be due to the fact that while we're now doing gethash calls on
> Nightly, we're not caching anything in the LookupCache, so there's no way to
> find noise entries. Once bug 1311935 has landed, we should only see the
> warning a few times, that is until the first gethash response gets cached.
> 
> Does my theory make sense Dimi?

LookupCache contains both "update data"(PrefixSet) and "completions from gethash",
Before requesting gethash calls, we will try to find noise entries in PrefixSet, not from cache,
so this shouldn't related to caching.

Because right now we only find noise entries in 4-bytes PrefixSet, but in V4, PrefixSet may
contain variable-length prefixes, so a non-buggy scenario is that all prefixes in PrefixSet
are not 4-bytes. In this case, we may find a match in database, read noise entries from
Fixed-Length PrefixSet but no entry could be found. But I don't think this will be the case.

Hi Hean-Yves,
Since this bug is found in a local build, I think it will be useful if you can help upload
the safebrowsing database, it can be found in PROFILE_FOLDER/tmp/scratch_user/safebrowsing/

Can you help check it? Thanks!
No longer depends on: 1311935
Flags: needinfo?(dlee) → needinfo?(jyavenard)
(In reply to Dimi Lee[:dimi][:dlee] from comment #4)
> Since this bug is found in a local build, I think it will be useful if you
> can help upload
> the safebrowsing database, it can be found in
> PROFILE_FOLDER/tmp/scratch_user/safebrowsing/

There's no such folder in the profile directory.

    $ find ~/Library/Application\ Support/Firefox -name safebrowsing
    $
Flags: needinfo?(jyavenard)
(In reply to Jean-Yves Avenard [:jya] from comment #5)
> (In reply to Dimi Lee[:dimi][:dlee] from comment #4)
> > Since this bug is found in a local build, I think it will be useful if you
> > can help upload
> > the safebrowsing database, it can be found in
> > PROFILE_FOLDER/tmp/scratch_user/safebrowsing/
> 
> There's no such folder in the profile directory.
> 
>     $ find ~/Library/Application\ Support/Firefox -name safebrowsing
>     $

Sorry i thought you are working on a developer build on linux.
Could you help check the following path again? Thanks!

~/Library/Caches/Firefox/Profiles/xxx.default/safebrowsing
xxx might be something like iqmgtpms, it's different on each machine.
sorry I forget to needinfo
Flags: needinfo?(jyavenard)
Whiteboard: #sbv4-m6 → #sbv4-m7
I created a new profile, and I can reproduce it again, everytime I use a page that plays data...
It's 9MB long, I've put it there:

https://people-mozilla.org/~jyavenard/safebrowsing.tbz
Flags: needinfo?(jyavenard)
Well, I think the warning occurs when the Url hits Tracking list (e.g base-track-digest256)

The list only contains 32 bytes and the prefix_set should be empty. We dont really need to send noise for such kind of these matchings.
I think we just filter out unnecessary list before adding noise (the list should be different from tracking protect list), or just filter out "google" provider (I guess that we only need to send noise to google server).
Hi Francois, do you have any idea about that?
Flags: needinfo?(francois)
(In reply to Thomas Nguyen[:tnguyen] ni plz from comment #9)
> Well, I think the warning occurs when the Url hits Tracking list (e.g
> base-track-digest256)

Are we generating noise entries for lists that are in urlclassifier.disallow_completions?

If so, then that's certainly unnecessary and we should fix that.

> The list only contains 32 bytes and the prefix_set should be empty. We dont
> really need to send noise for such kind of these matchings.

The noise value should not be specific to the provider. We want the same privacy protection regardless of which server our users are talking to.
Flags: needinfo?(francois)
(In reply to François Marier [:francois] from comment #10)
> (In reply to Thomas Nguyen[:tnguyen] ni plz from comment #9)
> > Well, I think the warning occurs when the Url hits Tracking list (e.g
> > base-track-digest256)
> 
> Are we generating noise entries for lists that are in
> urlclassifier.disallow_completions?
> 
> If so, then that's certainly unnecessary and we should fix that.
> 
Thanks Francois, 
I see that the base-track-digest256 is in the disallow list, we should do that way :)
Assignee: nobody → tnguyen
Status: NEW → ASSIGNED
Comment on attachment 8862746 [details]
Bug 1351472 - Skip AddNoise if the table is unknown or disallowed to getHash

https://reviewboard.mozilla.org/r/134608/#review137802

I would be curious to know what the root cause of this issue is. Why are we calling `AddNoise()` for tables that are supposed to have completions disabled?

Since noise is only added when we do completions, it seems like we are attempting to do completions when we shouldn't and I'd rather fix the problem earlier.
Attachment #8862746 - Flags: review?(francois) → review-
Comment on attachment 8862746 [details]
Bug 1351472 - Skip AddNoise if the table is unknown or disallowed to getHash

https://reviewboard.mozilla.org/r/134608/#review137802

I think we don't have to worry about attempting to do completions of invalid table.
That is only a warning when we add a noise to completes array[1] before about going to getHash [2]
[1] https://dxr.mozilla.org/mozilla-central/rev/1c6fa11d1fed893b00d94b6a899c307262674613/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#283
[2] https://dxr.mozilla.org/mozilla-central/rev/1c6fa11d1fed893b00d94b6a899c307262674613/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#291

Indeed we never do gethash for invalid table, you will see the flow in  [3], [4], [5]
[3]https://dxr.mozilla.org/mozilla-central/rev/1c6fa11d1fed893b00d94b6a899c307262674613/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#1076
[4]https://dxr.mozilla.org/mozilla-central/rev/1c6fa11d1fed893b00d94b6a899c307262674613/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#1100
[5]https://dxr.mozilla.org/mozilla-central/rev/1c6fa11d1fed893b00d94b6a899c307262674613/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#2264

Another approach is that we could early check of the table before going to lookupcomplete, somewhere in
[6]https://dxr.mozilla.org/mozilla-central/rev/1c6fa11d1fed893b00d94b6a899c307262674613/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#2264
Comment on attachment 8862746 [details]
Bug 1351472 - Skip AddNoise if the table is unknown or disallowed to getHash

https://reviewboard.mozilla.org/r/134608/#review137802

> Indeed we never do gethash for invalid table, you will see the flow in  [3], [4], [5]

Good, I'm glad we don't have a bug there :) Thanks for confirming.

Looking at this block:
https://dxr.mozilla.org/mozilla-central/rev/1c6fa11d1fed893b00d94b6a899c307262674613/toolkit/components/url-classifier/nsUrlClassifierDBService.cpp#279-287

It appears that we only add noise once, for the first result that's not confirmed.

This means that if we return early in `AddNoise()`, we won't add any noise at all, regardless of the other results. That's true of the existing code too. If we have two results, one from `base-track-digest256` and one from `goog-phish-shavar`, then we'll skip adding noise because `base-track-digest256` in is `disable_completions` but then we'll be doing a completion with Google on `goog-phish-shavar` without adding any noise.

I think we can fix this by changing the above if statement to:

```
if (!completes->ElementAt(i).Confirmed() && notInDisableCompletions && hasGethashUrl) {
    AddNoise(...);
    break;
}
```

In other words, moving the check you added to `AddNoise()` directly into that `for` loop.
Thanks you Francois
I just skipped hasGethashUrl idea because we may have to find a way to get gethashUrl. We may get the url from listmanager (by runnable/callback dispatch between main thread / worker thread) or directly from preference. Both of them make the code more complex but meaningless. We just fix the warning for addnoise and we don't send the getHash request when we should not
Comment on attachment 8862746 [details]
Bug 1351472 - Skip AddNoise if the table is unknown or disallowed to getHash

https://reviewboard.mozilla.org/r/134608/#review139426

That looks great. Just a few nits.

::: toolkit/components/url-classifier/nsUrlClassifierDBService.h:103
(Diff revision 2)
>    NS_DECL_THREADSAFE_ISUPPORTS
>    NS_DECL_NSIURLCLASSIFIERDBSERVICE
>    NS_DECL_NSIURICLASSIFIER
>    NS_DECL_NSIOBSERVER
>  
> +  bool CanGethash(const nsACString &tableName);

nit: I would suggest `CanComplete()`.

::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:2257
(Diff revision 2)
>  
>    return mWorkerProxy->CacheMisses(results);
>  }
>  
>  bool
> +nsUrlClassifierDBService::CanGethash(const nsACString &aTableName)

We could simplify that further:

    return mGethashTables.Contains(aTableName) && !mDisallowCompletionsTables.Contains(aTableName);

::: toolkit/components/url-classifier/nsUrlClassifierDBService.cpp:2259
(Diff revision 2)
>  }
>  
>  bool
> +nsUrlClassifierDBService::CanGethash(const nsACString &aTableName)
> +{
> +  // If we don't know about this table at all, or are disallowing completions

nit: we may as well remove the comment since the check is pretty straightforward
Attachment #8862746 - Flags: review?(francois)
Thanks Francois for your review :)
Comment on attachment 8862746 [details]
Bug 1351472 - Skip AddNoise if the table is unknown or disallowed to getHash

https://reviewboard.mozilla.org/r/134608/#review139442
Attachment #8862746 - Flags: review?(francois) → review+
Added mWorker = nullptr after we shutdown firefox to avoid leak (seems we missed that) 
Try result:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=2caa6f8697b15ef4e3a09e297404bac9c0936446&selectedJob=96801464
Keywords: checkin-needed
Pushed by ihsiao@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/6afd88f235fe
Skip AddNoise if the table is unknown or disallowed to getHash r=francois
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/6afd88f235fe
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla55
You need to log in before you can comment on or make changes to this bug.