mozilla::safebrowsing::LookupCacheV4::VerifyChecksum consumes >400ms of CPU during startup

RESOLVED FIXED in Firefox 67

Status

()

enhancement
P1
normal
RESOLVED FIXED
2 years ago
23 days ago

People

(Reporter: gps, Assigned: dimi)

Tracking

(Blocks 2 bugs, {perf})

unspecified
mozilla67
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox67 fixed)

Details

(Whiteboard: [qf:p1:pageload])

Attachments

(7 attachments)

Reporter

Description

2 years ago
When profiling Firefox startup, mozilla::safebrowsing::LookupCacheV4::VerifyChecksum was in the stack a lot - over 100ms on an i7-6700K. Although it wasn't in the main thread, it seems to be using critical CPU very early in app startup.

Furthermore, it appears it is using SHA-256 to perform verification. I question whether we need to use a cryptographic hash for file integrity verification [during startup]. I understand needing to verify the integrity of the safe browsing database when it is downloaded. But for a simple integrity check when loading a file, I would think a non-cryptographic hash like xxHash (which is better than CRC32) would be sufficient. I think xxHash is over a magnitude faster than SHA-256. So using a cryptographic hash here might just be flat out wasting CPU cycles [that could be used by other Firefox threads during startup].
Assignee

Updated

2 years ago
Assignee: nobody → dlee
Status: NEW → ASSIGNED

Updated

2 years ago
Assignee: dlee → nobody
Status: ASSIGNED → NEW
Assignee

Updated

2 years ago
Assignee: nobody → dlee
Status: NEW → ASSIGNED
Assignee

Comment 1

2 years ago
The reason safebrowsing uses SHA-256 is because google uses it[1], we will receive the SHA-256 checksum from google while doing an update. We also use the checksum while saving update data to file so we don't need to calculate checksum again.

We can use faster hash algorithm when store update data to file just like what we have already done in v2[2]. The downside of this approach is we need to calculate checksum again while saving data.

HI francois,
do you have any suggestion?

[1] https://developers.google.com/safe-browsing/v4/reference/rest/v4/threatListUpdates/fetch#checksum
[2] https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/HashStore.cpp#380 
    https://dxr.mozilla.org/mozilla-central/source/toolkit/components/url-classifier/HashStore.cpp#963
Flags: needinfo?(francois)
I don't have enough information to be confident about the priority of this optimization. Is 100ms on a background thread a big deal?

We could keep both a hash and a CRC-like thing around, but a DXR search for xxhash came out empty. It's probably overkill to add a new hash algorithm to mozilla-central just for this optimization when we already have in-tree CRC32 implementations.

Alternatively, do we have a way to mark this background thread as low priority so that it doesn't steal CPU cycles from a more important one?
Flags: needinfo?(francois) → needinfo?(gps)
Whiteboard: #sbv4-m8
Priority: -- → P3
Reporter

Comment 3

2 years ago
I don't know enough about Firefox internals to answer questions in the needinfo. My gut instinct is that this CPU so early during startup is wasteful.

xxHash is a relatively new hashing function. It is very small and very fast (much faster than CRC32) and it has sufficient "randomness" for content verification. It is used by lz4 and zstd, for example. But if we don't want to bloat scope to include it in Gecko, CRC32 should be acceptable (it is certainly faster than SHA-256). It is relatively slow compared to xxhash. But perfect is the enemy of good.
Flags: needinfo?(gps)
Assignee

Comment 4

2 years ago
not actively work on this.
Assignee: dlee → nobody
Status: ASSIGNED → NEW

Comment 5

2 years ago
Feels like this is something we should track as part of photon-perf or quantum flow. Florian, can you help get the right eyes on this?
Flags: needinfo?(florian)
Not really photon-perf, sounds more like Quantum Flow, although I'm not sure how much quantum flow cares about things that are slow but not on the main thread, ehsan?
QA Whiteboard: [qf]
Flags: needinfo?(florian) → needinfo?(ehsan)

Comment 7

2 years ago
Running non-low-priority CPU intensive work during startup directly competes with CPU intensive work that we need to do (mostly for hot startup and later stages of cold startups).  Ideally we should defer this work if possible.  Please note that we do spawn a fair number of threads pretty early on so it's not just the main thread that this competes with.

This is a hard sell for [qf:p1] but we should *really* try to fix it if possible...
QA Whiteboard: [qf] → [qf:p3]
Flags: needinfo?(ehsan)
Assignee

Comment 8

2 years ago
Since this is related to Quantum Flow, I'll work on it.
Assignee: nobody → dlee
Status: NEW → ASSIGNED
Whiteboard: #sbv4-m8 → #sbv4-m9
Assignee

Comment 9

2 years ago
I'll take PTO during M9, remove myself from assignee.
Assignee: dlee → nobody
Status: ASSIGNED → NEW
Whiteboard: #sbv4-m9
I have an imported perf profile that (currently) shows this validation takes 400+ms on a fast dual-xeon desktop (compiled with --enable-release, local inbound pull build).  This (especially as it blocks initial page load, and chews a core when we're trying to initialize) would seem to me be a qf:p1 now at least).  Note also that 17% or so is spent in GetSmallestPrefix(), and 3% in GetPrefixes()

https://perfht.ml/2N0gCVD

I'll do some more profiles to verify.  CRC32 should be at least 50% better, perhaps more.
QA Whiteboard: [qf:p3] → [qf]

Another current profile:
https://perfht.ml/2DK57OC
400+ms on a local mozilla-inbound opt --enable-release build

Note that from IRC conversations, I believe verifying the safebrowsing database blocks actually browsing, so this is important - and adds a bunch of CPU and likely time to startup. 400ms on a 3.8GHz xeon is a lot on something less powerful, like arm64.

Summary: mozilla::safebrowsing::LookupCacheV4::VerifyChecksum consumes >100ms of CPU during startup → mozilla::safebrowsing::LookupCacheV4::VerifyChecksum consumes >400ms of CPU during startup
QA Whiteboard: [qf]
Whiteboard: [qf]

Updated

4 months ago
Whiteboard: [qf] → [qf:p2:responsiveness]

Updated

4 months ago
Whiteboard: [qf:p2:responsiveness] → [qf:p1:pageload]
Assignee

Updated

4 months ago
Assignee: nobody → dlee
Status: NEW → ASSIGNED
Assignee

Updated

4 months ago
Priority: P3 → P1

On the 2018 reference device, I captured a profile where we're spending about 8 seconds opening the safebrowsing database:

http://bit.ly/2XiQU47

Are we calculating the checksum in this case? Is the work going on in this bug likely to help the issue in this profile?

Flags: needinfo?(dlee)
Assignee

Comment 14

4 months ago

(In reply to Mike Conley (:mconley) (:⚙️) from comment #13)

On the 2018 reference device, I captured a profile where we're spending about 8 seconds opening the safebrowsing database:

http://bit.ly/2XiQU47

Are we calculating the checksum in this case? Is the work going on in this bug likely to help the issue in this profile?

Yes, this is the case. The fix I am going to work here should help :)

Flags: needinfo?(dlee)

Note that in that profile, only ~25% is verifying the hash - so even making that 0-time will only cut ~2 seconds from the startup, or less. See http://bit.ly/2E3vvU3

26% in nsresult mozilla::safebrowsing::LookupCacheV4::VerifyChecksum, 22% is LoadMetadata, and 20% is VariableLengthPrefixSet::LoadFromFile, 16% in nsUrlClassifierPrefixSet::LoadFromFile, and 15% in mozilla::safebrowsing::HashStore::Open (from two different paths)

A fair bit of those times seem to be in opening files

Assignee

Comment 16

4 months ago

(In reply to Randell Jesup [:jesup] from comment #15)

Note that in that profile, only ~25% is verifying the hash - so even making that 0-time will only cut ~2 seconds from the startup, or less. See http://bit.ly/2E3vvU3

26% in nsresult mozilla::safebrowsing::LookupCacheV4::VerifyChecksum, 22% is LoadMetadata, and 20% is VariableLengthPrefixSet::LoadFromFile, 16% in nsUrlClassifierPrefixSet::LoadFromFile, and 15% in mozilla::safebrowsing::HashStore::Open (from two different paths)

A fair bit of those times seem to be in opening files

The LoadMetadata should be removed during startup within the same patch I am working on.
The VariableLengthPrefixSet::LoadFromFile and nsUrlClassifierPrefixSet::LoadFromFile is required.

As for HashStore::Open, I may spend some time to see if this can be skipped after the work here is done.

Assignee

Comment 17

4 months ago

SafeBrowsing V4 protocol use SHA-256 as the checksum to check integrity
of update data and also the integrity of prefix files.

SafeBrowsing V2 HashStore use MD5 as the checksum to check integrity of
.sbstore

Since we are going to use CRC32 as the integrity check of V4 prefix files,
I think rename V4 "checksum" to SHA256 can improve readability.

Assignee

Comment 18

4 months ago

SHA256 is an expensive operation, we should avoid using them if
possible. SafeBrowsing prefix files are loaded during startup and
verify integrity with SHA256 which may affect the performance
especially on the low-end device.

This patch simply removes the SHA256 integrity check. CRC32 version
integrity check will be introduced in the other patch.

This patch also changes the behavior of recording
"Telemetry::URLCLASSIFIER_VLPS_LOAD_CORRUPT" a little bit.
It used to records only once per session(during startup, the first
time we load prefix set), now it records per update.

Assignee

Comment 19

4 months ago

SafeBrowsing prefix files LOAD/SAVE operations are handled in xxxPrefixSet.cpp.
It would be more clear if xxxPrefixSet.cpp only processes prefix data,
while LookupCacheV2/LookupCacheV4 which use prefix set process file.

This patch doesn't change any behavior, testcases need to update because
the LookupCache & xxxPrefixSet APIs are changed.

Assignee

Comment 20

4 months ago

After this patch, we may have the following files in SafeBrowsing
directory:

  • (v2) .sbstore : Store V2 chunkdata, for update, MD5 integrity check
    while load
  • (v2) .pset : Store V2 prefixset, for lookup, load upon startup, no
    integrity check
  • (v4) .metadata : Store V4 state, for update, no integrity check
  • (v4) .vlpset : Store V4 prefixset, for lookup, load upon startup,
    CRC32 integrity check
  • (v4) .pset : V4 prefix set before this patch, should be removed

The magic string is also added to ".vlpset" header so we can add
a telemetry to see if sanity check is good enough for prefix set
integrity check (The telemetry is not yet added). If yes, we can remove
the CRC32 in the future for even better performance.

Assignee

Comment 21

4 months ago

This patch is to cleanup old SafeBrowsing v4 prefix files.

Assignee

Updated

4 months ago
Blocks: 1531095
Assignee

Updated

4 months ago
See Also: → 1531354
Assignee

Comment 22

3 months ago

To avoid forcing a redownload of SafeBrowsing v4 list.

Comment 23

3 months ago
Pushed by dlee@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/3d326cfcd002
P1. Rename checksum used in SafeBrowsing V4 to SHA256. r=gcp
https://hg.mozilla.org/integration/autoland/rev/6b8412db5a05
P2. Do not use SHA-256 while loading the V4 prefix files. r=gcp
https://hg.mozilla.org/integration/autoland/rev/bc0b91abce9b
P3. Separate file processing and prefix data processing for SafeBrowsing prefix set. r=gcp
https://hg.mozilla.org/integration/autoland/rev/4978556a66f6
P4. Add header and CRC32 checksum to SafeBrowsing V4 prefix files. r=gcp
https://hg.mozilla.org/integration/autoland/rev/f1f29fe519cf
P5. Remove old v4 prefix files after new files are stored. r=gcp
https://hg.mozilla.org/integration/autoland/rev/71dafccc22ae
P6. Load the old prefixset(.pset) when there is no .vlpset. r=gcp

Comment 26

3 months ago
Pushed by dlee@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c2331373e107
P1. Rename checksum used in SafeBrowsing V4 to SHA256. r=gcp
https://hg.mozilla.org/integration/autoland/rev/c7a253aed450
P2. Do not use SHA-256 while loading the V4 prefix files. r=gcp
https://hg.mozilla.org/integration/autoland/rev/e083106dc24f
P3. Separate file processing and prefix data processing for SafeBrowsing prefix set. r=gcp
https://hg.mozilla.org/integration/autoland/rev/3b5da75b9c7b
P4. Add header and CRC32 checksum to SafeBrowsing V4 prefix files. r=gcp
https://hg.mozilla.org/integration/autoland/rev/aaba7c25b72b
P5. Remove old v4 prefix files after new files are stored. r=gcp
https://hg.mozilla.org/integration/autoland/rev/a87bd3f9b87d
P6. Load the old prefixset(.pset) when there is no .vlpset. r=gcp
https://hg.mozilla.org/integration/autoland/rev/9bac094cec23
P7. Add gtest to ensure .pset is correctly loaded and removed. r=gcp
Assignee

Updated

3 months ago
Blocks: 1533884
Assignee

Updated

3 months ago
Flags: needinfo?(dlee)

Thanks Dimi!

What remains to be done from comment 15:
"22% is LoadMetadata, and 20% is VariableLengthPrefixSet::LoadFromFile, 16% in nsUrlClassifierPrefixSet::LoadFromFile, and 15% in mozilla::safebrowsing::HashStore::Open (from two different paths)

A fair bit of those times seem to be in opening files"

mconley - now that this landed, can you re-profile? thanks!

Flags: needinfo?(mconley)
Flags: needinfo?(dlee)
Assignee

Comment 29

3 months ago

(In reply to Randell Jesup [:jesup] (needinfo me) from comment #28)

What remains to be done from comment 15:
"22% is LoadMetadata, and 20% is VariableLengthPrefixSet::LoadFromFile, 16% in nsUrlClassifierPrefixSet::LoadFromFile, and 15% in mozilla::safebrowsing::HashStore::Open (from two different paths)

This patch should also remove the "LoadMetadata", so the remains are:

  1. VariableLengthPrefixSet::LoadFromFile
  2. nsUrlClassifierPrefixSet::LoadFromFile
  3. mozilla::safebrowsing::HashStore::Open

(1) and (2) are necessary otherwise users will be unprotected.
I filed bug 1531354 to check if we can skip reading the HashStore during startup. I will check this next week.

Flags: needinfo?(dlee)

(In reply to Randell Jesup [:jesup] (needinfo me) from comment #28)

mconley - now that this landed, can you re-profile? thanks!

Done: http://bit.ly/2TVnOtE

We're still taking a very long time getting that database off of the disk...

Flags: needinfo?(mconley)

~5.1 seconds in RegenActiveTables()(don't trust "Running time", the sample rate is >1ms and inconsistent at times). Split between:

  • mozilla::safebrowsing::Classifier::GetLookupCache
  • mozilla::safebrowsing::HashStore::Open
    Then 900ms in GetTables()->TableRequest()->mozilla::safebrowsing::HashStore::Open()
    Then 700ms in mozilla::safebrowsing::Classifier::LoadMetadata()

a few seconds later there's ~200ms in mozilla::safebrowsing::Classifier::AsyncApplyUpdates

(In reply to Randell Jesup [:jesup] (needinfo me) from comment #31)

~5.1 seconds in RegenActiveTables()(don't trust "Running time", the sample rate is >1ms and inconsistent at times).

This is not CPU time, 98% of these samples are the thread being blocked in nsLocalFile::OpenNSPRFileDesc.

~5.1 seconds in RegenActiveTables()(don't trust "Running time", the sample rate is >1ms and inconsistent at times).

This is not CPU time, 98% of these samples are the thread being blocked in nsLocalFile::OpenNSPRFileDesc.

Yup - though not one call to open file from the stacks - this calls GetLookupCache()->LookupCacheV2::Open()->HashStore::Open() and directly calls HashStore::Open(), and cycles back and forth between them repeatedly (I presume for N different sets of files of safebrowsing data).

Some way to avoid opening N files and instead open 1 or two files would be a Big Win here I suspect. It may complicate updates.

Assignee

Comment 34

3 months ago

(In reply to Randell Jesup [:jesup] (needinfo me) from comment #31)

~5.1 seconds in RegenActiveTables()(don't trust "Running time", the sample rate is >1ms and inconsistent at times). Split between:

  • mozilla::safebrowsing::Classifier::GetLookupCache
  • mozilla::safebrowsing::HashStore::Open
    Then 900ms in GetTables()->TableRequest()->mozilla::safebrowsing::HashStore::Open()
    Then 700ms in mozilla::safebrowsing::Classifier::LoadMetadata()

a few seconds later there's ~200ms in mozilla::safebrowsing::Classifier::AsyncApplyUpdates

I think some of those file read can be skipped, I'll work on it in Bug 1531354

Assignee

Updated

23 days ago
Regressions: 1553963
You need to log in before you can comment on or make changes to this bug.