<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Jeff Muizelaar [:jrmuizel]

Comment 3

•

10 months ago

(In reply to Markus Stange [:mstange] from comment #0)

The other half of the time is spent copying the file contents into memory a second time, and computing the CRC32 on those contents.

Is the CRC32 verification necessary?

If my understanding is correct, we originally implemented CRC32 verification to avoid cases where, if Firefox crashes while updating or saving SafeBrowsing data, we can ensure the file will be re-downloaded at the next startup.
Currently, we don't have telemetry to know how common this problem is. I think we should add telemetry for it, but until we have that data, keeping the CRC32 verification might be the safer choice.

Does it need to happen on every startup? On Android, users switch apps frequently so it's common that Firefox gets killed in the background and has to be restarted when the user switches back to it.

@Tim and i are thinking using a faster algorithm, xxhash for example. Do you think changing the algorithm would improve the performance?

Flags: needinfo?(dlee) → needinfo?(mstange.moz)

Comment 4

•

10 months ago

How big is the file?

Jeff Muizelaar [:jrmuizel]

Comment 5

•

10 months ago

An easy win would be to just the ARM64 CRC instructions: https://github.com/google/crc32c/blob/main/src/crc32c_arm64.cc

Comment 6

•

10 months ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #4)

How big is the file?

There are multiple files. The largest one is around ~10 MB, and the others are around a few hundred KB each. There are about 20 files in total.

Comment 7

•

10 months ago

(In reply to Jeff Muizelaar [:jrmuizel] from comment #5)

An easy win would be to just the ARM64 CRC instructions: https://github.com/google/crc32c/blob/main/src/crc32c_arm64.cc

This is a great suggestion, we will move forward with this option.

Assignee: nobody → tihuang

Status: NEW → ASSIGNED

Flags: needinfo?(mstange.moz)

Markus Stange [:mstange]

Reporter

Comment 8

•

10 months ago

Sure, faster CRC32 is something that may help a bit but it won't fix this bug. I think it should be handled in a new dependent bug.

I haven't seen a response to the first question yet. Can we change the in-memory representation so that it's one big allocation rather than many small allocations?

(In reply to Dimi Lee [:dimi] from comment #3)

Is the CRC32 verification necessary?

If my understanding is correct, we originally implemented CRC32 verification to avoid cases where, if Firefox crashes while updating or saving SafeBrowsing data, we can ensure the file will be re-downloaded at the next startup.

Isn't this problem usually avoided by writing out to a different filename and then moving the file to the final spot once all the data has been written? Or if we really want the CRC32 check, could we do it once after we change the file, rather than on every startup when we should be able to know that the file hasn't changed since the last check?

Comment 9

•

10 months ago

(In reply to Markus Stange [:mstange] from comment #8)

Sure, faster CRC32 is something that may help a bit but it won't fix this bug. I think it should be handled in a new dependent bug.

I haven't seen a response to the first question yet. Can we change the in-memory representation so that it's one big allocation rather than many small allocations?

sorry i forgot to answer this one!
So we intentionally change the in-memory representation from one big-array to array of array in Bug 1046038.
And i did update the algorithm so we only use array of array for tables with a lot of prefixes in this patch.

From the comment in Bug 1046037, we had OOM issue with 1-2M allocations (11 years ago). The entries we have is even more than 11 years ago. Taking phishing table for example, the table has around 1M entries, so we would probably need 4MB allocation with a flat array. Will that be a problem on mobile?
We might also test whether one flat array does improve the performance by setting this pref to 2 * 1024 * 1024 on mobile.

Isn't this problem usually avoided by writing out to a different filename and then moving the file to the final spot once all the data has been written? Or if we really want the CRC32 check, could we do it once after we change the file, rather than on every startup when we should be able to know that the file hasn't changed since the last check?

yes, you are right. I misread those bugs we added/updated the checksum implementation :(
We indeed move the update files to the final spot when update is done to address the crash during update problem. The checksum was implemented when we move the storage from sqlitedb to file 13 years ago and it was to prevent file corruption.

So to answer this question again - "Is the CRC32 verification necessary? Does it need to happen on every startup?"
I'd prefer not getting rid of the integrity check entirely, to avoid the case where a really unlucky user gets a bad file unnoticed and we’re never able to recover it for them. But I do think we can avoid doing the integrity check during every startup. Here's what I'd suggest:
Since SafeBrowsing updates every 30 or 45 minutes, I think we can do the integrity check only the first time we update in a session - meaning that even if the file is corrupted, it only stays corrupted for 45 minutes. I think the chance that users are not warned about a phishing site, or that a safe site is marked as phishing due to file corruption, is fairly low.

Updated

•

9 months ago

Severity: -- → S3

Priority: -- → P2

Assignee

Comment 10

•

9 months ago

Attached file Bug 1956920 - Delay the CRC32 verification on .vlp files to the list update. r?dimi! — Details

To improve the startup performance, we delay the CRC32 verfication to
the time before we send the list update to the SafeBrowsing server.

Pulsebot

Comment 11

•

9 months ago

Pushed by tihuang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/655eb9151006 Delay the CRC32 verification on .vlp files to the list update. r=dimi

Pulsebot

Comment 12

•

9 months ago

Pushed by imoraru@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f1ee6f4335a8 Revert "Bug 1956920 - Delay the CRC32 verification on .vlp files to the list update. r=dimi" for causing multiple failures.

Iulian Moraru

Comment 13

•

9 months ago

Revert "Bug 1956920 - Delay the CRC32 verification on .vlp files to the list update. r=dimi" for causing multiple failures.

Flags: needinfo?(tihuang)

Pulsebot

Comment 14

•

9 months ago

Pushed by tihuang@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/b74768931894 Delay the CRC32 verification on .vlp files to the list update. r=dimi

https://hg.mozilla.org/mozilla-central/rev/b74768931894

Assignee

Updated

•

9 months ago

Flags: needinfo?(tihuang)

agoloman

Comment 15

•

9 months ago

bugherder

Status: ASSIGNED → RESOLVED

Closed: 9 months ago

status-firefox140: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → 140 Branch

Camelia Badau [:cbadau], Desktop Test Engineering

Assignee

Comment 16

•

9 months ago

Hi Markus,

In this bug, we introduced a pref urlclassifier.delay_prefixes_crc32_check to delay the CRC32 check. The delay is currently disabled by the pref. Would you be able to test if enabling this pref will improve the performance? Thanks.

Flags: needinfo?(mstange.moz)

Updated

•

9 months ago

QA Whiteboard: [qa-triage-done-c141/b140]

Comment 17

•

8 months ago

ni myself to measure the impact of delaying the crc32_check, urlclassifier.delay_prefixes_crc32_check

Flags: needinfo?(acreskey)

Comment 18

•

8 months ago

The results from combining early initialization of the URL classifier along with the urlclassifier.delay_prefixes_crc32_check pref are looking very promising: https://bugzilla.mozilla.org/show_bug.cgi?id=1956356#c19
I'll separate those out next.

Markus Stange [:mstange]

Reporter

Comment 19

•

8 months ago

(In reply to Tim Huang[:timhuang] from comment #16)

Hi Markus,

In this bug, we introduced a pref urlclassifier.delay_prefixes_crc32_check to delay the CRC32 check. The delay is currently disabled by the pref. Would you be able to test if enabling this pref will improve the performance? Thanks.

Thanks Tim! And sorry for the delay here. Andrew will be taking a look at validating the performance impact.
Could you file follow-up bugs about the other source of slowness identified in the profiles, and about flipping this pref to true by default? This bug got closed when the off-by-default-pref patch landed. It would have been better to land that patch in a separate bug.

Flags: needinfo?(mstange.moz)

Assignee

Updated

•

8 months ago

Blocks: 1971949

Assignee

Updated

•

8 months ago

Comment 20

•

8 months ago

Quick update -- we've discovered some bugs in our CI test scores, bug 1970614 and bug 1972093, which are delaying the evaluation of this change.

Comment 21

•

6 months ago

Still having problems evaluating the impact of enabling this pref, but have not forgotten it.

Comment 22

•

6 months ago

Apologies for the delays.

We're seeing reproducible improvements to Fenix startup with urlclassifier.delay_prefixes_crc32_check enabled.
~8 to 24ms on newssite-applink-startup applink_startup
and ~6 to 60ms on tab-restore-shopify, (Please ignore the Android p6 results as they are a test issue, bug 1985798)

Startup CPU times looks to also have improved.

Startup perf compare

I think we should enable urlclassifier.delay_prefixes_crc32_check, bug 1971949.

(Leaving my ni as Tim is on PTO and not accepting ni's).

Frank Doty [:fdoty]

Updated

•

4 months ago

QA Whiteboard: [qa-triage-done-c141/b140] → [qa-triage-done-c141/b140] [fxp]

Comment 23

•

3 months ago

Tim,
We believe that enabling urlclassifier.delay_prefixes_crc32_check would be beneficial to Fenix cold start applink performance (based on profiling and CI tests, Comment 22).
We have telemetry in place to observe any field impact.

Would you be able to enable the urlclassifier.delay_prefixes_crc32_check pref ?

Flags: needinfo?(acreskey) → needinfo?(tihuang)

Assignee

Comment 24

•

3 months ago

Sur, I will work on Bug 1971949 to enable this for mobile.

Flags: needinfo?(tihuang)

Comment 25

•

3 months ago

Thank you, I'll keep my eyes open for perf alerts and changes to telemetry.

Comment 26

•

3 months ago

I ran a perf compare of this change, bug 1971949 and it looks good.

a55, newssite-applink-startup

Difference of means: -1.75% (-37.9 ms)
Difference of medians: -0.5% (-10.76 ms)

pixel 6, newssite-applink-startup

Difference of means: -2.29% (-37.46 ms)
Difference of medians: -2.3% (-36.9 ms)

And in a higher variance test
android-hw-p6-13-0-aarch64-shippable

Difference of means: -4.27% (-75.88 ms)
Difference of medians: -6.13% (-111.34 ms)

https://sql.telemetry.mozilla.org/queries/111382/source?p_channel=nightly&p_days=20&p_earliest_build_date=2025-09-19&p_group_by=submission_date&p_minimum_entries_group=0#273316

Comment 27

•

3 months ago

And while we cannot directly tie the impact of this change to our telemetry, since landing we are seeing median time to request start for cold applinks drop from 778ms to 605ms, which is by far the lowest value recorded.