Open Bug 2012269 Opened 2 days ago Updated 2 days ago

Gecko Remote Settings (rsegmnoittet-es) causes extreme disk write amplification during certificate updates

Categories

(Firefox :: Remote Settings Client, defect)

Firefox 147
defect

Tracking

()

UNCONFIRMED

People

(Reporter: aros, Unassigned)

Details

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:147.0) Gecko/20100101 Firefox/147.0

Steps to reproduce:

This report concerns the Gecko Remote Settings client, specifically the rsegmnoittet-es collection used for certificate / CRLite-related data.

On my system, I consistently observe the following behavior every few hours:

  1. Firefox downloads a ~3 MB ZIP archive into $TEMP, named ${UUID}.zip (example: 95c9d159-cbcc-4ee4-a739-714ab0c3f8bf.zip).

  2. The archive is unpacked into

    storage/permanent/chrome/idb/3870112724rsegmnoittet-es.files/
    

    resulting in ~1,800 small files that appear to be certificate-related objects.

  3. The same data is also written into the IndexedDB SQLite database

    3870112724rsegmnoittet-es.sqlite
    

    as entries in object_data.

  4. This is followed by updates to CRLite-related files such as

    security_state/crlite.filter
    security_state/data.safe.bin
    

The net effect is that updating ~3 MB of logical data results in hundreds of megabytes of disk writes:

  • ~1,800 individual files, each padded to at least one filesystem block (typically 4 KB)
  • metadata updates for creation and deletion
  • duplicated storage in both .files and SQLite
  • additional writes for CRLite regeneration

On CoW filesystems or flash-backed storage, this results in significant write amplification. Even without considering filesystem internals, the number of small-file operations alone is substantial.

What I'm struggling to understand is why this update path requires:

  • unpacking into thousands of discrete files,
  • duplicating the same data into IndexedDB objects,
  • and repeating this process multiple times per day.

From a data-model perspective, this appears to be a mostly monolithic dataset with relatively small incremental changes. If so, approaches such as:

  • a packed binary format,
  • chunked append-only updates,
  • or delta-based updates (e.g. xdelta-style patching)

could drastically reduce disk I/O while preserving correctness and integrity.

Questions:

  • Is the frequent full rewrite of this dataset intentional?
  • Are there constraints (IndexedDB, atomicity, platform portability) that prevent a more write-efficient design?
  • Has disk write amplification been measured or considered here?

I'm raising this because, at scale, this behavior has nontrivial performance and storage-longevity implications, especially on systems with limited write endurance.

Component: Untriaged → Remote Settings Client
OS: Unspecified → All
Hardware: Unspecified → All
You need to log in before you can comment on or make changes to this bug.