Closed Bug 1286798 Opened 5 years ago Closed 3 years ago

[meta] Next-generation LocalStorage (DOMStorage) implementation

Categories

(Core :: Storage: localStorage & sessionStorage, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla65
Tracking Status
firefox50 --- wontfix
firefox65 --- fixed

People

(Reporter: janv, Assigned: janv)

References

(Blocks 15 open bugs)

Details

(Keywords: dev-doc-needed, meta, Whiteboard: [e10s-multi:-][wptsync upstream])

Attachments

(8 files, 81 obsolete files)

601.80 KB, patch
asuth
: review+
Details | Diff | Splinter Review
15.52 KB, patch
asuth
: review+
Details | Diff | Splinter Review
16.26 KB, patch
asuth
: review+
Details | Diff | Splinter Review
15.17 KB, patch
asuth
: review+
Details | Diff | Splinter Review
654.76 KB, patch
mrbkap
: review+
mccr8
: review+
Details | Diff | Splinter Review
2.78 KB, patch
dholbert
: review+
Details | Diff | Splinter Review
1.10 KB, patch
asuth
: review+
Details | Diff | Splinter Review
842 bytes, patch
asuth
: review+
Details | Diff | Splinter Review
The next-generation should solve a bunch of problems:
- unification with other storage APIs (using QuotaManager)
- per origin database files
- database files in temporary storage (eviction of old data)
- e10s multi support
- move from PContent to PBackground
Blocks: 742822, 666724
Depends on: 1283609
Blocks: e10s-multi
Whiteboard: [e10s-multi:M?]
Whiteboard: [e10s-multi:M?] → [e10s-multi:M1]
Attached patch latest snappy (obsolete) — Splinter Review
Assignee: nobody → jvarga
Status: NEW → ASSIGNED
My main profile's 100MB webappstore.sqlite has 2,659 distinct originKeys.

Does that imply that my profile will have a couple of thousand SQLite databases, and a WAL for each one that's open?

That might not work out on mobile, even if you manage to put these files in the Android Cache directory. Heck, it might not be too fun on my MacBook Pro…!

Am I misunderstanding your approach?
Blocks: 857888
Flags: needinfo?(jvarga)
I have a similar profile (60MB, around 1800 domains), so I'm testing with real world data.
Yes, local storage is special in many regards. It's synchronous API which makes it quite hard to plug into quota manager, but it's been here for some time and many sites use it. However, we never evicted unused data, so we ended up with 2-3 thousand distinct domains.
I'm thinking about creating separate databases only for new local storage and about converting origins with existing local storage lazily. I believe many of those distinct domains won't be ever converted, so we will have to delete them when other domains need to store more data.
Flags: needinfo?(jvarga)
And btw, I'm not going to use SQLite for new local storage implementation :)
\o/

I was worried about the per-database overhead for multi-tab mobile users! Glad to hear you have it all in hand :D
multi-tab shouldn't be such a big concern since local storage will be preloaded in memory so the per origin database can be "closed" after preloading
Blocks: 1296572
Whiteboard: [e10s-multi:M1] → [e10s-multi:?]
Whiteboard: [e10s-multi:?] → [e10s-multi:M3]
Depends on: 768074
Whiteboard: [e10s-multi:M3] → [e10s-multi:-]
See Also: → 1362190
Keywords: meta
Summary: Next-generation DOMStorage implementation → [meta] Next-generation DOMStorage implementation
Priority: -- → P3
(In reply to Jan Varga [:janv] from comment #0)
> The next-generation should solve a bunch of problems:
> - unification with other storage APIs (using QuotaManager)
> - per origin database files
> - database files in temporary storage (eviction of old data)
> - e10s multi support
> - move from PContent to PBackground

Just a quick update:
- e10s multi support landed, although there is room for improvements
- move from PContent to PBackground landed too, bug 1350637
I'm updating the subject to make this easier to find, and I want to give a dump of my current understanding of :janv's plan, as I think the design has a lot of wins, but there are trade-offs involved and we want any discussion/debate to happen sooner rather than later.

## Plan
My mental model of the overhaul plan is (Jan, please correct me if I'm wrong!):
- All (canonical) LocalStorageCache data state is held in the parent process.
- We have actors per-LocalStorageCache which are per-principal (that is, origin+attributes) instead of the current "one actor per process that is conceptually just a remoted database connection" so that we can scope events properly.
- All localStorage reads happen via sync IPC to PBackground.  (Or, if necessary, a new PLocalStorageIsDumbAndNeedsItsOwnThread.)
- localStorage writes probably also happen via sync IPC[1].
- Events happen via async IPC if there are any "storage" event listeners in a content process.  The subscription should happen synchronously for maximum correctness, but it's fine to subscribe async under the spec.  This means that a change can be perceived by reading localStorage for which a "storage" event has not yet been received.  This is fine and consistent with how LocalStorage already works in Chromium/Blink (and probably webkit too?).  It's also how we've worked for a while.
- We use QuotaManager, but the origin never gets more than 5MiB of LocalStorage space (+slack) no matter what.

1: At least some writes could be optimized to be async if we're willing to allow the origin to exceed quota.  For example, Chromium allows quota to be exceeded by 100k because of multi-process races.  Our implementation could similarly allow each process some "slack" quota that permits async write messages up to that quota (and assuming no space would be freed by overwrites) as long as we're (sufficiently) under quota.  If a write would exceed the slack size, a sync write is performed and resets the outstandingSlackBytes to 0.  In the best-case scenario, outstandingSlackBytes never exceeds the threshold, and is continually re-zeroed by async IPC messages that confirm the writes and current quota usage for the given cache.

## Upsides
We get the following wins out of this implementation strategy versus our current strategy:
- Reduced memory usage.  Right now each browser process that has a window/iframe loaded for an origin that uses LocalStorage will carry the memory burden of the origin's entire LocalStorage usage in every process with the window/iframe loaded.  For sites that use localStorage and are very popular or are embedded as 3rd-party iframes (and assuming we don't double-key them), this could be a lot of memory[2].
- Globally consistent, sequential view of localStorage across all processes (unless we make any caching optimizations).  The HTML spec provides us basically infinite wiggle room when multiple processes are involved, but I know of at least one instance of LocalStorage being used by a site as a mutex between multiple windows.  There are arguably no downsides 
- Improved page load performance where the origin has a ton of data stored in localStorage but only needs a small amount of it near startup.  If the origin has 5MiB stored and it uses localStorage at all, all 5 MiB needs to be loaded from disk and received by the thread responsible for storage.  And if we go synchronous (in the current implementation), that could mean a simple boolean check has to wait on 5 megs of data.  Based on our current DB strategy, the load is still unavoidable, but we avoid having to ship it over IPC or process that on the content process main thread.

2: Note that this regressed in bug 1285898 out of necessity.  Previously, we'd drop the storage if a page didn't actually use LocalStorage within ~20 seconds.  This could be added back without the overhaul by addressing bug 1350071.

## Downsides
- Synchronous IPC is synchronous IPC, and this approach means more of it, even if it's in smaller bite-sized pieces.  Although we can mitigate by letting Quantom DOM context-switch to other TabGroups, it's not a win if we're perceptibly stalling a foreground TabGroup the user cares about.

## Potential Mitigations
- Use some kind of clever multi-process shared-memory map to make all reads except the preload non-blocking, and all writes async.  This would increase complexity and lose the globally consistent sequence.
- Create a second mode of operation for localStorage-heavy origins.  Once an activity threshold is crossed,  cache the entirety of the state in the content process and rely on the "storage" events (which must be subscribed to even if not delivered to content pages) to update the local cache.  If there's heavy write-load, consider coalescing the writes locally by tracking dirty keys and flushing on a timer, much like we do in the database thread already.

## Meta: Metrics / Benchmarks / Synthetic Benchmarks

From my perspective, we don't have a good understanding of how LocalStorage is used in the wild or a clear goal to optimize for than to reduce the scalar telemetry LOCALDOMSTORAGE_GETVALUE_BLOCKING_MS and make the boolean telemetry LOCALDOMSTORAGE_PRELOAD_PENDING_ON_FIRST_ACCESS as false as possible.  These are mainly questions of moving the preload trigger as early as possible in the pipeline, optimizing the database queries, and reducing database fragmentation.  These are all orthogonal to whether we do the in-parent overhaul or instead massively clean-up the existing implementation.  (Although our current implementation is limited in how early we can move the preload trigger because of how broadcast is implemented.)

It might be beneficial to instrument our LocalStorage implementation with MOZ_LOG statements that can generate optionally-anonymized traces of localstorage usage.  They'd log all reads/writes as `origin "${hashy(perUserRandomKey, origin}" key "${hashy(perUserRandomKey, key)}" ${key.length} ${readOrWriteOrClear} ${value && value.length}`.  Some kind of "about:networking" "logging"-tab-style affordance could turn on the logs for helpful people, who could then provide us privately with their logs with the understanding that we could reverse things if we brute-forced the per-user-key/salt or did traffic-analysis, but that we would not do so and this would largely avoid us accidentally learning anything private when skimming or otherwise analyzing logs.

## Analytical tools.
### Per-origin via Devtools console
Paste this into a dev-tools console of any browser on any origin to get a report on what localStorage is up to:
```
(function() { var valueBytes=0; var keyBytes=0; for (var i=0; i < localStorage.length; i++) { var key = localStorage.key(i); keyBytes += key.length; valueBytes += localStorage[key].length; } console.log("localStorage has", localStorage.length, "keys and uses", keyBytes, "key bytes,", valueBytes, "value bytes,", keyBytes+valueBytes, "total bytes-ish"); })();
```
ex, my google.com in my Firefox primary profile:
localStorage has 177 keys and uses 5738 key bytes, 17683 value bytes, 23421 total bytes-ish
ex, my google.com in my Google Chrome MoCo profile:
localStorage has 168 keys and uses 3295 key bytes, 2623 value bytes, 5918 total bytes-ish

### Whole profile via browser console
Open up a browser console via ctrl-shift-j (you may need to enable it first): paste this in and run it:
```
Components.utils.import("resource://gre/modules/Sqlite.jsm"); (async function() { const conn = await Sqlite.openConnection({ path: "webappsstore.sqlite", sharedMemoryCache: false }); await conn.execute("select sum(length(key) + length(value))/1024 AS total, originkey from webappsstore2 GROUP BY originKey HAVING total > 1024 ORDER BY total ASC", [], (row) => { console.log(row.getResultByIndex(0), row.getResultByIndex(1)); }); conn.close(); console.log("That's it."); })()
```

Note that if you have no origins with over 1 meg, you'll just see "That's it."  You can adjust the query as needed.

### Whole profile via sqlite3 command line tool
Same as the browser console invocation above.

Run this inside a `sqlite3 webappsstore.sqlite` to get a rough list of per (reversed) origin localStorage usage in KiB for sites using more than a meg:
```
select sum(length(key) + length(value))/1024 AS total, originkey from webappsstore2 GROUP BY originKey HAVING total > 1024 ORDER BY total ASC;
```
or direct from the command-line in your profile directory:
```
sqlite3 webappsstore.sqlite "select sum(length(key) + length(value))/1024 AS total, originkey from webappsstore2 GROUP BY originKey HAVING total > 1024 ORDER BY total ASC;"
```
Summary: [meta] Next-generation DOMStorage implementation → [meta] Next-generation LocalStorage (DOMStorage) implementation
(In reply to Andrew Sutherland [:asuth] from comment #8)

Thanks for the plan description, nice written!
As others can see, this is not trivial at all.

I have some more comments:
1. Regarding sync/async writes. We could avoid the slack if decide to keep the precise quota checks strategy by making QuotaObject::MaybeUpdateSize to work on PBackground thread (or special LocalStorage thread dedicated for IPC in the parent process).
The idea is that in LocalStorage it's easy to calculate how much data (or at least maximum) will be stored on disk. It's a simple key.length + value.length calculation. So we can try to synchronously pre-allocate the size (which would check the group limit and possibly free some space). Writes would then be asynchronously finished and after we know real size we can again update quota object (we could also try to compress in content process and send compressed bytes across IPC).
This can't work well with SQLite because it's hard to estimate how much data will actually be stored on disk, so I'm also considering to drop SQLite in LocalStorage and replace it with own simple db system.
All we need is to write key/value pairs using snappy compression and splitting these pairs across multiple files (to avoid 5 MB writes when only 1 byte is changed in the worst case). The splitting can be done based on hashing the key or something like that.
I already experimented with the simple db system. (Note this is not simpledb, bug 1361330).

2. Hopefully sync IPC reads won't be a performance problem since it makes everything much cleaner and less complicated.

3. Local storage db can contain thousands of unique domains. The current quota manager storage placement on disk is a directory per origin and these directories must be scanned during default/temporary storage initialization.
However, we don't need to create all of them. We can do it on demand when they are actually needed. Something like lazy upgrade of old Local Storage db.
This can be improved even more and have a zip file with compressed (unused) origin directories and unzip them when needed.
Finally, we can also cover a case when there's plenty of space on disk so origin directories never get evicted, but the number of origin directories gets quite big after some time. We can use the same archive/zip file to archive unused origin directories.

Of course, we don't have to implement everything at once.
(In reply to Jan Varga [:janv] from comment #9)
> This can't work well with SQLite because it's hard to estimate how much data
> will actually be stored on disk, so I'm also considering to drop SQLite in
> LocalStorage and replace it with own simple db system.
> All we need is to write key/value pairs using snappy compression and
> splitting these pairs across multiple files (to avoid 5 MB writes when only
> 1 byte is changed in the worst case). The splitting can be done based on
> hashing the key or something like that.
> I already experimented with the simple db system. (Note this is not
> simpledb, bug 1361330).

Can we dig into this a bit more?

I agree that given our usage pattern for localStorage, the main thing SQLite is doing is bounding I/O for small writes, as you say.  But I don't view the SQLite disk overhead as particularly concerning[1].  At least not worth the maintenance overhead of implementing a persistence layer that has to juggle multiple files.

That's not to say there isn't a gap in our set of persistence solutions, especially from C++ code.  We basically have SQLite and the ServiceWorkerRegistrar-style "sketchily read and write a text-file that's a downgrade nightmare" strategies.  JS is in better shape because it has easy JSON flat-file serialization and IndexedDB.

My thoughts in this area were that now that we can write rust code we can use its "serde" serialization/deserialization library to allow our platform code to do easily do JSON-style serialization.  And it seems conceivable that there is some middle-ground between flat-file and SQLite, but I think we want there to be an existing open source project we build around that's aligned with our use-case.  Or at least, it should be more than just you taking on an embedded database engine as part of your day job :)

Note that MoCo is a member of the SQLite consortium (https://sqlite.org/consortium.html) which means we actually can get a lot of their time; we're already paying for it, we're just not using it.  I'm sure they'd be happy to advise us on how to optimize for the localStorage case.  And it's worth pointing out that SQLite has a lot of extensibility.  For example, there's the CSV virtual table at https://sqlite.org/csv.html, plus, as you know, there's the VFS layer.  And as a virtual table, we'd get to use all of our existing mozStorage bindings, any rust bindings we adopt, etc.

1: The localStorage 5meg limit is in many ways orthogonal to the QuotaManager limit.  It's a policy decision from the pre-QuotaManager era that now is more about bounding memory use from LocalStorage-sites than it's about disk space.  To avoid breaking on sites that depend on the limit as we previously implemented it, we need to give them that space as we originally measured it, roughly speaking.  QuotaManager does need to deal in real-world disk usage, but there's no reason it has to insist on the localStorage on-disk usage being exactly 5 megabytes.  It can give it 10.  It should enforce our more generous upper limits, which we may need to boost slightly with the localStorage move to QM, but those are more abstract as far as the various origins are concerned.
Flags: needinfo?(jvarga)
I haven't made any decision regarding database system for LocalStorage, I just experimented with own one :)
I'm all for an existing open source system that fits our needs.
2¢ on this: adding another database tool of almost any kind is a non-trivial installer size bump, and that really matters on mobile, particularly if it's a tool that's only used for one specific case.

That means I tend to see fewer middle grounds between flat files (which have lots of problems) and SQLite (which solves lots of problems) -- the only acceptable middle grounds are those that have a future across a number of components and kill a bunch of other code, because we don't want to pay a few MB of library size just for one feature.

I say that as someone who very much thinks adding a suitable database tool to the codebase is a good idea!
If we go the SQLite route here's proposed db schema:

CREATE TABLE database (
  origin TEXT NOT NULL
  last_vacuum_time INTEGER NOT NULL DEFAULT 0,
  last_analyze_time INTEGER NOT NULL DEFAULT 0,
  last_vacuum_size INTEGER NOT NULL DEFAULT 0
)  WITHOUT ROWID;

CREATE TABLE data (
  key TEXT PRIMARY KEY,
  value TEXT NOT NULL,
  keyCompressed INTEGET NOT NULL,
  valueCompressed INTEGET NOT NULL,
  lastAccessTime INTEGER NOT NULL DEFAULT 0
);

CREATE INDEX data_key ON data(key);

Features (some of them just for consideration):
- WAL is probably not needed for such small databases
- queue changes coming from content processes and apply them to SQLite periodically or when the size gets over a threshold
- after a database update, check if it's worth to vacuum the database (or vacuum only when we get the idle notification)
- compress keys and values if the length is greater than a threshold, if compressed data isn't significantly smaller, keep key/value uncompressed
- compression/decompression would happen in content processes
- pre-loaded data could stay compressed to send less bytes across IPC either for normal operations or for distributing the change events
- every key/value pair would have last access time, if such a pair is not used for long time, it doesn't have to be preloaded next time we preload data for entire origin
- lastAccessTime may also serve for clearing session data based on time constraints ?

It would be nice if we could calculate the size needed by SQLite on disk for given key/value pair in the worst case (no free blocks in the database).
That way we could just synchronously pre-allocate space using QuotaManager before setItem() call is finished.
The pre-allocated space would then be freed once a database update is finished.
Flags: needinfo?(jvarga)
Depends on: 1405270
Attached patch Part 1 (obsolete) — Splinter Review
backup
Attached patch Part 2 (obsolete) — Splinter Review
backup
Attached patch Part 3 (obsolete) — Splinter Review
backup
Blocks: 1318713
Attached patch Part 3 (obsolete) — Splinter Review
Attachment #8926885 - Attachment is obsolete: true
Attachment #8934520 - Attachment is patch: true
Attached patch Part 4 (obsolete) — Splinter Review
Attached patch Part 5 (obsolete) — Splinter Review
Attached patch Part 3 (obsolete) — Splinter Review
Attachment #8934520 - Attachment is obsolete: true
Attached patch Part 4 (obsolete) — Splinter Review
Attachment #8934523 - Attachment is obsolete: true
Attached patch Part 5 (obsolete) — Splinter Review
Attachment #8934525 - Attachment is obsolete: true
Attached patch Part 6 (obsolete) — Splinter Review
Ok, Part 6 implements persistence on disk

TODO (may miss some items):
- quota checks
- migration of old data
- support for private browsing mode
- storage events
:janv update the bug with latest info please?
Flags: needinfo?(jvarga)
Attachment #8926883 - Attachment is obsolete: true
Flags: needinfo?(jvarga)
Attachment #8926884 - Attachment is obsolete: true
Attachment #8941046 - Attachment is obsolete: true
Attachment #8941047 - Attachment is obsolete: true
Attachment #8941048 - Attachment is obsolete: true
Attachment #8941049 - Attachment is obsolete: true
(In reply to Marion Daly [:mdaly] from comment #25)
> :janv update the bug with latest info please?

Sure, here are current patches with some description:

Part 1: Allow BackgroundChild::GetOrCreateForCurrentThread() to use a custom main event target

Part 2: Add IsOnDOMFileThread() and AssertIsOnDOMFileThread() generic helper methods

Part 3: New basic (memory only) implementation of LocalStorage;

The implementation is based on a cache (datastore) living in the parent process and sync IPC calls initiated from content processes.
IPC communication is done using per principal/origin database actors which connect to the datastore.
The synchronous blocking of the main thread is done by creating a nested event target and spinning the event loop.

Part 4: Basic integration with QuotaManager

This adds a new quota client implementation, but only implements ShutdownWorkThreads.
At shutdown we wait for all running operations to finish including database actors which are closed by using an extra IPC message which avoids races between the parent and child.
Databases are dropped on the child side as soon as they are not used (e.g. after unlinking by the cycle collector).

Part 5: More integration with QuotaManager

Preparation of datastores now creates real database files on disk. The LocalStorage directory is protected by a directory lock.
Infrastructure for database schema upgrades is in place too.
Database actors are now explicitely tracked by the datastore. When the last actor finishes the directory lock is released.
Added also implementation for QuotaClient::GetUsageForOrigin() and QuotaClient::AbortOperations().

Part 6: Fix a dead lock in the single process case

In the single process case, pass the nested main event target across IPC.
The event target is then used by the parent side to process runnables which need to run on the main thread.
After this change we don't have use hacks like getting profile directory path on the child side and sending it to the parent. The parent side can now do it freely even in the single process case.

Part 7: Teach QuotaManager's OriginParser to handle file://UNIVERSAL_FILE_URI_ORIGIN

See bug 1340710 which introduced this kind of URI.

Part 8: Persist datastores to disk; r=asuth

Introduced a Connection and a ConnectioThread object.
All I/O requests are processed by a new single thread managed by ConnectionThread object.
Connection objects are prepared by the prepare datastore operation and then kept alive by datastores just like directory locks.
All datastore I/O operations are wrapped in a transaction which automaticaly commits after 5 seconds.
Datastore preparation now also loads all items from the database.


So most of the stuff that is mentioned in my initial description of this bug and in comment 8 is implemented except:
- storage event notifications
- quota checks
- private browsing
- data migration

I'm currently working on:
- storage event notifications

In patch Part 3 I disabled tests that currently don't pass on try, they are mostly related to storage events not working.

My goal is to have all of this in FF 61, if I have time, I'd like to also implement data compression and some other performance and memory enhancements.
From the last update in comment 24 I was mostly working on:
1. Datastore/cache sharing by multiple tabs
This took a bit more time than I originally thought, but it's now working nicely. A datastore object is not tight to an actor, so it can be "preloaded" much sooner or can be kept alive for some time.

2. Bullet-proof quota client shutdown
I paid extra attention to this because we see many problems/crashes related to this. Some of the ideas/approaches/fixes can be used in other quota clients, especially in IndexedDB

3. Make it pass on try
I had to fix a problem with quota manager initialization ending in a dead lock in the single process case. Hopefully I fixed it in a way that we won't have to use hacks to handle the single process case.
Then I discovered that we introduced file://UNIVERSAL_FILE_URI_ORIGIN and some LocalStorage tests use that, so I had to fix the OriginParser. We might find out that some IndexedDB initialization issues are related to this special URI.
And a bunch of other small problems.

4. Code cleanup
I cleaned up the code and added many comments. The attached patches are ready to reviewed I think
Progress update:

1. storage event notifications
- implemented, but needs cleanup

2. private browsing
- fixed

TODO:
- quota checks
- data migration
- support for session only local storage (can we just drop this feature ?)
- fix clearing triggered by extension API, preferences, etc.
Session only local storage: research if others use it, talk to :asuth about it, if no one uses it we can safely dump it. Who would be stakeholders here?
Flags: needinfo?(bugmail)
Priority: P3 → P2
We will have to do something for CPOWs, some tests still use it.

Currently it can lead to a dead lock when local storage in the content process is synchronously blocking the main thread.

Here's the scenario:
- a content process touches local storage for the first time
- the content process issues a request to the parent process to prepare a datastore for local storage
- the content process synchronously blocks the main thread while waiting for a response
- the parent process sends a CPOW synchronous message (on the main thread) to the content process 
- the parent process receives the local storage request (on the PBackground thread) and starts a state machine
- one of the steps of the state machine is work on the main thread in the parent process, so a runnable is dispatched to the main thread
- the runnable is not executed because the main thread is doing a synchronous CPOW request
- the state machine can't continue, so both ends wait for each other

I found out that if I send a synchronous message from the content process to the parent while spinning the event loop, then the CPOW request gets processed. However, I'm not sure if it's safe and we want to allow it.
My understanding is that we want to get rid of CPOWs at some point and they are already disabled for normal web content.
So we should rewrite those tests that use local storage using CPOWs and somehow disable CPOWs when local storage is going to synchronously block the main thread (to rather throw an exception instead of deadlocking).
(In reply to Marion Daly [:mdaly] from comment #37)
> Session only local storage: research if others use it, talk to :asuth about
> it, if no one uses it we can safely dump it. Who would be stakeholders here?

I think the consensus reached at the Storage UX meeting in December at MozAustin, as I reported at bug 1400678 comment 3, is that clear-on-shutdown is sufficient for our session-only semantics.

I think it's also somewhat agreed that the current semantics of session-only LocalStorage bootstrapping from whatever is already persisted to LocalStorage on disk is weird and defies user expectations.  So, specifically, if the user goes to the canonical private browsing example of an engagement ring site, does something that persists data, then switches the site (or global preference) to session storage, the session storage will be bootstrapped from that already-persisted data each time, which is unexpected.

So:
- If we implement QuotaManager clear-on-shutdown in bug 1400678, I think we can remove the concept of session-only storage from LocalStorage here.
- If we don't implement bug 1400678, we might be able to simplify session-only to non-persistent in-memory-only.  (And since PrivateBrowsing is segregated by OriginAttribute, that further eliminates the need for slots.)
Flags: needinfo?(bugmail)
Thanks Andrew, that makes perfect sense.
Latest try looks good:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=7e5604be21b713dc4f0fe82a1e6679c07fe27b1c

Progress since last update:
- storage events implementation optimized, cleaned up and heavily tested
- local storage clearing fixed except browser_ext_browsingData_localStorage.js
- implemented cancellable datastore preparation to fix problems with CPOWs
- removed tests for session only support
- added checks for dom.storage.enabled pref and StorageAllowedForPrincipal()
- added support for low disk space mode

Tests that are still temporarily disabled:
test_bug600307-DBOps.html
test_localStorageFromChrome.xhtml
browser_ext_browsingData_localStorage.js
test_localStorageQuota.html
test_localStorageQuotaPrivateBrowsing_perwindowpb.html
dom/tests/browser/browser_localStorage_e10s.js

which results in this TODO list:
- broadcast domstorage-test-flushed message
- fix LocalStorage in chrome
- enhance nsIQuotaManagerService::clearStoragesForPrincipal() to accept a quota client argument
- finish quota checks
- database migration
Depends on: 1402254
(In reply to Jan Varga [:janv] from comment #41)
> - fix LocalStorage in chrome

fixed

> - enhance nsIQuotaManagerService::clearStoragesForPrincipal() to accept a quota client argument

fixed
(In reply to Jan Varga [:janv] from comment #41)
> - broadcast domstorage-test-flushed message

fixed

> - finish quota checks

origin quota checks - done
quota manager quota checks:
- group limit - done
- global limit - needs more work

All tests now pass except some devtools related ones that still use CPOWs

Yesterday asuth and I had a pre-review meeting to discuss current patches.

The new implementation is not fully complete (migration is still missing), but I'll start submitting patches for review following days.
Attachment #8946606 - Attachment is obsolete: true
Attachment #8946607 - Attachment is obsolete: true
Attachment #8946608 - Attachment is obsolete: true
Attachment #8946609 - Attachment is obsolete: true
Attachment #8946611 - Attachment is obsolete: true
Attachment #8946612 - Attachment is obsolete: true
Attachment #8946613 - Attachment is obsolete: true
Attachment #8946614 - Attachment is obsolete: true
Andrew, the new local storage implementation uses a nested event queue for synchronous blocking of the main thread in content processes. At the same time we dispatch to a DOM file thread that communicates with the parent process/PBackground thread asynchronously. The problem happens when GetOrCreateForCurrentThread() is called on the DOM file thread and a new runnable is dispatched to the main thread. The runnable connects the newly created PBackgroundChild with the parent process. However, the runnable never gets processed since we created a nested event queue on the main thread. This patch allows GetOrCreateForCurrentThread() to use a custom main event target, so we can pass our nested event queue.
Attachment #8956397 - Flags: review?(continuation)
Attachment #8956402 - Flags: review?(bugmail)
Sorry, attaching a new version, the previous patch doesn't build with latest m-c.
Attachment #8956402 - Attachment is obsolete: true
Attachment #8956402 - Flags: review?(bugmail)
Attachment #8956437 - Flags: review?(bugmail)
Attachment #8956448 - Flags: review?(bugmail)
Attachment #8956477 - Flags: review?(bugmail)
Attachment #8956489 - Flags: review?(bugmail)
Comment on attachment 8956397 [details] [diff] [review]
Part 1: Allow BackgroundChild::GetOrCreateForCurrentThread() to use a custom main event target

Review of attachment 8956397 [details] [diff] [review]:
-----------------------------------------------------------------

I don't know much about event targets but this looks reasonable.

::: ipc/glue/BackgroundChild.h
@@ +11,5 @@
>  #include "mozilla/Attributes.h"
>  #include "mozilla/ipc/Transport.h"
>  
>  class nsIDOMBlob;
> +class nsIEvenTarget;

nsIEventTarget

::: ipc/glue/BackgroundImpl.cpp
@@ +1460,2 @@
>  {
> +  MOZ_ASSERT_IF(NS_IsMainThread(), !aMainEventTarget);

Should this assert also be on BlockAndGetResults()? That also does a dispatch of this new argument.
Attachment #8956397 - Flags: review?(continuation) → review+
(In reply to Andrew McCreight [:mccr8] from comment #52)

Thanks for quick review!

> ::: ipc/glue/BackgroundImpl.cpp
> @@ +1460,2 @@
> >  {
> > +  MOZ_ASSERT_IF(NS_IsMainThread(), !aMainEventTarget);
> 
> Should this assert also be on BlockAndGetResults()? That also does a
> dispatch of this new argument.

Yeah, but I think that method shouldn't be called on the main thread at all, I added |AssertIsNotOnMainThread()| there.
Attachment #8956730 - Flags: review?(bugmail)
Attachment #8956744 - Flags: review?(bugmail)
Attachment #8956962 - Flags: review?(bugmail)
What sort performance regressions should we expect from this?
Blocks: 1341070
(In reply to Marion Daly [:mdaly] from comment #57)
> What sort performance regressions should we expect from this?

Page load time can be affected by this.
In theory, with all patches applied, especially the one that triggers preloading in ContentParent::AboutToLoadHttpFtpWyciwygDocumentForChild(), the synchronous waiting for I/O should happen less.
Storage events are also distributed in a more efficient manner.

Anyway, I'll have to do some talos testing.
Andrew, patches 1-10 implement most of the stuff and most of the tests pass.
Here's latest try for pathes 1-10:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=6b0cd9168e2b8476f76c06fc66139814eaf5a478

I'll be now attaching patches to fix the rest of tests and also patches with some improvements.
I'll attach them one by one as I finish final cleanup and self-review.
Attachment #8957058 - Flags: review?(bugmail)
Attachment #8957082 - Flags: review?(bugmail)
Attachment #8957112 - Flags: review?(bugmail)
Attachment #8957126 - Flags: review?(bugmail)
Attachment #8957133 - Flags: review?(bugmail)
Comment on attachment 8957303 [details] [diff] [review]
Part 19: Implement a helper method for datastore preloading verification

Review of attachment 8957303 [details] [diff] [review]:
-----------------------------------------------------------------

Andrew, I did only the minimum in browser_localStorage_e10s.js
Feel free to suggest what needs to be changed/removed or you can even do it yourself if you have a bit of time.
It's your test and you know best what needs to be done :)
Attachment #8957315 - Flags: review?(bugmail)
Comment on attachment 8957133 [details] [diff] [review]
Part 18: Verify that data is persisted on disk

Andrea, the Storage.webidl changes need to be reviewed by a DOM peer. Can you take a look ?
Attachment #8957133 - Flags: review?(amarchesini)
Bobby, the universal file origin was introduced in bug 1340710.
The base domain is currently not handled if strict file origin policy is not in effect which confuses quota manager.
I noticed this problem when I was testing new local storage implementation, but I guess it could already cause (hidden) problems with IndexedDB too.
Attachment #8957324 - Flags: review?(bobbyholley)
Comment on attachment 8957133 [details] [diff] [review]
Part 18: Verify that data is persisted on disk

Review of attachment 8957133 [details] [diff] [review]:
-----------------------------------------------------------------

r+ for the webidl. I leave all the rest to Andrew.
Attachment #8957133 - Flags: review?(amarchesini) → review+
Comment on attachment 8957324 [details] [diff] [review]
Part 21: Base domain needs to be handled too if strict file origin policy is not in effect

Review of attachment 8957324 [details] [diff] [review]:
-----------------------------------------------------------------

Hm, why do we need this boolean on every codebase principal, rather than just querying dynamically? Is it to handle the pref changing mid-run? I don't think I really care to support that, so how about just checking GetStrictFileOriginPolicy directly in GetBaseDomain?
Attachment #8957324 - Flags: review?(bobbyholley)
(In reply to Bobby Holley (:bholley) from comment #74)
> Comment on attachment 8957324 [details] [diff] [review]
> Part 21: Base domain needs to be handled too if strict file origin policy is
> not in effect
> 
> Review of attachment 8957324 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Hm, why do we need this boolean on every codebase principal, rather than
> just querying dynamically? Is it to handle the pref changing mid-run? I
> don't think I really care to support that, so how about just checking
> GetStrictFileOriginPolicy directly in GetBaseDomain?

Yeah, that's what I did initially, but I got assertions on try, because some
doc group code gets the base domain before the pref for file
origin policy changes and then gets it again and compares with the previous
value after the pref changes.
So I think the only way is to snapshot the pref setting.
Flags: needinfo?(bobbyholley)
Comment on attachment 8957324 [details] [diff] [review]
Part 21: Base domain needs to be handled too if strict file origin policy is not in effect

Review of attachment 8957324 [details] [diff] [review]:
-----------------------------------------------------------------

Ugh, ok. Add a comment explaining why this is there, then?
Attachment #8957324 - Flags: review+
Flags: needinfo?(bobbyholley)
(In reply to Bobby Holley (:bholley) from comment #77)
> Comment on attachment 8957324 [details] [diff] [review]
> Part 21: Base domain needs to be handled too if strict file origin policy is
> not in effect
> 
> Review of attachment 8957324 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Ugh, ok. Add a comment explaining why this is there, then?

Ok, will do. Thanks!
Attachment #8958425 - Flags: review?(bugmail)
Attachment #8959654 - Flags: review?(bugmail)
Ok, the only patch that is not yet ready for review is data migration.
It should be ready in coming days.
Depends on: 1446037, 1446310
Jan, I haven't had a chance to investigate this patchset, but will it add a concept of last-accessed timestamps to localStorages so that we can implement support for the "since" parameter of the browsingData extension API for clearing localStorages? If not, do you think it's worth adding it now to reduce the number of necessary data migrations down the line?
Attached patch data migration WIP (obsolete) — Splinter Review
(In reply to Thomas Wisniewski from comment #84)
> Jan, I haven't had a chance to investigate this patchset, but will it add a
> concept of last-accessed timestamps to localStorages so that we can
> implement support for the "since" parameter of the browsingData extension
> API for clearing localStorages? If not, do you think it's worth adding it
> now to reduce the number of necessary data migrations down the line?

There's a lastAccessTime column, but it's not used right now. I could add another patch that initializes it, but I wouldn't bet that it's enough for the feature you are talking about.
(In reply to Thomas Wisniewski from comment #84)
> Jan, I haven't had a chance to investigate this patchset, but will it add a
> concept of last-accessed timestamps to localStorages so that we can
> implement support for the "since" parameter of the browsingData extension
> API for clearing localStorages? If not, do you think it's worth adding it
> now to reduce the number of necessary data migrations down the line?

Can this be a feature we implement in a future release? It isn't necessarily related to this patch set.
>Can this be a feature we implement in a future release?

Oh I'm certainly not against that, and wouldn't expect the full feature to be implemented right now. I was just wondering if it was worth minimizing potential data-migrations (and starting to let the data be annotated in user profiles ahead of time), if it was trivial to extend this patch right now.

>I could add another patch that initializes it, but I wouldn't bet that it's enough for the feature you are talking about.

That's fine, I just think it's worth giving it a shot since we know it will be necessary, and we want to implement the feature anyway. But if it's not trivial and will delay this patchset, then let's just skip it for now.
Thomas, something you could be interesting in, is bug 1252998 where I'm introducing a new service able to store which principal has done storage operation. This is meant to be used for clear history, but probably it's fine for you too.

Those patches are not landed because I don't have time to fix 1 test failure... but if you want to help, that would be great!
Flags: needinfo?(wisniewskit)
Thanks baku, I'll see if I can scrounge up the time to help rebase that patchset and get it landed (but I'll be posting my XHR implementation patchset first).

And with that bug in mind, I suppose it's not worth delaying this one, so I retract my request.
Flags: needinfo?(wisniewskit)
(In reply to Jan Varga [:janv] from comment #85)
> Created attachment 8960540 [details] [diff] [review]
> data migration WIP

So, in the end I didn't implement a zip archive, I just move webappsstore.sqlite to <profile>/storage/ls-archive.sqlite and then lazily migrate data from it when it's needed. It works quite well. It's nice that SQLite supports transaction across multiple databases, so ensuring that stuff is done atomically is quite easy.
I also believe that we don't have to do a major storage upgrade (only minor), so it won't affect users switching between release/beta/nightly.
I'll attach a new patch once I finish final cleanup.

I also have a patch that removes old local storage implementation, just to be sure that everything uses new implementation.
(In reply to Jan Varga [:janv] from comment #91)
> (In reply to Jan Varga [:janv] from comment #85)
> > Created attachment 8960540 [details] [diff] [review]
> > data migration WIP
> 
> So, in the end I didn't implement a zip archive, I just move
> webappsstore.sqlite to <profile>/storage/ls-archive.sqlite and then lazily
> migrate data from it when it's needed. It works quite well. It's nice that
> SQLite supports transaction across multiple databases, so ensuring that
> stuff is done atomically is quite easy.
> I also believe that we don't have to do a major storage upgrade (only
> minor), so it won't affect users switching between release/beta/nightly.
> I'll attach a new patch once I finish final cleanup.
> 
> I also have a patch that removes old local storage implementation, just to
> be sure that everything uses new implementation.

any performance implicates either positive or negative with this change in strategy?
(In reply to Marion Daly [:mdaly] from comment #92)
> (In reply to Jan Varga [:janv] from comment #91)
> > (In reply to Jan Varga [:janv] from comment #85)
> > > Created attachment 8960540 [details] [diff] [review]
> > > data migration WIP
> > 
> > So, in the end I didn't implement a zip archive, I just move
> > webappsstore.sqlite to <profile>/storage/ls-archive.sqlite and then lazily
> > migrate data from it when it's needed. It works quite well. It's nice that
> > SQLite supports transaction across multiple databases, so ensuring that
> > stuff is done atomically is quite easy.
> > I also believe that we don't have to do a major storage upgrade (only
> > minor), so it won't affect users switching between release/beta/nightly.
> > I'll attach a new patch once I finish final cleanup.
> > 
> > I also have a patch that removes old local storage implementation, just to
> > be sure that everything uses new implementation.
> 
> any performance implicates either positive or negative with this change in
> strategy?

The performance is much better since the old database is just moved during a minor
storage upgrade (done by quota manager after startup).
Then, during a preload, old data is accessed using an index and inserted into new
per origin database.

The downside of this approach is that the old database is kept around uncompressed.
My multi year app profile contains a 125 MB database with roughly 3000 unique origins.

So, performance should be much better overall, we just keep a bit more data in user's
profile.

I have some ideas what to do with the old database long term. For example, we can try
to shrink it by vacuuming when the app is idle or just remove the whole file if it's
not touched for 6 months (or so).
Per https://bugzilla.mozilla.org/show_bug.cgi?id=1373244#c30 it sounds like we're not going to have profile-per-channel in time for the landing of this, so we do need to be more concerned about "profile from the future" than I was hoping.  Setting needinfo on myself to explicitly address the migration logic in the context of this.

:janv, please feel free to chime in with any thoughts too.

And for context for others, our general plan for situations like this has been :bkelly's idea to do the following when we encounter "from the future" things:
- Invoke a hook that can be used by system extensions to attempt to handle the data from the future and make it work in the current, downgraded version.
- If there was no hook or the hook failed/gave up, wipe the data in question.

In the context of QuotaManager we'd expect the above to happen at a few possible granularity levels: profile-wide, origin-wide, QM-client-for-an-origin.


A general issue is that in an ideal world, we'd need the hook mechanism and wipe mechanism in place for at least one release prior to landing anything that would make things look like they're from the future.  Although there's always uplifts...
Flags: needinfo?(bugmail)
Andrew, hopefully we won't have this problem with new local storage since I think a *minor* storage upgrade should be enough.
See comment 91.
Would this be prefed on or off in FF 61?
So I found performance regression on linux tp6_google. Obviously google.com uses LocalStorage and tp6_google on linux is very low number (the page loads very fast compared to other platforms), so any additional overhead is noticeable.
I already optimized PBackgroundIDBDatabase creation, now there's only once actor per origin and per process (there use to be one per DOM global) and various other small optimizations. However, it seems it's still not enough.
Originally I wanted to do another optimization in a follow up bug, but due to the tp6_google regression on linux I changed my mind. I want to do what Andrew described in comment 8

> ... consider coalescing the writes locally by tracking
> dirty keys and flushing on a timer, much like we do in the database thread
> already.

I'll try to do a basic (lightweight) version of this, I just want to create a small per origin and content process cache that coalesces setItem/removeItem calls and flush changes to the parent process after let's say 100ms.
(In reply to Marion Daly [:mdaly] from comment #96)
> Would this be prefed on or off in FF 61?

We have roughly 5 weeks until m-c becomes FF 62, I think it's still feasible to have it enabled in FF 61.
See Also: → 1411908
Status?
(In reply to Marion Daly [:mdaly] from comment #99)
> Status?

I implemented the micro cache on the content side and I also verified that all tests pass cross platform:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d9412df3564c5db60e41d184090bb314c65bb288

I also tested performance and it looks good:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=73348ff8d0338cc4261b2575e4891b39997a759d
https://treeherder.mozilla.org/perf.html#/compare?originalProject=mozilla-central&newProject=try&newRevision=73348ff8d0338cc4261b2575e4891b39997a759d&framework=1&selectedTimeRange=172800

However, the micro cache currently supports only "full snapshots", that is that all data is sent to the content process.
I plan to implement lazy loading of data for micro cache if data is over 4K.
So that should be the last patch.

I'm currently trying to solve a shutdown crash related to quota manager, once I fix it, I'll get back to the micro cache.
:jan will this be a pref?
Flags: needinfo?(jvarga)
(In reply to Marion Daly [:mdaly] from comment #101)
> :jan will this be a pref?

Yeah, I'm leaning towards to have a pref for this, but I have to investigate what to do with migrated data. We will need a new patch for that.
Flags: needinfo?(jvarga)
Can you create a bug for the pref then and add it as a dependency?
(In reply to Marion Daly [:mdaly] from comment #103)
> Can you create a bug for the pref then and add it as a dependency?

Existing patches for new implementation depend on the fact that the old implementation is replaced with the new one. So I'm reworking stuff here in this bug to have a pref for new LS implementation.
status?
(In reply to Marion Daly [:mdaly] from comment #105)
> status?

Finishing work on the shadow database writing, so the new LS implementation will write data changes to new per origin databases and also to the old one (webappstore.sqlite).
Attachment #8956398 - Attachment is obsolete: true
Attachment #8956398 - Flags: review?(bugmail)
Attachment #8979507 - Flags: review?(bugmail)
Attachment #8956437 - Attachment is obsolete: true
Attachment #8956437 - Flags: review?(bugmail)
Attachment #8979508 - Flags: review?(bugmail)
Attachment #8956448 - Attachment is obsolete: true
Attachment #8956448 - Flags: review?(bugmail)
Attachment #8979510 - Flags: review?(bugmail)
Attachment #8956477 - Attachment is obsolete: true
Attachment #8956477 - Flags: review?(bugmail)
Attachment #8979511 - Flags: review?(bugmail)
Attachment #8956489 - Attachment is obsolete: true
Attachment #8956489 - Flags: review?(bugmail)
Attachment #8979512 - Flags: review?(bugmail)
Attachment #8956533 - Attachment is obsolete: true
Attachment #8956533 - Flags: review?(bugmail)
Attachment #8979513 - Flags: review?(bugmail)
Attachment #8956730 - Attachment is obsolete: true
Attachment #8956730 - Flags: review?(bugmail)
Attachment #8979514 - Flags: review?(bugmail)
Attachment #8956744 - Attachment is obsolete: true
Attachment #8956744 - Flags: review?(bugmail)
Attachment #8979515 - Flags: review?(bugmail)
Attachment #8956962 - Attachment is obsolete: true
Attachment #8956962 - Flags: review?(bugmail)
Attachment #8979516 - Flags: review?(bugmail)
Attachment #8957058 - Attachment is obsolete: true
Attachment #8957058 - Flags: review?(bugmail)
Attachment #8979519 - Flags: review?(bugmail)
Attachment #8957061 - Attachment is obsolete: true
Attachment #8957061 - Flags: review?(bugmail)
Attachment #8979520 - Flags: review?(bugmail)
Attachment #8957082 - Attachment is obsolete: true
Attachment #8957082 - Flags: review?(bugmail)
Attachment #8979521 - Flags: review?(bugmail)
Attachment #8957111 - Attachment is obsolete: true
Attachment #8957111 - Flags: review?(bugmail)
Attachment #8979522 - Flags: review?(bugmail)
Attachment #8957112 - Attachment is obsolete: true
Attachment #8957112 - Flags: review?(bugmail)
Attachment #8979524 - Flags: review?(bugmail)
Attachment #8957120 - Attachment is obsolete: true
Attachment #8957120 - Flags: review?(bugmail)
Attachment #8979525 - Flags: review?(bugmail)
Attachment #8957126 - Attachment is obsolete: true
Attachment #8957126 - Flags: review?(bugmail)
Attachment #8979526 - Flags: review?(bugmail)
r=baku
Attachment #8957133 - Attachment is obsolete: true
Attachment #8957133 - Flags: review?(bugmail)
Attachment #8979527 - Flags: review?(bugmail)
Attachment #8979527 - Flags: review+
Attachment #8957303 - Attachment is obsolete: true
Attachment #8957303 - Flags: review?(bugmail)
Attachment #8979528 - Flags: review?(bugmail)
Attachment #8979529 - Flags: review?(bugmail)
Attachment #8957315 - Attachment is obsolete: true
Attachment #8957315 - Flags: review?(bugmail)
r=bholley
Attachment #8957324 - Attachment is obsolete: true
Attachment #8979531 - Flags: review+
Attachment #8958425 - Attachment is obsolete: true
Attachment #8958425 - Flags: review?(bugmail)
Attachment #8979532 - Flags: review?(bugmail)
Attachment #8958445 - Attachment is obsolete: true
Attachment #8958445 - Flags: review?(bugmail)
Attachment #8979534 - Flags: review?(bugmail)
Attachment #8959654 - Attachment is obsolete: true
Attachment #8959654 - Flags: review?(bugmail)
Attachment #8979535 - Flags: review?(bugmail)
Attachment #8772366 - Attachment is obsolete: true
Attachment #8960540 - Attachment is obsolete: true
Timeframe for bringing into nightly?
Flags: needinfo?(jvarga)
(In reply to Marion Daly [:mdaly] from comment #132)
> Timeframe for bringing into nightly?

Given the fact that we can land it preffed off, I think 2 weeks is realistic timeframe.
Flags: needinfo?(jvarga)
Attached patch Part 27: Share database actors (obsolete) — Splinter Review
Attachment #8980048 - Flags: review?(bugmail)
Attachment #8980537 - Flags: review?(bugmail)
Attachment #8981412 - Flags: review?(bugmail)
Ok, there are only two patches remaining, one for e10s testing of snapshotting and a patch for shadow database writing. Both are coming today/tomorrow, I just need to do final cleanup and testing of those patches.
Here's the latest plan for landing:

We will land a special patch for current m-c (FF 62) that updates quota manager to know about new quota client (quota client for local storage).
New implementation of local storage will land early in the FF 63 cycle (initially preffed off), just to have enough time for fixing possible regressions.
We have a pref for the new implementation, so we can switch to old implementation anytime we want, but still, it's better to have more time for fixing possible regressions (keep in mind that local storage is used by sites a lot).
Landing a pre-patch in previous cycle will ensure that users which switch between nightly and beta will be able to use the same profile without any problems. This also covers users which switch between release and beta.
If we enable new implementation by defaut in FF 64, then users will be able to use the same profile with all 3 channels (nightly/beta/release).
Liz: Does that work for you?
Flags: needinfo?(lhenry)
That sounds good. I'd like to make sure we have QA set up to cover this as a feature for 63. Can you file a PI request?
Flags: needinfo?(mdaly)
Flags: needinfo?(jvarga)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #143)
> That sounds good. I'd like to make sure we have QA set up to cover this as a
> feature for 63. Can you file a PI request?

Happily, I'll file one, I need to learn how to do it anyway.
Flags: needinfo?(mdaly)

(In reply to Marion Daly [:mdaly] from comment #144)
> (In reply to Liz Henry (:lizzard) (needinfo? me) from comment #143)
> > That sounds good. I'd like to make sure we have QA set up to cover this as a
> > feature for 63. Can you file a PI request?
> 
> Happily, I'll file one, I need to learn how to do it anyway.

Thanks Marion. You can follow the below link and submit a PI Request. 
https://mana.mozilla.org/wiki/display/PI/PI+Request

Also, for any future reference, we track all the Milestones and Test Results in Trello.

Thank you!
Bug fixes.
Attachment #8981412 - Attachment is obsolete: true
Attachment #8981412 - Flags: review?(bugmail)
Attachment #8982508 - Flags: review?(bugmail)
try:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=3640163c666208d149e8bd8b5b0154bf5751590c&selectedJob=181300055

One failure has been fixed by a new patch (part 33) and the other one by updating patch part 16.
Flags: needinfo?(jvarga)
(In reply to Tania Maity from comment #145)
> 
> (In reply to Marion Daly [:mdaly] from comment #144)
> > (In reply to Liz Henry (:lizzard) (needinfo? me) from comment #143)
> > > That sounds good. I'd like to make sure we have QA set up to cover this as a
> > > feature for 63. Can you file a PI request?
> > 
> > Happily, I'll file one, I need to learn how to do it anyway.
> 
> Thanks Marion. You can follow the below link and submit a PI Request. 
> https://mana.mozilla.org/wiki/display/PI/PI+Request
> 
> Also, for any future reference, we track all the Milestones and Test Results
> in Trello.
> 
> Thank you!

Also adding the Trello card link for this:
https://trello.com/c/RAxskkWh/543-next-generation-localstorage-domstorage-implementation#
Jan can you verify the pref we are using to turn this on/off?

Is it dom.storage.enabled ?
Flags: needinfo?(jvarga)
(In reply to Marion Daly [:mdaly] from comment #151)
> Jan can you verify the pref we are using to turn this on/off?
> 
> Is it dom.storage.enabled ?
That's for LocalStorage in general.

The pref for new LS implementation is: dom.storage.next_gen
Flags: needinfo?(jvarga)
Attached patch Folded patch (obsolete) — Splinter Review
Andrew, here's a folded patch, it contains everything. Ready for your review.
You can also use the try push to see all individual patches.
Attachment #8987370 - Flags: review?(bugmail)
Commands used for testing locally:

./mach xpcshell-test dom/localstorage/test/unit dom/quota/test/unit/test_removeLocalStorage.js dom/quota/test/unit/test_specialOrigins.js toolkit/components/cleardata/tests/unit toolkit/components/extensions/test/xpcshell/test_ext_localStorage.js toolkit/forgetaboutsite/test/unit/test_removeDataFromDomain.js

./mach wpt testing/web-platform/tests/webstorage

./mach mochitest --headless browser/components/contextualidentity/test/browser/browser_forgetaboutsite.js browser/components/contextualidentity/test/browser/browser_forgetAPI_quota_clearStoragesForPrincipal.js browser/components/extensions/test/browser/browser_ext_browsingData_indexedDB.js browser/components/extensions/test/browser/browser_ext_browsingData_localStorage.js browser/components/originattributes/test/browser/browser_localStorageIsolation.js browser/components/preferences/in-content/tests/siteData/browser_siteData.js browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent.js browser/components/privatebrowsing/test/browser/browser_privatebrowsing_localStorage_before_after.js browser/components/sessionstore/test/browser_464199.js devtools/client/storage/test/browser_storage_delete.js devtools/client/storage/test/browser_storage_delete_all.js devtools/client/storage/test/browser_storage_delete_tree.js devtools/client/storage/test/browser_storage_delete_usercontextid.js devtools/client/storage/test/browser_storage_dynamic_updates_localStorage.js devtools/client/storage/test/browser_storage_dynamic_updates_sessionStorage.js devtools/client/storage/test/browser_storage_localstorage_add.js devtools/client/storage/test/browser_storage_localstorage_edit.js devtools/client/storage/test/browser_storage_localstorage_error.js devtools/client/storage/test/browser_storage_localstorage_rapid_add_remove.js devtools/server/tests/browser/browser_storage_dynamic_windows.js devtools/server/tests/browser/browser_storage_updates.js dom/tests/browser/browser_localStorage_e10s.js dom/tests/browser/browser_localStorage_snapshotting_e10s.js dom/base/test/test_bug1043106.html dom/tests/mochitest/localstorage dom/tests/mochitest/storageevent toolkit/components/extensions/test/mochitest/test_ext_unlimitedStorage.html toolkit/components/extensions/test/mochitest/test_ext_unlimitedStorage_legacy_persistent_indexedDB.html toolkit/components/extensions/test/mochitest/test_ext_storage_cleanup.html
Depends on: 1470811
Blocks: 1446992
Component: DOM → DOM: Web Storage
Blocks: 1436128
QA Contact: timea.babos
Blocks: 1410701
Attached patch Folded patchSplinter Review
Attachment #8979506 - Attachment is obsolete: true
Attachment #8979507 - Attachment is obsolete: true
Attachment #8979508 - Attachment is obsolete: true
Attachment #8979510 - Attachment is obsolete: true
Attachment #8979511 - Attachment is obsolete: true
Attachment #8979512 - Attachment is obsolete: true
Attachment #8979513 - Attachment is obsolete: true
Attachment #8979514 - Attachment is obsolete: true
Attachment #8979515 - Attachment is obsolete: true
Attachment #8979516 - Attachment is obsolete: true
Attachment #8979519 - Attachment is obsolete: true
Attachment #8979520 - Attachment is obsolete: true
Attachment #8979521 - Attachment is obsolete: true
Attachment #8979522 - Attachment is obsolete: true
Attachment #8979524 - Attachment is obsolete: true
Attachment #8979525 - Attachment is obsolete: true
Attachment #8979526 - Attachment is obsolete: true
Attachment #8979527 - Attachment is obsolete: true
Attachment #8979528 - Attachment is obsolete: true
Attachment #8979529 - Attachment is obsolete: true
Attachment #8979531 - Attachment is obsolete: true
Attachment #8979532 - Attachment is obsolete: true
Attachment #8979533 - Attachment is obsolete: true
Attachment #8979534 - Attachment is obsolete: true
Attachment #8979535 - Attachment is obsolete: true
Attachment #8980046 - Attachment is obsolete: true
Attachment #8980048 - Attachment is obsolete: true
Attachment #8980070 - Attachment is obsolete: true
Attachment #8980537 - Attachment is obsolete: true
Attachment #8980538 - Attachment is obsolete: true
Attachment #8982508 - Attachment is obsolete: true
Attachment #8982509 - Attachment is obsolete: true
Attachment #8982511 - Attachment is obsolete: true
Attachment #8987370 - Attachment is obsolete: true
Attachment #8979507 - Flags: review?(bugmail)
Attachment #8979508 - Flags: review?(bugmail)
Attachment #8979510 - Flags: review?(bugmail)
Attachment #8979511 - Flags: review?(bugmail)
Attachment #8979512 - Flags: review?(bugmail)
Attachment #8979513 - Flags: review?(bugmail)
Attachment #8979514 - Flags: review?(bugmail)
Attachment #8979515 - Flags: review?(bugmail)
Attachment #8979516 - Flags: review?(bugmail)
Attachment #8979519 - Flags: review?(bugmail)
Attachment #8979520 - Flags: review?(bugmail)
Attachment #8979521 - Flags: review?(bugmail)
Attachment #8979522 - Flags: review?(bugmail)
Attachment #8979524 - Flags: review?(bugmail)
Attachment #8979525 - Flags: review?(bugmail)
Attachment #8979526 - Flags: review?(bugmail)
Attachment #8979527 - Flags: review?(bugmail)
Attachment #8979528 - Flags: review?(bugmail)
Attachment #8979529 - Flags: review?(bugmail)
Attachment #8979532 - Flags: review?(bugmail)
Attachment #8979533 - Flags: review?(bugmail)
Attachment #8979534 - Flags: review?(bugmail)
Attachment #8979535 - Flags: review?(bugmail)
Attachment #8980046 - Flags: review?(bugmail)
Attachment #8980048 - Flags: review?(bugmail)
Attachment #8980070 - Flags: review?(bugmail)
Attachment #8980537 - Flags: review?(bugmail)
Attachment #8980538 - Flags: review?(bugmail)
Attachment #8982508 - Flags: review?(bugmail)
Attachment #8982509 - Flags: review?(bugmail)
Attachment #8982511 - Flags: review?(bugmail)
Attachment #8987370 - Flags: review?(bugmail)
Blocks: 1498393
Comment on attachment 9009154 [details] [diff] [review]
Folded patch

Review of attachment 9009154 [details] [diff] [review]:
-----------------------------------------------------------------

Here's a checkpoint of review comments mainly relating to privacy APIs that consume QM and tests external to LSng.

::: browser/components/extensions/parent/ext-browsingData.js
@@ +109,5 @@
>      return Promise.reject(
>        {message: "Firefox does not support clearing localStorage with 'since'."});
>    }
> +
> +  if (Services.lsm.nextGenLocalStorageEnabled) {

Please add a comment here along the lines of:

Ideally we could reuse the logic in Sanitizer.jsm or nsIClearDataService, but this API exposes an ability to wipe data at a much finger granularity than those APIs.  So custom logic is used here to wipe only the QM localStorage client (when in use).

@@ +115,5 @@
> +
> +    await new Promise(resolve => {
> +      quotaManagerService.getUsage(request => {
> +        if (request.resultCode != Cr.NS_OK) {
> +          // We are probably shutting down. We don't want to propagate the error,

This comment seems like it was copied and pasted.  Errors here and below should be propagated since this is a privacy API.

::: browser/components/extensions/test/browser/browser_ext_browsingData_indexedDB.js
@@ +52,5 @@
>      return new Promise(resolve => {
>        let origins = [];
>        Services.qms.getUsage(request => {
>          for (let i = 0; i < request.result.length; ++i) {
> +          if (request.result[i].usage == 0) {

nit: triple-equals unless you have a good reason (and a comment explaining the good reason ;)

::: caps/ContentPrincipal.h
@@ +49,5 @@
>    extensions::WebExtensionPolicy* AddonPolicy();
>  
>    nsCOMPtr<nsIURI> mDomain;
>    nsCOMPtr<nsIURI> mCodebase;
> +  bool mStrictFileOriginPolicy;

If retained, please copy the general comment we have about the universal file origin here (that was also duplicated in ContentPrincipal.cpp).

But it's not clear this should be retained.  The existing idiom seems to be to check nsScriptSecurityManager::GetStrictFileOriginPolicy() all over the place rather than latching the value.  If there's a necessary reason to latch the value I think there should be a comment here explaining that and the existing non-static callers (ContentPrincipal::MayLoadInternal) should be updated.

A concern I have is that this needs to be serialized/deserialized as part of PrincipalInfo to be truly safe.  This seems like a band-aid to work-around a problem in tests that are partially broken.  It seems better to just fix the tests.  If you do want to keep this, I think you may want to consult with :baku on serializing the value and whether the change is necessary.

A comment might be:
"""
The "security.fileuri.strict_origin_policy" preference controls whether all files are considered to be same-origin (false) or not.  This value is exposed via nsScriptSecurityManager::GetStrictFileOriginPolicy() and is allowed to change at runtime.  In order to guarantee consistent behavior for the holder of the principal for subsystems like QuotaManager in the face of a pref change, it is necessary to latch this value when the principal is created.

SOMETHING ABOUT WHY this doesn't get serialized/deserialized in PrincipalInfo.
"""

::: dom/quota/ActorsParent.cpp
@@ +2695,5 @@
>    if (NS_WARN_IF(NS_FAILED(rv))) {
>      return rv;
>    }
>  
> +  NextGenLocalStorageEnabled();

This should likely have "Unused <<" at the front and a comment that explains this is being done to ensure the value is cached.

::: dom/quota/nsIQuotaManagerService.idl
@@ +87,5 @@
>    clear();
>  
>    /**
> +   * Removes all storages stored for the given pattern. The files may not be
> +   * deleted immediately depending on prohibitive concurrent operations.

Can you elaborate on this comment a little?  Tracing these operations to the code they trigger can be a hassle, so being able to elaborate on what this does in terms of locks could be useful.

@@ +90,5 @@
> +   * Removes all storages stored for the given pattern. The files may not be
> +   * deleted immediately depending on prohibitive concurrent operations.
> +   *
> +   * @param aPattern
> +   *        A pattern for the origins whose storages are to be cleared.

Please elaborate on what form this pattern takes.  For example:

A pattern to be parsed by OriginAttributesPattern.  Currently this is expected to be a JSON representation of the OriginAttributesPatternDictionary defined in ChromeUtils.webidl.

@@ +93,5 @@
> +   * @param aPattern
> +   *        A pattern for the origins whose storages are to be cleared.
> +   */
> +  [must_use] nsIQuotaRequest
> +  clearStoragesForPattern(in AString aPattern);

I suggest renaming this to clearStoragesForOriginAttributesPattern.  It's long but less ambiguous about what's going on.  I know we shorten this under the hood in OriginScope (and there's no need to change that), but I think it helps for clarity with data clearing.

@@ +102,5 @@
>     *
>     * @param aPrincipal
>     *        A principal for the origin whose storages are to be cleared.
>     * @param aPersistenceType
>     *        An optional string that tells what persistence type of storages

Existing comment, but I'd suggest enumerating the legal values here, expressing the impact of not specifying, and explaining why one would want to specify the value.  Something like:

If omitted (or void), all persistence types will be cleared for the principal.  If a single persistence type ("persistent", "temporary", or "default") is provided, then only that persistence directory will be considered.  Note that "persistent" is different than being "persisted" via navigator.storage.persist() and is only for chrome principals.  See bug 1354500 for more info.  In general, null is the right thing to pass here.

@@ +105,5 @@
>     * @param aPersistenceType
>     *        An optional string that tells what persistence type of storages
>     *        will be cleared.
> +   * @param aClientType
> +   *        An optional string that tells what client type of storages

Suggest re-phrasing this a little to something like the following to make it clear that a list can't be used and how to better express how all clients can be cleared:

If omitted (or void), all client types will be cleared for the principal.  If a single client type is provided from Client.h, then only that client's storage will be cleared.  If you want to clear multiple client types (but not all), then you must call this method multiple times.

@@ +138,5 @@
> +   * a no-op.
> +   *
> +   * @param aPrincipal
> +   *        A principal for the origin whose storages are to be reset.
> +   * @param aPersistenceType

Suggest propagating the changes from above to here.

::: dom/webidl/Storage.webidl
@@ +34,5 @@
>    readonly attribute boolean isSessionOnly;
>  };
> +
> +// Testing only.
> +partial interface Storage {

Any reason not to mark the partial interface with the pref rather than each individual method?

::: gfx/ipc/PVsyncBridge.ipdl
@@ +5,5 @@
>  
>  using class mozilla::TimeStamp from "mozilla/TimeStamp.h";
>  using mozilla::layers::LayersId from "mozilla/layers/LayersTypes.h";
>  
> +include "mozilla/layers/LayersMessageUtils.h";

POSSIBLE REBASE ERROR: It's not clear why your changes would necessitate this change.

::: gfx/thebes/gfxMacPlatformFontList.mm
@@ +1891,5 @@
>      CFMutableArrayRef urls =
>          ::CFArrayCreateMutable(kCFAllocatorDefault, 0, &kCFTypeArrayCallBacks);
>  
> +    nsCOMPtr<nsIFile> file;
> +    while (NS_SUCCEEDED(e->GetNextFile(getter_AddRefs(file))) && file) {

POSSIBLE REBASE ERROR.  (I put it in caps so it stands out!  If you meant to do this, maybe do it in a different bug.)

::: ipc/ipdl/sync-messages.ini
@@ +920,5 @@
>  description =
>  [PBackgroundStorage::Preload]
>  description =
> +[PBackgroundLSDatabase::PBackgroundLSSnapshot]
> +description =

Note that since bug 1340440 you'll need an IPC peer's review on these changes.  While it looks like convention is to not populate the description field here, I think it's appropriate to have comments in the IPDL file that document where/when the sync messages are used or references to comements elsewhere in the codebase.  Right now only PBackgroundLSSnapshot::Ping has such a comment.  (Which is a nice comment!)

I have in-flight comment suggestions if you want to wait on those.

::: modules/libpref/init/all.js
@@ +1248,5 @@
>  pref("dom.disable_open_click_delay", 1000);
>  pref("dom.serviceWorkers.disable_open_click_delay", 1000);
>  
>  pref("dom.storage.enabled", true);
> +#ifdef NIGHTLY_BUILD

Please add a comment here that explains what this pref does and links to the bug to pref on by default everywhere.

::: toolkit/components/cleardata/ClearDataService.js
@@ +292,5 @@
>  const QuotaCleaner = {
>    deleteByPrincipal(aPrincipal) {
> +    if (!Services.lsm.nextGenLocalStorageEnabled) {
> +      // localStorage
> +      Services.obs.notifyObservers(null, "browser:purge-domain-data",

This block wants a comment along the lines of the one that wants to exist in `test_removeDataFromDomain.js`.  A short simple option might be:

The legacy LocalStorage implementation that will eventually be removed depends on this observer notification to clear by principal.  Only generate it if we're using the legacy implementation.

@@ +317,5 @@
>  
>    deleteByHost(aHost, aOriginAttributes) {
> +    if (!Services.lsm.nextGenLocalStorageEnabled) {
> +      // localStorage
> +      Services.obs.notifyObservers(null, "browser:purge-domain-data", aHost);

same deal with a comment here.

@@ +338,5 @@
>                                       .createCodebasePrincipal(httpsURI, aOriginAttributes);
> +        let promises = [];
> +        promises.push(new Promise(aResolve => {
> +          let req = Services.qms.clearStoragesForPrincipal(httpPrincipal, null, null, true);
> +          req.callback = () => { aResolve(); };

We need to be rejecting or setting "exceptionThrown" to true in all three of these clearing cases in event of an error.  If we somehow can't reject or these errors are somehow benign, there needs to be comments here and/or references to bugs to address the situation.  NB: I do realize the existing code was ignoring the possibility of errors.

@@ +346,5 @@
> +          req.callback = () => { aResolve(); };
> +        }));
> +        if (Services.lsm.nextGenLocalStorageEnabled) {
> +          promises.push(new Promise(aResolve => {
> +            Services.qms.getUsage(aRequest => {

Please add a comment that explains the mechanism here.  For example:
deleteByHost has the semantics that "foo.example.com" should be wiped if we are provided an aHost of "example.com".  QuotaManager doesn't have a way to directly do this, so we use getUsage() to get a list of all of the origins known to QuotaManager and then check whether the domain is a sub-domain of aHost.

@@ +347,5 @@
> +        }));
> +        if (Services.lsm.nextGenLocalStorageEnabled) {
> +          promises.push(new Promise(aResolve => {
> +            Services.qms.getUsage(aRequest => {
> +              if (aRequest.resultCode != Cr.NS_OK) {

Please add a comment explaining why this would happen and what this means in terms of clearing user data in addition to the rejection mentioned above.  It might also make sense to merge this comments with the above in a block comment.  For example:

In the event of a failure, QuotaManager failed to initialize.  QuotaManager reports such failures via telemetry and the profile problem should be addressed via that mechanism.  Additionally, this method will reject to propagate the error upwards.

@@ +356,5 @@
> +              let promises = [];
> +              for (let item of aRequest.result) {
> +                let principal = Services.scriptSecurityManager.createCodebasePrincipalFromOrigin(item.origin);
> +                let originNoSuffix = principal.originNoSuffix;
> +                if (eTLDService.hasRootDomain(originNoSuffix, aHost)) {

originNoSuffix will look like "https://foo.example.com" while aHost will look like "example.com".  Given how HasRootDomain is written, this currently works, but I think you should be using `principal.URI.host` instead.

@@ +359,5 @@
> +                let originNoSuffix = principal.originNoSuffix;
> +                if (eTLDService.hasRootDomain(originNoSuffix, aHost)) {
> +                  promises.push(new Promise(aResolve => {
> +                    let clearRequest = Services.qms.clearStoragesForPrincipal(principal, null, "ls");
> +                    clearRequest.callback = () => { aResolve(); };

The request has a `result` so you should be either using it or commenting why you're not using it.  Again for the above comments, this probably wants to potentially reject.

@@ +427,1 @@
>                    req.callback = () => { aResolve(); };

Error handling or comments here as well, please.

::: toolkit/components/extensions/Extension.jsm
@@ +253,5 @@
>        Services.qms.clearStoragesForPrincipal(storagePrincipal);
>  
>        ExtensionStorageIDB.clearMigratedExtensionPref(addon.id);
>  
> +      if (!Services.lsm.nextGenLocalStorageEnabled) {

This should have a comment explaining that the qms clear above is taking care of this.

::: toolkit/forgetaboutsite/test/unit/test_removeDataFromDomain.js
@@ +313,5 @@
>    check_permission_exists(TEST_URI, false);
>  }
>  
>  function waitForPurgeNotification() {
> +  if (Services.lsm.nextGenLocalStorageEnabled) {

As I dig into the history of the existence of this method[1], it seems like this method dates from when this test was driven by a generator and the method may not need to exist anymore.  Specifically, it seems like the code wanted to make sure that SiteDataManager got to notifying "browser:purge-domain-data" but that directly invoking the generator at that point would cause the test to resume running and this would happen before the LocalStorage logic got a chance to run, hence the explicit turn of the event loop.

Given all the refactorings and cleanups that have happened to introduce nsIClearDataService and make it properly async, I think you could probably remove waitForPurgeNotification() and calls to it and everything would still work.  That's a little bit of scope creep though, so if you don't want to try that maybe add a block comment to this method like:

This method probably can be removed.  It stems from the semantics of direct-driven generators and building on top of an ordering hack involving synchronous observer notifications.  The hack still works, it's just not needed, and in the case of next-generation LocalStorage, the observer notification will never be generated, so we just resolve() directly.  Please clean this code up, kind reader!

1: https://hg.mozilla.org/integration/mozilla-inbound/rev/acf32763c61d#l5.66
Blocks: 1504142
Depends on: 1505798
Off-bug I'd expressed concern over us latching mDocumentURI.  We differ from Chrome in latching this value, but now I'm not sure that's actually a problem.  First, our existing pre-LSNG implementation latches the document URI.  Second, for the Clients API exposed by ServiceWorkers, the Client exposes a `url` that is the latched "creation URL" from HTML as defined at https://html.spec.whatwg.org/multipage/webappapis.html#concept-environment-creation-url.  While the "storage" event doesn't have the same logistical concerns as the `Client` interface, I think it's reasonable for us to not enhance LocalStorage further.

I'd propose we file a follow-up to consider implementing the change and maybe just leave it open so people can find it if it does turn out to be a webcompat problem.  We should also double-check that the bug doesn't already exist.
Comment on attachment 9009154 [details] [diff] [review]
Folded patch

Review of attachment 9009154 [details] [diff] [review]:
-----------------------------------------------------------------

r=asuth with my previous review comments in comment 163 addressed, plus the review fixes applied plus comments from the extra commits on my empty try push at https://treeherder.mozilla.org/#/jobs?repo=try&revision=019b18aec556c809b868318e9f9d616459d8ce24.  Note that when I re-based I messed up and included your try push as a base revision, so you'll need to avoid grafting that or histedit it out or what not.

Please feel free to fold the review fixes into your code as you desire (or not), but ideally keep my comments in their own commit with any corrections/fix-ups in a commit with your authorship so that I don't get confused when looking at the blame.

I think the 2 things then, as discussed at the team meeting are to get r=mrbkap for the WebIDL and r=mccr8 for the Sync IPC changes that need peer sign-off.
Attachment #9009154 - Flags: review+
Thanks a lot, I really appreciate all the fixes and comments.
Andrew, I found some typos, added more comments in few places and removed empty comments /* */.
Let me know if I should keep those empty /* */
Attachment #9027641 - Flags: review?(bugmail)
Comment on attachment 9027641 [details] [diff] [review]
patch for comments

Review of attachment 9027641 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/localstorage/PBackgroundLSRequest.ipdl
@@ +27,5 @@
>  };
>  
>  /**
> + * Discriminated union which can contain an error code (`nsresult`) or
> + * particular request responses.

This should be probably "response", not "responses"
Comment on attachment 9027641 [details] [diff] [review]
patch for comments

Review of attachment 9027641 [details] [diff] [review]:
-----------------------------------------------------------------

It absolutely makes sense to remove the extra /* */'s.  What would happen is I'd start to document something, then realize I needed to do an investigation, but never quite made it back to the area of code I started at.  I can put those back in with actual comments in follow-up patches :)

::: dom/localstorage/PBackgroundLSRequest.ipdl
@@ +27,5 @@
>  };
>  
>  /**
> + * Discriminated union which can contain an error code (`nsresult`) or
> + * particular request responses.

The singular is probably more correct here, yeah.
Attachment #9027641 - Flags: review?(bugmail) → review+
I still need to take of the rest, but since you are traveling tomorrow I'm submitting what I have. I'll try to finish remaining fixes tonight.


(In reply to Andrew Sutherland [:asuth] from comment #163)
> Comment on attachment 9009154 [details] [diff] [review]
> Folded patch
> 
> Review of attachment 9009154 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> Here's a checkpoint of review comments mainly relating to privacy APIs that
> consume QM and tests external to LSng.
> 
> ::: browser/components/extensions/parent/ext-browsingData.js
> @@ +109,5 @@
> >      return Promise.reject(
> >        {message: "Firefox does not support clearing localStorage with 'since'."});
> >    }
> > +
> > +  if (Services.lsm.nextGenLocalStorageEnabled) {
> 
> Please add a comment here along the lines of:
> 
> Ideally we could reuse the logic in Sanitizer.jsm or nsIClearDataService,
> but this API exposes an ability to wipe data at a much finger granularity
> than those APIs.  So custom logic is used here to wipe only the QM
> localStorage client (when in use).

Added.

> 
> @@ +115,5 @@
> > +
> > +    await new Promise(resolve => {
> > +      quotaManagerService.getUsage(request => {
> > +        if (request.resultCode != Cr.NS_OK) {
> > +          // We are probably shutting down. We don't want to propagate the error,
> 
> This comment seems like it was copied and pasted.  Errors here and below
> should be propagated since this is a privacy API.

Yeah, I reworked it by adding reject handling.

> 
> :::
> browser/components/extensions/test/browser/
> browser_ext_browsingData_indexedDB.js
> @@ +52,5 @@
> >      return new Promise(resolve => {
> >        let origins = [];
> >        Services.qms.getUsage(request => {
> >          for (let i = 0; i < request.result.length; ++i) {
> > +          if (request.result[i].usage == 0) {
> 
> nit: triple-equals unless you have a good reason (and a comment explaining
> the good reason ;)

Fixed.

> 
> ::: caps/ContentPrincipal.h
> @@ +49,5 @@
> >    extensions::WebExtensionPolicy* AddonPolicy();
> >  
> >    nsCOMPtr<nsIURI> mDomain;
> >    nsCOMPtr<nsIURI> mCodebase;
> > +  bool mStrictFileOriginPolicy;
> 
> If retained, please copy the general comment we have about the universal
> file origin here (that was also duplicated in ContentPrincipal.cpp).
> 
> But it's not clear this should be retained.  The existing idiom seems to be
> to check nsScriptSecurityManager::GetStrictFileOriginPolicy() all over the
> place rather than latching the value.  If there's a necessary reason to
> latch the value I think there should be a comment here explaining that and
> the existing non-static callers (ContentPrincipal::MayLoadInternal) should
> be updated.
> 
> A concern I have is that this needs to be serialized/deserialized as part of
> PrincipalInfo to be truly safe.  This seems like a band-aid to work-around a
> problem in tests that are partially broken.  It seems better to just fix the
> tests.  If you do want to keep this, I think you may want to consult with
> :baku on serializing the value and whether the change is necessary.
> 
> A comment might be:
> """
> The "security.fileuri.strict_origin_policy" preference controls whether all
> files are considered to be same-origin (false) or not.  This value is
> exposed via nsScriptSecurityManager::GetStrictFileOriginPolicy() and is
> allowed to change at runtime.  In order to guarantee consistent behavior for
> the holder of the principal for subsystems like QuotaManager in the face of
> a pref change, it is necessary to latch this value when the principal is
> created.
> 
> SOMETHING ABOUT WHY this doesn't get serialized/deserialized in
> PrincipalInfo.
> """

See comment 74, comment 75 and comment 76.
Unfortunatelly the try link doesn't work anymore, but I'll test in on try again if we still need to latch the value.

> 
> ::: dom/quota/ActorsParent.cpp
> @@ +2695,5 @@
> >    if (NS_WARN_IF(NS_FAILED(rv))) {
> >      return rv;
> >    }
> >  
> > +  NextGenLocalStorageEnabled();
> 
> This should likely have "Unused <<" at the front and a comment that explains
> this is being done to ensure the value is cached.

Fixed.

> 
> ::: dom/quota/nsIQuotaManagerService.idl
> @@ +87,5 @@
> >    clear();
> >  
> >    /**
> > +   * Removes all storages stored for the given pattern. The files may not be
> > +   * deleted immediately depending on prohibitive concurrent operations.
> 
> Can you elaborate on this comment a little?  Tracing these operations to the
> code they trigger can be a hassle, so being able to elaborate on what this
> does in terms of locks could be useful.

Yeah, I added comments for locking and an example too.

> 
> @@ +90,5 @@
> > +   * Removes all storages stored for the given pattern. The files may not be
> > +   * deleted immediately depending on prohibitive concurrent operations.
> > +   *
> > +   * @param aPattern
> > +   *        A pattern for the origins whose storages are to be cleared.
> 
> Please elaborate on what form this pattern takes.  For example:
> 
> A pattern to be parsed by OriginAttributesPattern.  Currently this is
> expected to be a JSON representation of the
> OriginAttributesPatternDictionary defined in ChromeUtils.webidl.

Added.

> 
> @@ +93,5 @@
> > +   * @param aPattern
> > +   *        A pattern for the origins whose storages are to be cleared.
> > +   */
> > +  [must_use] nsIQuotaRequest
> > +  clearStoragesForPattern(in AString aPattern);
> 
> I suggest renaming this to clearStoragesForOriginAttributesPattern.  It's
> long but less ambiguous about what's going on.  I know we shorten this under
> the hood in OriginScope (and there's no need to change that), but I think it
> helps for clarity with data clearing.

Ok, I changed the name.

> 
> @@ +102,5 @@
> >     *
> >     * @param aPrincipal
> >     *        A principal for the origin whose storages are to be cleared.
> >     * @param aPersistenceType
> >     *        An optional string that tells what persistence type of storages
> 
> Existing comment, but I'd suggest enumerating the legal values here,
> expressing the impact of not specifying, and explaining why one would want
> to specify the value.  Something like:
> 
> If omitted (or void), all persistence types will be cleared for the
> principal.  If a single persistence type ("persistent", "temporary", or
> "default") is provided, then only that persistence directory will be
> considered.  Note that "persistent" is different than being "persisted" via
> navigator.storage.persist() and is only for chrome principals.  See bug
> 1354500 for more info.  In general, null is the right thing to pass here.

Ok, added.

> 
> @@ +105,5 @@
> >     * @param aPersistenceType
> >     *        An optional string that tells what persistence type of storages
> >     *        will be cleared.
> > +   * @param aClientType
> > +   *        An optional string that tells what client type of storages
> 
> Suggest re-phrasing this a little to something like the following to make it
> clear that a list can't be used and how to better express how all clients
> can be cleared:
> 
> If omitted (or void), all client types will be cleared for the principal. 
> If a single client type is provided from Client.h, then only that client's
> storage will be cleared.  If you want to clear multiple client types (but
> not all), then you must call this method multiple times.

Re-phrased.

> 
> @@ +138,5 @@
> > +   * a no-op.
> > +   *
> > +   * @param aPrincipal
> > +   *        A principal for the origin whose storages are to be reset.
> > +   * @param aPersistenceType
> 
> Suggest propagating the changes from above to here.

Done.

> 
> ::: dom/webidl/Storage.webidl
> @@ +34,5 @@
> >    readonly attribute boolean isSessionOnly;
> >  };
> > +
> > +// Testing only.
> > +partial interface Storage {
> 
> Any reason not to mark the partial interface with the pref rather than each
> individual method?

I get this error:
WebIDL.WebIDLError: error: Unknown extended attribute Pref on partial interface

> 
> ::: gfx/ipc/PVsyncBridge.ipdl
> @@ +5,5 @@
> >  
> >  using class mozilla::TimeStamp from "mozilla/TimeStamp.h";
> >  using mozilla::layers::LayersId from "mozilla/layers/LayersTypes.h";
> >  
> > +include "mozilla/layers/LayersMessageUtils.h";
> 
> POSSIBLE REBASE ERROR: It's not clear why your changes would necessitate
> this change.

It used to be a problem, sometimes there are weird dependencies between ipdl files, but this somehow got fixed in the meantime so I removed it.

> 
> ::: gfx/thebes/gfxMacPlatformFontList.mm
> @@ +1891,5 @@
> >      CFMutableArrayRef urls =
> >          ::CFArrayCreateMutable(kCFAllocatorDefault, 0, &kCFTypeArrayCallBacks);
> >  
> > +    nsCOMPtr<nsIFile> file;
> > +    while (NS_SUCCEEDED(e->GetNextFile(getter_AddRefs(file))) && file) {
> 
> POSSIBLE REBASE ERROR.  (I put it in caps so it stands out!  If you meant to
> do this, maybe do it in a different bug.)

Yeah, sorry, it was a different patch. Removed.

> 
> ::: ipc/ipdl/sync-messages.ini
> @@ +920,5 @@
> >  description =
> >  [PBackgroundStorage::Preload]
> >  description =
> > +[PBackgroundLSDatabase::PBackgroundLSSnapshot]
> > +description =
> 
> Note that since bug 1340440 you'll need an IPC peer's review on these
> changes.  While it looks like convention is to not populate the description
> field here, I think it's appropriate to have comments in the IPDL file that
> document where/when the sync messages are used or references to comements
> elsewhere in the codebase.  Right now only PBackgroundLSSnapshot::Ping has
> such a comment.  (Which is a nice comment!)
> 
> I have in-flight comment suggestions if you want to wait on those.
> 

I added references here and I also added comments for remaining sync methods.
Attached patch Review fixes 1Splinter Review
Attachment #9027976 - Flags: review?(bugmail)
(In reply to Jan Varga [:janv] from comment #174)
> > ::: caps/ContentPrincipal.h
> > @@ +49,5 @@
> > >    extensions::WebExtensionPolicy* AddonPolicy();
> > >  
> > >    nsCOMPtr<nsIURI> mDomain;
> > >    nsCOMPtr<nsIURI> mCodebase;
> > > +  bool mStrictFileOriginPolicy;
> > 
> > If retained, please copy the general comment we have about the universal
> > file origin here (that was also duplicated in ContentPrincipal.cpp).
> > 
> > But it's not clear this should be retained.  The existing idiom seems to be
> > to check nsScriptSecurityManager::GetStrictFileOriginPolicy() all over the
> > place rather than latching the value.  If there's a necessary reason to
> > latch the value I think there should be a comment here explaining that and
> > the existing non-static callers (ContentPrincipal::MayLoadInternal) should
> > be updated.
> > 
> > A concern I have is that this needs to be serialized/deserialized as part of
> > PrincipalInfo to be truly safe.  This seems like a band-aid to work-around a
> > problem in tests that are partially broken.  It seems better to just fix the
> > tests.  If you do want to keep this, I think you may want to consult with
> > :baku on serializing the value and whether the change is necessary.
> > 
> > A comment might be:
> > """
> > The "security.fileuri.strict_origin_policy" preference controls whether all
> > files are considered to be same-origin (false) or not.  This value is
> > exposed via nsScriptSecurityManager::GetStrictFileOriginPolicy() and is
> > allowed to change at runtime.  In order to guarantee consistent behavior for
> > the holder of the principal for subsystems like QuotaManager in the face of
> > a pref change, it is necessary to latch this value when the principal is
> > created.
> > 
> > SOMETHING ABOUT WHY this doesn't get serialized/deserialized in
> > PrincipalInfo.
> > """
> 
> See comment 74, comment 75 and comment 76.
> Unfortunatelly the try link doesn't work anymore, but I'll test in on try
> again if we still need to latch the value.

I think it still needs to be latched, take a look here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=bbfd3e2e26a744494f7ee7522d10558f4f0991cb&selectedJob=214186615
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=214186615&repo=try&lineNumber=39299
the failing test is: filter-extref-differentOrigin-01.svg
It has something to do with document groups.

also see:
https://searchfox.org/mozilla-central/source/layout/reftests/svg/reftest.list#205

So document was already created with a doc group before the pref was set.
Then GetDocGroup was called and it found a mismatch because GetBaseDomain returned a different result.

I don't know, maybe MayLoadInternal should use the latched value too (and be serialized too) ?

Who should decide this ?
Blocks: 1510410
(In reply to Andrew Sutherland [:asuth] from comment #163)
> ::: modules/libpref/init/all.js
> @@ +1248,5 @@
> >  pref("dom.disable_open_click_delay", 1000);
> >  pref("dom.serviceWorkers.disable_open_click_delay", 1000);
> >  
> >  pref("dom.storage.enabled", true);
> > +#ifdef NIGHTLY_BUILD
> 
> Please add a comment here that explains what this pref does and links to the
> bug to pref on by default everywhere.

Done.

> 
> ::: toolkit/components/cleardata/ClearDataService.js
> @@ +292,5 @@
> >  const QuotaCleaner = {
> >    deleteByPrincipal(aPrincipal) {
> > +    if (!Services.lsm.nextGenLocalStorageEnabled) {
> > +      // localStorage
> > +      Services.obs.notifyObservers(null, "browser:purge-domain-data",
> 
> This block wants a comment along the lines of the one that wants to exist in
> `test_removeDataFromDomain.js`.  A short simple option might be:
> 
> The legacy LocalStorage implementation that will eventually be removed
> depends on this observer notification to clear by principal.  Only generate
> it if we're using the legacy implementation.

Added.

> 
> @@ +317,5 @@
> >  
> >    deleteByHost(aHost, aOriginAttributes) {
> > +    if (!Services.lsm.nextGenLocalStorageEnabled) {
> > +      // localStorage
> > +      Services.obs.notifyObservers(null, "browser:purge-domain-data", aHost);
> 
> same deal with a comment here.

Added.

> 
> @@ +338,5 @@
> >                                       .createCodebasePrincipal(httpsURI, aOriginAttributes);
> > +        let promises = [];
> > +        promises.push(new Promise(aResolve => {
> > +          let req = Services.qms.clearStoragesForPrincipal(httpPrincipal, null, null, true);
> > +          req.callback = () => { aResolve(); };
> 
> We need to be rejecting or setting "exceptionThrown" to true in all three of
> these clearing cases in event of an error.  If we somehow can't reject or
> these errors are somehow benign, there needs to be comments here and/or
> references to bugs to address the situation.  NB: I do realize the existing
> code was ignoring the possibility of errors.

Ok, I added rejecting everywhere.

> 
> @@ +346,5 @@
> > +          req.callback = () => { aResolve(); };
> > +        }));
> > +        if (Services.lsm.nextGenLocalStorageEnabled) {
> > +          promises.push(new Promise(aResolve => {
> > +            Services.qms.getUsage(aRequest => {
> 
> Please add a comment that explains the mechanism here.  For example:
> deleteByHost has the semantics that "foo.example.com" should be wiped if we
> are provided an aHost of "example.com".  QuotaManager doesn't have a way to
> directly do this, so we use getUsage() to get a list of all of the origins
> known to QuotaManager and then check whether the domain is a sub-domain of
> aHost.

Added.

> 
> @@ +347,5 @@
> > +        }));
> > +        if (Services.lsm.nextGenLocalStorageEnabled) {
> > +          promises.push(new Promise(aResolve => {
> > +            Services.qms.getUsage(aRequest => {
> > +              if (aRequest.resultCode != Cr.NS_OK) {
> 
> Please add a comment explaining why this would happen and what this means in
> terms of clearing user data in addition to the rejection mentioned above. 
> It might also make sense to merge this comments with the above in a block
> comment.  For example:
> 
> In the event of a failure, QuotaManager failed to initialize.  QuotaManager
> reports such failures via telemetry and the profile problem should be
> addressed via that mechanism.  Additionally, this method will reject to
> propagate the error upwards.

I added a simple comment above. I didn't mention quota manager storage init
failure nor telemetry, because I think getUsage() and clearStoragesForPrincipal()
don't need to init storage.

> 
> @@ +356,5 @@
> > +              let promises = [];
> > +              for (let item of aRequest.result) {
> > +                let principal = Services.scriptSecurityManager.createCodebasePrincipalFromOrigin(item.origin);
> > +                let originNoSuffix = principal.originNoSuffix;
> > +                if (eTLDService.hasRootDomain(originNoSuffix, aHost)) {
> 
> originNoSuffix will look like "https://foo.example.com" while aHost will
> look like "example.com".  Given how HasRootDomain is written, this currently
> works, but I think you should be using `principal.URI.host` instead.

Ok, changed to principal.URI.host.

> 
> @@ +359,5 @@
> > +                let originNoSuffix = principal.originNoSuffix;
> > +                if (eTLDService.hasRootDomain(originNoSuffix, aHost)) {
> > +                  promises.push(new Promise(aResolve => {
> > +                    let clearRequest = Services.qms.clearStoragesForPrincipal(principal, null, "ls");
> > +                    clearRequest.callback = () => { aResolve(); };
> 
> The request has a `result` so you should be either using it or commenting
> why you're not using it.  Again for the above comments, this probably wants
> to potentially reject.

Ok.

> 
> @@ +427,1 @@
> >                    req.callback = () => { aResolve(); };
> 
> Error handling or comments here as well, please.

Ok.

> 
> ::: toolkit/components/extensions/Extension.jsm
> @@ +253,5 @@
> >        Services.qms.clearStoragesForPrincipal(storagePrincipal);
> >  
> >        ExtensionStorageIDB.clearMigratedExtensionPref(addon.id);
> >  
> > +      if (!Services.lsm.nextGenLocalStorageEnabled) {
> 
> This should have a comment explaining that the qms clear above is taking
> care of this.

Yeah, added a comment.

> 
> ::: toolkit/forgetaboutsite/test/unit/test_removeDataFromDomain.js
> @@ +313,5 @@
> >    check_permission_exists(TEST_URI, false);
> >  }
> >  
> >  function waitForPurgeNotification() {
> > +  if (Services.lsm.nextGenLocalStorageEnabled) {
> 
> As I dig into the history of the existence of this method[1], it seems like
> this method dates from when this test was driven by a generator and the
> method may not need to exist anymore.  Specifically, it seems like the code
> wanted to make sure that SiteDataManager got to notifying
> "browser:purge-domain-data" but that directly invoking the generator at that
> point would cause the test to resume running and this would happen before
> the LocalStorage logic got a chance to run, hence the explicit turn of the
> event loop.
> 
> Given all the refactorings and cleanups that have happened to introduce
> nsIClearDataService and make it properly async, I think you could probably
> remove waitForPurgeNotification() and calls to it and everything would still
> work.  That's a little bit of scope creep though, so if you don't want to
> try that maybe add a block comment to this method like:
> 
> This method probably can be removed.  It stems from the semantics of
> direct-driven generators and building on top of an ordering hack involving
> synchronous observer notifications.  The hack still works, it's just not
> needed, and in the case of next-generation LocalStorage, the observer
> notification will never be generated, so we just resolve() directly.  Please
> clean this code up, kind reader!
> 
> 1: https://hg.mozilla.org/integration/mozilla-inbound/rev/acf32763c61d#l5.66

Yeah, removed.
Attached patch Review fixes 2Splinter Review
Attachment #9028099 - Flags: review?(bugmail)
Attachment #9028099 - Attachment is patch: true
Attachment #9028099 - Attachment mime type: application/octet-stream → text/plain
(In reply to Jan Varga [:janv] from comment #174)
> I get this error:
> WebIDL.WebIDLError: error: Unknown extended attribute Pref on partial
> interface

Hm, maybe the spec allows it but our binding generator doesn't.  Okay.
(In reply to Jan Varga [:janv] from comment #176)
> (In reply to Jan Varga [:janv] from comment #174)
> > > ::: caps/ContentPrincipal.h
> > > @@ +49,5 @@
> > > >    extensions::WebExtensionPolicy* AddonPolicy();
> > > >  
> > > >    nsCOMPtr<nsIURI> mDomain;
> > > >    nsCOMPtr<nsIURI> mCodebase;
> > > > +  bool mStrictFileOriginPolicy;
> > > 
> > > If retained, please copy the general comment we have about the universal
> > > file origin here (that was also duplicated in ContentPrincipal.cpp).
> > > 
> > > But it's not clear this should be retained.  The existing idiom seems to be
> > > to check nsScriptSecurityManager::GetStrictFileOriginPolicy() all over the
> > > place rather than latching the value.  If there's a necessary reason to
> > > latch the value I think there should be a comment here explaining that and
> > > the existing non-static callers (ContentPrincipal::MayLoadInternal) should
> > > be updated.
> > > 
> > > A concern I have is that this needs to be serialized/deserialized as part of
> > > PrincipalInfo to be truly safe.  This seems like a band-aid to work-around a
> > > problem in tests that are partially broken.  It seems better to just fix the
> > > tests.  If you do want to keep this, I think you may want to consult with
> > > :baku on serializing the value and whether the change is necessary.
> > > 
> > > A comment might be:
> > > """
> > > The "security.fileuri.strict_origin_policy" preference controls whether all
> > > files are considered to be same-origin (false) or not.  This value is
> > > exposed via nsScriptSecurityManager::GetStrictFileOriginPolicy() and is
> > > allowed to change at runtime.  In order to guarantee consistent behavior for
> > > the holder of the principal for subsystems like QuotaManager in the face of
> > > a pref change, it is necessary to latch this value when the principal is
> > > created.
> > > 
> > > SOMETHING ABOUT WHY this doesn't get serialized/deserialized in
> > > PrincipalInfo.
> > > """
> > 
> > See comment 74, comment 75 and comment 76.
> > Unfortunatelly the try link doesn't work anymore, but I'll test in on try
> > again if we still need to latch the value.
> 
> I think it still needs to be latched, take a look here:
> https://treeherder.mozilla.org/#/
> jobs?repo=try&revision=bbfd3e2e26a744494f7ee7522d10558f4f0991cb&selectedJob=2
> 14186615
> https://treeherder.mozilla.org/logviewer.html#/
> jobs?job_id=214186615&repo=try&lineNumber=39299
> the failing test is: filter-extref-differentOrigin-01.svg
> It has something to do with document groups.
> 
> also see:
> https://searchfox.org/mozilla-central/source/layout/reftests/svg/reftest.
> list#205
> 
> So document was already created with a doc group before the pref was set.
> Then GetDocGroup was called and it found a mismatch because GetBaseDomain
> returned a different result.
> 
> I don't know, maybe MayLoadInternal should use the latched value too (and be
> serialized too) ?
> 
> Who should decide this ?

So, there's a reference/comment to this test located at https://searchfox.org/mozilla-central/rev/0859e6b10fb901875c80de8f8fc33cbb77b2505e/layout/reftests/svg/reftest.list#205 that reads:
"""
# This pref is normally on by default, but we turn it off in reftest runs to
# disable an unnecessary security-check. This reftest is actually testing that
# the security check works, though, so it needs the pref to be turned on:
fails-if(Android) pref(security.fileuri.strict_origin_policy,true) == filter-extref-differentOrigin-01.svg pass.svg # Bug 695385
"""

Note that this line seems to be also causing the reftest runner logic at https://searchfox.org/mozilla-central/rev/0859e6b10fb901875c80de8f8fc33cbb77b2505e/layout/tools/reftest/reftest.jsm#704 to trigger the preference change, which is where we get the line "REFTEST INFO | SET PREFERENCE pref(security.fileuri.strict_origin_policy,true)" in the test and which presumably is directly the result of the "Assertion failure: mDocGroup->MatchesKey(docGroupKey), at /builds/worker/workspace/build/src/dom/base/nsDocument.cpp:3272" due to such a fundamental invariant being changed at runtime without flushing the bfcache, etc.

Bug 695385 is explicitly about creating a mochitest to replace that test.  Because indeed, it does seem sketchy to be using file URI semantics to test cross-origin logic.  Moreso, I do want to emphasize that changing that pref at runtime without good hygiene has become increasingly dubious through the years.

And there's a comment in the test at https://searchfox.org/mozilla-central/rev/0859e6b10fb901875c80de8f8fc33cbb77b2505e/layout/reftests/svg/filter-extref-differentOrigin-01.svg#8 that similarly suggests this is a sketchy thing going on.
"""
  <!-- In case we actually _can_ access the external resource (e.g. on Android
       where reftests are served as http:// instead of file://), we include
       this <use> element to be sure the resource loads before onload so that
       we'll fail reliably. -->
"""

It looks like the hand-waving for why the pref gets turned off is found at https://searchfox.org/mozilla-central/rev/0859e6b10fb901875c80de8f8fc33cbb77b2505e/testing/profiles/reftest/user.js#88
"""
// Checking whether two files are the same is slow on Windows.
// Setting this pref makes tests run much faster there. Reftests also
// rely on this to load downloadable fonts (which are restricted to same
// origin policy by default) from outside their directory.
user_pref("security.fileuri.strict_origin_policy", false);
"""

:dholbert seems to be a good person to ping about the test test/infra given his creation of bug 695385 (8 years ago ;) and that his phonebook entry still has the word "layout" in it :).  dholbert, can you weigh in on the easiest way to avoid having this massively hacky pref-flip?  It looks like web-platform-tests implements reftests and doesn't depend on file URL serving.  Perhaps we could move the test into https://searchfox.org/mozilla-central/source/testing/web-platform/mozilla/tests where we store Mozilla tests that aren't intended for upstreaming and could use an explicitly cross-origin domain instead of using "../" and depending on the wacky file URI semantics to make that be cross-origin?

In terms of the implementation choice, :baku seems like a good person to consult about this.  :baku, per the above, I think my stance continues to be that we should not make the implementation change to latch the state of the strict file URI policy to the origin because it's a super-big-testing-only hack.  Hopefully you agree, and maybe you also have some suggestions about how to help make the reftest's hack no longer a problem.
Flags: needinfo?(dholbert)
Flags: needinfo?(amarchesini)
Attachment #9027976 - Flags: review?(bugmail) → review+
Attachment #9028099 - Flags: review?(bugmail) → review+
(In reply to Andrew Sutherland [:asuth] from comment #8)
> I'm updating the subject to make this easier to find, and I want to give a
> dump of my current understanding of :janv's plan, as I think the design has
> a lot of wins, but there are trade-offs involved and we want any
> discussion/debate to happen sooner rather than later.

The actual implementation differs a bit from the original plan, so I'm going
to provide some updates for it.

> 
> ## Plan
> My mental model of the overhaul plan is (Jan, please correct me if I'm
> wrong!):
> - All (canonical) LocalStorageCache data state is held in the parent process.

Right, and the corresponding class is called `Datastore` in the parent.

> - We have actors per-LocalStorageCache which are per-principal (that is,
> origin+attributes) instead of the current "one actor per process that is
> conceptually just a remoted database connection" so that we can scope events
> properly.

Right, the corresponding class is `Database` in the parent.
However, we create only one database actor per content process for performance
reasons.

> - All localStorage reads happen via sync IPC to PBackground.  (Or, if
> necessary, a new PLocalStorageIsDumbAndNeedsItsOwnThread.)

We found out that performance is bad with this approach (each sync IPC call
needs to dispatch a runnable to an internal IPC thread, the calling thread is
then blocked using a monitor, something similar happens on the other side and
there's some other overhead; all in all, the notable sync IPC overhead can't be
avoided with current IPC implementation).
We decided to implement a snapshotting mechanism that greatly improves performance
and also correctness since run-to-completion semantics are honored (if JS calls
LS API, the first call creates a snapshot which captures datastore state and that
state is then used in content until we get to a stable point). We've done many
optimizations to improve snapshotting to make it really fast (for example, saving
modified values lazily, prefilling initial snapshot with items, reusing snapshots
if they are not "dirty", checkpointing changes in batches, pre-allocating quota
usage, etc.)
Regarding the special thread in the parent, we haven't found a reason to create a
new PBackground like thread in the parent.

> - localStorage writes probably also happen via sync IPC[1].

When we get to a stable state, changes are asynchronously checkpointed to the
parent. The only thing that needs to be (sometimes) synchronous is quota
checking. However, due to quota usage pre-allocation, this happens only sometimes
when pre-allocated quota usage can't fulfill given write request.

> - Events happen via async IPC if there are any "storage" event listeners in
> a content process.  The subscription should happen synchronously for maximum
> correctness, but it's fine to subscribe async under the spec.  This means
> that a change can be perceived by reading localStorage for which a "storage"
> event has not yet been received.  This is fine and consistent with how
> LocalStorage already works in Chromium/Blink (and probably webkit too?). 
> It's also how we've worked for a while.

Yes, storage events happen via async IPC if there are any "storage" event
listeners in a content process. There's a special per origin `Observer` actor
independent from the `Database` actor.
The subscription is synchronous.

> - We use QuotaManager, but the origin never gets more than 5MiB of
> LocalStorage space (+slack) no matter what.

Yes, we use QuotaManager which was quite challenging given LS is synchronous
API and QuotaManager is all async.
We don't have "slack" quota, but we do have the usage pre-allocation. So we never
allow more than 5 MB (or quota manager's group limit).
The usage is still the logical usage (key.length + value.length), not physical
size of sqlite databases on disk.
LS writes can also trigger origin eviction.

> 
> 1: At least some writes could be optimized to be async if we're willing to
> allow the origin to exceed quota.  For example, Chromium allows quota to be
> exceeded by 100k because of multi-process races.  Our implementation could
> similarly allow each process some "slack" quota that permits async write
> messages up to that quota (and assuming no space would be freed by
> overwrites) as long as we're (sufficiently) under quota.  If a write would
> exceed the slack size, a sync write is performed and resets the
> outstandingSlackBytes to 0.  In the best-case scenario,
> outstandingSlackBytes never exceeds the threshold, and is continually
> re-zeroed by async IPC messages that confirm the writes and current quota
> usage for the given cache.

We decided for the quota usage pre-allocation.

> 
> ## Upsides
> We get the following wins out of this implementation strategy versus our
> current strategy:
> - Reduced memory usage.  Right now each browser process that has a
> window/iframe loaded for an origin that uses LocalStorage will carry the
> memory burden of the origin's entire LocalStorage usage in every process
> with the window/iframe loaded.  For sites that use localStorage and are very
> popular or are embedded as 3rd-party iframes (and assuming we don't
> double-key them), this could be a lot of memory[2].

Yes, and it also fixes many long-standing problems with cache synchronization
across content processes.

> - Globally consistent, sequential view of localStorage across all processes
> (unless we make any caching optimizations).  The HTML spec provides us
> basically infinite wiggle room when multiple processes are involved, but I
> know of at least one instance of LocalStorage being used by a site as a
> mutex between multiple windows.  There are arguably no downsides 

Well, I would still call it globally consistent, given that it would be weird
for a JS function that does multiple get/set operations to see changes from
other processes.

> - Improved page load performance where the origin has a ton of data stored
> in localStorage but only needs a small amount of it near startup.  If the
> origin has 5MiB stored and it uses localStorage at all, all 5 MiB needs to
> be loaded from disk and received by the thread responsible for storage.  And
> if we go synchronous (in the current implementation), that could mean a
> simple boolean check has to wait on 5 megs of data.  Based on our current DB
> strategy, the load is still unavoidable, but we avoid having to ship it over
> IPC or process that on the content process main thread.

Yes, and we trigger pre-loading sooner and directly in the parent process, in
ContentParent::AboutToLoadHttpFtpWyciwygDocumentForChild

> 
> 2: Note that this regressed in bug 1285898 out of necessity.  Previously,
> we'd drop the storage if a page didn't actually use LocalStorage within ~20
> seconds.  This could be added back without the overhaul by addressing bug
> 1350071.

Yes, we added it back.

I would add some other upsides:
- datastore preloading doesn't synchronously block the PBackground thread
- it works correctly with multiple private windows for the same origin
- quota is shared with IndexedDB and DOM Cache
- unified usage/storage reporting in preferences UI
- less prone to database corruption (if it happens, it only affects given origin)
- data is evicted along with IndexedDB and DOM Cache
- unified API for origin clearing
- navigator.storage.estimate() also includes LS
- a good base for reworking sessionStorage to persist data for session restore
  (needed for fission)

> ## Downsides
> - Synchronous IPC is synchronous IPC, and this approach means more of it,
> even if it's in smaller bite-sized pieces.  Although we can mitigate by
> letting Quantom DOM context-switch to other TabGroups, it's not a win if
> we're perceptibly stalling a foreground TabGroup the user cares about.

Yeah, but we mitigated this by implementing the snapshotting.

> 
> ## Potential Mitigations
> - Use some kind of clever multi-process shared-memory map to make all reads
> except the preload non-blocking, and all writes async.  This would increase
> complexity and lose the globally consistent sequence.

I investigated using a shared-memory for the cache, but it turned out to be
quite hard and complex.

> - Create a second mode of operation for localStorage-heavy origins.  Once an
> activity threshold is crossed,  cache the entirety of the state in the
> content process and rely on the "storage" events (which must be subscribed
> to even if not delivered to content pages) to update the local cache.  If
> there's heavy write-load, consider coalescing the writes locally by tracking
> dirty keys and flushing on a timer, much like we do in the database thread
> already.

Yeah, the snapshotting is like a micro-cache, we can prefill it with all the
values if LS is relatively small and then operations in content are really fast
(no need for sync IPC). The cache doesn't need to be synchronized, because it's
not worth it, we just throw it away and create a new one when needed.

There are other downsides, some of them were already mitigated:
1. Conversion of old database to per origin databases.
This was mitigated by doing the conversion on demand/lazily.

2. Initial quota manager's storage directory sweep can take longer.
This hasn't been mitigated yet, but according to our testing and testing done by
QA, there are no signs of big performance problems so far. However, this is
something we should monitor after landing/flipping the pref. We have a telemetry
for that.

3. Due to database shadowing, profiles will contain redundant data.
The shadowing is needed, so we can enable/disable new implementation anytime
without losing any data (even if a profile is used with older FF versions).
However, this is just temporary until we get rid of the old implementation.
Ok, I'm now going to incorporate review fixes into corresponding patches.
After that, I'll request reviews for Storage.webidl and sync-messages.ini.
Hopefully, we solve the strict file origin policy soon, so I can land it (preffed-off) this week.
Comment on attachment 9028262 [details] [diff] [review]
Final folded patch

Blake, could you please review Storage.webidl changes ?

This is a folded patch, but we will be landing individual patches, you can find them here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d634ea50e9c0af40166fdce5aad783cce032b217

Storage.webidl changes we made gradually in these patches:
Part 18 https://hg.mozilla.org/try/rev/174d6c04d756
Part 31 https://hg.mozilla.org/try/rev/f6860b5fcf78
Part 53 https://hg.mozilla.org/try/rev/e8c6c1ba618f
Attachment #9028262 - Flags: review?(mrbkap)
Comment on attachment 9028262 [details] [diff] [review]
Final folded patch

Andrew, could you please review sync-messages.ini changes ?

This is a folded patch, but we will be landing individual patches, you can find them here:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=d634ea50e9c0af40166fdce5aad783cce032b217

sync-messages.ini changes we made gradually in these patches:
Part 3 https://hg.mozilla.org/try/rev/8363a5e52f54
Part 29 https://hg.mozilla.org/try/rev/23b6e5feb512
Part 31 https://hg.mozilla.org/try/rev/f6860b5fcf78
Part 53 https://hg.mozilla.org/try/rev/e8c6c1ba618f
Attachment #9028262 - Flags: review?(continuation)
I think we should try to remove that pref completely.
We are very inconsistent in how NS_SecurityCompareURIs() is used:

Sometimes it's true (both are correct because they have a previous check or because they are dealing with HTTP URLs):
https://searchfox.org/mozilla-central/rev/0859e6b10fb901875c80de8f8fc33cbb77b2505e/dom/security/nsCSPContext.cpp#1500
https://searchfox.org/mozilla-central/rev/0859e6b10fb901875c80de8f8fc33cbb77b2505e/netwerk/protocol/http/nsHttpChannelAuthProvider.cpp#1015

Sometimes it's false:
https://searchfox.org/mozilla-central/rev/0859e6b10fb901875c80de8f8fc33cbb77b2505e/uriloader/prefetch/OfflineCacheUpdateParent.cpp#114

Sometimes we cache it:
https://searchfox.org/mozilla-central/rev/0859e6b10fb901875c80de8f8fc33cbb77b2505e/uriloader/prefetch/nsOfflineCacheUpdate.cpp#1127-1133

About storing the pref in the principal, I don't think it's the right way to go. We need to find a policy about comparing 2 principals with 2 different pref values and that would be like an hack on top of an hack.

I prefer to see the tests fixed.
Flags: needinfo?(amarchesini)
(In reply to Jan Varga [:janv] from comment #72)
> Created attachment 8957324 [details] [diff] [review]
> Part 21: Base domain needs to be handled too if strict file origin policy is
> not in effect
> 
> Bobby, the universal file origin was introduced in bug 1340710.
> The base domain is currently not handled if strict file origin policy is not
> in effect which confuses quota manager.
> I noticed this problem when I was testing new local storage implementation,
> but I guess it could already cause (hidden) problems with IndexedDB too.

I forgot to mention that it's not just the ref test which is failing, there's a problem
with quota manager too. I forgot details, but I'm running additional tests on try to find
out what the problem was.
I'm waiting for more data from try server, but it seems it's just the test that needs the latched value.
Can we just disable the test for now ?
> Can we just disable the test for now ?

I suggest to disable the test. Let's file a bug to re-enable the test after bug 695385.
Comment on attachment 9028262 [details] [diff] [review]
Final folded patch

Review of attachment 9028262 [details] [diff] [review]:
-----------------------------------------------------------------

r=me for the sync-messages.ini change
Attachment #9028262 - Flags: review?(continuation) → review+
Attachment #9028262 - Flags: review?(mrbkap) → review+
Daniel, can you review the test disabling ?
This is the last thing that blocks LSNG from landing. I would like to land it today or tomorrow.
Attachment #9028591 - Flags: review?(dholbert)
This shows up in my latest try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=019900fadd4ed2f12dd0566f8fc0ce12cf32395a

I tried to fix it, but it's a mess, I would rather disable it. It's disabled on linux and windows anyway.
Attachment #9028684 - Flags: review?(bugmail)
Attachment #9028684 - Attachment is patch: true
Attachment #9028684 - Attachment mime type: application/octet-stream → text/plain
Attachment #9028684 - Flags: review?(bugmail) → review+
This also shows up in my latest try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=019900fadd4ed2f12dd0566f8fc0ce12cf32395a

The test is not very stable, there are many intermittent failures already, bug 1436395, so the problem is even worse in verify mode.
Attachment #9028699 - Flags: review?(bugmail)
Comment on attachment 9028699 [details] [diff] [review]
Disable browser/components/preferences/in-content/tests/siteData/browser_siteData.js in verify mode

Review of attachment 9028699 [details] [diff] [review]:
-----------------------------------------------------------------

Restating: The test is flakey and already has an intermittent that fails in this manner and is actively being looked into on bug 1415037 by :jaws so this is being handled and disabling "verify" runs is reasonable.  I've also cc'ed myself on that bug to help ensure the test is made reliable, because it is an important test.
Attachment #9028699 - Flags: review?(bugmail) → review+
Comment on attachment 9028591 [details] [diff] [review]
Part 21: Base domain needs to be handled too if strict file origin policy is not in effect

Review of attachment 9028591 [details] [diff] [review]:
-----------------------------------------------------------------

::: layout/reftests/svg/reftest.list
@@ +203,4 @@
>  # This pref is normally on by default, but we turn it off in reftest runs to
>  # disable an unnecessary security-check. This reftest is actually testing that
>  # the security check works, though, so it needs the pref to be turned on:
> +#fails-if(Android) pref(security.fileuri.strict_origin_policy,true) == filter-extref-differentOrigin-01.svg pass.svg # Bug 695385

r=me on the reftest tweak -- though rather than commenting it out, please replace
  fails-if(Android)
with
  "skip"

So it should look like:
skip pref(...etc)

I'll take a look today at rewriting this test in bug 695385.
Attachment #9028591 - Flags: review?(dholbert) → review+
(In reply to Daniel Holbert [:dholbert] from comment #196)
> r=me on the reftest tweak -- though rather than commenting it out, please
> replace
>   fails-if(Android)
> with
>   "skip"
> 
> So it should look like:
> skip pref(...etc)

Ok, changed.
Final try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=39da6866129eac6d9b42b2c9093da98e93712ea2

I'm going to land on mozilla inbound in a moment.
Pushed by jvarga@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/c58b7f9f8773
Part 1: Allow BackgroundChild::GetOrCreateForCurrentThread() to use a custom main event target; r=mccr8,asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb1fc7fd2fd2
Part 2: Add IsOnDOMFileThread() and AssertIsOnDOMFileThread() generic helper methods; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/2a6e7e64cec1
Part 3: New basic (memory only) implementation of LocalStorage; r=asuth,mccr8
https://hg.mozilla.org/integration/mozilla-inbound/rev/2aaa8d6bbaf1
Part 4: Basic integration with QuotaManager; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/25c5ff3cec8f
Part 5: More integration with QuotaManager; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/0b0a03cf7b64
Part 6: Fix a dead lock in the single process case; r=asuth,janv
https://hg.mozilla.org/integration/mozilla-inbound/rev/d5f293542508
Part 7: Teach QuotaManager's OriginParser to handle file://UNIVERSAL_FILE_URI_ORIGIN; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/1a6c4f6a6b5c
Part 8: Persist datastores to disk; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/30bf6870616b
Part 9: Support for private browsing; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/75c28b78f8ee
Part 10: Support for storage events; r=asuth,janv
https://hg.mozilla.org/integration/mozilla-inbound/rev/4e938b6813e3
Part 11: Enable tests for session only mode (but only for the old local storage implementation); r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/a4324eecfb06
Part 12: Honor the storage preference and cookie settings in all LocalStorage API methods; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/c63189db48cb
Part 13: Preparation for quota checks; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/20606f9de28f
Part 14: Enhance clearStoragesForPrincipal() to support clearing of storages for specific quota client; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb1594f32fae
Part 15: Fix clearLocalStorage() in browser extensions; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/22ede10fb76d
Part 16: Adjust ClearDataService for new local storage implementation; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/03b477c00c4f
Part 17: Fix a test failing in --verify mode; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/61ce2d795e8e
Part 18: Verify that data is persisted on disk; r=asuth,mrbkap
https://hg.mozilla.org/integration/mozilla-inbound/rev/5714205a7bf9
Part 19: Implement a helper method for datastore preloading verification; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/615640a58a97
Part 20: Add checks for the 5 MB origin limit; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/bb3a3cc424cd
Part 21: Base domain needs to be handled too if strict file origin policy is not in effect; r=bholley,asuth,dholbert
https://hg.mozilla.org/integration/mozilla-inbound/rev/076cc59fa793
Part 22: Add support for preloading of datastores; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/bd8fd9a22dc0
Part 23: Fix GetOrCreateForCurrentThread() when the runnable for calling SendInitBackground() on the main thread was already dispatched, but the main thread then entered a nested event loop; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/c0d40572c29b
Part 24: A new exclusive directory lock shouldn't invalidate existing internal directory locks; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/8c37e877e231
Part 25: Add checks for the group and global limit; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/2620df4a91da
Part 26: Implement a lazy data migration from old webappsstore.sqlite; r=asuth,janv
https://hg.mozilla.org/integration/mozilla-inbound/rev/e75a175ddf88
Part 27: Share database actors; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/17a4f3ac425a
Part 28: Add more QuotaClient::IsShuttingDownOnBackgroundThread() and MayProceed() checks; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/70ba8b2410f8
Part 29: Implement implicit snapshotting of databases; r=asuth,mccr8
https://hg.mozilla.org/integration/mozilla-inbound/rev/e2b5bee9812c
Part 30: Preserve key order when sending items to a content process; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/bc288ab2655c
Part 31: Support for lazy loading of items; r=asuth,mrbkap,mccr8
https://hg.mozilla.org/integration/mozilla-inbound/rev/1309fe77cfaa
Part 32: Add a test for snapshotting verification in multi-e10s setup; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/eba8447d3935
Part 33: Restrict localStorage from being available on some pages; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/3adfef6668aa
Part 34: Queue database operations until update batch ends; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/6214aafc061f
Part 35: Implement database shadowing; r=asuth,janv
https://hg.mozilla.org/integration/mozilla-inbound/rev/d5f866efde44
Part 36: Allow snapshot initialization to a specific load state; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/52e5be69837d
Part 37: Always preallocate quota when initializing a snapshot; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/ef64949fc1aa
Part 38: Cache items in an array; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/4f699604c960
Part 39: Reduce number of hash lookups; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/f2cdc8e4ef39
Part 40: Increase initial snapshot prefill to 16KB; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/270fc081f01f
Part 41: Implement QuotaClient::AbortOperationsForProcess; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/398f80b485a9
Part 42: Implement snapshot reusing; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/5dbbe4015d16
Part 43: Coalesce database operations before they are applied to disk; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/c29b8b8fbf4d
Part 44: Switch Connection to use WriteOptimizer too; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/fc0bce55c20d
Part 45: Delay flushing to disk using a timer; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/6710732a8d07
Part 46: Add a pref for database shadowing; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/b9badcc1709d
Part 47: Add AboutToClearOrigins() method to the quota client interface; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/f0721a859c74
Part 48: Add ParseOrigin() method to the quota manager; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/66d175df8106
Part 49: Add clearStoragesForPattern() method to the quota manager service; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/351e6a4eed9a
Part 50: Add support for clearing of the archive and shadow database; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/e21dc8a8b5b9
Part 51: Add tests for archive and shadow database clearing; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/d4212b87659b
Part 52: Rework tests to use async functions; r=asuth
https://hg.mozilla.org/integration/mozilla-inbound/rev/fd65b14ab7e8
Part 53: Review code comments; r=janv,mrbkap,mccr8
https://hg.mozilla.org/integration/mozilla-inbound/rev/11c709d6c6d9
Part 54: Disable LSNG by default; r=asuth
(In reply to Jan Varga [:janv] from comment #181)
> > - We use QuotaManager, but the origin never gets more than 5MiB of
> > LocalStorage space (+slack) no matter what.
> 
> Yes, we use QuotaManager which was quite challenging given LS is synchronous
> API and QuotaManager is all async.
> We don't have "slack" quota, but we do have the usage pre-allocation. So we
> never
> allow more than 5 MB (or quota manager's group limit).
> The usage is still the logical usage (key.length + value.length), not
> physical
> size of sqlite databases on disk.
> LS writes can also trigger origin eviction.

Forgot to mention that the 5MB limit is not used within eTLD+1 domains
anymore. It's been replaced by quota manager's group limit which uses
eTLD+1 too, but it's much bigger and dynamically set according to free
disk space. The maximum is currently 2GB.
It's the same thing that is used for IndexedDB and DOM Cache.
https://hg.mozilla.org/mozilla-central/rev/c58b7f9f8773
https://hg.mozilla.org/mozilla-central/rev/bb1fc7fd2fd2
https://hg.mozilla.org/mozilla-central/rev/2a6e7e64cec1
https://hg.mozilla.org/mozilla-central/rev/2aaa8d6bbaf1
https://hg.mozilla.org/mozilla-central/rev/25c5ff3cec8f
https://hg.mozilla.org/mozilla-central/rev/0b0a03cf7b64
https://hg.mozilla.org/mozilla-central/rev/d5f293542508
https://hg.mozilla.org/mozilla-central/rev/1a6c4f6a6b5c
https://hg.mozilla.org/mozilla-central/rev/30bf6870616b
https://hg.mozilla.org/mozilla-central/rev/75c28b78f8ee
https://hg.mozilla.org/mozilla-central/rev/4e938b6813e3
https://hg.mozilla.org/mozilla-central/rev/a4324eecfb06
https://hg.mozilla.org/mozilla-central/rev/c63189db48cb
https://hg.mozilla.org/mozilla-central/rev/20606f9de28f
https://hg.mozilla.org/mozilla-central/rev/bb1594f32fae
https://hg.mozilla.org/mozilla-central/rev/22ede10fb76d
https://hg.mozilla.org/mozilla-central/rev/03b477c00c4f
https://hg.mozilla.org/mozilla-central/rev/61ce2d795e8e
https://hg.mozilla.org/mozilla-central/rev/5714205a7bf9
https://hg.mozilla.org/mozilla-central/rev/615640a58a97
https://hg.mozilla.org/mozilla-central/rev/bb3a3cc424cd
https://hg.mozilla.org/mozilla-central/rev/076cc59fa793
https://hg.mozilla.org/mozilla-central/rev/bd8fd9a22dc0
https://hg.mozilla.org/mozilla-central/rev/c0d40572c29b
https://hg.mozilla.org/mozilla-central/rev/8c37e877e231
https://hg.mozilla.org/mozilla-central/rev/2620df4a91da
https://hg.mozilla.org/mozilla-central/rev/e75a175ddf88
https://hg.mozilla.org/mozilla-central/rev/17a4f3ac425a
https://hg.mozilla.org/mozilla-central/rev/70ba8b2410f8
https://hg.mozilla.org/mozilla-central/rev/e2b5bee9812c
https://hg.mozilla.org/mozilla-central/rev/bc288ab2655c
https://hg.mozilla.org/mozilla-central/rev/1309fe77cfaa
https://hg.mozilla.org/mozilla-central/rev/eba8447d3935
https://hg.mozilla.org/mozilla-central/rev/3adfef6668aa
https://hg.mozilla.org/mozilla-central/rev/6214aafc061f
https://hg.mozilla.org/mozilla-central/rev/d5f866efde44
https://hg.mozilla.org/mozilla-central/rev/52e5be69837d
https://hg.mozilla.org/mozilla-central/rev/ef64949fc1aa
https://hg.mozilla.org/mozilla-central/rev/4f699604c960
https://hg.mozilla.org/mozilla-central/rev/f2cdc8e4ef39
https://hg.mozilla.org/mozilla-central/rev/270fc081f01f
https://hg.mozilla.org/mozilla-central/rev/398f80b485a9
https://hg.mozilla.org/mozilla-central/rev/5dbbe4015d16
https://hg.mozilla.org/mozilla-central/rev/c29b8b8fbf4d
https://hg.mozilla.org/mozilla-central/rev/fc0bce55c20d
https://hg.mozilla.org/mozilla-central/rev/6710732a8d07
https://hg.mozilla.org/mozilla-central/rev/b9badcc1709d
https://hg.mozilla.org/mozilla-central/rev/f0721a859c74
https://hg.mozilla.org/mozilla-central/rev/66d175df8106
https://hg.mozilla.org/mozilla-central/rev/351e6a4eed9a
https://hg.mozilla.org/mozilla-central/rev/e21dc8a8b5b9
https://hg.mozilla.org/mozilla-central/rev/d4212b87659b
https://hg.mozilla.org/mozilla-central/rev/fd65b14ab7e8
https://hg.mozilla.org/mozilla-central/rev/11c709d6c6d9
Status: ASSIGNED → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla65
Created web-platform-tests PR https://github.com/web-platform-tests/wpt/pull/14305 for changes under testing/web-platform/tests
Whiteboard: [e10s-multi:-] → [e10s-multi:-][wptsync upstream]
No longer depends on: 1505798
(In reply to Andrew Sutherland [:asuth] from comment #180)
> Bug 695385 is explicitly about creating a mochitest to replace that test. 
> Because indeed, it does seem sketchy to be using file URI semantics to test
> cross-origin logic.

Actually, Bug 695385 is about *duplicating* that test, for robustness, so that we would end up testing *both of the following things*:
* Expectation #1: generally, pages can't use SVG filters from other origins (from the hypothetical mochitest).
* Expectation #2: more specifically, local files aren't allowed to use files in higher directories as SVG filters/etc (which we might allow them to do by mistake if we put too much faith in the IS_LOCAL_RESOURCE flag, as I did in a r-minus'ed patch from 7 years ago -- see  bug 686013 comment 19, "on the face of it IS_LOCAL_RESOURCE is the wrong thing here".  That's what prompted me to write this test).

In particular, we can't test (2) at all on Android, and that's what prompted me to file Bug 695385, so that we could at least test a more general security guarantee there.

> It looks like the hand-waving for why the pref gets turned off is found at [...]

Yeah -- I traced back the history and it seems we turned off the pref in bug 846890, specifically for perf-on-windows reasons (and, less importantly, for convenience of allowing reftests to share resources without having to add "HTTP" annotations in reftest.list)

> dholbert, can you weigh in on the
> easiest way to avoid having this massively hacky pref-flip?  It looks like
> web-platform-tests implements reftests and doesn't depend on file URL
> serving.  Perhaps we could move the test into [web-platform-tests]

Sadly, web-platform-tests do not use file:// URIs, so they can't test the more specific "Expectation #2".  Only reftests can do that, and they can only do that if we have this pref at its default (enabled/secure) setting.

I've filed bug 1511209 on perhaps restoring the pref to its default value on non-Windows platforms (I hope), so that we can reenable this reftest on most platforms.  And in the meantime, I'll still try to fix Bug 695385 so that we can test the more general "Expectation #1" (though I would very much still like to be able to test Expectation #2.)
Flags: needinfo?(dholbert)
Depends on: 1513553
See Also: → 1510410
MDN documentation for Web Storage needs to be updated along these lines:
- quota is shared with IndexedDB and DOM Cache (however, per origin limited is still 5MB)
- unified usage/storage reporting in preferences UI
- data is evicted along with IndexedDB and DOM Cache
- navigator.storage.estimate() also includes LS
- stored in per origin sqlite databases
- eTLD+1 groups are not limited by 5MB
Keywords: dev-doc-needed
Depends on: 1516277
Depends on: 1516282
Depends on: 1516018
Blocks: 1064466
Blocks: 1517090
Depends on: 1535298
No longer depends on: 1535298
Blocks: 1436231
Blocks: 1544466

Can you suggest a release note for 70 if you think it's a good idea to include this in the relnotes? Thanks!

Flags: needinfo?(jvarga)

I'm still on PTO officially and I'm not that good at writing this kind of stuff, maybe :asuth could help if he has time.

Flags: needinfo?(jvarga)
Flags: needinfo?(bugmail)
Flags: needinfo?(bugmail)

Suggested Release Note:

"""
LocalStorage now enforces quota on a per-origin basis. Previously "example.com", "foo.example.com", and "bar.example.com" all shared the same 5 megabyte limit even though they could not see each other's data. Now each origin has its own 5 megabyte limit. Memory and disk usage have also been improved.
"""

There have also been improvements in multi-process coherency under private browsing and when "Delete cookies and site data when Firefox is closed" is enabled, but it would be reasonable to expect those to work in the first place and the scenarios in play were very much edge cases.

Flags: needinfo?(bugmail)
No longer depends on: 1513553
Regressions: 1513553
Depends on: 1575891
Blocks: 1569296
Regressions: 1577202

Clearing this for now - feel free to re-request when LSNG is ready to ship.

See Also: → 1685495
You need to log in before you can comment on or make changes to this bug.