Closed Bug 1534222 Opened 6 years ago Closed 5 years ago

LSNG: Check for maximum message size before calling SendCheckpoint()

Tracking

()

Status:

RESOLVED WORKSFORME

Tracking Flags:

Tracking

Status

firefox-esr60

---

unaffected

firefox66

---

unaffected

firefox67

---

disabled

firefox67.0.1

---

disabled

firefox68

---

wontfix

firefox69

---

fix-optional

firefox70

---

fix-optional

People

(Reporter: janv, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: crash, regression)

Crash Data

Jan Varga [:janv]

Reporter

Description

•

6 years ago

See this crash:
https://crash-stats.mozilla.org/report/index/7c83afc2-0f17-44b3-92f6-792bd0190311

Jan Varga [:janv]

Reporter

Updated

•

6 years ago

Blocks: 1517090

Jan Varga [:janv]

Reporter

Updated

•

6 years ago

Crash Signature: [mozilla::dom::PBackgroundLSSnapshotChild::SendCheckpoint(nsTArray<mozilla::dom::LSWriteInfo> const&)]

Jan Varga [:janv]

Reporter

Updated

•

6 years ago

Crash Signature: [mozilla::dom::PBackgroundLSSnapshotChild::SendCheckpoint(nsTArray<mozilla::dom::LSWriteInfo> const&)] → [@ mozilla::ipc::ProcessLink::SendMessage | IPC_Message_Name=PBackgroundLSSnapshot::Msg_Checkpoint]

Yaron Tausky [:ytausky]

Updated

•

6 years ago

Priority: -- → P2

Jan Varga [:janv]

Reporter

Updated

•

6 years ago

Assignee: nobody → jvarga

Priority: P2 → P1

Pascal Chevrel:pascalc

Updated

•

6 years ago

status-firefox66: --- → unaffected

status-firefox68: --- → affected

status-firefox-esr60: --- → unaffected

Keywords: crash, regression

Jan Varga [:janv]

Reporter

Updated

•

6 years ago

Status: NEW → ASSIGNED

Andrew Overholt [:overholt]

Updated

•

6 years ago

Blocks: 1539835

Jan Varga [:janv]

Reporter

Updated

•

6 years ago

No longer blocks: 1517090

Jan Varga [:janv]

Reporter

Updated

•

6 years ago

Blocks: 1540402

Neha Kochar [:neha]

Comment 1

•

6 years ago

Jan, could you please provide an estimate timeline for the patch?

Flags: needinfo?(jvarga)

Jan Varga [:janv]

Reporter

Comment 2

•

6 years ago

(In reply to Neha Kochar [:neha] from comment #1)

Jan, could you please provide an estimate timeline for the patch?

Couple of days, likely next Monday.

Flags: needinfo?(jvarga)

Jan Varga [:janv]

Reporter

Comment 3

•

6 years ago

•

Edited

There's still the 5 MB limit, but a page can cause a lot of local storage changes during execution of a JS function.
The limit is not reached in the end, but all the changes can create a huge "database log".

For example, this loop will cause the crash (max message size is 256MB):
for (var i = 0; i < 26; i++) {
randomlyChange(largeAlmost5MegaByteBuffer);
localStorage.setItem("foo", largeAlmost5MegaByteBuffer);
}

Andrew, here are options for fixing this bug, we need to decide which approach to take:

Use the WriteOptimizer on the child side too
This would decrease memory requirements I believe, but so far we always distributed all storage events across content processes. This could regress some web sites.
Use data compression
This would also decrease memory requirements, but it's not trivial to implement and we need to watch for max message size anyway.
Support sending partial checkpoints
The child would have to create a new separate write info array, once the size of existing array is close to max message size.
Then when the stable state callback is called, we call SendCheckpoint multiple times and the last one would signalize that it's the last message. Parent would have allocate separate arrays too and apply changes only when everything has been received.
The downside is that this can temporarily cause huge memory allocations if a page does something bad intentionally.
The upside is that snapshots would be still consistent as we originally planned.
Send checkpoint when the size of write info array is close to max message size
We would just track the size of current write info array and call SendCheckpoint when the size is close to max message size, we wouldn't wait for the stable state callback.
The downside is that while the existing snapshot would still see the original state, any new snapshots created in other content processes would see "partial" changes from the original content process.
The upside is that this doesn't need a lot of memory.

If you know about any other options, let me know.

Flags: needinfo?(bugmail)

Andrew Overholt [:overholt]

Comment 4

•

6 years ago

(LSNG isn't shipping in 67 so updating the tracking flags accordingly)

status-firefox67: affected → unaffected

Jan Varga [:janv]

Reporter

Comment 5

•

6 years ago

We still need to check the max message size, but fixing bug 1546310 and bug 1513937 may mitigate this issue.

Andrew Sutherland [:asuth] (he/him)

Comment 6

•

6 years ago

I think using the WriteOptimizer in the client is the right way to go.

Flags: needinfo?(bugmail)

Randell Jesup [:jesup] (needinfo me)

Comment 7

•

6 years ago

Are we still planning to land a max-message-size test in 68?

Flags: needinfo?(jvarga)

Jan Varga [:janv]

Reporter

Comment 8

•

6 years ago

No, we're going to implement a WriteOptimizer that will coalesce operations, so final IPC message would never reach the IPC maximum message size. That should happen right after bug 1540777.

Flags: needinfo?(jvarga)

Randell Jesup [:jesup] (needinfo me)

Comment 9

•

6 years ago

We were looking at this in Regression Triage today; are we still planning on fixing this in 68 (as it's a P1)? Thanks!

Flags: needinfo?(jvarga)

Jan Varga [:janv]

Reporter

Comment 10

•

6 years ago

Yes.

Flags: needinfo?(jvarga)

Jan Varga [:janv]

Reporter

Comment 11

•

6 years ago

Andrew, I'm afraid that we can't use write optimizer in the client, take a look at this test:
https://searchfox.org/mozilla-central/source/testing/web-platform/tests/webstorage/event_local_removeitem.html

Flags: needinfo?(bugmail)

Andrew Sutherland [:asuth] (he/him)

Comment 12

•

6 years ago

As per the discussion in a vidyo call, I think only do the write optimizer for IPC purposes is a workable simplification. Behavior within the same process would be the same, so that test would still pass. And as we start doing origin-per-process, it's less likely that a site would perceive dubious behavior.

Alternately, we could allow some level of redundancy... but I think it's clear that if an origin is generating 256MB of deltas that it's appropriate to engage in some type of https://github.com/WICG/interventions.

Flags: needinfo?(bugmail)

Jan Varga [:janv]

Reporter

Comment 13

•

6 years ago

Yeah, but it will take some time to have origin-per-process. Andrew also mentioned that parent could tell the child if there are cross-process listeners at the time the snapshot is created. If there are no listeners we can safely use the write optimizer on the child side too. I think that would be a good compromise since the chance of having two or more tabs open for the same origin and having a cross-process listener for the storage event and having a function that produces a lot of changes is not very high.

[:philipp]

Updated

•

6 years ago

status-firefox67: unaffected → disabled

status-firefox67.0.1: --- → disabled

Jan Varga [:janv]

Reporter

Updated

•

5 years ago

No longer blocks: 1539835

Jan Varga [:janv]

Reporter

Updated

•

5 years ago

Priority: P1 → P2

Jan Varga [:janv]

Reporter

Updated

•

5 years ago

Assignee: jvarga → nobody

Status: ASSIGNED → NEW

Liz Henry (:lizzard) (relman/hg->git project)

Comment 14

•

5 years ago

Too late for a fix in 68 but we could still likely take a patch for 69.

status-firefox68: affected → wontfix

status-firefox69: --- → affected

Liz Henry (:lizzard) (relman/hg->git project)

Comment 15

•

5 years ago

Dropping the issue from regression triage at this point; it has a priority set and the team is aware of the issue.

status-firefox69: affected → fix-optional

status-firefox70: --- → fix-optional

BugBot [:suhaib / :marco/ :calixte]

Comment 16

•

5 years ago

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED

Closed: 5 years ago

Resolution: --- → WORKSFORME

You need to log in before you can comment on or make changes to this bug.