Closed Bug 1632887 Opened 1 year ago Closed 1 year ago

Perma Late Beta TEST-UNEXPECTED-FAIL | browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent.js | public window should only see public items - Got 0, expected 1 when Gecko 77 merges to Beta on 2020-05-04

Categories

(Core :: Storage: localStorage & sessionStorage, defect, P2)

defect

Tracking

()

RESOLVED FIXED
mozilla78
Tracking Status
firefox-esr68 --- unaffected
firefox75 --- unaffected
firefox76 --- unaffected
firefox77 --- fixed
firefox78 + fixed

People

(Reporter: bogdan_tara, Assigned: baku)

References

(Regression)

Details

(Keywords: intermittent-failure, regression)

Attachments

(2 files)

[Tracking Requested - why for this release]:
Central as Beta simulation.

https://treeherder.mozilla.org/#/jobs?repo=try&selectedJob=299244012&revision=c4bb503ef027ace38df36d56b358c32921a96b02&searchStr=bc4

https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=299244012&repo=try&lineNumber=4719

https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/RcNFAjmpRaWvABB354f1hA/runs/0/artifacts/public/logs/live_backing.log

[task 2020-04-24T13:35:09.974Z] 13:35:09     INFO - TEST-START | browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent.js
[task 2020-04-24T13:35:10.639Z] 13:35:10     INFO - GECKO(6670) | JavaScript error: resource://gre/modules/remotepagemanager/MessagePort.jsm, line 483: TypeError: this.window is null
[task 2020-04-24T13:35:10.639Z] 13:35:10     INFO - GECKO(6670) | JavaScript error: chrome://browser/content/aboutPrivateBrowsing.js, line 8: InvalidStateError: An exception was thrown
[task 2020-04-24T13:35:11.391Z] 13:35:11     INFO - TEST-INFO | started process screentopng
[task 2020-04-24T13:35:11.724Z] 13:35:11     INFO - TEST-INFO | screentopng: exit 0
[task 2020-04-24T13:35:11.724Z] 13:35:11     INFO - Buffered messages logged at 13:35:09
[task 2020-04-24T13:35:11.724Z] 13:35:11     INFO - Entering test bound setup
[task 2020-04-24T13:35:11.724Z] 13:35:11     INFO - Leaving test bound setup
[task 2020-04-24T13:35:11.724Z] 13:35:11     INFO - Entering test bound test
[task 2020-04-24T13:35:11.724Z] 13:35:11     INFO - Buffered messages logged at 13:35:10
[task 2020-04-24T13:35:11.724Z] 13:35:11     INFO - Console message: [JavaScript Error: "The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol." {file: "http://mochi.test:8888/browser/browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent_page.html?action=set&name=test&value=value&initial=true" line: 0}]
[task 2020-04-24T13:35:11.726Z] 13:35:11     INFO - Console message: [JavaScript Error: "The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol." {file: "http://mochi.test:8888/browser/browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent_page.html?action=set&name=test2&value=value2" line: 0}]
[task 2020-04-24T13:35:11.727Z] 13:35:11     INFO - TEST-PASS | browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent.js | public window shouldn't see private storage - 
[task 2020-04-24T13:35:11.728Z] 13:35:11     INFO - TEST-PASS | browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent.js | public window should only see public items - 
[task 2020-04-24T13:35:11.732Z] 13:35:11     INFO - Console message: [JavaScript Error: "The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol." {file: "http://mochi.test:8888/browser/browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent_page.html?action=get&name=test2" line: 0}]
[task 2020-04-24T13:35:11.732Z] 13:35:11     INFO - TEST-PASS | browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent.js | private window shouldn't see public storage - 
[task 2020-04-24T13:35:11.732Z] 13:35:11     INFO - TEST-PASS | browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent.js | private window should only see private items - 
[task 2020-04-24T13:35:11.732Z] 13:35:11     INFO - Console message: [JavaScript Error: "The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol." {file: "http://mochi.test:8888/browser/browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent_page.html?action=get&name=test" line: 0}]
[task 2020-04-24T13:35:11.733Z] 13:35:11     INFO - Console message: [JavaScript Error: "TypeError: this.window is null" {file: "resource://gre/modules/remotepagemanager/MessagePort.jsm" line: 483}]
[task 2020-04-24T13:35:11.734Z] 13:35:11     INFO - Console message: [JavaScript Error: "InvalidStateError: An exception was thrown" {file: "chrome://browser/content/aboutPrivateBrowsing.js" line: 8}]
[task 2020-04-24T13:35:11.734Z] 13:35:11     INFO - Buffered messages logged at 13:35:11
[task 2020-04-24T13:35:11.735Z] 13:35:11     INFO - TEST-PASS | browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent.js | public window shouldn't see cleared private storage - 
[task 2020-04-24T13:35:11.736Z] 13:35:11     INFO - Buffered messages finished
[task 2020-04-24T13:35:11.737Z] 13:35:11     INFO - TEST-UNEXPECTED-FAIL | browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent.js | public window should only see public items - Got 0, expected 1
[task 2020-04-24T13:35:11.737Z] 13:35:11     INFO - Stack trace:
[task 2020-04-24T13:35:11.737Z] 13:35:11     INFO - chrome://mochikit/content/browser-test.js:test_is:1297
[task 2020-04-24T13:35:11.738Z] 13:35:11     INFO - chrome://mochitests/content/browser/browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent.js:test:78
[task 2020-04-24T13:35:11.738Z] 13:35:11     INFO - chrome://mochikit/content/browser-test.js:Tester_execTest/<:1039
[task 2020-04-24T13:35:11.739Z] 13:35:11     INFO - chrome://mochikit/content/browser-test.js:Tester_execTest:1074
[task 2020-04-24T13:35:11.739Z] 13:35:11     INFO - chrome://mochikit/content/browser-test.js:nextTest/<:904
[task 2020-04-24T13:35:11.740Z] 13:35:11     INFO - chrome://mochikit/content/tests/SimpleTest/SimpleTest.js:SimpleTest.waitForFocus/waitForFocusInner/focusedOrLoaded/<:918
[task 2020-04-24T13:35:11.743Z] 13:35:11     INFO - Console message: [JavaScript Error: "The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol." {file: "http://mochi.test:8888/browser/browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent_page.html?action=get&name=test2" line: 0}]
[task 2020-04-24T13:35:12.287Z] 13:35:12     INFO - TEST-PASS | browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent.js | private window should only see new private items - 
[task 2020-04-24T13:35:12.291Z] 13:35:12     INFO - Console message: [JavaScript Error: "The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol." {file: "http://mochi.test:8888/browser/browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent_page.html?action=set&name=test3&value=value3" line: 0}]
[task 2020-04-24T13:35:12.360Z] 13:35:12     INFO - Leaving test bound test
[task 2020-04-24T13:35:12.376Z] 13:35:12     INFO - Console message: [JavaScript Error: "The character encoding of the HTML document was not declared. The document will render with garbled text in some browser configurations if the document contains characters from outside the US-ASCII range. The character encoding of the page must be declared in the document or in the transfer protocol." {file: "http://mochi.test:8888/browser/browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent_page.html?final=true" line: 0}]
[task 2020-04-24T13:35:12.452Z] 13:35:12     INFO - GECKO(6670) | MEMORY STAT | vsize 3594MB | residentFast 623MB | heapAllocated 336MB
[task 2020-04-24T13:35:12.452Z] 13:35:12     INFO - TEST-OK | browser/components/privatebrowsing/test/browser/browser_privatebrowsing_concurrent.js | took 2472ms
Flags: needinfo?(amarchesini)

Andrea, you mentioned it's not reproducible locally for you. What is the plan for this failure affecting late beta? Starting Wednesday, the 77 beta code will be in late beta mode and these failures occur. Recent central-as-late-beta simulation (we don't do early-as-late-beta ones): https://treeherder.mozilla.org/#/jobs?repo=try&selectedTaskRun=Kz6PQfWWRaKoyxehbC54PQ-0&resultStatus=testfailed%2Cbusted%2Cexception%2Cretry%2Cusercancel%2Crunnable&revision=b1a1f77b17feb13d46bf8aee3c6a6ba2a7670115&searchStr=browser-chrome

Flags: needinfo?(wleung)

This seems to be a localStorage intermittent failure. I wrote a simplified version of the test. I'm going to ask asuth to take a look.

Flags: needinfo?(amarchesini) → needinfo?(bugmail)

If you replace this with browser_privatebrowsing_concurrent.js, you will see the bug in a late-beta release.

I tried to get this to reproduce with self-serve pernosco on :baku's try push at https://treeherder.mozilla.org/#/jobs?repo=try&revision=0ce675e8694ecdba17e6e4645faf1f18ef88d7fb and the comment 0 failure, but pernosco couldn't reproduce. I'm now trying to reproduce locally.

Based on analysis with baku on slack, it seems likely that what's happening here is that:

  • Since this is late beta, LSNG gets turned off. This means we're dealing with the old LS implementation.
  • The old LS implementation uses PBackgroundLocalStorageCache to propagate changes separately from database storage. This mechanism avoids sending data to the same process based on LocalStorageCacheParent object identity not process id.
  • Because the tests are all single-process because the processCount is explicitly set to 1 by the test, we expect that a situation arises where:
    • The holder of the LocalStorageCache is destroyed, which tears down the actor after it has broadcast its operations over the PBackgroundLocalStorageCache channel.
    • A new LocalStorageCache comes into existence and it hears the clear() operation from its predecessor, even though it probably shouldn't. (Like, the sequence of events should be impossible, as it should require 2 LocalStorageCache{Parent,Child} instances for the same process to exist at the same time since the LSCP::RecvNotify synchronously re-dispatches.) This leads to a state inconsistency as the LocalStorage will be seeing the correct disk state, but is also hearing a partial replay of events that happened immediately beforehand. In particular, the clear operation is the most dramatic LS operation possible.
Component: Private Browsing → Storage: localStorage & sessionStorage
Priority: -- → P2
Product: Firefox → Core

Ah, so, the test's attempt to set the processCount to 1 isn't actually doing what the test was hoping would happen, as the pernosco trace shows. It needs to take more proactive action to ensure only one process is used.

:baku's revised test is basically a loop of:

  • In tab A, do a LS clear followed by setting test=value
  • In tab A, do a LS get of "test2" and a length check (in a new page load from the previous)
  • Open another window, and in tab B, do a LS get of "test2" and a length check.
  • Close the window and thereby tab B.

In our first iteration of the test, it ends up that tab A is in process X and tab B is in process Y. The gotcha is that the LocalStorageCache instantiated in process Y by B:1 (tab B iteration 1) stays alive through to B:2 and because there was no window active anymore to process the propagated events from process X, the state in Y is out-of-date.

This is somewhat a known limitation of the legacy LocalStorage implementation, except that I think we generally would have expected the LocalStorageCache to have been destroyed in that case. Presumably something is hanging onto that LocalStorageCache that we don't want to be hanging onto it. Maybe some kind of bfcache / undo closed window thing is active?

There's a little more investigation to do here, but in the meantime, it's probably safe to disable the test because the test is broken. It's not clear there's actually any regression in legacy LocalStorage, although there could be a change related to sessionstore and/or session history and/or bfcache.

And to summarize re: PB: There is no privatebrowsing problem. There is only the potential that the broken test has revealed a change in window retention and the window's retention of its LocalStorage instance and its LocalStorageCache.

Flags: needinfo?(bugmail)

What's the way forward here? Disabling the test?

Flags: needinfo?(amarchesini)

Yes, I would disable the test. :asuth, can we reenable the test when LSNG is enabled, correct?

Flags: needinfo?(amarchesini) → needinfo?(bugmail)

Yes, disabling seems reasonable. It looks like skip-if = release_or_beta is the best option we have to match the current state of things per https://firefox-source-docs.mozilla.org/build/buildsystem/mozinfo.html#mozinfo-attributes ?

Flags: needinfo?(bugmail)
Assignee: nobody → amarchesini
Status: NEW → ASSIGNED
Pushed by amarchesini@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/086d3ca5e98e
Disable browser_privatebrowsing_concurrent.js in beta and release, r=asuth
Status: ASSIGNED → RESOLVED
Closed: 1 year ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla78
You need to log in before you can comment on or make changes to this bug.