Closed Bug 1746996 Opened 4 years ago Closed 4 years ago

Intermittent browser/base/content/test/general/browser_invalid_uri_back_forward_manipulation.js | Test timed out | Found a tab after previous test timed out: about:blank

Categories

(Core :: XML, defect, P5)

defect

Tracking

()

RESOLVED FIXED
97 Branch
Tracking Status
firefox-esr91 --- unaffected
firefox95 --- unaffected
firefox96 --- fixed
firefox97 --- fixed

People

(Reporter: intermittent-bug-filer, Assigned: bholley)

References

(Blocks 1 open bug, Regression)

Details

(Keywords: intermittent-failure, regression, Whiteboard: [retriggered][stockwell needswork:owner])

Attachments

(1 file)

Filed by: mlaza [at] mozilla.com
Parsed log: https://treeherder.mozilla.org/logviewer?job_id=361947251&repo=autoland
Full log: https://firefox-ci-tc.services.mozilla.com/api/queue/v1/task/QmX86im-Rp6Ha8sx2J_c9g/runs/0/artifacts/public/logs/live_backing.log


[task 2021-12-21T00:55:07.516Z] 00:55:07     INFO - TEST-INFO | started process screencapture
[task 2021-12-21T00:55:07.641Z] 00:55:07     INFO - TEST-INFO | screencapture: exit 0
[task 2021-12-21T00:55:07.641Z] 00:55:07     INFO - Buffered messages logged at 00:53:37
[task 2021-12-21T00:55:07.642Z] 00:55:07     INFO - Entering test bound checkBackFromInvalidURI
[task 2021-12-21T00:55:07.642Z] 00:55:07     INFO - Loaded about:robots
[task 2021-12-21T00:55:07.643Z] 00:55:07     INFO - Console message: [JavaScript Error: "Could not record event: Error: Event must be defined, unless input was pasted/dropped" {file: "resource:///modules/UrlbarController.jsm" line: 781}]
[task 2021-12-21T00:55:07.643Z] 00:55:07     INFO - record@resource:///modules/UrlbarController.jsm:781:10
[task 2021-12-21T00:55:07.643Z] 00:55:07     INFO - handleNavigation@resource:///modules/UrlbarInput.jsm:589:37
[task 2021-12-21T00:55:07.643Z] 00:55:07     INFO - handleCommand@resource:///modules/UrlbarInput.jsm:461:10
[task 2021-12-21T00:55:07.644Z] 00:55:07     INFO - checkBackFromInvalidURI@chrome://mochitests/content/browser/browser/base/content/test/general/browser_invalid_uri_back_forward_manipulation.js:23:11
[task 2021-12-21T00:55:07.644Z] 00:55:07     INFO - 
[task 2021-12-21T00:55:07.645Z] 00:55:07     INFO - Console message: [JavaScript Error: "[Exception... "Couldn't build a valid uri"  nsresult: "0x804b000a (NS_ERROR_MALFORMED_URI)"  location: "JS frame :: resource://gre/modules/URIFixup.jsm :: getFixupURIInfo :: line 439"  data: no]"]
[task 2021-12-21T00:55:07.645Z] 00:55:07     INFO - getFixupURIInfo@resource://gre/modules/URIFixup.jsm:439:13
[task 2021-12-21T00:55:07.645Z] 00:55:07     INFO - PUIU_markPageAsTyped@resource:///modules/PlacesUIUtils.jsm:563:25
[task 2021-12-21T00:55:07.645Z] 00:55:07     INFO - addToUrlbarHistory@resource:///modules/UrlbarUtils.jsm:276:21
[task 2021-12-21T00:55:07.645Z] 00:55:07     INFO - _loadURL@resource:///modules/UrlbarInput.jsm:2424:19
[task 2021-12-21T00:55:07.645Z] 00:55:07     INFO - pickResult@resource:///modules/UrlbarInput.jsm:1058:10
[task 2021-12-21T00:55:07.645Z] 00:55:07     INFO - handleNavigation/<@resource:///modules/UrlbarInput.jsm:633:16
[task 2021-12-21T00:55:07.645Z] 00:55:07     INFO - 
[task 2021-12-21T00:55:07.646Z] 00:55:07     INFO - Console message: [JavaScript Error: "NS_ERROR_MALFORMED_URI: Couldn't build a valid uri" {file: "resource://gre/modules/URIFixup.jsm" line: 439}]
[task 2021-12-21T00:55:07.646Z] 00:55:07     INFO - getFixupURIInfo@resource://gre/modules/URIFixup.jsm:439:13
[task 2021-12-21T00:55:07.646Z] 00:55:07     INFO - _loadURI@chrome://browser/content/browser.js:1556:27
[task 2021-12-21T00:55:07.646Z] 00:55:07     INFO - openLinkIn@chrome://browser/content/utilityOverlay.js:555:21
[task 2021-12-21T00:55:07.646Z] 00:55:07     INFO - openUILinkIn@chrome://browser/content/utilityOverlay.js:274:13
[task 2021-12-21T00:55:07.646Z] 00:55:07     INFO - openTrustedLinkIn@chrome://browser/content/utilityOverlay.js:191:15
[task 2021-12-21T00:55:07.646Z] 00:55:07     INFO - _loadURL@resource:///modules/UrlbarInput.jsm:2486:19
[task 2021-12-21T00:55:07.647Z] 00:55:07     INFO - pickResult@resource:///modules/UrlbarInput.jsm:1058:10
[task 2021-12-21T00:55:07.647Z] 00:55:07     INFO - handleNavigation/<@resource:///modules/UrlbarInput.jsm:633:16
[task 2021-12-21T00:55:07.647Z] 00:55:07     INFO - 
[task 2021-12-21T00:55:07.648Z] 00:55:07     INFO - Console message: [JavaScript Error: "XML Parsing Error: mismatched tag. Expected: </div>.
[task 2021-12-21T00:55:07.648Z] 00:55:07     INFO - Location: jar:file:///opt/worker/tasks/task_164004768083453/build/application/Firefox%20NightlyDebug.app/Contents/Resources/browser/omni.ja!/chrome/browser/content/browser/certerror/aboutNetError.xhtml
[task 2021-12-21T00:55:07.648Z] 00:55:07     INFO - Line Number 90, Column 131:" {file: "jar:file:///opt/worker/tasks/task_164004768083453/build/application/Firefox%20NightlyDebug.app/Contents/Resources/browser/omni.ja!/chrome/browser/content/browser/certerror/aboutNetError.xhtml" line: 90 column: 131 source: "        <div id="es_nssBadCert_MOZILLA_PKIX_ERROR_NOT_YET_VALID_ISSUER_CERTIFICATE">&certerror.expiredCert.whatCanYouDoAboutIt2;</div>"}]
[task 2021-12-21T00:55:07.648Z] 00:55:07     INFO - Buffered messages finished
[task 2021-12-21T00:55:07.649Z] 00:55:07     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/general/browser_invalid_uri_back_forward_manipulation.js | Test timed out - 
[task 2021-12-21T00:55:07.649Z] 00:55:07     INFO - GECKO(1363) | MEMORY STAT | vsize 17535MB | residentFast 538MB | heapAllocated 147MB
[task 2021-12-21T00:55:07.650Z] 00:55:07     INFO - TEST-OK | browser/base/content/test/general/browser_invalid_uri_back_forward_manipulation.js | took 90112ms
[task 2021-12-21T00:55:07.650Z] 00:55:07     INFO - Not taking screenshot here: see the one that was previously logged
[task 2021-12-21T00:55:07.650Z] 00:55:07     INFO - TEST-UNEXPECTED-FAIL | browser/base/content/test/general/browser_invalid_uri_back_forward_manipulation.js | Found a tab after previous test timed out: about:blank - 
[task 2021-12-21T00:55:07.651Z] 00:55:07     INFO - GECKO(1363) | [Parent 1363, Main Thread] WARNING: '!inner', file /builds/worker/checkouts/gecko/dom/ipc/jsactor/JSWindowActorProtocol.cpp:184
[task 2021-12-21T00:55:07.651Z] 00:55:07     INFO - GECKO(1363) | [Parent 1363, Main Thread] WARNING: '!inner', file /builds/worker/checkouts/gecko/dom/ipc/jsactor/JSWindowActorProtocol.cpp:184
[task 2021-12-21T00:55:07.652Z] 00:55:07     INFO - GECKO(1363) | [Child 1369: Main Thread]: I/DocShellAndDOMWindowLeak ++DOCSHELL 10e34b000 == 1 [pid = 1369] [id = 53]
[task 2021-12-21T00:55:07.652Z] 00:55:07     INFO - GECKO(1363) | [Child 1369: Main Thread]: I/DocShellAndDOMWindowLeak ++DOMWINDOW == 1 (10e39ee40) [pid = 1369] [serial = 121] [outer = 0]
[task 2021-12-21T00:55:07.653Z] 00:55:07     INFO - GECKO(1363) | [Child 1369: Main Thread]: I/DocShellAndDOMWindowLeak ++DOMWINDOW == 2 (10e50a000) [pid = 1369] [serial = 122] [outer = 10e39ee40]
[task 2021-12-21T00:55:07.653Z] 00:55:07     INFO - GECKO(1363) | [Parent 1363, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, false) failed with result 0x804B0050 (NS_ERROR_INSUFFICIENT_DOMAIN_LEVELS): file /builds/worker/checkouts/gecko/toolkit/components/antitracking/URLQueryStringStripper.cpp:129
[task 2021-12-21T00:55:07.654Z] 00:55:07     INFO - GECKO(1363) | [Child 1369, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, false) failed with result 0x804B0050 (NS_ERROR_INSUFFICIENT_DOMAIN_LEVELS): file /builds/worker/checkouts/gecko/toolkit/components/antitracking/URLQueryStringStripper.cpp:129
[task 2021-12-21T00:55:07.654Z] 00:55:07     INFO - GECKO(1363) | [Child 1369, Main Thread] WARNING: NS_ENSURE_SUCCESS(rv, false) failed with result 0x804B0050 (NS_ERROR_INSUFFICIENT_DOMAIN_LEVELS): file /builds/worker/checkouts/gecko/toolkit/components/antitracking/URLQueryStringStripper.cpp:129
[task 2021-12-21T00:55:07.655Z] 00:55:07     INFO - checking window state
[task 2021-12-21T00:55:07.655Z] 00:55:07     INFO - TEST-START | browser/base/content/test/general/browser_lastAccessedTab.js

The Bugbug bot thinks this bug should belong to the 'Core::Privacy: Anti-Tracking' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.

Component: General → Privacy: Anti-Tracking
Product: Firefox → Core
Summary: Intermittent [tier 2] browser/base/content/test/general/browser_invalid_uri_back_forward_manipulation.js | Test timed out | Found a tab after previous test timed out: about:blank → Intermittent browser/base/content/test/general/browser_invalid_uri_back_forward_manipulation.js | Test timed out | Found a tab after previous test timed out: about:blank

Looks like this started with changes in Bug 1745239, retriggers. Peter, could you have a look over these? Thank you.

Flags: needinfo?(peterv)
Regressed by: 1745239
Has Regression Range: --- → yes
Whiteboard: [retriggered][stockwell needswork:owner]

Set release status flags based on info from the regressing bug 1745239

The test failures show XML parsing failures in aboutNetError.xhtml, e.g.:

[task 2021-12-25T13:23:54.013Z] 13:23:54 INFO - Console message: [JavaScript Error: "XML Parsing Error: mismatched tag. Expected: </div>.
[task 2021-12-25T13:23:54.013Z] 13:23:54 INFO - Location: jar:file:///opt/worker/tasks/task_164043367019293/build/application/Firefox%20NightlyDebug.app/Contents/Resources/browser/omni.ja!/chrome/browser/content/browser/certerror/aboutNetError.xhtml
[task 2021-12-25T13:23:54.013Z] 13:23:54 INFO - Line Number 55, Column 52:" {file: "jar:file:///opt/worker/tasks/task_164043367019293/build/application/Firefox%20NightlyDebug.app/Contents/Resources/browser/omni.ja!/chrome/browser/content/browser/certerror/aboutNetError.xhtml" line: 55 column: 52 source: " <div id="ed_netReset">&netReset.longDesc;</div>"}]
[task 2021-12-25T13:23:54.013Z] 13:23:54 INFO - Buffered messages finished

and

[task 2021-12-25T13:21:31.217Z] 13:21:31 INFO - Console message: [JavaScript Error: "XML Parsing Error: mismatched tag. Expected: </div>.
[task 2021-12-25T13:21:31.217Z] 13:21:31 INFO - Location: jar:file:///opt/worker/tasks/task_164043414713850/build/application/Firefox%20NightlyDebug.app/Contents/Resources/browser/omni.ja!/chrome/browser/content/browser/certerror/aboutNetError.xhtml
[task 2021-12-25T13:21:31.217Z] 13:21:31 INFO - Line Number 90, Column 131:" {file: "jar:file:///opt/worker/tasks/task_164043414713850/build/application/Firefox%20NightlyDebug.app/Contents/Resources/browser/omni.ja!/chrome/browser/content/browser/certerror/aboutNetError.xhtml" line: 90 column: 131 source: " <div id="es_nssBadCert_MOZILLA_PKIX_ERROR_NOT_YET_VALID_ISSUER_CERTIFICATE">&certerror.expiredCert.whatCanYouDoAboutIt2;</div>"}]
[task 2021-12-25T13:21:31.218Z] 13:21:31 INFO - Buffered messages finished

So different points in the file in those two cases, but each time closing a div after processing a dtd.

Presumably the finer-grained chunking is exposing some edge cases, as we intended it to.

I did a try push with some logging [1]. I haven't looked at the results closely but my sense is that the nondeterminism here is coming from the parser getting interrupted at different points (reflected in the non-deterministic ParseBuffer(null) calls). So presumably if we can log exactly when the parser gets interrupted during the failing runs, we should be able to make the failure deterministically reproducible.

[1] https://treeherder.mozilla.org/jobs?repo=try&revision=dbd452fde7b98fd41898b67cc2015221b555c607

(In reply to Bobby Holley (:bholley) from comment #6)

So presumably if we can log exactly when the parser gets interrupted during the failing runs, we should be able to make the failure deterministically reproducible.

I did this [1], which shows an interrupt at token 659 on the failing run. Sure enough, if I instrument nsContentSink::DidProcessATokenImpl to deterministically interrupt at exactly the 659th token, I get a YSOD trying to load about:neterror locally.

Probably easiest to debug this in pernosco with the WASM sandbox turned off, so I've done another push to that effect [2].

[1] https://treeherder.mozilla.org/jobs?repo=try&revision=c1654fbc58c4bc10ffbc360e133116cc98dfe960
[2] https://treeherder.mozilla.org/#/jobs?repo=try&revision=e47b0cd8e1a4148fccf494f417d7fec51ff0fde2

To avoid copying in the common case, expat points directly into the
(internal) input buffer for strings referenced from the tag stack. When
expat unwinds back to the caller, it attempts to call storeRawNames() to
walk the tag stack and copy any such strings into persistent memory
before the caller potentially invokes XML_Parse() again and shuffles the
input buffer, thereby invalidating these references.

Unfortunately, it doesn't do it in all the right places. Because
different parsing states set |processor| to different callbacks (so that
parsing can resume in the right context), there are a number of
non-obvious entry points. In this case, parsing was interrupted in the
middle of processing an internal entity [1], so |processor| was set to
internalEntityProcessor(), which invokes doContent() but does not call
storeRawNames(). Expat then returned to the caller with a tag stack
still referencing the input buffer, which got clobbered in the next call
to XML_Parse, causing a tag mismatch on the next close tag.

Conceptually, this optimization should be managed by doContent(), and I
believe the only reason that isn't the case is that doContent() has so
many return paths (and we don't have RAII in C). We can fix this by
wrapping doContent() in a helper.

[1] &certerror.expiredCert.whatCanYouDoAboutIt2;

Assignee: nobody → bholley
Status: NEW → ASSIGNED
Blocks: 1748118
Pushed by bholley@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/521f2f003843 Ensure that storeRawNames is always called. r=peterv
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → FIXED
Target Milestone: --- → 97 Branch
Component: Privacy: Anti-Tracking → XML

Is there a user impact which justifies uplift consideration for the 96 RC build or can this ride 97?

Flags: needinfo?(peterv) → needinfo?(bholley)

Comment on attachment 9257187 [details]
Bug 1746996 - Ensure that storeRawNames is always called.

Beta/Release Uplift Approval Request

  • User impact if declined: There is a small chance that Firefox could intermittently fail to parse an XML document. The chance of this happening in release builds is relatively low, because we set the chunking threshold to 128k (rather than the 1k we were using in debug builds which succeeded in its goal of surfacing edge-case bugs in CI). However, it still could happen, leading to unpredictable and near-impossible-to-diagnose breakage.
  • Is this code covered by automated tests?: No
  • Has the fix been verified in Nightly?: Yes
  • Needs manual test from QE?: No
  • If yes, steps to reproduce:
  • List of other uplifts needed: None
  • Risk to taking this patch: Low
  • Why is the change risky/not risky? (and alternatives if risky): The patch is pretty low-risk. I'm generally reluctant to uplift anything this late in the beta cycle, but I think the chance of this patch breaking things is lower than the chance of expat misbehaving due to the bug being fixed here.

Expat is RLBoxed, so any theoretical regressions would be scoped to XML parsing and couldn't cause security bugs.

  • String changes made/needed:
Flags: needinfo?(bholley)
Attachment #9257187 - Flags: approval-mozilla-beta?

Comment on attachment 9257187 [details]
Bug 1746996 - Ensure that storeRawNames is always called.

Approved for 96.0rc2

Attachment #9257187 - Flags: approval-mozilla-beta? → approval-mozilla-release+
Blocks: 1758626
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: