Cannot load marketplace dev in the browser - crash during load

VERIFIED FIXED in Firefox 19

Status

Firefox OS
General
P1
critical
VERIFIED FIXED
6 years ago
6 years ago

People

(Reporter: jsmith, Assigned: jdm)

Tracking

({crash, regression, reproducible})

unspecified
B2G C3 (12dec-1jan)
ARM
Gonk (Firefox OS)
crash, regression, reproducible
Dependency tree / graph

Firefox Tracking Flags

(blocking-basecamp:+, firefox19 fixed, firefox20 fixed, b2g18 fixed)

Details

Attachments

(2 attachments, 2 obsolete attachments)

(Reporter)

Description

6 years ago
Steps:

1. Go to marketplace-dev.allizom.org in the browser

Expected:

The marketplace dev server should load.

Actual:

The browser tab crashes. 100% reproducible. Wasn't reproducing this yesterday, so this is a regression on today's build.

Interesting piece of logcat:

12-21 16:35:28.906: I/Gecko(110): [Parent 110] WARNING: waitpid failed pid:527 errno:10: file ../../../gecko/ipc/chromium/src/base/process_util_posix.cc, line 260
12-21 16:35:28.906: I/Gecko(110): [Parent 110] WARNING: Failed to deliver SIGKILL to 527!(3).: file ../../../gecko/ipc/chromium/src/chrome/common/process_watcher_posix_sigchld.cc, line 118
(Reporter)

Comment 1

6 years ago
Created attachment 694864 [details]
Logcat
(Reporter)

Updated

6 years ago
blocking-basecamp: --- → ?
Keywords: regression
(Reporter)

Updated

6 years ago
Keywords: crash, reproducible
(Reporter)

Comment 2

6 years ago
I need this fixed asap - all testing on marketplace dev is blocked on device without this, including privileged app testing.

Comment 3

6 years ago
(In reply to Jason Smith [:jsmith] from comment #0)
> The browser tab crashes. 100% reproducible. Wasn't reproducing this
> yesterday, so this is a regression on today's build.

Can you give us a crash ID?

You can find a list of your submitted reports with
adb shell ls -l /data/b2g/mozilla/Crash\ Reports/submitted/

Updated

6 years ago
Assignee: nobody → fabrice
blocking-basecamp: ? → +

Updated

6 years ago
Priority: -- → P1
I reproduce something that looks similar with the staging marketplace app.  It's being killed by SIGTERM, which means the b2g process killed it on purpose.  I suspect a permission issue but don't see anything in logcat.
We're failing to deserialize an actor ID from a PLayer constructor message ...
Bug 782542 is written all over this.
Assignee: fabrice → jones.chris.g
Workaround:

$ git checkout 7de93615636d5cdb13ed50232dc60ed58b182a9c
Severity: blocker → critical
Created attachment 695067 [details] [diff] [review]
Pref off sending CSP reports until we sort out bug 824179

This is an || review.
Attachment #695067 - Flags: review?(sstamm)
Attachment #695067 - Flags: review?(fabrice)
Attachment #695067 - Flags: review?(sstamm)
Attachment #695067 - Flags: review?(fabrice)
Attachment #695067 - Flags: review+
Based on what I see in the logcat: the marketplace is loading a data: uri even though the CSP doesn't permit that URI.  The policy needs to be changed to allow images from *everywhere* or at least include "data:" in the list of valid image sources.
https://hg.mozilla.org/integration/mozilla-inbound/rev/5aa4d9daa13c

But this will fail a test.  Instead of backing out and relanding with the test fix I'm going to just push the test fix.
Test fixes are nontrivial.  Out she comes.
Comment on attachment 695067 [details] [diff] [review]
Pref off sending CSP reports until we sort out bug 824179

It appears that this code runs off the main thread, so this patch is incorrect.
Attachment #695067 - Flags: review+ → review-
Comment on attachment 695067 [details] [diff] [review]
Pref off sending CSP reports until we sort out bug 824179

Scratch that ... we're touching docshells and so forth in other places in the same file.

This is off the critical backout path now so we can do a non-emergency review.
Attachment #695067 - Flags: review- → review?(sstamm)
Created attachment 695091 [details] [diff] [review]
Update tests to deal with disabled CSP reporting

I didn't see a way to get an appropriate context for the asyncOpen() here, but if you know how to do it, it would be nice not to whack this feature :/.
Attachment #695091 - Flags: review?(sstamm)
> Bug 782542 is written all over this...
> I suspect a permission issue but don't see anything in logcat.

Does NS_WARNING not print to logcat?  Should I change the necko security checks to use printf_stderr()?

Note that bug 782542 didn't affect CSP in any way, it only checks for TabParent and nsILoadContext in necko channels' callbacks, so I'm not certain the crashes were caused by it and we may not need to block 782542 on this.
Target Milestone: --- → B2G C3 (12dec-1jan)
Sorry missed your comment!

(In reply to Jason Duell (:jduell) from comment #16)
> > Bug 782542 is written all over this...
> > I suspect a permission issue but don't see anything in logcat.
> 
> Does NS_WARNING not print to logcat?  Should I change the necko security
> checks to use printf_stderr()?
> 

Only in debug builds.  For an error like this, it would be better to use printf_stderr(), which is always enabled.

> Note that bug 782542 didn't affect CSP in any way, it only checks for
> TabParent and nsILoadContext in necko channels' callbacks, so I'm not
> certain the crashes were caused by it and we may not need to block 782542 on
> this.

The problem is that if there's a CSP violation in a subprocess, and the CSP specifies report-uri, then contentSecurityPolicy.js will initiate a network request without a load context.
(Assignee)

Comment 18

6 years ago
It looks to me like scanRequestData is always called with a valid channel object shortly after CSP creation, so we should be able to stash the load context from the channel in a private var and use that in sendReports later.
(Assignee)

Comment 19

6 years ago
Created attachment 696068 [details] [diff] [review]
Make CSP reports part of the originating document's loadgroup.

This makes the aborts go away. I'd like to get Jonas' feedback on this, to make sure that adding the report channel to the document's loadgroup won't have unintended side-effects.
Attachment #696068 - Flags: review?(sstamm)
(Assignee)

Updated

6 years ago
Attachment #696068 - Flags: review?(jonas)
Attachment #695067 - Flags: review?(sstamm) → review+
Comment on attachment 695091 [details] [diff] [review]
Update tests to deal with disabled CSP reporting

Review of attachment 695091 [details] [diff] [review]:
-----------------------------------------------------------------

Approach looks good, and I think that was all the tests that rely on reporting.
Attachment #695091 - Flags: review?(sstamm) → review+
Comment on attachment 696068 [details] [diff] [review]
Make CSP reports part of the originating document's loadgroup.

Review of attachment 696068 [details] [diff] [review]:
-----------------------------------------------------------------

Looks fine to me.  Jonas, will this have funny side-effects?  I'm wondering if the report sending will get cancelled if the page somehow navigates before the transmission is complete.

::: content/base/src/contentSecurityPolicy.js
@@ +311,5 @@
>            // so we can tell it to not follow redirects when posting the reports
>            chan.notificationCallbacks = new CSPReportRedirectSink(this._policy);
> +          if (this._docRequest) {
> +            chan.loadGroup = this._docRequest.loadGroup;
> +          }

This won't affect CSP's operations and seems good.

Favor while you're editing this file: can you tweak the comment above LOAD_ANONYMOUS to say "report URI" instead of "policy URI"?  It's wrong.
Attachment #696068 - Flags: review?(sstamm) → review+
(Assignee)

Updated

6 years ago
Assignee: jones.chris.g → josh
My finger is on the "obsolete" button for the hack patches ... ;)
Comment on attachment 696068 [details] [diff] [review]
Make CSP reports part of the originating document's loadgroup.

Review of attachment 696068 [details] [diff] [review]:
-----------------------------------------------------------------

For what it's wroth, this will have the effect that if the user leaves the page before we have time to send the CSP report, the report won't be sent.

As soon as the user leaves a page, all requests associated with a loadgroup will be cancelled and new ones connected to the loadgroup will be prevented from being started.

This doesn't seem like a big problem to me. And certainly a smaller problem than the problem this bug is filed on. But I thought it was worth mentioning. Something that definitely lessens this problem is that it doesn't really matter much if the request is cancelled after it has been initiated since we don't really care what the response of the request is. All that matters is that the request reaches the server.
Attachment #696068 - Flags: review?(jonas) → review+
(Assignee)

Comment 25

6 years ago
https://hg.mozilla.org/releases/mozilla-aurora/rev/1eecfd35a273
https://hg.mozilla.org/releases/mozilla-b2g18/rev/f58409e312c8
status-b2g18: --- → fixed
status-firefox19: --- → fixed
status-firefox20: --- → fixed
https://hg.mozilla.org/mozilla-central/rev/787938d80556
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
(Reporter)

Comment 27

6 years ago
Verified on 1/5 - marketplace dev is loading in the browser fine.
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.