Intermittent test_csp_redirects.html,test_policyuri_regression_from_multipolicy.html | application crashed [@ nsCSPContext::SendReports(nsISupports*, nsIURI*, nsAString_internal&, unsigned int, nsAString_internal&, nsAString_internal&, unsigned int)]

RESOLVED FIXED in Firefox 34

Status

()

defect
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: KWierso, Assigned: ckerschb)

Tracking

({crash, intermittent-failure})

unspecified
mozilla36
x86_64
macOS
Points:
---

Firefox Tracking Flags

(firefox34 fixed, firefox35 fixed, firefox36 fixed, firefox-esr31 unaffected)

Details

(crash signature)

Attachments

(1 attachment)

Reporter

Description

5 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=48912227&tree=Mozilla-Inbound
Rev4 MacOSX Snow Leopard 10.6 mozilla-inbound opt test mochitest-1 on 2014-09-25 16:50:08 PDT for push 72e22f3d7cc6

slave: t-snow-r4-0119



16:52:28     INFO -  90 INFO TEST-OK | /tests/content/base/test/csp/test_multi_policy_injection_bypass.html | took 115ms
16:52:29     INFO -  91 INFO TEST-START | /tests/content/base/test/csp/test_nonce_source.html
16:52:29     INFO -  92 INFO TEST-OK | /tests/content/base/test/csp/test_nonce_source.html | took 192ms
16:52:29     INFO -  93 INFO TEST-START | /tests/content/base/test/csp/test_policyuri_regression_from_multipolicy.html
16:52:31     INFO -  TEST-INFO | Main app process: killed by SIGHUP
16:52:31     INFO -  94 ERROR TEST-UNEXPECTED-FAIL | /tests/content/base/test/csp/test_policyuri_regression_from_multipolicy.html | application terminated with exit code 1
16:52:31     INFO -  runtests.py | Application ran for: 0:00:29.197250
16:52:31     INFO -  zombiecheck | Reading PID log: /var/folders/OB/OBLZlmZXH7igb1IHOVwzCE+++-k/-Tmp-/tmpUruTSupidlog
16:52:31     INFO -  mozcrash Downloading symbols from: https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-macosx64/1411684960/firefox-35.0a1.en-US.mac.crashreporter-symbols.zip
16:53:03     INFO -  mozcrash Saved minidump as /builds/slave/talos-slave/test/build/blobber_upload_dir/D2D2D7D1-6882-47FA-AE0F-D0DBB0374C65.dmp
16:53:03     INFO -  mozcrash Saved app info as /builds/slave/talos-slave/test/build/blobber_upload_dir/D2D2D7D1-6882-47FA-AE0F-D0DBB0374C65.extra
16:53:03  WARNING -  PROCESS-CRASH | /tests/content/base/test/csp/test_policyuri_regression_from_multipolicy.html | application crashed [@ nsCSPContext::SendReports(nsISupports*, nsIURI*, nsAString_internal&, unsigned int, nsAString_internal&, nsAString_internal&, unsigned int)]
16:53:03     INFO -  Crash dump filename: /var/folders/OB/OBLZlmZXH7igb1IHOVwzCE+++-k/-Tmp-/tmp2JqvTP.mozrunner/minidumps/D2D2D7D1-6882-47FA-AE0F-D0DBB0374C65.dmp
16:53:03     INFO -  Operating system: Mac OS X
16:53:03     INFO -                    10.6.8 10K549
16:53:03     INFO -  CPU: amd64
16:53:03     INFO -       family 6 model 23 stepping 10
16:53:03     INFO -       2 CPUs
16:53:03     INFO -  Crash reason:  EXC_BAD_ACCESS / 0x0000000d
16:53:03     INFO -  Crash address: 0x0
16:53:03     INFO -  Thread 0 (crashed)
16:53:03     INFO -   0  XUL!nsCSPContext::SendReports(nsISupports*, nsIURI*, nsAString_internal&, unsigned int, nsAString_internal&, nsAString_internal&, unsigned int) [nsTArray.h:72e22f3d7cc6 : 329 + 0x7]
16:53:03     INFO -      rbx = 0x000000011f28aaa0   r12 = 0x000000012bee81d0
16:53:03     INFO -      r13 = 0x00007fff5fbfce00   r14 = 0x000000012c947860
16:53:03     INFO -      r15 = 0x0000000000000000   rip = 0x0000000103241310
16:53:03     INFO -      rsp = 0x00007fff5fbfc470   rbp = 0x00007fff5fbfcbb0
16:53:03     INFO -      Found by: given as instruction pointer in context
16:53:03     INFO -   1  XUL!CSPReportSenderRunnable::Run() [nsCSPContext.cpp:72e22f3d7cc6 : 835 + 0x29]
16:53:03     INFO -      rbx = 0x000000011f9b1c00   r12 = 0x000000010038c708
16:53:03     INFO -      r13 = 0x00007fff5fbfce00   r14 = 0x00000001003fdbf0
16:53:03     INFO -      r15 = 0x0000000000000000   rip = 0x0000000103266658
16:53:03     INFO -      rsp = 0x00007fff5fbfcbc0   rbp = 0x00007fff5fbfcd30
16:53:03     INFO -      Found by: call frame info
16:53:03     INFO -   2  XUL!nsThread::ProcessNextEvent(bool, bool*) [nsThread.cpp:72e22f3d7cc6 : 823 + 0x5]
16:53:03     INFO -      rbx = 0x0000000100337530   r12 = 0x000000010038c708
16:53:03     INFO -      r13 = 0x00007fff5fbfce00   r14 = 0x0000000100337558
16:53:03     INFO -      r15 = 0x0000000100337598   rip = 0x0000000101c3b9b0
16:53:03     INFO -      rsp = 0x00007fff5fbfcd40   rbp = 0x00007fff5fbfcdf0
16:53:03     INFO -      Found by: call frame info
16:53:03     INFO -   3  XUL!NS_ProcessPendingEvents(nsIThread*, unsigned int) [nsThreadUtils.cpp:72e22f3d7cc6 : 207 + 0xd]
16:53:03     INFO -      rbx = 0x0000000000000000   r12 = 0x0000000000087376
16:53:03     INFO -      r13 = 0x00007fff5fbfce07   r14 = 0x0000000000000014
16:53:03     INFO -      r15 = 0x0000000100337530   rip = 0x0000000101c5ae01
16:53:03     INFO -      rsp = 0x00007fff5fbfce00   rbp = 0x00007fff5fbfce30
16:53:03     INFO -      Found by: call frame info
16:53:03     INFO -   4  XUL!nsBaseAppShell::NativeEventCallback() [nsBaseAppShell.cpp:72e22f3d7cc6 : 98 + 0xe]
16:53:03     INFO -      rbx = 0x000000010038c700   r12 = 0x0000000000000000
16:53:03     INFO -      r13 = 0x0000000000000001   r14 = 0x0000000100337530
16:53:03     INFO -      r15 = 0x000000010038c700   rip = 0x0000000103137267
16:53:03     INFO -      rsp = 0x00007fff5fbfce40   rbp = 0x00007fff5fbfce60
16:53:03     INFO -      Found by: call frame info
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Flags: needinfo?(mozilla)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
I am investigating, but still haven't found the root cause of the problem. Can't reproduce locally - let's see how we can fix that.
Flags: needinfo?(mozilla)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Summary: Intermittent test_policyuri_regression_from_multipolicy.html | application crashed [@ nsCSPContext::SendReports(nsISupports*, nsIURI*, nsAString_internal&, unsigned int, nsAString_internal&, nsAString_internal&, unsigned int)] → Intermittent test_csp_redirects.html,test_policyuri_regression_from_multipolicy.html | application crashed [@ nsCSPContext::SendReports(nsISupports*, nsIURI*, nsAString_internal&, unsigned int, nsAString_internal&, nsAString_internal&, unsigned int)]
Chris: Can we put some argument checks at the top of nsCSPContext::SendReports to spit out warnings/assertions when the required arguments or nsCSPContext member variables are null?
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment hidden (Legacy TBPL/Treeherder Robot)
Sid, looking at the output in Comment 0, the crash is related to an nsTArray, so I think we have to update [1] to be:
>  nsTArray<nsCOMPtr<nsCSPPolicy*> > mPolicies;

Since SendReports runs in a different Thread, we probably deref a nullptr in that array here [2]:

Do you agree?

[1] http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsCSPContext.h#66
[2] http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsCSPContext.cpp#647
Flags: needinfo?(sstamm)
yeah, possibly.  I wonder if we can easily pass *all* the data into the constructor of CSPReportSenderRunnable so it's not dependent on the context...

Another option is to just do a nullcheck and punt on reporting if things are null.
Flags: needinfo?(sstamm)
My initial guess from Comment 26 is not completely right, but nevertheless helped me find the real problem here. Since nsCSPContext was a 'regular' pointer in SenderRunnable, most likely the GC got called in between and cleaned up the CSPContext; calling the destructor of CSPContext [1]; deleting all the policies within the nsTarray which leads to exactly the problem reported in comment 0.

I think this patch should fix the problem.
Sid, any reason mCSPContext was not a refPtr in the first place?

[1] http://mxr.mozilla.org/mozilla-central/source/content/base/src/nsCSPContext.cpp#238
Attachment #8505658 - Flags: review?(sstamm)
Comment hidden (Legacy TBPL/Treeherder Robot)
Comment on attachment 8505658 [details] [diff] [review]
bug_1073316_use_nsrefptr_to_store_context_in_senderrunnable.patch

Review of attachment 8505658 [details] [diff] [review]:
-----------------------------------------------------------------

This looks like a fine change, I can't think of a reason it was a not refcounted before.  Can we trigger this problem even pseudo-reliably?  (I'm trying to understand if there's a good way to check if this fixes things.)
Attachment #8505658 - Flags: review?(sstamm) → review+
(In reply to Sid Stamm [:geekboy or :sstamm] from comment #30)
> Comment on attachment 8505658 [details] [diff] [review]
> bug_1073316_use_nsrefptr_to_store_context_in_senderrunnable.patch
> 
> Review of attachment 8505658 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> This looks like a fine change, I can't think of a reason it was a not
> refcounted before.  Can we trigger this problem even pseudo-reliably?  (I'm
> trying to understand if there's a good way to check if this fixes things.)

I tried to create a simplified testcase but I think there is no easy way to check if that fixes things. I guess we have to land the patch and then wait for a couple of days if that TBPL error reoccurs.
Assignee: nobody → mozilla
Status: NEW → ASSIGNED
https://hg.mozilla.org/mozilla-central/rev/7ae7a60dcda8
Status: ASSIGNED → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Please request Aurora and Beta approval on this when you get a chance :)
Flags: needinfo?(mozilla)
Comment on attachment 8505658 [details] [diff] [review]
bug_1073316_use_nsrefptr_to_store_context_in_senderrunnable.patch

Approval Request Comment
[Feature/regressing bug #]: New CSP backend; Bug 925004 and it's dependents.

[User impact if declined]: occasional browser crashes when sending CSP reports.

[Describe test coverage new/current, TBPL]: Reoccuring TBPL crashes. There is no easy way to trigger that race condition that causes the browser crash.

[Risks and why]: low, just adding a refcount to a pointer.

[String/UUID change made/needed]: no
Attachment #8505658 - Flags: approval-mozilla-aurora?
Comment on attachment 8505658 [details] [diff] [review]
bug_1073316_use_nsrefptr_to_store_context_in_senderrunnable.patch

Approval Request Comment
[Feature/regressing bug #]: New CSP backend; Bug 925004 and it's dependents.

[User impact if declined]: occasional browser crashes when sending CSP reports.

[Describe test coverage new/current, TBPL]: Reoccuring TBPL crashes. There is no easy way to trigger that race condition that causes the browser crash.

[Risks and why]: low, just adding a refcount to a pointer.

[String/UUID change made/needed]: no
Flags: needinfo?(mozilla)
Attachment #8505658 - Flags: approval-mozilla-beta?
Comment on attachment 8505658 [details] [diff] [review]
bug_1073316_use_nsrefptr_to_store_context_in_senderrunnable.patch

Beta+
Aurora+
Attachment #8505658 - Flags: approval-mozilla-beta?
Attachment #8505658 - Flags: approval-mozilla-beta+
Attachment #8505658 - Flags: approval-mozilla-aurora?
Attachment #8505658 - Flags: approval-mozilla-aurora+
Can anyone please land/uplift the patch to aurora and beta for me?
Keywords: checkin-needed
Yes, don't worry about it :) We have bug queries for uplifts.
Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.