Closed Bug 754572 Opened 8 years ago Closed 7 years ago

Crash [@ ns_if_addref<nsIScreen*>] with use-after-free

Categories

(Firefox for Android :: General, defect)

All
Android
defect
Not set

Tracking

()

RESOLVED WORKSFORME
Tracking Status
blocking-fennec1.0 --- -
fennec + ---

People

(Reporter: decoder, Assigned: bjacob)

References

(Blocks 1 open bug)

Details

(Keywords: sec-critical, Whiteboard: [leave open])

Attachments

(2 files)

Filing this without a test as discussed on IRC. I hit this crash while fuzzing on mobile (mozilla-central, debug build, rev 448f554f6acb) but I cannot reproduce. minidump_stackwalk isn't able to give much information:

Operating system: Linux
                  0.0.0 Linux 2.6.32.9-00002-gd8084dc-dirty #1 SMP PREEMPT Wed Feb 2 11:32:06 PST 2011 armv7l
CPU: arm
     0 CPUs

Crash reason:  SIGSEGV
Crash address: 0x5a5a5a5e

Thread 4 (crashed)
 0  libxul.so + 0xd44278
     r4 = 0x531f1400    r5 = 0x4e302ce8    r6 = 0x4e302c7c    r7 = 0x4e3027b0
     r8 = 0x83afefb0    r9 = 0x4fc03700   r10 = 0x82d4426d    fp = 0x00000000
     sp = 0x4e3027b0    lr = 0x825c7291    pc = 0x82d44278
    Found by: given as instruction pointer in context
 1  libxul.so + 0x5c7290
     sp = 0x4e3027b8    pc = 0x825c7291
    Found by: stack scanning
 2  libxul.so + 0x1afefaf
     sp = 0x4e3027c8    pc = 0x83afefb0
    Found by: stack scanning
 3  libxul.so + 0x758d58
     sp = 0x4e3027d8    pc = 0x82758d59
    Found by: stack scanning
 4  libxul.so + 0x41374c
     sp = 0x4e3027e8    pc = 0x8241374d
    Found by: stack scanning
 5  libxul.so + 0x5c75fa
     sp = 0x4e3027f8    pc = 0x825c75fb
    Found by: stack scanning
 6  libxul.so + 0x1b0b47f
     sp = 0x4e302874    pc = 0x83b0b480
    Found by: stack scanning
 7  libxul.so + 0x1b0b47f
     sp = 0x4e3028b0    pc = 0x83b0b480
    Found by: stack scanning
 8  libnspr4.so!quorem [prdtoa.c:448f554f6acb : 2599 + 0x7]
     sp = 0x4e302900    pc = 0x81410000
    Found by: stack scanning


Not sure why minidump_stackwalk cannot resolve the symbols here. My own implementation using addr2line is able to symbolize the trace at least partly using the original binary for reconstruction:

#0      ns_if_addref<nsIScreen*> at /home/decoder/LangFuzz/mozilla-central-browser/objdir-droid/widget/android/../../dist/include/nsISupportsUtils.h:94
#1      $t at nsContentUtils.cpp:0
#2      ?? at ??:0
#3      nsHTMLDocument::Release() at libgcc2.c:0
#4      ~nsCOMPtr at /home/decoder/LangFuzz/mozilla-central-browser/objdir-droid/layout/base/../../dist/include/nsCOMPtr.h:522
#5      $t at nsContentUtils.cpp:0
#6      0x813fffff at 
#7      $d at nsGkAtoms.cpp:0


In logcat, I spotted this after the fuzzer was done with it's last command:

I/Gecko   (18634): void mozilla::AndroidBridge::HandleGeckoMessage(const nsAString_internal&, nsAString_internal&)
I/Gecko   (18634): leaving void mozilla::AndroidBridge::HandleGeckoMessage(const nsAString_internal&, nsAString_internal&)
I/Gecko   (18634): AndroidBridge::SetFullScreen


So I checked and it seems to fuzzer did play with mozRequestFullscreen before the crash, so it might be related to the fullscreen API (which caused security problems before, e.g. bug 725770).

I'll try to reproduce this again when I have the addon installed that allows us to GC(), but right now I can't reproduce this (and therefore also not confirm it's mobile specific).
For later (internal) reference:

Fuzzer ID: tegra-287
Issue ID: 3f28fee5-5397-683a-111179a7-6cbdbc9c
Benoit, is this related to the nsScreenManagerAndroid non-thread safety bug you were looking at?
Forgot to mention that there are two assertions popping up several times immediately before end of log:

I/Gecko   (18634): ###!!! ASSERTION: nsScreenManagerAndroid not thread-safe: '_mOwningThread.GetThread() == PR_GetCurrentThread()', file /home/decoder/LangFuzz/mozilla-central-browser/widget/android/nsScreenManagerAndroid.cpp, line 98
I/Gecko   (18634): ###!!! ASSERTION: nsBaseScreen not thread-safe: '_mOwningThread.GetThread() == PR_GetCurrentThread()', file /home/decoder/LangFuzz/mozilla-central-browser/widget/xpwidgets/nsBaseScreen.cpp, line 43
I
Blocks: adbfuzz
Keywords: sec-critical
QA Contact: general → choller
Whiteboard: [sg:needinfo]
blocking-fennec1.0: --- → ?
tracking-fennec: --- → 14+
blocking-fennec1.0: ? → .N+
Did you have multiple screens?  Ie did you have a HDMI out or something similar?
(In reply to Naoki Hirata :nhirata from comment #4)
> Did you have multiple screens?  Ie did you have a HDMI out or something
> similar?

Not sure about that, this happened on the Tegra pool :)
By default wouldn't that mean that there is no screen attached what-so-ever?
(In reply to Naoki Hirata :nhirata from comment #6)
> By default wouldn't that mean that there is no screen attached what-so-ever?

That might be the case, yes. I don't think that's relevant though, as these boards are the reference devices for all of our automated mobile tests.
Whiteboard: [sg:needinfo]
Benoit, weren't you looking at this?
Assignee: nobody → bjacob
Perhaps this is just bug 725770? In any case, not very likely that someone else will find a testcase for this.
My guess is that you would need to be on a tegra or panda board in order to reproduce this crasher.
I don't think it's quite the same crasher on any device because the test case from bug 725770 will crash with [nsDocShell::RestoreFromHistory]
https://crash-stats.mozilla.com/report/index/bp-7bb03e4e-1be7-4174-a2a6-3ab282120619
What's the status on this bug?
AFAIK, we don't have test case, nor steps to reproduce.  Decoder cannot reproduce this bug either after running into it at one time.
tracking-fennec: 14+ → +
blocking-fennec1.0: .N+ → -
This assertion should be fatal because:
 - it covers a plain Gecko programming mistake (doing thing from the wrong thread)
 - the error it covers can lead to crashes and other bad things
 - the present bug is an example of how the non-fatality of this assertion is making our life harder
Attachment #637229 - Flags: review?(bzbarsky)
By the way, I don't understand what this macro is doing in the XPCOM_GLUE case. It is then empty, so NS_CheckThreadSafe(condition, msg)  evaluates to (condition, msg), so the condition is evaluated even in release builds. I doubt that's intentional??
Proposed fix for the issue raised in the previous comment.
Attachment #637230 - Flags: review?(bzbarsky)
Comment on attachment 637229 [details] [diff] [review]
add options to use fatal assertions in NS_CheckThreadSafe, use it in nsBaseScreen and nsScreenManagerAndroid

This seems like something Benjamin should look at.
Attachment #637229 - Flags: review?(bzbarsky) → review?(benjamin)
Comment on attachment 637230 [details] [diff] [review]
let NS_CheckThreadSafe(x,y) be empty in XPCOM_GLUE case, as opposed to letting it evaluate to (x,y)

r=me
Attachment #637230 - Flags: review?(bzbarsky) → review+
Christian, can you please try the tryserve build from comment 20? Does it fix it for you?
Ahem, sorry, I meant to ask: what does the crash look like with this build? Certainly these patches don't fix the crash, they just hopefully will give more understandable crashes.
Attachment #637229 - Flags: review?(benjamin) → review+
Filed bug 771148 about making the assertion unconditionally fatal.
Christian, please retry with current mozilla-central now!
(In reply to Benoit Jacob [:bjacob] from comment #27)
> Christian, please retry with current mozilla-central now!

I'm rebuilding now from central and will restart some domfuzz instances on it then. I'll keep an eye on the thread safety assertions and report back :)
I didn't hit this so far, so I'm closing as WFM. But I'll reopen when I see the assertion.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.