Nightly crashes during YT Feedback's snapshot phase in xpc_FastGetCachedWrapper .

RESOLVED WORKSFORME

Status

()

defect
--
critical
RESOLVED WORKSFORME
7 years ago
2 years ago

People

(Reporter: rob1weld, Unassigned)

Tracking

({crash, regression})

15 Branch
x86
Windows XP
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox15- affected)

Details

(Whiteboard: fixed by bug 752340?, crash signature)

Attachments

(1 attachment)

User Agent: Mozilla/5.0 (Windows NT 5.1; rv:14.0) Gecko/20120511 Firefox/14.0a2
Build ID: 20120511042006

Steps to reproduce:

Usually when I use the YouTube "Report a Bug" link at the bottom of their Pages the "Snapshot Phase" will cause Nightly to crash. Rarely this does not happen (on Nightly) and I've yet to see it occur on Aurora. Every URL at YT (with a "Report a Bug" Link) works but here is an URL to follow: https://www.youtube.com/user/LowLightVideos 

This started a week ago (AFAIK) but admittedly I don't use the "RaB" Feature very often.

bp-494552eb-3508-4fb0-84fe-c4eb92120507

bp-461f562f-2219-4c47-b988-edb452120512
bp-3914d636-c605-44fb-9e73-d0e492120512
bp-5d3b9f4e-d718-4547-b386-27fe32120512


Crashing Thread
Frame 	Module 	Signature 				Source
0 		xul.dll 	xpc_FastGetCachedWrapper 	obj-firefox/dist/include/xpcpublic.h:148
1 		xul.dll 	nsIDOMNode_GetNextSibling 	obj-firefox/js/xpconnect/src/dom_quickstubs.cpp:5276
2 		mozjs.dll 	js::Shape::get 			js/src/jsscopeinlines.h:321
...


Different nearby problem: bp-d32ddc09-46df-464c-81c6-1811c2120511
Component: Untriaged → XPConnect
Product: Firefox → Core
Ah so this is NOT Nightly but actually Aurora???  Nightly is version 15.0a1.  You are running 140.a2 which is NOT Nightly.
Component: XPConnect → Untriaged
Product: Core → Firefox
Summary: Nightly crashes during YT Feedback's snapshot phase in xpc_FastGetCachedWrapper . → Aurora crashes during YT Feedback's snapshot phase in xpc_FastGetCachedWrapper .
Component: Untriaged → XPConnect
Product: Firefox → Core
QA Contact: untriaged → xpconnect
Hmm typo was trying to say you are running 14.0a2.
Oops. Ok i did this on the wrong bug.  Sorry, your crash reports are on nightly.  I apologize.
Summary: Aurora crashes during YT Feedback's snapshot phase in xpc_FastGetCachedWrapper . → Nightly crashes during YT Feedback's snapshot phase in xpc_FastGetCachedWrapper .
Severity: normal → critical
Status: UNCONFIRMED → NEW
Crash Signature: [@ xpc_FastGetCachedWrapper(nsWrapperCache*, JSObject*, JS::Value*)]
Ever confirmed: true
Keywords: crash
Version: 14 Branch → Trunk
Reported at YT also (but it is not theirs).


If you get this far then you passed where the crash occurs (hit [BACK]+"X" to avoid sending 'Test Feedback' to YT).

----------
Google Feedback

Click or drag on the page to help us better understand your feedback. You can move this dialog if it's in the way.

[] Highlight areas relevant to your feedback.

[] Black out any personal information.
----------
(In reply to Bill Gianopoulos [:WG9s] from comment #3)
> Oops. Ok i did this on the wrong bug.  Sorry, your crash reports are on
> nightly.  I apologize.

Bill,

The "User Agent" in Comment 0 is created based on the Browser _used_ to make a Bug Report - regardless of which version of the Browser _actually_ contains the Bug.

I did set the "Version" to "15 Branch" when I reported this. It was likely changed (by someone else) to "Trunk" (and Critical) to make certain we fix this quick.

I have tested Nightly and Aurora but not "FF Release".


It would have been helpful if BugZilla had noticed that the 'version was forced' and omitted the "User Agent" line from the first comment to avoid confusion (and SPAMming of Comment 0). 


You could file an RFE against Bugzilla, last time I tried it was:

Product: mozilla.org → bugzilla.mozilla.org
Component: Bugzilla: Other b.m.o Issues → General


PS: Thanks for your keen interest Bill.
(In reply to Rob from comment #5)
> Reported at YT also (but it is not theirs).
> 
> 
> If you get this far then you passed where the crash occurs (hit [BACK]+"X"
> to avoid sending 'Test Feedback' to YT).
> 
> ----------
> Google Feedback
> 
> Click or drag on the page to help us better understand your feedback. You
> can move this dialog if it's in the way.
> 
> [] Highlight areas relevant to your feedback.
> 
> [] Black out any personal information.
> ----------

This time when I tested on Nightly this I got a _different_ message from YT.

The popup said something to the effect of: 'The Snapshot failed but you can continue to file a Bug Report without the Snapshot. [BACK] [NEXT]'.


When I hit [BACK] (in the popup, not the Browser) and clicked the [X] I got this (NEW problem) 'BP': bp-41db414e-04ad-4645-92dd-61f9c2120515


[@ XPCConvert::NativeInterface2JSObject(XPCLazyCallContext&, JS::Value*, nsIXPConnectJSObjectHolder**, xpcObjectHelper&, nsID const*, XPCNativeInterface**, bool, unsigned int*) ] 

Crashing Thread
Frame 	Module 		Signature 				Source
0 	xul.dll 	XPCConvert::NativeInterface2JSObject 	js/xpconnect/src/XPCConvert.cpp:943
1 	xul.dll 	mozilla::dom::XPCOMObjectToJsval 	dom/bindings/BindingUtils.cpp:285
2 	xul.dll 	mozilla::dom::binding::Wrap<nsIContent> 	js/xpconnect/src/dombindings.cpp:122
3 	xul.dll 	mozilla::dom::binding::ListBase<mozilla::dom::binding::ListClass<nsIHTMLCollecti 	js/xpconnect/src/dombindings.cpp:620
4 		@0xffffff85 	
5 	xul.dll 	xpc::XrayUtils::IsTransparent 	js/xpconnect/wrappers/XrayWrapper.cpp:988
...

-----

I tried again and got this (same problem as originally reported) "BP": bp-17bab00f-20a9-47f4-84ed-d6f402120515


[@ xpc_FastGetCachedWrapper(nsWrapperCache*, JSObject*, JS::Value*) ] 

Crashing Thread
Frame 	Module 	Signature 	Source
0 	xul.dll 	xpc_FastGetCachedWrapper 	obj-firefox/dist/include/xpcpublic.h:148
1 	xul.dll 	nsIDOMNode_GetNextSibling 	obj-firefox/js/xpconnect/src/dom_quickstubs.cpp:5276
2 	mozjs.dll 	js::Shape::get 	js/src/jsscopeinlines.h:321
...

-----

Thus, this problem can trigger a Bug in two different places (as a result of a single user action). 

Do you want a separate Report for bp-41db414e-04ad-4645-92dd-61f9c2120515 { XPCConvert::NativeInterface2JSObject() }, cooment in another BR, or should we keep them together while we wait for the persons fixing this to take a peek ?


Bug "XPCConvert::NativeInterface2JSObject()" might be Bug 704556, Bug 674568, Bug 480682, etc. .
Duplicate of this bug: 756683
Does this happen on Aurora?  If not, we should look for a regression range here.
I got a different backtrace for the same STR: https://crash-stats.mozilla.com/report/index/bp-cab10ddc-09a3-4a86-b556-c6f2c2120518

0 	xul.dll 	xpc_MorphSlimWrapper 	js/xpconnect/src/nsXPConnect.cpp:1238
1 	xul.dll 	nsNodeUtils::CloneAndAdopt 	content/base/src/nsNodeUtils.cpp:430
2 	xul.dll 	nsNodeUtils::CloneAndAdopt 	content/base/src/nsNodeUtils.cpp:591
3 	xul.dll 	nsNodeUtils::CloneAndAdopt 	content/base/src/nsNodeUtils.cpp:591
(In reply to Kyle Huey [:khuey] (khuey@mozilla.com) from comment #9)
> Does this happen on Aurora?  If not, we should look for a regression range
> here.

I have never seen it on Aurora nor can I cause it to happen now.

> regression range
Less than 6 weeks, possibly less than 3. Anyone else use "RaB" in the last month (on Nightly) and prior to 2012-05-07 19:48:53 .

---
about:addons

Shockwave Flash 	11.2.202.235
Last Updated 			Wednesday, May 09, 2012
A regressing changeset would be very helpful here.
Crash Signature: [@ xpc_FastGetCachedWrapper(nsWrapperCache*, JSObject*, JS::Value*)] → [@ xpc_FastGetCachedWrapper(nsWrapperCache*, JSObject*, JS::Value*)] [@ xpc_MorphSlimWrapper(JSContext*, nsISupports*)]
Keywords: regression
Version: Trunk → 15 Branch
Crash Signature: [@ xpc_FastGetCachedWrapper(nsWrapperCache*, JSObject*, JS::Value*)] [@ xpc_MorphSlimWrapper(JSContext*, nsISupports*)] → [@ xpc_FastGetCachedWrapper(nsWrapperCache*, JSObject*, JS::Value*)] [@ xpc_MorphSlimWrapper(JSContext*, nsISupports*)] [@ XPCConvert::NativeInterface2JSObject(XPCLazyCallContext&, JS::Value*, nsIXPConnectJSObjectHolder**, xpcObjectHelper&, nsID const* …
Regression window(m-c)
Not crash:
http://hg.mozilla.org/mozilla-central/rev/2db9df42823d
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/15.0 Firefox/15.0a1 ID:20120504030509
Crash:
http://hg.mozilla.org/mozilla-central/rev/e1a40027dc7e
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/15.0 Firefox/15.0a1 ID:20120504014349
Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=2db9df42823d&tochange=e1a40027dc7e

Regression window(m-i)
Not crash
http://hg.mozilla.org/integration/mozilla-inbound/rev/ac00c792933e
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/15.0 Firefox/15.0a1 ID:20120503001006
Crash:
http://hg.mozilla.org/integration/mozilla-inbound/rev/bed8c4e3dfdf
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:15.0) Gecko/15.0 Firefox/15.0a1 ID:20120503001301
Pushlog:
http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=ac00c792933e&tochange=bed8c4e3dfdf
OK, totally reproducible.  In js::GetObjectClass we have:

(gdb) p ((class 'js::shadow::Object'*)obj)->shape->base
$13 = (struct BaseShape *) 0xdadadadadadadada

which is ... not happy.

In fact:

(gdb) p/x *((class 'js::shadow::Object'*)obj)->shape
$26 = {
  base = 0xdadadadadadadada, 
  _1 = {
    asBits = 0xdadadadadadadada
  }, 
  slotInfo = 0xdadadada, 
  static FIXED_SLOTS_SHIFT = 0x1b
}

And further:

(gdb) p/x *((class 'js::shadow::Object'*)obj)  
$27 = {
  shape = 0x14c009340, 
  type = 0x14c009fff, 
  slots = 0xdadadadadadadada, 
  _1 = 0xdadadadadadadada
}

So the shape and type pointers are OK, but the rest of the object looks overwritten.  Similarly, if I look inside the type, some parts are OK but some are 0xdada....
Assignee: nobody → general
Group: core-security
Component: XPConnect → JavaScript Engine
QA Contact: xpconnect → general
We need a StackTrace.

Note: Used these commends before and after, then "gn" and "gh" to continue.

|* ~* kp
|* !analyze -v -f
|* lm
|* !analyze -v -hang



PS: MANY lines containing following were deleted from the Log:
http://symbols.mozilla.org/firefox/firefox.exe/4FBC305Fe1000/firefox.exe not found
http://symbols.mozilla.org/firefox/mozjs.dll/4FBC1F371f9000/mozjs.dll not found
http://symbols.mozilla.org/firefox/xul.dll/4FBC2FDFed0000/xul.dll not found

Is the Server missing some Symbols ?
The stack trace is just us calling the .firstChild quickstub via a cross-compartment wrapper, then grabbing the JSObject* from the nsWrapperCache of the node and trying to get its class in xpc_FastGetCachedWrapper.
Whiteboard: [js:p1:fx15]
This sounds a lot like bug 752340 to me. Both involve getting a GCed object out of the wrapper cache and both have CPG in the regression range.
(In reply to Bill McCloskey (:billm) from comment #17)
> This sounds a lot like bug 752340 to me. Both involve getting a GCed object
> out of the wrapper cache and both have CPG in the regression range.

I am not authorized to access bug #752340, so I can not comment.

-----

While I am here ...

Reading http://hg.mozilla.org/mozilla-central/rev/e1a40027dc7e my $0.0002 Cents is:

In xpc_CreateSandboxObject() (@ http://hg.mozilla.org/mozilla-central/rev/e1a40027dc7e#l360.36) we _previously_:

1. Used the Parameter "*identityPtr" when we called the Function.
2. Later (360.56) we tested "*identityPtr" and IF it were null we allocated "identity" and then pointed "*identityPtr" to it.
3. Then in 360.62 we used "*identityPtr" (which had TWO chances to be allocated) to call xpc_CreateGlobalObject() .
4. Then in 360.101 we set "*identity = nsnull;".
5. Later (360.143) we _might_ set "identity = compartmentPrivate->key->GetPtr();"
6. Then we use "identity" (which might be null or assigned) in 360.162 .


After the change (in xpc_CreateSandboxObject() ) we:

1. Do not use "*identityPtr" in the Function's Parameters (360.38).
2. In 360.64 we use "identity" (which _might_ have been successfully allocated in 360.63) instead of "identityPtr", (which had TWO chances to be allocated) and that is it's last use, we no longer use "identity" in 360.163 .


With a quick look (and poor familiarity with the Code) it looks like we used to:

1. Make a double effort to set or allocate "identityPtr" and use it in 360.62 .
2. Set "identity" to nsnull in 360.101 .
3. We _might_ reallocate "identity" in 360.143, else it is nsnull .
4. We use "identity" in 360.162 .

Now we:

1. Make a single effort to allocate "identity" to the Function "Identity()" assume it was OK, and use it immediately in 360.64 .
2. Neither  "identityPtr" or "identity" is used after that.
3. Do not have a chance that whatever called xpc_CreateSandboxObject() set "*identityPtr" to something (for subsequent use by "identity") and instead we hope that the call to "Identity()" worked and that "New" doesn't have a bad_alloc exception.

The test in 360.57 gave us a greater assurance of a non-null value and we probably would have set a correct value when we called xpc_CreateSandboxObject() now we use 360.63 to replace that; but there is a small chance it could fail.


We've gone from using a Parameter which was likely set correctly (by the calling Function, unless there are two Bugs ;( ) and testing it, to using a Variable that might not be set correctly and performing no tests on it (in the unlikely event that "New" or "Identity()" fail) before using it. Then off we go to call (recursively) xpc_CreateSandboxObject() in 360.163 .


I hope that was well explained and an interesting diversion.
I have same issue related to bug 752340.
I reailize that, but so is this one.  If people expect others to follow along with what they are saying and possibly help with a solution, then perhaps cc'ing people would help. Just sayin'!
If not then taking this bug and kind of duping it to a hideen bug without copying the cc list is kind of counter to what I thought was the openness that was supposed to be at the heart of this project.  Also just sayin'!
This bug and bug 752340 don't affect release builds or even Beta builds. Why are they hidden?
Because we don't want our nightly users being exploited either?
(In reply to Boris Zbarsky (:bz) from comment #24)
> Because we don't want our nightly users being exploited either?
I don't think there are malicious people scanning Bugzilla to find security holes in unreleased builds. Malicious people exploit known vulnerabilities for users without the latest release version or unknown vulnerabilities.
> I don't think there are malicious people scanning Bugzilla to find security holes in
> unreleased builds.

There have been in the past.  Why do you think there aren't now?
(In reply to Scoobidiver from comment #25)
> (In reply to Boris Zbarsky (:bz) from comment #24)
> > Because we don't want our nightly users being exploited either?
> I don't think there are malicious people scanning Bugzilla to find security
> holes in unreleased builds. Malicious people exploit known vulnerabilities
> for users without the latest release version or unknown vulnerabilities.

(some) People (sh/w)ould look for whichever 'hole' is available wherever they can find find it and may utilize the knowledge to apply it elsewhere (called a '0 Day Exploit'). We should give no free ride.

OTOH people with access to this Bug should be considered for access to the other as mentioned in Comment 22. It is less work for me not to have the access and less help for you to refuse it.
I also found this by Googling: bp-d254aeb1-f62b-4096-80cc-47ced2120513 (on Linux) with a slightly differing Thread path leading to xpc_FastGetCachedWrapper() in jsfriendapi.h:356 (without a Socorro generated Link).

Some of the Links provided by Socorro in that "BP" give a "mozilla-central - error" and "An error occurred while processing your request
obj-firefox/js/xpconnect/src/dom_quickstubs.cpp@c758cc9b60e5: not found in manifest" .


Should we recheck the Regression Window (this Bug may have come and gone, then came back again) ?

These files: https://crash-analysis.mozilla.com/rkaiser/2012-04-20/ff-13-crashcount.csv https://crash-analysis.mozilla.com/rkaiser/2012-02-22/2012-02-22.firefox.esr.10.0.components.html show a crash count for xpc_FastGetCachedWrapper(nsWrapperCache*, JSObject*, JS::Value*) and are dated prior to the dates given in Comment 13.
Whiteboard: [js:p1:fx15] → [js:p1:fx16]
This WFM on 14, 15, and 16. Is anyone still seeing this bug?
Whiteboard: [js:p1:fx16] → [js:waitingforinfo][js:p1:fx16]
Whiteboard: [js:waitingforinfo][js:p1:fx16] → [js:waitingforinfo]
Here is the only recent crash report I have seen: bp-be3e50f0-e972-4194-833f-2f6792120718. nsIDOMNode_GetNextSibling is not in the stack trace.
(In reply to David Mandelin [:dmandelin] from comment #29)
> This WFM on 14, 15, and 16. Is anyone still seeing this bug?

Given this and comment 30, can we untrack for upcoming versions?
(In reply to Scoobidiver from comment #23)
> This bug and bug 752340 don't affect release builds or even Beta builds. Why
> are they hidden?

Not only are there a lot of Nightly and Aurora users who would be interesting to attack should someone want to nail a specific target (people who can check in? people who can see all the other security bugs?), we sometimes don't get the bugs fixed before they hit Beta or Release.
(In reply to Alex Keybl [:akeybl] from comment #31)
> (In reply to David Mandelin [:dmandelin] from comment #29)
> > This WFM on 14, 15, and 16. Is anyone still seeing this bug?
> 
> Given this and comment 30, can we untrack for upcoming versions?

Unhide or untrack? On hiding, I'll defer to Dan in comment 32. On tracking, I don't think we need to track it for now.
Given comment 33, un-tracking for 15.
This still sounds like a dupe of bug 752340 to me, so it makes sense to unhide them at the same time.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → WORKSFORME
Whiteboard: [js:waitingforinfo] → fixed by bug 752340?
Group: core-security → core-security-release
Group: core-security-release
You need to log in before you can comment on or make changes to this bug.