Closed Bug 854614 Opened 8 years ago Closed 8 years ago

Root quick-stubs

Categories

(Core :: XPConnect, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla23

People

(Reporter: evilpie, Assigned: evilpie)

References

Details

Attachments

(3 files)

Should reduce the total count already.
Attachment #729276 - Flags: review?(terrence)
Assignee: nobody → evilpies
Status: NEW → ASSIGNED
Attachment #729284 - Flags: review?(terrence)
Comment on attachment 729276 [details] [diff] [review]
xpc_qsUnwrapThis - v1

Review of attachment 729276 [details] [diff] [review]:
-----------------------------------------------------------------

Nice!
Attachment #729276 - Flags: review?(terrence) → review+
You should probably get an XPConnect peer to look at these, though they do look pretty simple.  Maybe peterv or bholley.
Attachment #729276 - Flags: review?(bobbyholley+bmo)
Comment on attachment 729284 [details] [diff] [review]
XPCLazyCallContext - v1

Review of attachment 729284 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/xpconnect/src/xpcprivate.h
@@ +1319,5 @@
>          : mCallBeginRequest(DONT_CALL_BEGINREQUEST),
>            mCcx(&ccx),
>            mCcxToDestroy(nullptr)
> +          , mObj(ccx.GetJSContext(), nullptr)
> +          , mFlattenedJSObject(ccx.GetJSContext(), nullptr)

Eww, you should probably move all the commas to the front, or fix up the moved lines.
Attachment #729284 - Flags: review?(terrence) → review+
Attachment #729284 - Flags: review?(bobbyholley+bmo)
Hmm.  Shouldn't JS_THIS_OBJECT return a rooted object in some way?  Having to root it again is a bit weird....
(In reply to Boris Zbarsky (:bz) from comment #5)
> Hmm.  Shouldn't JS_THIS_OBJECT return a rooted object in some way?  Having
> to root it again is a bit weird....
We need to root JSObject*, in addition to that the function sometimes computes a this object.
JS_THIS_OBJECT guarantees that vp[1] is a Value holding the JSObject*.  And vp is already completely rooted, right?

That's assuming that one sticks to using JS_THIS_OBJECT on the vp of a JSNative, of course, which doesn't seem like an insane assumption...
(In reply to Boris Zbarsky (:bz) from comment #7)
> JS_THIS_OBJECT guarantees that vp[1] is a Value holding the JSObject*.  And
> vp is already completely rooted, right?

Well, the object will at least be kept alive by the vp reference. This doesn't really help much, however, if we do a compacting or nursery collection in the meantime. In order to support moving collectors, the GC needs to know about - and be able to update - all live pointers in the system.

> That's assuming that one sticks to using JS_THIS_OBJECT on the vp of a
> JSNative, of course, which doesn't seem like an insane assumption...

CallArgs is much better from a rooting perspective because we can easily return a HandleValue referencing the already-rooted Value. This, unfortunately, doesn't help much if you need to get a new JSObject* reference out of the Value reference out and store it on the C stack.
> In order to support moving collectors, the GC needs to know about - and be able to update
> - all live pointers in the system.

Yes, but I assume the GC already has to know about vp[1], right?

> CallArgs is much better from a rooting perspective because we can easily return a
> HandleValue referencing the already-rooted Value.

That's generally fine with me; there are existing bugs on converting codegen to CallArgs, I think.

There is no need to store this object on the C stack per se; we just need to be able to get at it easily...
(In reply to Boris Zbarsky (:bz) from comment #9)
> > In order to support moving collectors, the GC needs to know about - and be able to update
> > - all live pointers in the system.
> 
> Yes, but I assume the GC already has to know about vp[1], right?

Correct. The protocol is that the caller of the JSNative must ensure that the vp array is rooted.

> > CallArgs is much better from a rooting perspective because we can easily return a
> > HandleValue referencing the already-rooted Value.
> 
> That's generally fine with me; there are existing bugs on converting codegen
> to CallArgs, I think.
> 
> There is no need to store this object on the C stack per se; we just need to
> be able to get at it easily...

That's usually fine. However, you do have to be extremely careful when using temporaries because C++ puts very few limitations on ordering. One of the more nightmarish examples that we ran into when rooting SpiderMonkey was this one-liner:

fun->script = MethodThatOccasionallyGCs(); // Occasional crash!

When compiled by GCC 3.2 the order of operations was:
1) RootedFunction::operator-> (stores fun to C stack)
2) lookup JSFunction::script (stores fun->script to C stack)
3) MethodThatMayRarelyGC() (corrupts fun and script on C stack)
4) HeapPtrScript::operator= (|this| is the corrupted script pointer)
5) JSScript::writeBarrierPre
6) Cell::compartment (attempt to access the arena header through a broken pointer accesses an unmapped address)

Most other compilers seem to emit this line in the more natural ordering where (3) comes first, however, both orderings are fine by my reading of the C++ spec.

I don't expect to hit anything this nasty when rooting the browser because we expose so few internals, but this sort of thing does still keep me awake at night. I'll sleep much better if we root aggressively and don't try to too hard to outsmart C++ in the general case. My hope is that if we focus our root minimization efforts on the generated code we will get more bang for our buck with less overt danger. We can handle any other hot-paths as once-off specializations as we discover that they are problematic. Does this make sense from your perspective?
Yep.  My main concern, really, is what happens with the WebIDL bindings.  The quickstubs we have left at this point are not as performance-critical, so the extra Rooted there is not a huge deal.  

Just to check, when you say "generated code" do you mean JIT-generated code or WebIDL-codegen generated code?

My one remaining question is this.  My understanding of Handle<JSObject*> is that it's really a JSObject** under the hood, pointing to a JSObject* that the GC knows to update.  So it seems like having a Handle<JSObject*> on the stack should be OK, right?  I mean, that's what a Handle<JSObject*> function argument is...
(In reply to Boris Zbarsky (:bz) from comment #11)
> Yep.  My main concern, really, is what happens with the WebIDL bindings. 
> The quickstubs we have left at this point are not as performance-critical,
> so the extra Rooted there is not a huge deal.  

Excellent!

> Just to check, when you say "generated code" do you mean JIT-generated code
> or WebIDL-codegen generated code?

WebIDL-codegen. Sorry, should have been more specific, considering how much code we generate around here.

> My one remaining question is this.  My understanding of Handle<JSObject*> is
> that it's really a JSObject** under the hood, pointing to a JSObject* that
> the GC knows to update.  So it seems like having a Handle<JSObject*> on the
> stack should be OK, right?  I mean, that's what a Handle<JSObject*> function
> argument is...

Yes, that is absolutely correct.

Now that I've given it some more thought, I think the two issues we had with Handle<JSObject*> cannot be a problem in the browser. JSObject* is opaque there, making operator-> is useless and |this| will never be a GC thing. Of course, that doesn't preclude other problems cropping up, but there's no need to get ahead of ourselves.
Comment on attachment 729276 [details] [diff] [review]
xpc_qsUnwrapThis - v1

Review of attachment 729276 [details] [diff] [review]:
-----------------------------------------------------------------

This stuff used to be insanely hot, such that the rooting here might introduce something like a 5% dromaeo regression. But I don't think it is anymore, now that we have the new bindings.

r=bholley contingent on implementing whatever gets decided about the typedefs in bug 854955.
Attachment #729276 - Flags: review?(bobbyholley+bmo) → review+
Comment on attachment 729284 [details] [diff] [review]
XPCLazyCallContext - v1

Review of attachment 729284 [details] [diff] [review]:
-----------------------------------------------------------------

The review comments in comment 13 apply verbatim here.
Attachment #729284 - Flags: review?(bobbyholley+bmo) → review+
So we are actually okay with rooting this now?
Seems like everybody agree to just root and be done with it. Going to land after some try running.
I had to remove the #ifdef DEBUG optimization in XPCLazyCallContext, because otherwise it wouldn't compile on the linux builders.

https://hg.mozilla.org/integration/mozilla-inbound/rev/b9d17aa000e7
https://hg.mozilla.org/integration/mozilla-inbound/rev/2b47c18653da
Backed out for Windows crashes.
https://hg.mozilla.org/integration/mozilla-inbound/rev/6a31b8520842

https://tbpl.mozilla.org/php/getParsedLog.php?id=21486597&tree=Mozilla-Inbound

14:05:43     INFO -  ### XPCOM_MEM_BLOAT_LOG defined -- logging bloat/leaks to c:\users\cltbld~1.t-w\appdata\local\temp\tmpuepis1\runtests_leaks.log
14:05:44     INFO -  WARNING: Failed to load startupcache file correctly, removing!: file e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/startupcache/StartupCache.cpp, line 235
14:05:44     INFO -  WARNING: This method is lossy. Use GetCanonicalPath !: file e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/xpcom/io/nsLocalFileWin.cpp, line 3271
14:05:46     INFO -  1365195946089	Marionette	INFO	MarionetteComponent loaded
14:05:46     INFO -  1365195946092	Marionette	INFO	marionette not enabled
14:05:46     INFO -  ++DOCSHELL 065C41A8 == 1 [id = 1]
14:05:46     INFO -  ++DOMWINDOW == 1 (0657AA08) [serial = 1] [outer = 00000000]
14:05:46     INFO -  ++DOMWINDOW == 2 (05950148) [serial = 2] [outer = 0657AA08]
14:05:47     INFO -  pldhash: for the table at address 09B9BE68, the given entrySize of 144 definitely favors chaining over double hashing.
14:05:48     INFO -  ++DOCSHELL 0A121798 == 2 [id = 2]
14:05:48     INFO -  ++DOMWINDOW == 3 (05951488) [serial = 3] [outer = 00000000]
14:05:48     INFO -  ++DOMWINDOW == 4 (059516F0) [serial = 4] [outer = 05951488]
14:05:48     INFO -  ++DOMWINDOW == 5 (05951958) [serial = 5] [outer = 0657AA08]
14:05:49     INFO -  WARNING: Unable to test style tree integrity -- no content node: file e:/builds/moz2_slave/m-in-w32-d-0000000000000000000/build/layout/base/nsCSSFrameConstructor.cpp, line 8246
14:05:50  WARNING -  TEST-UNEXPECTED-FAIL | automation.py | Exited with code -1073741819 during test run
14:05:50     INFO -  INFO | automation.py | Application ran for: 0:00:07.244000
14:05:50     INFO -  INFO | zombiecheck | Reading PID log: c:\users\cltbld~1.t-w\appdata\local\temp\tmppibjxfpidlog
14:05:57  WARNING -  PROCESS-CRASH | automation.py | application crashed [@ xpcObjectHelper::~xpcObjectHelper() + 0xa]
14:05:57     INFO -  Crash dump filename: c:\users\cltbld~1.t-w\appdata\local\temp\tmpuepis1\minidumps\c8371ac2-a98f-46cd-8ace-f08ea5e14c55.dmp
14:05:57     INFO -  Operating system: Windows NT
14:05:57     INFO -                    6.2.9200
14:05:57     INFO -  CPU: x86
14:05:57     INFO -       GenuineIntel family 6 model 30 stepping 5
14:05:57     INFO -       8 CPUs
14:05:57     INFO -  Crash reason:  EXCEPTION_ACCESS_VIOLATION_READ
14:05:57     INFO -  Crash address: 0xffffffff90000000
14:05:57     INFO -  Thread 0 (crashed)
14:05:57     INFO -   0  xul.dll!xpcObjectHelper::~xpcObjectHelper() + 0xa
14:05:57     INFO -      eip = 0x715e59a9   esp = 0x00fce7dc   ebp = 0x00fcdead   ebx = 0x0474dc01
14:05:57     INFO -      esi = 0x00fcde91   edi = 0x09da9cd8   eax = 0x90000000   ecx = 0x00fcde91
14:05:57     INFO -      edx = 0x00000000   efl = 0x00010286
14:05:57     INFO -      Found by: given as instruction pointer in context
14:05:57     INFO -   1  xul.dll!mozilla::dom::HandleNewBindingWrappingFailureHelper<nsDOMEvent,0>::Wrap(JSContext *,JSObject *,nsDOMEvent &,JS::Value *) [BindingUtils.h:2b47c18653da : 696 + 0x3e]
14:05:57     INFO -      eip = 0x72542b66   esp = 0x00fce7e4   ebp = 0x00fce9d4
14:05:57     INFO -      Found by: call frame info
14:05:57     INFO -   2  xul.dll + 0x2120183
14:05:57     INFO -      eip = 0x72ff0184   esp = 0x00fce9dc   ebp = 0x7020512d
14:05:57     INFO -      Found by: call frame info
14:05:57     INFO -   3  xul.dll!BloatEntry::AccountRefs() [nsTraceRefcntImpl.cpp:2b47c18653da : 275 + 0x12]
14:05:57     INFO -      eip = 0x7288171e   esp = 0x00fce9e4   ebp = 0x7020512d
14:05:57     INFO -      Found by: stack scanning
14:05:57     INFO -   4  nss3.dll!PL_HashTableRawLookup [plhash.c:2b47c18653da : 146 + 0x16]
14:05:57     INFO -      eip = 0x7020512d   esp = 0x00fce9f4   ebp = 0x7020512d
14:05:57     INFO -      Found by: stack scanning
14:05:57     INFO -   5  xul.dll!BloatEntry::AccountRefs() [nsTraceRefcntImpl.cpp:2b47c18653da : 275 + 0x12]
14:05:57     INFO -      eip = 0x7288171e   esp = 0x00fcea10   ebp = 0x7020512d
14:05:57     INFO -      Found by: call frame info
14:05:57     INFO -   6  nss3.dll!_PR_MD_UNLOCK [w95cv.c:2b47c18653da : 365 + 0x6]
14:05:57     INFO -      eip = 0x701ffcbf   esp = 0x00fcea38   ebp = 0x7020512d
14:05:57     INFO -      Found by: stack scanning
14:05:57     INFO -   7  nss3.dll!PR_Unlock [prulock.c:2b47c18653da : 315 + 0x8]
14:05:57     INFO -      eip = 0x701fbae1   esp = 0x00fcea40   ebp = 0x7020512d
14:05:57     INFO -      Found by: call frame info
14:05:57     INFO -   8  xul.dll!BloatEntry::AccountRefs() [nsTraceRefcntImpl.cpp:2b47c18653da : 275 + 0x12]
14:05:57     INFO -      eip = 0x7288171e   esp = 0x00fcea58   ebp = 0x7020512d
14:05:57     INFO -      Found by: call frame info
Not sure how this is caused by the patch. I am not thrilled.
Well, for a start that stack is garbage...  I'm willing to posit frame 0 is right, but the rest seems stomped-on or something.  What's odd is it's always stomped the same way?!?

Some of the other stacks show it as the ~nsCOMPtr<nsIClassInfo> under the ~xpcObjectHelper, which is not unreasonable, but are the same above that.

Of interest is that only debug Windows builds are affected.

Do a try push (just the build on debug Windows should do) of just the first patch here to see which of these two patches causes the problem and then we'll have more info for digging into it?
Thank you for taking a look. I didn't even think about the stack being garbage. Curious how reproducible it is, and than only on debug.

I already did that, and they try push for the first patch is green. So I am going to push that.
Yeah, the reproducibility confuses me too, but the stack is clearly somewhat nonsense.  :(  Maybe it's just whatever symbolicates it screwing up?

As far as the second patch goes... what if we preserve the order of the initializers in the constructor?  I have no idea why that would matter on Windows only, but it's a debug-only issue for sure.
I fixed the windows issue. I changed the Rooted initializer to use XPCCallContext passed as parameter instead of the member. https://tbpl.mozilla.org/?tree=Try&rev=644d66b8cf77
https://hg.mozilla.org/mozilla-central/rev/8bcd935cbcfe
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Er...  The patch on this bug uses the parameter too.

The try patch seems to restore the old order, though, no?
(In reply to Boris Zbarsky (:bz) from comment #26)
> Er...  The patch on this bug uses the parameter too.
> 
> The try patch seems to restore the old order, though, no?

You are right, so the only change is that I removed the #ifdef DEBUG and fixed the ordering.
I just pushed the fixed patch. At the end this is not really different from what was reviewed.
https://hg.mozilla.org/integration/mozilla-inbound/rev/3fd383ca51c8
https://hg.mozilla.org/mozilla-central/rev/3fd383ca51c8
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
There is still more to be done. Patch coming up, but still needs to go through try.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attachment #735776 - Flags: review?(bzbarsky)
Comment on attachment 735776 [details] [diff] [review]
Root arg and result in generated stubs

>+        if typeName == "[jsval]":
>+            argPtr = name + ".address()"

That doesn't make sense.  The only thing that saves it, as far as I can tell, is that argPtr is never used in the [jsval] case...

I would prefer you leave argPtr as it was, or set it to something that will definitely trigger a compiler error in the jsval case if someone uses it.

r=me with that fixed.
Attachment #735776 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky (:bz) from comment #32)
> Comment on attachment 735776 [details] [diff] [review]
> Root arg and result in generated stubs
> 
> >+        if typeName == "[jsval]":
> >+            argPtr = name + ".address()"
> 
> That doesn't make sense.  The only thing that saves it, as far as I can
> tell, is that argPtr is never used in the [jsval] case...
> 
> I would prefer you leave argPtr as it was, or set it to something that will
> definitely trigger a compiler error in the jsval case if someone uses it.
> 
> r=me with that fixed.

Thanks for the quick review. I am going to set it to None, so it might already fail in python.
Rev 2ceca4816688 was backed out because we think it was the cause of intermittent mochitest-a11y crashes.
https://hg.mozilla.org/integration/mozilla-inbound/rev/ed781d87fc46

https://tbpl.mozilla.org/php/getParsedLog.php?id=21683764&tree=Mozilla-Inbound

07:27:11     INFO -  7752 INFO TEST-PASS | chrome://mochitests/content/a11y/accessible/tests/mochitest/states/test_aria_widgetitems.html | state bits should not be present in ID [ 'div@id="aria_tab3" node', address: [object HTMLDivElement] ]!
07:27:11     INFO -  7753 INFO TEST-PASS | chrome://mochitests/content/a11y/accessible/tests/mochitest/states/test_aria_widgetitems.html | Focussed [ 'div@id="aria_tab3" node', address: [object HTMLDivElement] ] must be focusable!
07:27:11  WARNING -  TEST-UNEXPECTED-FAIL | chrome://mochitests/content/a11y/accessible/tests/mochitest/states/test_aria_widgetitems.html | Exited with code 11 during test run
07:27:11     INFO -  INFO | automation.py | Application ran for: 0:01:32.523391
07:27:11     INFO -  INFO | zombiecheck | Reading PID log: /tmp/tmpnm1b0Ipidlog
07:27:11     INFO -  ==> process 2576 launched child process 2641
07:27:11     INFO -  INFO | zombiecheck | Checking for orphan process with PID: 2641
07:27:11     INFO -  mozcrash INFO | Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux64/1365685870/firefox-23.0a1.en-US.linux-x86_64.crashreporter-symbols.zip
07:27:11     INFO -  Downloading symbols from: http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-inbound-linux64/1365685870/firefox-23.0a1.en-US.linux-x86_64.crashreporter-symbols.zip
07:27:32  WARNING -  PROCESS-CRASH | chrome://mochitests/content/a11y/accessible/tests/mochitest/states/test_aria_widgetitems.html | application crashed [@ JSAutoCompartment::JSAutoCompartment]
07:27:32     INFO -  Crash dump filename: /tmp/tmp_eykxC/minidumps/15daba40-6e30-68ab-6389fba9-4c15c5cd.dmp
07:27:32     INFO -  Operating system: Linux
07:27:32     INFO -                    0.0.0 Linux 2.6.31.5-127.fc12.x86_64 #1 SMP Sat Nov 7 21:11:14 EST 2009 x86_64
07:27:32     INFO -  CPU: amd64
07:27:32     INFO -       family 6 model 23 stepping 10
07:27:32     INFO -       2 CPUs
07:27:32     INFO -  Crash reason:  SIGSEGV
07:27:32     INFO -  Crash address: 0x0
07:27:32     INFO -  Thread 0 (crashed)
07:27:32     INFO -   0  libxul.so!JSAutoCompartment::JSAutoCompartment [jsapi.cpp:2bb26d742f5f : 7282 + 0x0]
07:27:32     INFO -      rbx = 0x00007fff684102e8   r12 = 0x0000000000000000
07:27:32     INFO -      r13 = 0x0000000000000000   r14 = 0x00007f2f0f840900
07:27:32     INFO -      r15 = 0x00007fff684103d0   rip = 0x00007f2f27c2356f
07:27:32     INFO -      rsp = 0x00007fff6840ffc0   rbp = 0x00007fff6840ffc0
07:27:32     INFO -      Found by: given as instruction pointer in context
07:27:32     INFO -   1  libxul.so!XPCWrappedNative::GetNewOrUsed(XPCCallContext&, xpcObjectHelper&, XPCWrappedNativeScope*, XPCNativeInterface*, XPCWrappedNative**) [Util.h:2bb26d742f5f : 181 + 0x4]
07:27:32     INFO -      rip = 0x00007f2f271f2682   rsp = 0x00007fff6840ffd0
07:27:32     INFO -      rbp = 0x00007fff68410180
07:27:32     INFO -      Found by: stack scanning
07:27:32     INFO -   2  libxul.so!nsHTMLDocument::QueryInterface(nsID const&, void**) [nsHTMLDocument.cpp:2bb26d742f5f : 253 + 0x5]
07:27:32     INFO -      rip = 0x00007f2f26ea88b0   rsp = 0x00007fff68410030
07:27:32     INFO -      rbp = 0x00007fff68410180
07:27:32     INFO -      Found by: stack scanning
07:27:32     INFO -   3  libxul.so!nsNodeSH::PreCreate(nsISupports*, JSContext*, JSObject*, JSObject**) [nsDOMClassInfo.cpp:2bb26d742f5f : 5765 + 0x5]
07:27:32     INFO -      rip = 0x00007f2f26f86f91   rsp = 0x00007fff68410050
07:27:32     INFO -      rbp = 0x00007fff68410180
07:27:32     INFO -      Found by: stack scanning
Whiteboard: [leave open]
The crash is still happening even with the backout. It's being tracked in bug 860903.
Comment on attachment 735776 [details] [diff] [review]
Root arg and result in generated stubs

This wasn't the cause of bug 860903. Re-landed.
https://hg.mozilla.org/integration/mozilla-inbound/rev/afb6e4e2d67d
Whiteboard: [leave open]
https://hg.mozilla.org/mozilla-central/rev/afb6e4e2d67d
Status: REOPENED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.