Closed Bug 997440 Opened 6 years ago Closed 6 years ago

Use a sane entry script for XPCWrappedJS invocation

Categories

(Core :: XPConnect, defect)

x86
macOS
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla32

People

(Reporter: bholley, Assigned: bholley)

References

Details

Attachments

(9 files, 2 obsolete files)

3.08 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
1.10 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
2.14 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
5.78 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
3.03 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
1.54 KB, patch
bholley
: review+
Details | Diff | Splinter Review
1.78 KB, patch
bholley
: review+
Details | Diff | Splinter Review
6.64 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
6.98 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
In bug 988383 comment 13, Bob ran into failures when he tried to remove the cx push in nsGlobalWindow::SetNewDocument. As it turns out, we associated XPCWrappedNativeScope instances with the stack-top cx at creation time, and fall back to that for a JSContext in XPCWrappedJS invocation if there's nothing on the stack.

So when we use the SafeJSContext in SetNewDocument, that spreads to the XPCWrappedNativeScope, causing us to use the SafeJSContext to invoke XPCWrappedJS instances, which gives them no useful base URI. This causes us to fail things like docshell/test/test_bug660404.html.

I'll attach some patches to fix this all up.
(In reply to Bobby Holley (:bholley) from comment #0)

> So when we use the SafeJSContext in SetNewDocument, that spreads to the
> XPCWrappedNativeScope, causing us to use the SafeJSContext to invoke
> XPCWrappedJS instances, which gives them no useful base URI.

Ah, I'd looked at that code a couple of times, where it does a bare |new XPCWrappedNativeScope|.
Couldn't quite see how it was causing the uncaught exceptions though.
Depends on: 997987
Depends on: 998377
Depends on: 999213
(In reply to Bobby Holley (:bholley) from comment #3)
> https://tbpl.mozilla.org/?tree=Try&rev=19c964a954a1

I know these may not be the final patches, but should these usages of AutoEntryScript have comments about why we need them, linking to the spec or stating that they are Gecko specific?
(In reply to Bob Owen (:bobowen) from comment #4)
> (In reply to Bobby Holley (:bholley) from comment #3)
> > https://tbpl.mozilla.org/?tree=Try&rev=19c964a954a1
> 
> I know these may not be the final patches, but should these usages of
> AutoEntryScript have comments about why we need them, linking to the spec or
> stating that they are Gecko specific?

Good point! I've added them.
Attachment #8420248 - Flags: review?(bzbarsky)
This failure is currently being silently squelched on trunk.
Attachment #8420249 - Flags: review?(bzbarsky)
This basically means that baseURI will be relative to global in which the code
is defined, which seems reasonable.
Attachment #8420252 - Flags: review?(bzbarsky)
Comment on attachment 8420254 [details] [diff] [review]
Part 7 - Remove cx pushing in SetNewDocument. v1

Review of attachment 8420254 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/base/nsGlobalWindow.cpp
@@ +2279,5 @@
>    }
>  #endif
>  
> +  AutoJSAPI jsapi;
> +  JSContext *cx = jsapi.cx();

* to the left

@@ +2341,3 @@
>    // Check if we're near the stack limit before we get anywhere near the
>    // transplanting code.
>    JS_CHECK_RECURSION(cx, return NS_ERROR_FAILURE);

This is in a bit of a weird place now. Should it be right after the declaration of cx?
Comment on attachment 8420248 [details] [diff] [review]
Part 1 - Push some more cxes in jsd. v1

Review of attachment 8420248 [details] [diff] [review]:
-----------------------------------------------------------------

::: js/jsd/jsd_stak.cpp
@@ +269,5 @@
>      JSD_LOCK_THREADSTATES(jsdc);
>  
>      if( jsd_IsValidFrameInThreadState(jsdc, jsdthreadstate, jsdframe) )
>      {
> +        AutoPushJSContext cx(jsdthreadstate->context);

Er ... aren't we trying to get rid of these?
(In reply to Bob Owen (:bobowen) from comment #16)
> >      if( jsd_IsValidFrameInThreadState(jsdc, jsdthreadstate, jsdframe) )
> >      {
> > +        AutoPushJSContext cx(jsdthreadstate->context);
> 
> Er ... aren't we trying to get rid of these?

Yes, but the entire jsd directory is going to be rm -f ed sometime this cycle or next, so we just want to whack the mole a little bit longer. ;-)
Bob, there were a few errors in the try push in comment 14. Do you have the cycles to take a quick look at them and see what's going on? I'm pretty swamped at the moment.
Flags: needinfo?(bobowencode)
(In reply to Bobby Holley (:bholley) from comment #18)
> Bob, there were a few errors in the try push in comment 14. Do you have the
> cycles to take a quick look at them and see what's going on? I'm pretty
> swamped at the moment.

I'll take a look and see what I can do.
Flags: needinfo?(bobowencode)
(In reply to Bob Owen (:bobowen) from comment #19)
> (In reply to Bobby Holley (:bholley) from comment #18)
> > Bob, there were a few errors in the try push in comment 14. Do you have the
> > cycles to take a quick look at them and see what's going on? I'm pretty
> > swamped at the moment.
> 
> I'll take a look and see what I can do.

I've been ruling out all the extraneous failures throughout the day while doing other things finally got chance to have a look at the real ones.

So the xpcshell test that is failing is definitely because of Part 5.
It looks like it's observing extra exceptions that it is not expecting.
Possibly something to do with directory services.

Sorry, haven't had much experience with xpcshell tests and I'm having trouble using gdb.
I get the following:

.../xpcshell: error while loading shared libraries: libxul.so: cannot open shared object file: No such file or directory
I've got gdb working now, it appears to be the change in nsXPCWrappedJSClass::CallMethod that is causing the problem.
I'll carry on digging while I have time.
I've been working on Windows today, so I've just had a look at the browser chrome failure and it looks like it is also caused by the change in nsXPCWrappedJSClass::CallMethod.

We appear to be ending up with a different JSContext* from the one we get with the old code.
Not sure if that is intentional.
(In reply to Bob Owen (:bobowen) from comment #22)
> We appear to be ending up with a different JSContext* from the one we get
> with the old code.
> Not sure if that is intentional.

Yeah, that's intentional. The current JSContext is very arbitrary, and that machinery needs to go away (see part 6). So the replacement is to use the JSContext we from the AutoEntryScript, which makes much more sense.

So we basically just need to understand why this is failing in the new world (presumably, because there was an error that was squelched in the old world but reported in the new world), and hammer it down appropriately (a la Part 3).
(In reply to Bobby Holley (:bholley) from comment #23)

> So we basically just need to understand why this is failing in the new world
> (presumably, because there was an error that was squelched in the old world
> but reported in the new world), and hammer it down appropriately (a la Part
> 3).

Right, I see.
I'll look at doing that tomorrow.
All of my time got consumed by others things yesterday.

This test uses a fake AddData dir to hold the test crash reports, it also registers its own nsIDirectoryServiceProvider to provide the fake path.

This fake provider was also getting requests for plugin.scan.Acrobat and plugin.scan.Quicktime, and as a result threw exceptions.
This used to get lost, but after patch Part 5 doesn't.

I think I could have just added a SimpleTest.ignoreAllUncaughtExceptions to the test, but this seems more targeted.
Also, I decided to return null rather than delegate to the real provider as it seemed closest to the existing behaviour.

Interestingly, as I discovered while trying to debug this, if you delayed the start of the test you didn't get the errors, presumably because the getFile calls had already been made.
If you started the tests soon enough the errors got reported, but always in the same place, as if they were queued up waiting to be released.

Try push with this patch:
https://tbpl.mozilla.org/?tree=Try&rev=9a1665905367

I'll switch to Linux now as I'm hoping I have a fix for that one as well.
Attachment #8423865 - Flags: review?(bobbyholley)
This is a very similar change to Part 3, although I'm not entirely sure what is causing the new errors that have surfaced.
Do you think they should be followed up in another bug?

Here's a try push for the xpcshell tests with this patch:
https://tbpl.mozilla.org/?tree=Try&rev=67ade2710bd4

Oh, and patch Part 4 doesn't apply cleanly at the moment because of the rename of GetSubjectPrincipal and GetObjectPrincipal to SubjectPrincipal and ObjectPrincipal.
Attachment #8423925 - Flags: review?(bobbyholley)
Assignee: bobbyholley → bobowencode
Status: NEW → ASSIGNED
Whoops, bzexport stole the bug for me.
Assignee: bobowencode → bobbyholley
The build from that last try push was failing on Linux debug.
Looks like Part 5 lost the move of a using statement from the try push in comment 14.
As the patch holds the original author, I assume I can just add this.

Here's a try push just for the failing build:
https://tbpl.mozilla.org/?tree=Try&rev=9415d6c65e04

I'll upload Part 4 with the surrounding function name changes I mentioned, as well.
Attachment #8420252 - Attachment is obsolete: true
Attachment #8420252 - Flags: review?(bzbarsky)
Attachment #8424467 - Flags: review?(bzbarsky)
Same as version 1 of this patch, but just with a couple of renames of functions on the surrounding context code, so it applies cleanly.
Attachment #8420251 - Attachment is obsolete: true
Attachment #8420251 - Flags: review?(bzbarsky)
Attachment #8424469 - Flags: review?(bzbarsky)
Attachment #8423865 - Flags: review?(bobbyholley) → review+
Attachment #8423925 - Flags: review?(bobbyholley) → review+
Awesome, thanks Bob! Looks like this is all green now, so we're just waiting on Boris' review.
Comment on attachment 8420248 [details] [diff] [review]
Part 1 - Push some more cxes in jsd. v1

>@@ -287,19 +288,18 @@ jsd_GetScopeChainForStackFrame(JSDContext* jsdc,

Is it OK to still be in the request when we do JSD_NewValue?

>@@ -311,19 +311,18 @@ jsd_GetThisForStackFrame(JSDContext* jsdc,

Likewise.

r=me modulo that
Attachment #8420248 - Flags: review?(bzbarsky) → review+
Comment on attachment 8420249 [details] [diff] [review]
Part 2 - Fix an XPConnect test bug. v1

r==me
Attachment #8420249 - Flags: review?(bzbarsky) → review+
Comment on attachment 8420250 [details] [diff] [review]
Part 3 - Fix some error reporting nastiness in the CPS tests. v1

r=me
Attachment #8420250 - Flags: review?(bzbarsky) → review+
Comment on attachment 8424469 [details] [diff] [review]
Part 4 - Introduce a special null principal to return when we're not in any compartment at all. v2

r=me
Attachment #8424469 - Flags: review?(bzbarsky) → review+
Comment on attachment 8424467 [details] [diff] [review]
Part 5 - Use the WrappedJS's global as an entry global. v2

r=me.  This is the guts of the change, right?  ;)
Attachment #8424467 - Flags: review?(bzbarsky) → review+
Comment on attachment 8420253 [details] [diff] [review]
Part 6 - Stop associating XPCWrappedNativeScopes with an XPCContext. v1

r=me
Attachment #8420253 - Flags: review?(bzbarsky) → review+
Comment on attachment 8420254 [details] [diff] [review]
Part 7 - Remove cx pushing in SetNewDocument. v1

r=me, I think
Attachment #8420254 - Flags: review?(bzbarsky) → review+
(In reply to Boris Zbarsky [:bz] from comment #31)
> Is it OK to still be in the request when we do JSD_NewValue?

AFAIK, it's really never a problem to be in a request. The main reasons we have requests is so that we have some way of running GCs with the conservative stack scanner disabled (in configurations where we still use that).
Depends on: 1014350
Blocks: 1039881
Depends on: 1031303
You need to log in before you can comment on or make changes to this bug.