Last Comment Bug 786852 - Getting the interface for Ci.nsIWebNavigation on a modal window fails with "Permission denied for DOMAIN to create wrapper for object of class UnnamedClass"
: Getting the interface for Ci.nsIWebNavigation on a modal window fails with "P...
Status: RESOLVED FIXED
[qa-]
: regression
Product: Core
Classification: Components
Component: Security (show other bugs)
: 14 Branch
: All All
: -- critical (vote)
: mozilla18
Assigned To: Bobby Holley (busy)
:
Mentors:
Depends on: 904807
Blocks: 757046 786249
  Show dependency treegraph
 
Reported: 2012-08-29 15:43 PDT by Henrik Skupin (:whimboo)
Modified: 2013-08-19 06:50 PDT (History)
9 users (show)
bobbyholley: in‑testsuite-
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
+
fixed
+
fixed


Attachments
assertion stack (4.54 KB, text/plain)
2012-08-30 03:18 PDT, Henrik Skupin (:whimboo)
no flags Details
gdb stack (38.58 KB, text/plain)
2012-08-30 14:45 PDT, Henrik Skupin (:whimboo)
no flags Details

Description Henrik Skupin (:whimboo) 2012-08-29 15:43:14 PDT
As you can read on bug 786249 there is a possible regression in the permission manager in Firefox 17.0a1.

The first bad revision is:
changeset:   103203:c1e3da499d87
user:        Bobby Holley <bobbyholley@gmail.com>
date:        Thu Aug 23 11:45:28 2012 -0700
summary:     Bug 757046 - Convert enablePrivilege into an insecure test-only construct (preffed off everywhere but in automation). r=bz

A Mozmill testcase has been added to bug 786249 which demonstrates this failure after the about tenth iteration. That means it doesn't happen immediately and varies in the iteration number.

Boris proposed to debug which I will do tomorrow:

> If you're willing to debug, you can breakpoint in the failure case in
> nsScriptSecurityManager::CanCreateWrapper and get the JS or C++ stacks. 
> That might give me some idea of what's going on...
Comment 1 Henrik Skupin (:whimboo) 2012-08-30 03:02:43 PDT
So running the Mozmill testcase with a debug build I get an assertion:

###!!! ASSERTION: About to remove a different wrapper with the same nsISupports identity! This will most likely cause serious problems!: '!wrapperInMap || wrapperInMap == wrapper', file /Volumes/data/code/firefox/nightly/js/xpconnect/src/XPCMaps.h, line 130
Comment 2 Henrik Skupin (:whimboo) 2012-08-30 03:18:42 PDT
Created attachment 656797 [details]
assertion stack

Assertion stack as shown with 'export XPCOM_DEBUG_BREAK=stack' and fixed via 'tools/rb/fix-macosx-stack.pl'
Comment 3 Henrik Skupin (:whimboo) 2012-08-30 08:24:19 PDT
I will try later today if I'm able to get it into my debugger with all the data Boris requested.
Comment 4 Henrik Skupin (:whimboo) 2012-08-30 13:17:23 PDT
(In reply to Henrik Skupin (:whimboo) from comment #0)
> > If you're willing to debug, you can breakpoint in the failure case in
> > nsScriptSecurityManager::CanCreateWrapper and get the JS or C++ stacks. 
> > That might give me some idea of what's going on...

Boris, this method is called a couple of hundred times so it makes it hard to find the right one. Is there a condition I could add for the breakpoint? Or how would I configure gdb to stop at the wanted call?
Comment 5 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-08-30 13:20:12 PDT
Just breakpoint on the failure case.  So "b nsScriptSecurityManager.cpp:2637" or whatever line http://hg.mozilla.org/mozilla-central/file/790fb17b1fe3/caps/src/nsScriptSecurityManager.cpp#l2637 is in your local version of the file.
Comment 6 Henrik Skupin (:whimboo) 2012-08-30 14:45:24 PDT
Created attachment 657050 [details]
gdb stack

I hope that helps. Also it looks like that the assertion is not directly related to this issue.
Comment 7 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-08-30 20:47:34 PDT
What about the JS stacl?  "call DumpJSStack()" in gdb.
Comment 8 Henrik Skupin (:whimboo) 2012-08-31 01:01:10 PDT
The JS stack is not help helpful. It only lists the code from within my testcase:

(gdb) call DumpJSStack()
0 mdObserver_findWindow() ["resource://mozmill/modules/frame.js -> file:///Volumes/data/code/mozmill-tests/nightly/testPermissions.js":108]
    this = [object Object]
1 mdObserver_observe() ["resource://mozmill/modules/frame.js -> file:///Volumes/data/code/mozmill-tests/nightly/testPermissions.js":157]
    this = [object Object]

    var requestor = window.QueryInterface(Ci.nsIInterfaceRequestor);
    dump ("*** requestor: " + requestor + "\n");
>   var navigation = requestor.getInterface(Ci.nsIWebNavigation);
Comment 9 Henrik Skupin (:whimboo) 2012-08-31 01:28:35 PDT
Would I have to switch threads because the above code is executed via a timer?
Comment 10 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-08-31 02:51:25 PDT
No, there's only one thread involved.

OK, so the code on the stack is the frame.js thing.  That sure looks like it should have a chrome principal!  And it definitely shouldn't have the principal you mentioned in bug 786249.
Comment 11 Henrik Skupin (:whimboo) 2012-08-31 03:53:23 PDT
So could it be related to the securable module loader we are making use of in frame.js? But it has the 'system' principal set:

https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/extension/resource/modules/frame.js#L38

It's code can be found here:

https://github.com/mozilla/mozmill/blob/master/mozmill/mozmill/extension/resource/stdlib/securable-module.js
Comment 12 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-08-31 09:41:40 PDT
It could be, sure.  It's really hard to tell without just debugging this to see what's going on.
Comment 13 Henrik Skupin (:whimboo) 2012-09-03 01:57:31 PDT
Boris, would you be able to help out with debugging? I could provide a fully setup environment which let you run Mozmill tests with debugger support.
Comment 14 Boris Zbarsky [:bz] (Out June 25-July 6) 2012-09-03 13:08:43 PDT
This bug really needs bholley or mrbkap to look at it.  Unfortunately, bholley is on vacation....
Comment 15 Henrik Skupin (:whimboo) 2012-09-03 13:18:12 PDT
Ok, so here the steps how to setup a working mozmill environment which someone can use to debug this issue. Keep in mind that the timeout value has to be set to not let mozrunner kill the browser.

1. Create a virtual environment: "virtualenv mozmill"
2. Clone the mozmill repository: git clone git://github.com/mozilla/mozmill.git
3. Run ./setup_development.py on the master branch
4. Download the minimized testcase from bug 786249 (attachment 656449 [details])
5. Run the testcase: mozmill -b /path/to/firefox -t testPermissions.js --timeout=10000000
Comment 16 Henrik Skupin (:whimboo) 2012-09-03 13:26:19 PDT
(In reply to Henrik Skupin (:whimboo) from comment #15)
> 5. Run the testcase: mozmill -b /path/to/firefox -t testPermissions.js
> --timeout=10000000

To run Firefox with gdb enabled you will have to use the following command:

mozmill -b /path/to/firefox -t testPermissions.js --timeout=10000000 --debugger=gdb
Comment 17 Alex Keybl [:akeybl] 2012-09-05 16:13:58 PDT
(for whenever Bobby's back from PTO)
Comment 18 Bobby Holley (busy) 2012-09-06 16:21:23 PDT
Hm, when I follow the STR here, I get this output:
http://pastebin.mozilla.org/1808006

Followed by a crash with the following stack:
http://pastebin.mozilla.org/1808046

Looks like the crash has to do with someone doing some faulty I/O with js-ctypes. Not sure about the specifics though.

I'm on mac 10.7. Any ideas on how to reproduce the permission error and/or avoid the js-ctypes crash here?
Comment 19 Henrik Skupin (:whimboo) 2012-09-06 22:01:02 PDT
Hey Bobby, most likely you are already getting this failure but it's not printed to the console. You can add '--console-level=DEBUG' to the above command line to see the details of the test result.

I have also seen this assertion a couple of times but it sounds like another issue. Mozmill or better JSBridge is making use of js-ctypes for its low-level network communication between Python CLI <-> JS extension. I haven't seen such an assertion in the past so it could be something new too. I will try to check that in the next days and report separately.
Comment 20 Bobby Holley (busy) 2012-09-07 11:47:04 PDT
The js-ctypes thing is a crash rather than an assertion, so it prevents me from going any further (and running more iterations? I only see iteration 0).

when I add --console-level=DEBUG, I get:
http://pastebin.mozilla.org/1809379

In particular, the error seems to be: "could not find element ID: addon".
Comment 21 Henrik Skupin (:whimboo) 2012-09-07 15:21:55 PDT
(In reply to Bobby Holley (:bholley) from comment #20)
> In particular, the error seems to be: "could not find element ID: addon".

Oh, that's because of bug 786985. Mozmill starts too early for debug builds and it will fail because no window is open yet. As a workaround please put the following line right in-front of the for loop inside the test method:

controller.sleep(10000);

It will wait 10s before starting the code which access the ui elements. Just raise the value if you still see it.
Comment 22 Henrik Skupin (:whimboo) 2012-09-07 15:28:20 PDT
Bobby, which code has been landed yesterday? All of our tests we ran today don't show this problem anymore.
Comment 23 Henrik Skupin (:whimboo) 2012-09-09 09:29:48 PDT
So one of the patches from Bobby on the following pushlog fixed it. I will check tomorrow so I can figure out the right one.

Pushlog:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=6705e131aeaa&tochange=0c4fa25f637b
Comment 24 Bobby Holley (busy) 2012-09-09 22:11:57 PDT
(In reply to Henrik Skupin (:whimboo) from comment #23)
> So one of the patches from Bobby on the following pushlog fixed it. I will
> check tomorrow so I can figure out the right one.

It was almost certainly bug 774633. This is the one that I suggested might fix this (on IRC). I guess I must have asked a day too early. :-)

Anyway, sounds like this is fixed. Happy day! :-)
Comment 25 Henrik Skupin (:whimboo) 2012-09-11 04:11:56 PDT
The first good revision is:
changeset:   104319:32ecf0cec2c3
user:        Bobby Holley <bobbyholley@gmail.com>
date:        Wed Sep 05 11:32:06 2012 -0700
summary:     Bug 774633 - Move the call to SetInitialPrincipalToSubject into nsAppShellService::RegisterTopLevelWindow. r=jst

That means that attachment 646207 [details] [diff] [review] (Part 5 - Move the call to SetInitialPrincipalToSubject into nsAppShellService::RegisterTopLevelWindow. v1) fixed the problem here.

As mentioned before it works fine now on Aurora and Nightly. So setting the right flags and closing bug as resolved fixed. Would be great to get an automated for it in m-c.
Comment 26 Bobby Holley (busy) 2012-11-04 07:20:05 PST
I'm not going to write an automated test for this. If someone wants to, please file a separate bug.

Note You need to log in before you can comment on or make changes to this bug.