As you can read on bug 786249 there is a possible regression in the permission manager in Firefox 17.0a1.
The first bad revision is:
user: Bobby Holley <email@example.com>
date: Thu Aug 23 11:45:28 2012 -0700
summary: Bug 757046 - Convert enablePrivilege into an insecure test-only construct (preffed off everywhere but in automation). r=bz
A Mozmill testcase has been added to bug 786249 which demonstrates this failure after the about tenth iteration. That means it doesn't happen immediately and varies in the iteration number.
Boris proposed to debug which I will do tomorrow:
> If you're willing to debug, you can breakpoint in the failure case in
> nsScriptSecurityManager::CanCreateWrapper and get the JS or C++ stacks.
> That might give me some idea of what's going on...
So running the Mozmill testcase with a debug build I get an assertion:
###!!! ASSERTION: About to remove a different wrapper with the same nsISupports identity! This will most likely cause serious problems!: '!wrapperInMap || wrapperInMap == wrapper', file /Volumes/data/code/firefox/nightly/js/xpconnect/src/XPCMaps.h, line 130
Created attachment 656797 [details]
Assertion stack as shown with 'export XPCOM_DEBUG_BREAK=stack' and fixed via 'tools/rb/fix-macosx-stack.pl'
I will try later today if I'm able to get it into my debugger with all the data Boris requested.
(In reply to Henrik Skupin (:whimboo) from comment #0)
> > If you're willing to debug, you can breakpoint in the failure case in
> > nsScriptSecurityManager::CanCreateWrapper and get the JS or C++ stacks.
> > That might give me some idea of what's going on...
Boris, this method is called a couple of hundred times so it makes it hard to find the right one. Is there a condition I could add for the breakpoint? Or how would I configure gdb to stop at the wanted call?
Just breakpoint on the failure case. So "b nsScriptSecurityManager.cpp:2637" or whatever line http://hg.mozilla.org/mozilla-central/file/790fb17b1fe3/caps/src/nsScriptSecurityManager.cpp#l2637 is in your local version of the file.
Created attachment 657050 [details]
I hope that helps. Also it looks like that the assertion is not directly related to this issue.
What about the JS stacl? "call DumpJSStack()" in gdb.
The JS stack is not help helpful. It only lists the code from within my testcase:
(gdb) call DumpJSStack()
0 mdObserver_findWindow() ["resource://mozmill/modules/frame.js -> file:///Volumes/data/code/mozmill-tests/nightly/testPermissions.js":108]
this = [object Object]
1 mdObserver_observe() ["resource://mozmill/modules/frame.js -> file:///Volumes/data/code/mozmill-tests/nightly/testPermissions.js":157]
this = [object Object]
var requestor = window.QueryInterface(Ci.nsIInterfaceRequestor);
dump ("*** requestor: " + requestor + "\n");
> var navigation = requestor.getInterface(Ci.nsIWebNavigation);
Would I have to switch threads because the above code is executed via a timer?
No, there's only one thread involved.
OK, so the code on the stack is the frame.js thing. That sure looks like it should have a chrome principal! And it definitely shouldn't have the principal you mentioned in bug 786249.
So could it be related to the securable module loader we are making use of in frame.js? But it has the 'system' principal set:
It's code can be found here:
It could be, sure. It's really hard to tell without just debugging this to see what's going on.
Boris, would you be able to help out with debugging? I could provide a fully setup environment which let you run Mozmill tests with debugger support.
This bug really needs bholley or mrbkap to look at it. Unfortunately, bholley is on vacation....
Ok, so here the steps how to setup a working mozmill environment which someone can use to debug this issue. Keep in mind that the timeout value has to be set to not let mozrunner kill the browser.
1. Create a virtual environment: "virtualenv mozmill"
2. Clone the mozmill repository: git clone git://github.com/mozilla/mozmill.git
3. Run ./setup_development.py on the master branch
4. Download the minimized testcase from bug 786249 (attachment 656449 [details])
5. Run the testcase: mozmill -b /path/to/firefox -t testPermissions.js --timeout=10000000
(In reply to Henrik Skupin (:whimboo) from comment #15)
> 5. Run the testcase: mozmill -b /path/to/firefox -t testPermissions.js
To run Firefox with gdb enabled you will have to use the following command:
mozmill -b /path/to/firefox -t testPermissions.js --timeout=10000000 --debugger=gdb
(for whenever Bobby's back from PTO)
Hm, when I follow the STR here, I get this output:
Followed by a crash with the following stack:
Looks like the crash has to do with someone doing some faulty I/O with js-ctypes. Not sure about the specifics though.
I'm on mac 10.7. Any ideas on how to reproduce the permission error and/or avoid the js-ctypes crash here?
Hey Bobby, most likely you are already getting this failure but it's not printed to the console. You can add '--console-level=DEBUG' to the above command line to see the details of the test result.
I have also seen this assertion a couple of times but it sounds like another issue. Mozmill or better JSBridge is making use of js-ctypes for its low-level network communication between Python CLI <-> JS extension. I haven't seen such an assertion in the past so it could be something new too. I will try to check that in the next days and report separately.
The js-ctypes thing is a crash rather than an assertion, so it prevents me from going any further (and running more iterations? I only see iteration 0).
when I add --console-level=DEBUG, I get:
In particular, the error seems to be: "could not find element ID: addon".
(In reply to Bobby Holley (:bholley) from comment #20)
> In particular, the error seems to be: "could not find element ID: addon".
Oh, that's because of bug 786985. Mozmill starts too early for debug builds and it will fail because no window is open yet. As a workaround please put the following line right in-front of the for loop inside the test method:
It will wait 10s before starting the code which access the ui elements. Just raise the value if you still see it.
Bobby, which code has been landed yesterday? All of our tests we ran today don't show this problem anymore.
So one of the patches from Bobby on the following pushlog fixed it. I will check tomorrow so I can figure out the right one.
(In reply to Henrik Skupin (:whimboo) from comment #23)
> So one of the patches from Bobby on the following pushlog fixed it. I will
> check tomorrow so I can figure out the right one.
It was almost certainly bug 774633. This is the one that I suggested might fix this (on IRC). I guess I must have asked a day too early. :-)
Anyway, sounds like this is fixed. Happy day! :-)
The first good revision is:
user: Bobby Holley <firstname.lastname@example.org>
date: Wed Sep 05 11:32:06 2012 -0700
summary: Bug 774633 - Move the call to SetInitialPrincipalToSubject into nsAppShellService::RegisterTopLevelWindow. r=jst
That means that attachment 646207 [details] [diff] [review] (Part 5 - Move the call to SetInitialPrincipalToSubject into nsAppShellService::RegisterTopLevelWindow. v1) fixed the problem here.
As mentioned before it works fine now on Aurora and Nightly. So setting the right flags and closing bug as resolved fixed. Would be great to get an automated for it in m-c.
I'm not going to write an automated test for this. If someone wants to, please file a separate bug.