Closed
Bug 825802
Opened 12 years ago
Closed 12 years ago
Memory leak in Marionette when calling execute_script() in CONTENT context
Categories
(Remote Protocol :: Marionette, defect, P1)
Tracking
(blocking-b2g:tef+, blocking-basecamp:-, firefox19 wontfix, firefox20 wontfix, firefox21 fixed, b2g18 fixed, b2g18-v1.0.0 fixed)
RESOLVED
FIXED
B2G C4 (2jan on)
People
(Reporter: m1, Assigned: jgriffin)
References
Details
(Whiteboard: [cr 436777] [triage:1/16])
Attachments
(5 files, 2 obsolete files)
After many iterations of MT calls or Browser app open/close cycles, we are seeing what looks to be unbounded (or at least not well bounded) memory usage growth in the b2g process.
Before:
PID PR CPU% S #THR VSS RSS PCY UID Name
127 0 6% S 37 168824K 42000K fg root /system/b2g/b2g
488 0 0% S 11 61292K 23184K fg app_488 /system/b2g/plugin-container
474 0 0% S 11 59176K 20520K fg app_474 /system/b2g/plugin-container
632 0 0% S 10 55980K 17232K fg app_632 /system/b2g/plugin-container
364 0 0% S 11 65396K 16764K fg app_364 /system/b2g/plugin-container
After
PID PR CPU% S #THR VSS RSS PCY UID Name
127 0 0% S 36 221392K 102140K fg root /system/b2g/b2g
19093 0 15% S 14 80976K 24224K fg app_1909 /system/b2g/plugin-container
364 0 0% S 11 65396K 14596K fg app_364 /system/b2g/plugin-container
19270 0 0% S 10 55980K 9848K fg app_1927 /system/b2g/plugin-container
The rest of the process memory usage in the system is basically unchanged from Before to After (couple KBs here and there).
This leads to unexpected killing of content processes that cause stability test to fail.
Reporter | ||
Comment 1•12 years ago
|
||
(Working on getting more details about what the maps file looks like for the b2g process After, but wanted to file early in case other folks had seen this as well and/or to qawanted it.)
Whiteboard: [cr 436777]
Comment 2•12 years ago
|
||
m1, should we set up some queries for the whiteboard content you're using here?
blocking-basecamp: ? → +
Comment 3•12 years ago
|
||
Hey Nicholas,
Is this something you can look at in the next few days?
Assignee: nobody → n.nethercote
Comment 4•12 years ago
|
||
njn doesn't have a b2g phone.
> After many iterations of MT calls
What's an MT call?
Standard procedure when reporting memory usage issues is to attach a memory report. You can generate one by running tools/get-about-memory.py (just attach the whole folder it creates).
It's also helpful if you can additionally attach the memory report generated by running get-about-memory.py --minimize, which triggers a gc.
Updated•12 years ago
|
Assignee: n.nethercote → justin.lebar+bug
Comment 5•12 years ago
|
||
I tried repeating (1) and (2) below
1) opening the browser app, pressing home, and then opening it again, and
2) opening the browser app, long-pressing home, killing the browser app, then opening the browser again.
I was unable to get main-process memory usage to go over 69mb in either case.
Looks like it's up to someone else to capture a memory report here.
Updated•12 years ago
|
Flags: needinfo?(mvines)
Reporter | ||
Comment 6•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] (away 12/21-1/2) from comment #4)
> What's an MT call?
mobile terminated.
> Standard procedure when reporting memory usage issues is to attach a memory
> report. You can generate one by running tools/get-about-memory.py (just
> attach the whole folder it creates).
Can't find this tool. Got a better link? For us to be able to use it in a widespread manner it needs to be work on Windows BTW.
Flags: needinfo?(mvines)
Comment 7•12 years ago
|
||
> Can't find this tool.
The relevant tools directory is in the root B2G repository.
I have no idea if it will work on Windows, but feel free to file bugs or send pull requests.
Reporter | ||
Comment 8•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] (away 12/21-1/2) from comment #7)
> > Can't find this tool.
>
> The relevant tools directory is in the root B2G repository.
Thanks.
> I have no idea if it will work on Windows, but feel free to file bugs or
> send pull requests.
--
mvines@cros-bs-p3:~/ws/b2g/device/qcom/b2g_common/mozilla-b2g/tools$ ./get_about_memory.py
This script requires Python 2.7.
--
--
C:\Python27>python.exe u:\mem\get_about_memory.py
Traceback (most recent call last):
File "u:\mem\get_about_memory.py", line 35, in <module>
import fix_b2g_stack
File "u:\mem\fix_b2g_stack.py", line 42, in <module>
import fcntl
ImportError: No module named fcntl
--
:(
Comment 9•12 years ago
|
||
fix_b2g_stack is definitely not going to work on Windows (not because of fcntl, which is used for file locking which really isn't important, but because you don't have an addr2line binary). But we don't need that for get_about_memory.py without DMD, so there's still some hope.
We have other essential tools which also aren't going to work on Windows (e.g. the profile script), so it seems to me that the simplest thing to do is for you to spin up a Linux VM and use the tools there. If you use these tools on Windows, you're going to be the sole consumers on that platform, which sucks for obvious reasons.
If using *nix is not an option, please file a bug and we'll work through this. I don't think there's anything intrinsic to get_about_memory.py that should prevent us from making it work on Windows.
Reporter | ||
Comment 10•12 years ago
|
||
*nix is not an option for the test folks unfortunately, VM or real hardware.
Comment 11•12 years ago
|
||
Do the instructions in bug 826272 suffice for getting an about:memory dump here? It's not ideal, but if we consider this bug a blocker, we shouldn't spin our wheels...
Reporter | ||
Comment 12•12 years ago
|
||
Ah, yeah that should work. Is there a "standalone" tool to post proccess the json dump?
Comment 13•12 years ago
|
||
There isn't a standalone tool to do the post-processing, but it's not necessary. The only post-processing we do is to merge all all of the processes' about:memory dumps into one file. This is convenient because you can open them all at once in Firefox.
But if all you do is gunzip the files and open them in about:memory in Nightly, that should work without any other post-processing. (Or you can just attach the separate files to a bug.)
Reporter | ||
Comment 14•12 years ago
|
||
This maps file is from the "Browser_B2G" test which failed at the 190th iteration this time. The test simply navigates to two different URLs per iteration (opening browser instance once per iteration).
Check out the end of the maps file. Lots of anonymous maps, some 1-2MB in size. The way that the anonymous maps increase in size towards the end of the file kindof looks like a hash table or some other data structure scaling up it's memory usage as it gets more loaded.
Comment 15•12 years ago
|
||
Both jemalloc and the JS engine allocate in 1 and 2mb blocks. Moreover, I'm pretty sure maps coalesces adjacent mappings in /maps, so I'd expect to see larger blocks towards the end, where we haven't unmapped as much stuff as towards the beginning.
(In reply to Michael Vines [:m1] from comment #14)
> Created attachment 698051 [details]
> a /proc maps file
>
> This maps file is from the "Browser_B2G" test which failed at the 190th
> iteration this time. The test simply navigates to two different URLs per
> iteration (opening browser instance once per iteration).
Does "browser instance once per iteration" mean
- close browser app, open browser app
or
- leave browser app open, close tab, open tab
Also, are the two URLs the same in all iterations, or different pairs of URLs in each iteration?
Reporter | ||
Comment 18•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #13)
> open them in about:memory in Nightly
Sorry for being a n00b. I don't see way in about:memory in Nightly (Linux) to open these.
Comment 19•12 years ago
|
||
> I don't see way in about:memory in Nightly (Linux) to open these.
At the bottom of about:memory, there's a button to load from a file. (Be sure to gunzip first.)
> Sorry for being a n00b.
Sorry for my bad UI!
Reporter | ||
Comment 20•12 years ago
|
||
(In reply to Chris Jones [:cjones] [:warhammer] from comment #16)
>
> Does "browser instance once per iteration" mean
> - close browser app, open browser app
> or
> - leave browser app open, close tab, open tab
Leave the browser app open w/ single tab, just swapping the url between cnn and yahoo once per iteration.
(In reply to Justin Lebar [:jlebar] from comment #19)
> At the bottom of about:memory, there's a button to load from a file. (Be
> sure to gunzip first.)
Thanks, turns out my nightly was not from last night. :(
The state of the system was unfortunately destroyed so I've asked for a rerun which will happen over the weekend. This last run was apparently performed a build from December 17th (when the issue was first discovered) so I've asked they make the next attempt on the tip.
Comment 21•12 years ago
|
||
The ball is in Michel Vines' court here to get us more info. We can't do anything here until we get more data.
Flags: needinfo?(mvines)
Reporter | ||
Comment 22•12 years ago
|
||
Withdrawing this for now. We can't seem to reproduce this at the tip anymore. Will reopen later if it pops up again
Status: NEW → RESOLVED
Closed: 12 years ago
Flags: needinfo?(mvines)
Resolution: --- → WORKSFORME
Sigh. And yay.
Comment 24•12 years ago
|
||
So I was working on this today and I got the device fully locked up so it didn't react on any input anymore. Even the power button was not functional. Thankfully it restarted after about 30s or so.
After talking to Chris and Justin, I'm now doing an analysis on it. As we just figured out this issue as reported in comment 0 is still visible with an engineering build from yesterday. I'm now working closely together with Justin to get this investigated. Expect an update soon.
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
Comment 25•12 years ago
|
||
Not blocking basecamp on this, but blocking-b2g: tef+. *Please* keep working on this thought, we must resolve this before releasing v1.
blocking-b2g: --- → tef+
blocking-basecamp: + → -
Comment 26•12 years ago
|
||
In case it's important later that's the output via adb logcat at the freeze situation:
I/InputReader( 109): Detected input event buffer overrun for device atmel-touchscreen.
D/memalloc( 109): /dev/pmem: Freeing buffer base:0x4adc2000 size:614400 offset:1843200 fd:146
D/memalloc( 109): /dev/pmem: Freeing buffer base:0x4aeee000 size:614400 offset:3072000 fd:133
I/Gecko ( 109):
I/Gecko ( 109): ###!!! [Parent][AsyncChannel] Error: Channel error: cannot send/recv
I/Gecko ( 109):
I/Gecko ( 109):
I/Gecko ( 109): ###!!! [Parent][AsyncChannel] Error: Channel error: cannot send/recv
I/Gecko ( 109):
I/Gecko ( 109):
I/Gecko ( 109): ###!!! [Parent][AsyncChannel] Error: Channel error: cannot send/recv
I/Gecko ( 109):
Status: REOPENED → ASSIGNED
Comment 27•12 years ago
|
||
Comment 28•12 years ago
|
||
Comment 29•12 years ago
|
||
> 66.75 MB (60.76%) -- compartment([System Principal], chrome://browser/content/shell.xul)
Um.
How does marionette inject these events? If it runs any code in the shell.xul compartment, it looks like it might be leaking.
Comment 30•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #29)
> > 66.75 MB (60.76%) -- compartment([System Principal], chrome://browser/content/shell.xul)
>
> Um.
>
> How does marionette inject these events? If it runs any code in the
> shell.xul compartment, it looks like it might be leaking.
Mdas or jgriffin, could one of you reply to this? I will try to figure that out but you have indeed better knowledge how marionette works.
Comment 31•12 years ago
|
||
(In reply to Justin Lebar [:jlebar] from comment #29)
> > 66.75 MB (60.76%) -- compartment([System Principal], chrome://browser/content/shell.xul)
>
> Um.
>
> How does marionette inject these events? If it runs any code in the
> shell.xul compartment, it looks like it might be leaking.
since we are trying to emulate how a user does things we try and fire all the events that the user was doing. e.g. for click its http://mxr.mozilla.org/mozilla-central/source/testing/marionette/atoms/atoms.js#80 for typing its http://mxr.mozilla.org/mozilla-central/source/testing/marionette/marionette-sendkeys.js or for touch we use http://mxr.mozilla.org/mozilla-central/source/testing/marionette/client/marionette/touch/synthetic_gestures.js.
All of this is running through the remote debugger so the net needs to be quite wide.
Comment 32•12 years ago
|
||
> http://mxr.mozilla.org/mozilla-central/source/testing/marionette/atoms/atoms.js#80
Do we have an unminified version of this kicking around somewhere?
Comment 33•12 years ago
|
||
Henrick and I looked at the JS GC heap dump from the main process.
I was surprised to see so much memory given to scripts:
> 18.92 MB (17.22%) ── scripts [2]
To put this into perspective, gmail has ~3mb of scripts.
I'm not an expert on reading these JS GC heap dumps, but we saw 200,000 occurrences of
> 0xdeadbeef B script chrome://specialpowers/content/specialpowersAPI.js:53
for varying addresses (i.e., the first address was not always the same).
I don't know exactly what this means, but I bet Gregor or Andrew can tell us.
This looks like a bug in Marionette, although it could be a bug in how JS deals with what Marionette is doing. At the very least, it does not appear to be a bug in the system. So either this bug should be bb-, or we should move the Marionette testing out of this bug and into a new bug. What we do may depend on how well we believe our Marionette tests here approximate what QC was doing. (Are they identical?)
Comment 34•12 years ago
|
||
Specialpowers is a test-harness only thing. We've had problems in it before that caused terrible memory usage in Mochitests. Of course, it could be a platform issue, too. I think bholley or jmaher are the people to ask about that.
Comment 35•12 years ago
|
||
Also, if you could email or attach a larger snippet of the GC heap dump I can help you interpret it. That single line isn't really enough context. It does seem odd that the address is DEADBEEF...
Comment 36•12 years ago
|
||
> Specialpowers is a test-harness only thing
Sorry, I should have asked a more specific question: What does it mean that we have 200,000 script objects all pointing to X.js:53?
Comment 37•12 years ago
|
||
Oh, is it not literally DEADBEEF? Okay, that makes more sense...
You'd have to ask a JS person, but I'm guessing that it is loading the script 200,000 times, and not freeing it properly.
Using my log analysis tool from here under gc/find_roots.py you should be able to figure out why those scripts are alive. eg what is rooting them.
https://github.com/amccreight/heapgraph
Comment 38•12 years ago
|
||
Here's the gc log: http://people.mozilla.com/~jlebar/gc-edges.108.log.xz
Comment 39•12 years ago
|
||
So what actually happens in the test script is that we import a JS script at the beginning of the test. It contains the functions for random key events we execute later via execute_script(). All events are getting triggered directly via windowutils and not via Marionette! So sorry for the false information above.
I have flashed the unagi device with a debug build now and I will try further to debug it once it's in the same state again.
Comment 40•12 years ago
|
||
We've had bugs in the past where SpecialPowers has caused us to leak every window ever, like bug 794420. Not sure if that's what's happening here though.
Comment 41•12 years ago
|
||
I run the gc heap graph script against the address 0x485878f8 which is one of the instances of specialpowerapi.js:53. That's the result of it. Given that there is no documentation how to analyze it I'm not fully sure how to interpret those results.
Comment 42•12 years ago
|
||
(In reply to Bobby Holley (:bholley) from comment #40)
> We've had bugs in the past where SpecialPowers has caused us to leak every
> window ever, like bug 794420. Not sure if that's what's happening here
> though.
No, I don't think so. Or at least, if we are leaking all windows, that's not the main problem here. One of the main problems is 200,000 JS script objects.
Comment 43•12 years ago
|
||
> I'm not fully sure how to interpret those results.
Each entry shows a chain from a root to the object. For instance, in the first one, there's a root jar:file:///system/b2g/omni.ja!/components/Webapps.js that has a field with the label FakeBackstagePass in the log.
This line:
--[_browserElementChildMap]-> 0x43524d60 [Object <no private>]
...means that the object on the previous line has a field labelled _browserElementChildMap that contains the value 0x43524d60. The label of the object 0x43524d60 in the log is Object <no private>.
All of the shape, base, parent stuff is JS internal stuff. Basically, once you get into a particular clump of JS, you can get into many things in that clump.
Reporter | ||
Comment 44•12 years ago
|
||
How are we doing here?
Comment 45•12 years ago
|
||
Someone needs to own this, and I think it should be Henrik, since he's able to reproduce the issue, or the person who wrote the script that's causing the leak. This doesn't appear to be a memory leak in anything other than the test harness, so I don't think I'm the right owner.
Andrew, are all of the 200,000 (to a first approximation) script objects rooted at Webapps.jsm's FakeBackstagePass?
Assignee: justin.lebar+bug → nobody
Comment 46•12 years ago
|
||
Last Friday we have seen a couple of issues with the script and it needs to be changed drastically. We are not sure yet if such a change will affect the memory behavior. We will discuss that in our automation development meeting today and we will come back here with our plan of attack. It will not be me who will continue here but most likely Rob.
Here some of the behaviors in question we figured out:
* The script doesn't wait until the application has been loaded and focused. This is causing bug 829514. The letter 'u' is doing strange things which looks like restarting Gaia. This also might have an influence of the memory consumption here. Doug wanted to explore the other bug once he is back from traveling back.
* We send events not through the interfaces Marionette provides but use artificial scripts injected into the systems scope on the b2g device. Each time we trigger an event we call execute_script(). Given that we have seen leaks with scripts in Marionette it might be related.
I think that we should get the script fixed and retest everything. The current version of the script can be used in parallel to find possible leaks in Marionette which have to be fixed too. But the latter should not block any further work for possible b2g leaks.
Reporter | ||
Comment 47•12 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #46)
> * We send events not through the interfaces Marionette provides but use
> artificial scripts injected into the systems scope on the b2g device. Each
> time we trigger an event we call execute_script(). Given that we have seen
> leaks with scripts in Marionette it might be related.
oic! Can you please let us know what interface should be used here instead?
Comment 48•12 years ago
|
||
See bug 809245 for some touch related events you can directly send via the Python code. Otherwise there is a send_keys() method available too. Hm, interesting is that those touch events are also making use of execute_script internally. So we might not be able to workaround a possible leak in that method.
Updated•12 years ago
|
Assignee: nobody → mvines
Reporter | ||
Comment 49•12 years ago
|
||
I can reproduce the not-well-bounded memory growth by removing all input event injection, so doesn't seem to be that. Here's a simple snippet that reproduces this issue for me at least.
marionette.set_context(Marionette.CONTEXT_CONTENT);
while 42:
marionette.execute_script("return 0;");
execute_script() is leaking after all? This doesn't occur w/ CONTEXT_CHROME.
Assignee: mvines → nobody
Comment 50•12 years ago
|
||
Thanks Michael for that helpful information. So it might be really Marionette related then. I did a quick check if I can find something in the Marionette code and the following code is interesting:
Here is the internal execute() method which gets executed when execute_script() is called from the Python side. The appropriate actor method is executeScript():
http://mxr.mozilla.org/mozilla-central/source/testing/marionette/marionette-actors.js#718
So in line 726 we differentiate a call to context content vs. chrome:
http://mxr.mozilla.org/mozilla-central/source/testing/marionette/marionette-actors.js#726
When we are in content we are triggering a call to this.sendAsync() which sends the same event to the same actor command again. It doesn't happen for chrome! So the sync call to executeScript form execute_script() is turned into a repeating async call and the original call to execute() returns so that we continue in the Python code, while more and more sendAsync() calls are getting fired.
That only stops because we change the context on the Python side to CHROME again for a short moment. That let the loop quit triggering new sendAsync() calls. But that means we are firing the event against the chrome scope now but not content. So it's not the application which receives the event but the system application. That might explain the behavior we can see in bug 829514.
Next we call getCurrentWindow() which itself calls createExecuteSandbox:
http://mxr.mozilla.org/mozilla-central/source/testing/marionette/marionette-actors.js#326
http://mxr.mozilla.org/mozilla-central/source/testing/marionette/marionette-actors.js#623
In the latter method we load various script files if the specialPowers variable has been set:
http://mxr.mozilla.org/mozilla-central/source/testing/marionette/marionette-actors.js#651
I'm not sure if we reach this code given that I haven't had the time to investigate that. But someone might have to check that. If it's the case we create a new special powers object each time we fire off an event and that way the dubious line in specialpowersapi.js gets called:
https://mxr.mozilla.org/mozilla-beta/source/testing/specialpowers/content/specialpowers.js#13
https://mxr.mozilla.org/mozilla-beta/source/testing/specialpowers/content/specialpowersAPI.js#31
https://mxr.mozilla.org/mozilla-beta/source/testing/specialpowers/content/specialpowersAPI.js#53
Please keep in mind that I haven't tested the above so my analysis might be wrong. But by checking the code might it be that we leak those sandboxes?
Comment 51•12 years ago
|
||
Lets move over the bug to the Marionette component for further investigation and a fix.
Component: General → Marionette
Product: Boot2Gecko → Testing
Summary: /system/b2g/b2g consuming too much memory during stability test → Memory leak in Marionette when calling execute_script() in CONTENT context
Comment 52•12 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #50)
> Here is the internal execute() method which gets executed when
> execute_script() is called from the Python side. The appropriate actor
> method is executeScript():
> http://mxr.mozilla.org/mozilla-central/source/testing/marionette/marionette-
> actors.js#718
>
> So in line 726 we differentiate a call to context content vs. chrome:
> http://mxr.mozilla.org/mozilla-central/source/testing/marionette/marionette-
> actors.js#726
>
> When we are in content we are triggering a call to this.sendAsync() which
> sends the same event to the same actor command again. It doesn't happen for
> chrome! So the sync call to executeScript form execute_script() is turned
> into a repeating async call and the original call to execute() returns so
> that we continue in the Python code, while more and more sendAsync() calls
> are getting fired.
>
> That only stops because we change the context on the Python side to CHROME
> again for a short moment. That let the loop quit triggering new sendAsync()
> calls. But that means we are firing the event against the chrome scope now
> but not content. So it's not the application which receives the event but
> the system application. That might explain the behavior we can see in bug
> 829514.
So up to this point I think we basically need to:
726 - if (this.context == "content") {
726 + if (this.context == "content" && inChromeScope) {
and then modify the rest of the execute function to handle running scripts from content space. I don't think this will fix the leak however, that'll have to be taken care of separately. I'm not familiar with this code base, so I think I should leave it to mdas or jgriffin.
> That only stops because we change the context on the Python side to CHROME
> again for a short moment.
Henrik, out of curiosity, if you put a sleep right before this happens, does the number of references to specialpowers rise or remain at a constant 200,000?
Flags: needinfo?(hskupin)
Reporter | ||
Updated•12 years ago
|
Flags: needinfo?(mdas)
Flags: needinfo?(jgriffin)
Reporter | ||
Updated•12 years ago
|
Severity: normal → blocker
Assignee | ||
Comment 53•12 years ago
|
||
I'll take this.
Assignee: nobody → jgriffin
Flags: needinfo?(mdas)
Flags: needinfo?(jgriffin)
Comment 54•12 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #50)
> Thanks Michael for that helpful information. So it might be really
> Marionette related then. I did a quick check if I can find something in the
> Marionette code and the following code is interesting:
>
> Here is the internal execute() method which gets executed when
> execute_script() is called from the Python side. The appropriate actor
> method is executeScript():
> http://mxr.mozilla.org/mozilla-central/source/testing/marionette/marionette-
> actors.js#718
>
> So in line 726 we differentiate a call to context content vs. chrome:
> http://mxr.mozilla.org/mozilla-central/source/testing/marionette/marionette-
> actors.js#726
>
> When we are in content we are triggering a call to this.sendAsync() which
> sends the same event to the same actor command again. It doesn't happen for
> chrome! So the sync call to executeScript form execute_script() is turned
> into a repeating async call and the original call to execute() returns so
> that we continue in the Python code, while more and more sendAsync() calls
> are getting fired.
>
That's not what's happening actually. So the sendAsync method is actually sending the a 'Marionette:executeScript' message through the MessageManager. This gets picked up by the appropriate content listener, and the executeScript call in marionette-listener.js will be executed. So control goes from 'execute' function in marionette-actors.js to 'executeScript' function in marionette-listener.js, and we return to the client from marionette-listener.js. So there is no loop here. The leak is somewhere else...
Assignee: jgriffin → nobody
Severity: blocker → normal
Assignee | ||
Updated•12 years ago
|
Assignee: nobody → jgriffin
Assignee | ||
Comment 55•12 years ago
|
||
I don't think mdas meant to change the priority.
Severity: normal → blocker
Assignee | ||
Comment 56•12 years ago
|
||
This patch seems to fix the problem for me locally. It was suggested by Ted - it just changes the way we bind DOMWindowUtils to SpecialPowers. We'll need to run this on try to make sure it doesn't break anything.
Attachment #702495 -
Flags: review?
Assignee | ||
Comment 57•12 years ago
|
||
Pushed to try: https://tbpl.mozilla.org/?tree=Try&rev=a664f79dd642
(Who did you mean to hook for review?)
Assignee | ||
Updated•12 years ago
|
Attachment #702495 -
Flags: review? → review?(ted)
\o/
Comment 60•12 years ago
|
||
Comment on attachment 702495 [details] [diff] [review]
Improve SpecialPowersAIP.bindDOMWindowUtils() so it doesn't leak in Marionette,
Review of attachment 702495 [details] [diff] [review]:
-----------------------------------------------------------------
Not sure why bholley didn't rip all this code out when he added SpecialPowers.wrap in the first place.
Attachment #702495 -
Flags: review?(ted) → review+
Reporter | ||
Comment 61•12 years ago
|
||
Patch is looking good so far in my setup, b2g process memory is definitely not growing like it was before
Assignee | ||
Comment 62•12 years ago
|
||
This patch seems to break test_domWindowUtils.html; maybe the test just needs to be updated.
Assignee | ||
Comment 64•12 years ago
|
||
The try run on my previous patch is a sea of orange; apparently quite a few tests depend on the ghetto method that SpecialPowers uses to bind DOMWindowUtils. Tracking down and fixing all those cases will take a while, especially since I'm not familiar with any of the relevant code. Since time seems to be critical here, here's an alternate patch. This causes Marionette not to expose SpecialPowers to script unless the caller specifies special_powers=True in calls to execute_(async_)script. Most of the plumbing for this already existed, but it wasn't completely hooked up for content calls. This doesn't fix the problem, but causes the monkey script not to trigger it.
I suggest we move the SpecialPowers issue to a separate bug, assuming that Michael Vines confirms that this patch resolves his issue.
Attachment #702581 -
Flags: review?(philipp)
Assignee | ||
Comment 65•12 years ago
|
||
Comment on attachment 702581 [details] [diff] [review]
Don't expose SpecialPowers unless specifically requested,
Whoever gets to it first...
Attachment #702581 -
Flags: review?(mdas)
Assignee | ||
Comment 66•12 years ago
|
||
Reporter | ||
Comment 67•12 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #64)
> I suggest we move the SpecialPowers issue to a separate bug, assuming that
> Michael Vines confirms that this patch resolves his issue.
Patch looks real good so far! I don't see the same leaks after a couple hours of monkey on my machine. Our test guys have a couple stations running right now too. Ship it!
Comment 68•12 years ago
|
||
Comment on attachment 702581 [details] [diff] [review]
Don't expose SpecialPowers unless specifically requested,
Review of attachment 702581 [details] [diff] [review]:
-----------------------------------------------------------------
::: testing/marionette/marionette-listener.js
@@ +307,5 @@
> });
>
> + if (specialPowers) {
> + sandbox.SpecialPowers = new SpecialPowers(aWindow);
> + }
Could also make it a lazy getter instead of lugging the 'specialPowers' flag around, no? I don't see where that flag is ever enabled, anyway. Or am I missing something?
Anyway, I guess r+ if it works and is urgent...
Attachment #702581 -
Flags: review?(philipp) → review+
Assignee | ||
Comment 69•12 years ago
|
||
(In reply to Michael Vines [:m1] from comment #67)
> (In reply to Jonathan Griffin (:jgriffin) from comment #64)
> > I suggest we move the SpecialPowers issue to a separate bug, assuming that
> > Michael Vines confirms that this patch resolves his issue.
>
> Patch looks real good so far! I don't see the same leaks after a couple
> hours of monkey on my machine. Our test guys have a couple stations running
> right now too. Ship it!
Could you try patch 2 in place of patch 1? Patch 1 causes a lot of test failures that will prevent us from landing it quickly.
Assignee | ||
Comment 70•12 years ago
|
||
(In reply to Philipp von Weitershausen [:philikon] from comment #68)
> Comment on attachment 702581 [details] [diff] [review]
> Don't expose SpecialPowers unless specifically requested,
>
> Review of attachment 702581 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> ::: testing/marionette/marionette-listener.js
> @@ +307,5 @@
> > });
> >
> > + if (specialPowers) {
> > + sandbox.SpecialPowers = new SpecialPowers(aWindow);
> > + }
>
> Could also make it a lazy getter instead of lugging the 'specialPowers' flag
> around, no? I don't see where that flag is ever enabled, anyway. Or am I
> missing something?
>
> Anyway, I guess r+ if it works and is urgent...
We were already passing specialPowers=True/False from execute calls (see http://mxr.mozilla.org/mozilla-central/source/testing/marionette/client/marionette/marionette.py#465), but we weren't acting on it in marionette-listeners, just marionette-actors (see http://mxr.mozilla.org/mozilla-central/source/testing/marionette/marionette-actors.js#739). This patch just extends that to control SpecialPowers exposure in marionette-listeners as well.
Comment 71•12 years ago
|
||
Comment on attachment 702581 [details] [diff] [review]
Don't expose SpecialPowers unless specifically requested,
Review of attachment 702581 [details] [diff] [review]:
-----------------------------------------------------------------
lgtm
::: testing/marionette/marionette-listener.js
@@ +307,5 @@
> });
>
> + if (specialPowers) {
> + sandbox.SpecialPowers = new SpecialPowers(aWindow);
> + }
The specialPowers stuff is sent over from the client (http://mxr.mozilla.org/mozilla-central/source/testing/marionette/client/marionette/marionette_test.py#213)
We set it so that chrome calls would have SpecialPowers since content had it by default. We no longer have it by default in content, but since we set special_powers=True, it'll work as expected after this lands.
Attachment #702581 -
Flags: review?(mdas) → review+
Reporter | ||
Comment 72•12 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #69)
> Could you try patch 2 in place of patch 1? Patch 1 causes a lot of test
> failures that will prevent us from landing it quickly.
patch 2 seems to be holding up good as well.
Comment 73•12 years ago
|
||
(In reply to Jonathan Griffin (:jgriffin) from comment #70)
> We were already passing specialPowers=True/False from execute calls (see
> http://mxr.mozilla.org/mozilla-central/source/testing/marionette/client/
> marionette/marionette.py#465), but we weren't acting on it in
> marionette-listeners, just marionette-actors (see
> http://mxr.mozilla.org/mozilla-central/source/testing/marionette/marionette-
> actors.js#739). This patch just extends that to control SpecialPowers
> exposure in marionette-listeners as well.
(In reply to Malini Das [:mdas] from comment #71)
> The specialPowers stuff is sent over from the client
> (http://mxr.mozilla.org/mozilla-central/source/testing/marionette/client/
> marionette/marionette_test.py#213)
>
> We set it so that chrome calls would have SpecialPowers since content had it
> by default. We no longer have it by default in content, but since we set
> special_powers=True, it'll work as expected after this lands.
Gotcha. Thanks for the explanation.
My point about a lazy getter remains, but I guess most Marionette tests will need some form of SpecialPowers anyway, so it probably won't matter much.
Assignee | ||
Comment 74•12 years ago
|
||
Missed executeJSScript, this fixes that and is good to land.
Assignee | ||
Updated•12 years ago
|
Attachment #702581 -
Attachment is obsolete: true
Assignee | ||
Comment 75•12 years ago
|
||
Comment on attachment 702664 [details] [diff] [review]
Don't expose SpecialPowers unless specifically requested,
Carry r+ forward.
Attachment #702664 -
Flags: review+
Assignee | ||
Updated•12 years ago
|
Attachment #702495 -
Attachment is obsolete: true
Assignee | ||
Comment 76•12 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/0d2969ae8f71
I'll handle the lazy getter in a separate bug, so we don't have to wait X hours for a new try run.
Flags: needinfo?(hskupin)
Comment 77•12 years ago
|
||
Good to see the progress on it guys! I will also test it once it has been merged to m-c.
QA Contact: hskupin
Comment 78•12 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #50)
> That only stops because we change the context on the Python side to CHROME
> again for a short moment. That let the loop quit triggering new sendAsync()
> calls. But that means we are firing the event against the chrome scope now
> but not content. So it's not the application which receives the event but
> the system application. That might explain the behavior we can see in bug
> 829514.
I don't think that this is a correct behavior. Jonathan, shall I file a new bug about that, given it will not be part of the remaining fix?
Comment 79•12 years ago
|
||
Did you file a bug on the DOMWindowUtils thing?
Comment 80•12 years ago
|
||
(In reply to Henrik Skupin (:whimboo) from comment #78)
> (In reply to Henrik Skupin (:whimboo) from comment #50)
> > That only stops because we change the context on the Python side to CHROME
> > again for a short moment. That let the loop quit triggering new sendAsync()
> > calls. But that means we are firing the event against the chrome scope now
> > but not content. So it's not the application which receives the event but
> > the system application. That might explain the behavior we can see in bug
> > 829514.
>
> I don't think that this is a correct behavior. Jonathan, shall I file a new
> bug about that, given it will not be part of the remaining fix?
As Malini pointed out in comment 54, this isn't actually happening.
Comment 81•12 years ago
|
||
Oh, I missed that comment. Thanks Andrew.
Updated•12 years ago
|
Whiteboard: [cr 436777] → [cr 436777] [triage:1/16]
Assignee | ||
Comment 82•12 years ago
|
||
(In reply to Ted Mielczarek [:ted.mielczarek] from comment #79)
> Did you file a bug on the DOMWindowUtils thing?
Just filed as bug 831367
Comment 83•12 years ago
|
||
Status: ASSIGNED → RESOLVED
Closed: 12 years ago → 12 years ago
Resolution: --- → FIXED
Comment 84•12 years ago
|
||
status-b2g18:
--- → fixed
status-firefox19:
--- → wontfix
status-firefox20:
--- → wontfix
status-firefox21:
--- → fixed
Comment 85•12 years ago
|
||
Landed on mozilla-b2g18/gaia master prior to the 1/25 branching to mozilla-b2g18_v1_0_0/v1.0.0, updating status-b2g-v1.0.0 to fixed.
status-b2g18-v1.0.0:
--- → fixed
Updated•2 years ago
|
Product: Testing → Remote Protocol
You need to log in
before you can comment on or make changes to this bug.
Description
•