Last Comment Bug 833518 - New HTMLElement bindings caused ~20MB regression in peak memory consumption on areweslimyet
: New HTMLElement bindings caused ~20MB regression in peak memory consumption o...
Status: RESOLVED FIXED
[MemShrink:P2]
: regression
Product: Core
Classification: Components
Component: DOM (show other bugs)
: Trunk
: All All
: -- normal with 2 votes (vote)
: ---
Assigned To: Peter Van der Beken [:peterv]
:
:
Mentors:
Depends on:
Blocks: 812333 818219
  Show dependency treegraph
 
Reported: 2013-01-22 12:18 PST by John Schoenick [:johns]
Modified: 2013-04-01 19:52 PDT (History)
40 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
unaffected
+
fixed
+
fixed


Attachments
Don't install new bindings for on* properties (6.44 KB, patch)
2013-01-25 08:09 PST, Peter Van der Beken [:peterv]
bzbarsky: review+
Details | Diff | Splinter Review
Don't install new bindings for on* properties (6.39 KB, patch)
2013-01-25 12:20 PST, Peter Van der Beken [:peterv]
peterv: review+
n.nethercote: review+
lukasblakk+bugs: approval‑mozilla‑aurora+
Details | Diff | Splinter Review
AWSY on Feb 6 (210.63 KB, image/png)
2013-02-05 19:57 PST, Nicholas Nethercote [:njn]
no flags Details
AWSY on Feb 19 (147.79 KB, image/png)
2013-02-19 13:54 PST, Nicholas Nethercote [:njn]
no flags Details

Description John Schoenick [:johns] 2013-01-22 12:18:01 PST
+++ This bug was initially created as a clone of Bug #820602 +++

https://areweslimyet.com/

When bug 818219 landed AWSY shows a 20MiB regression in the "RSS: After TP5 [+30s, forced GC]" (Dark green) line, which measures memory usage with 30 TP5 sites open after waiting 30 seconds and forcing a GC.

More specifically, bug 818219 landed as 79fc9d732bf7, which regressed by 18MiB, then was backed out as a675dff7f8c4 which returned to previous levels, before regressing again in the e42b1de74b4d .. 8a0c82cf76ab range, which included this landing.
Comment 1 Boris Zbarsky [:bz] (still a bit busy) 2013-01-22 12:42:17 PST
Hmm.  Do we have any good data on where the memory is used?

I would have expected some extra memory usage here because of all the getters/setters on HTMLElement now getting JSFunctions instead of being JSPropertyOps.

I guess we're setting this up on every single HTML element prototype, so we're looking at something like (100+ functions) * (however many prototypes) * (however many pages)...  If so, then finishing the conversion of HTML element classes to WebIDL will naturally help here.  Have we observed drops when such conversions have landed, by any chance?

(Note: it's a lot of functions because of all the on* properties.)
Comment 2 John Schoenick [:johns] 2013-01-22 13:19:35 PST
There haven't been any significant decreases in the past months :(

The last significant drop in usage was in the range:
https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=58c8080a1a7c&tochange=8586bd350875

If you zoom in to the Dec 7th range on AWSY and click on the points just before/after the jump you can view the full about:memory information of each test run. The explicit data is much less noisy than resident.

Looking at f23cff8ec6c9 vs 79fc9d732bf7:
- explicit/window-objects regressed by 14MiB, and explicit/js-non-window by ~4MiB
- The myspace.com compartment went from ~12MiB to ~14MiB
  - gc-heap/objects/function up ~700KiB,
  - unused-gc-things up ~500KiB
  - shapes/base by ~300KiB
  - shapes/dict by ~150KiB
Comment 3 John Schoenick [:johns] 2013-01-22 13:28:30 PST
It looks like bug 812333 also caused a smaller ~7MiB regression in explicit.

Prior to this 20MiB spike, we climbed from 262MiB to 282MiB from Oct 27th to Dec 7th, largely in small increments.
Comment 4 Nicholas Nethercote [:njn] 2013-01-22 14:56:58 PST
> If so, then finishing the conversion of HTML
> element classes to WebIDL will naturally help here.

Why is that?

This is a bad regression, and it's on Aurora now.  khuey says we could turn this off for the moment.  Do we have other, longer-term options?
Comment 5 :Ms2ger (⌚ UTC+1/+2) 2013-01-23 01:11:59 PST
(In reply to Nicholas Nethercote [:njn] from comment #4)
> > If so, then finishing the conversion of HTML
> > element classes to WebIDL will naturally help here.
> 
> Why is that?

Because we install the JSFunctions on the leaf prototypes, but only for those classes that aren't converted yet. The new bindings follow WebIDL and inherit them from HTMLElement.prototype. That is, for XPConnect HTML elements, we have

Element.prototype -> Node.prototype -> EventTarget.prototype (each with its own members)
^
|
HTMLElement.prototype (also with its own members)
^
|
(Leaf class).prototype (with all members from the leaf class, *and* the members from all classes along the prototype chain)

whereas for WebIDL binding HTML elements, we have


Element.prototype -> Node.prototype -> EventTarget.prototype (each with its own members)
^
|
HTMLElement.prototype (also with its own members)
^
|
(Leaf class).prototype (with *just* the members from the leaf class)
Comment 6 Peter Van der Beken [:peterv] 2013-01-23 03:09:49 PST
(In reply to Nicholas Nethercote [:njn] from comment #4)
> This is a bad regression, and it's on Aurora now.  khuey says we could turn
> this off for the moment.  Do we have other, longer-term options?

I don't think we should turn this off, I think we should try filtering out all the on* properties first. As said the longer-term option is converting to the new bindings.
Comment 7 Peter Van der Beken [:peterv] 2013-01-23 06:55:47 PST
I'm trying out a patch that reduces the number of properties by filtering the on* properties. Though I have to say it's a bit baffling how hard it is to be able to reproduce these test results, given the importance we seem to give to this test.
Comment 8 Boris Zbarsky [:bz] (still a bit busy) 2013-01-23 07:02:25 PST
> There haven't been any significant decreases in the past months :(

Well, it depends on how you define "significant".

There is a definite drop of 6+MB on this pushlog, afaict: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=6172df451c1c&tochange=6503c6bee6f8 which is what I'd expect from the changes there.  It's just not clear whether that drop is noise or due to the changes....

> Do we have other, longer-term options?

The primary long-term option is finishing the conversion of HTML elements to WebIDL.  This is a highly parallelizable and mechanical task with two exceptions (embed/object and form), and I welcome people helping out.  ;)  Seriously, though, long-term this should give us what we want.  To the extent that it doesn't, we'll need to fix the JS engine to use less memory for these functions...

Peter, for comment 6 the idea is to try not installing quickstubs for on*?  We could certainly try that; it should be a pretty small behavior change, I think...
Comment 9 Justin Lebar (not reading bugmail) 2013-01-23 07:08:32 PST
> Though I have to say it's a bit baffling how hard it is to be able to reproduce these test results, 
> given the importance we seem to give to this test.

I imagine that if you send johns a patch, he can run it through AWSY on his machine, which is probably much easier than trying to set it up locally.
Comment 10 Andrew McCreight [:mccr8] 2013-01-23 07:38:53 PST
You can also look at the individual reports on AWSY and probably dig around in the about:memory report for each to figure out which pages are hit the worst, then get the TP5 set.  For this particular problem that does not involve anything transient, I imagine "load a page, click minimize memory usage, wait a few seconds" should show the problem.
Comment 11 Justin Lebar (not reading bugmail) 2013-01-23 07:40:16 PST
(The TP5 pageset is available by asking on #releng.)
Comment 12 John Schoenick [:johns] 2013-01-23 08:37:38 PST
(In reply to Peter Van der Beken [:peterv] from comment #7)
> Though I have to say it's a bit baffling how hard it is
> to be able to reproduce these test results, given the importance we seem to
> give to this test.

Adding the ability to the AWSY control panel to automatically test try builds is on my todo list, in the meantime I can run the test on any patches pretty easily.

For running the test locally, there's notes at bug 820602 comment 22. It requires setting up the TP5 pageset (which cannot be distributed) and issuing one command, so I'm not sure it's unreasonably difficult compared to our other test suites.
Comment 13 Nicholas Nethercote [:njn] 2013-01-23 14:33:44 PST
> I'm trying out a patch that reduces the number of properties by filtering
> the on* properties.

Great!  I've assigned this bug to you accordingly.  If the filtering doesn't work, I guess we can turn this off on Aurora, and then hopefully do enough WebIDL conversion during the rest of this cycle to fix Nightly?

> Though I have to say it's a bit baffling how hard it is
> to be able to reproduce these test results, given the importance we seem to
> give to this test.

Good memory consumption tests are really hard to write;  that's life.  John's done a great job getting AWSY as useful and usable as it is.  Just post the patch when you're done and he'll be able to do a test run.  Thanks!
Comment 14 Jeff Walden [:Waldo] (remove +bmo to email) 2013-01-23 15:38:51 PST
In the long term -- nowhere close to short term -- one possibility for dealing with the webidl function-size memory increase is to institutionalize (in the disputably good way) the quickstubs __lookupGetter__/__lookupSetter__ hack, so that the relevant properties include a JSNative initially, then on Object.getOwnProperty inspection they get mutated to contain an actual function.

This is a substantial amount of work of considerable complexity, in terms of implementation, in terms of how it affects other changes we want to make (specifically, making our MOP be like the ES6 MOP -- it makes using Property Descriptor in the MOP harder, because you have to be able to rewrite the Shape sometimes, but not always), and in terms of not affecting perf.  I did some (but not all) of this work in bug 560072, back in the day, so I *know* it's complex and tricky and difficult and error-prone.  But it's possible it's what we'll have to do.  :-\
Comment 15 Lukas Blakk [:lsblakk] use ?needinfo 2013-01-23 17:03:01 PST
We can track this for release if the proposal is to backout bug 818219 from branches and work on a proper fix as per comment 14, but if a backout isn't an option this doesn't seem fit to be a release-blocking tracked bug since it looks like a much more involved effort that will span several versions.
Comment 16 Boris Zbarsky [:bz] (still a bit busy) 2013-01-24 05:35:23 PST
> I guess we can turn this off on Aurora

I would be strongly opposed to that unless this is a critical stop-ship problem.  Doing that will make methods with the same name have different behavior on different HTML elements, which is ... highly suboptimal.  In my opinion.
Comment 17 Justin Lebar (not reading bugmail) 2013-01-24 06:55:50 PST
We were giving gps a hard time for a 2.5mb memory regression with FHR; it's only fair to give you 8x more of a hard time.  :)  (Actually, this is probably worse than 8x, because it scales with additional memory usage; it's not a fixed cost.)

Do we have any options here to reduce the size of this memory regression other than backing the patch out?
Comment 18 Boris Zbarsky [:bz] (still a bit busy) 2013-01-24 08:45:33 PST
Um... read the bug?  ;)  Seriously, see comment 7.  Peter is working on it.
Comment 19 Justin Lebar (not reading bugmail) 2013-01-24 08:47:26 PST
(In reply to Boris Zbarsky (:bz) from comment #18)
> Um... read the bug?  ;)  Seriously, see comment 7.  Peter is working on it.

Sorry, I thought that wasn't targeted at Aurora.  Is it?
Comment 20 Boris Zbarsky [:bz] (still a bit busy) 2013-01-24 08:57:24 PST
It would be Aurora-only, actually.  On m-c we're just going to convert more elements and see how things look.

Note that since we're converting them a few at a time you won't see a big drop, but rather a bunch of small decreases.  Assuming our analysis of the situation is correct.

I suppose at the end we can test things by turning off the quickstub bits in a test build and seeing how much memory that wins at that point, if any.
Comment 21 Peter Van der Beken [:peterv] 2013-01-25 08:09:39 PST
Created attachment 706405 [details] [diff] [review]
Don't install new bindings for on* properties

With this MaxMemoryResidentForceGCV2 goes from 405344256 to 393580544.
Comment 22 Boris Zbarsky [:bz] (still a bit busy) 2013-01-25 08:55:55 PST
OK, so 12MB?  And we won another 6MB or so, afaict, when we converted the dromaeo-used elements.  So that sounds like most of the difference right there.
Comment 23 Peter Van der Beken [:peterv] 2013-01-25 11:43:46 PST
Comment on attachment 706405 [details] [diff] [review]
Don't install new bindings for on* properties

I think we should do this on aurora, and continue converting to new bindings on trunk.
Comment 24 Boris Zbarsky [:bz] (still a bit busy) 2013-01-25 11:53:28 PST
Comment on attachment 706405 [details] [diff] [review]
Don't install new bindings for on* properties

>+        JSBool ok;

This is never really used.  Just get rid of it, please.  So you can make the for loop look like this:

  for (JSPropertySpec* ps = props->specs; px->name; ++ps)

r=me with that.
Comment 25 Peter Van der Beken [:peterv] 2013-01-25 12:20:24 PST
Created attachment 706533 [details] [diff] [review]
Don't install new bindings for on* properties
Comment 26 Peter Van der Beken [:peterv] 2013-01-25 12:21:20 PST
Comment on attachment 706533 [details] [diff] [review]
Don't install new bindings for on* properties

We'll need a JS peer to sign off on the new friend api. We only want to add this to aurora though.
Comment 27 Nicholas Nethercote [:njn] 2013-01-25 12:39:21 PST
Comment on attachment 706533 [details] [diff] [review]
Don't install new bindings for on* properties

Review of attachment 706533 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks for fixing this, peterv!

::: js/src/jsapi.cpp
@@ +4021,5 @@
>  }
>  
> +namespace js {
> +
> +JS_FRIEND_API(bool)

Can you add a comment about the temporariness of this?  E.g. "This function was added on the Aurora branch as a short-term fix for bug 833518.  It's not present on trunk."
Comment 28 Nicholas Nethercote [:njn] 2013-01-25 12:40:10 PST
(In reply to Boris Zbarsky (:bz) from comment #22)
> OK, so 12MB?  And we won another 6MB or so, afaict, when we converted the
> dromaeo-used elements.

When/where did that happen?  Is it on Aurora?
Comment 29 Boris Zbarsky [:bz] (still a bit busy) 2013-01-25 12:41:33 PST
> When/where did that happen?

See comment 8.  The relevant bugs are bug 824823 and bug 824907

>  Is it on Aurora?

Yes.
Comment 30 Nicholas Nethercote [:njn] 2013-01-25 12:52:45 PST
johns, once peterv's patch has landed, is it possible to do an AWSY run of Aurora?  It'd be good to confirm the improvement on the actual branch.
Comment 31 Peter Van der Beken [:peterv] 2013-01-26 12:23:34 PST
Comment on attachment 706533 [details] [diff] [review]
Don't install new bindings for on* properties

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 818219
User impact if declined: higher memory usage
Testing completed (on m-c, etc.): pushed to try, was green. We're not planning on landing this on trunk because we want to fix it in a different way there.
Risk to taking this patch (and alternatives if risky): there's a small risk for web compat issues because we're switching some properties on some elements back to using XPConnect, which has small differences in behaviour. Because we've limited it to on* properties the risk should be pretty low.
String or UUID changes made by this patch: none, but adds a JS friend API that should only exist during one cycle
Comment 32 Lukas Blakk [:lsblakk] use ?needinfo 2013-01-30 12:47:45 PST
Comment on attachment 706533 [details] [diff] [review]
Don't install new bindings for on* properties

Approving for Aurora, understood that it will only land there and a different fix will be forthcoming for trunk.  Let's land this now so we can get some bake time before merge to Beta.
Comment 33 Peter Van der Beken [:peterv] 2013-01-31 07:03:30 PST
https://hg.mozilla.org/releases/mozilla-aurora/rev/6f2c8a3ad078
Comment 34 Nicholas Nethercote [:njn] 2013-02-05 19:57:18 PST
Created attachment 710493 [details]
AWSY on Feb 6

The attached AWSY screenshot shows that we're at about the same place we were between Dec 7--17, after this regression happened.  I.e. no net improvement on Nightly.  (Peter landed an improvement on Aurora, of course).

Are there many more conversions to happen between now and the end of this dev cycle (Feb 18)?  I wonder if we'll need to land Peter's fix on the next Aurora as well.
Comment 35 Boris Zbarsky [:bz] (still a bit busy) 2013-02-06 00:52:05 PST
> Are there many more conversions to happen between now and the end of this dev cycle

Yes.
Comment 36 Peter Van der Beken [:peterv] 2013-02-19 12:06:08 PST
A bunch more elements were converted. It does seem like we've slowly improved, I don't think we're completely back where we were but we're getting close. Personally I don't think we need to take this on the next aurora, but let us know if you disagree.
Comment 37 Nicholas Nethercote [:njn] 2013-02-19 13:54:19 PST
Created attachment 715688 [details]
AWSY on Feb 19

It is improving.  E.g. in the attachment, the blue line was about 250 MiB before the regression (the first jump on the left) and it's been creeping on the right side, and is now at about 260 MiB.
Comment 38 Andrew McCreight [:mccr8] 2013-02-19 14:29:42 PST
njn said he'd do a AWSY run to figure out how much the Aurora patch buys us now.

Peter, is there any particular reason not to apply this patch?
Comment 39 Boris Zbarsky [:bz] (still a bit busy) 2013-02-19 16:45:53 PST
Well, apart from the reason in comment 16?  Though for the on* properties the behavior differences are pretty subtle, I'd guess.
Comment 40 bhavana bajaj [:bajaj] 2013-03-25 23:31:37 PDT
(In reply to Peter Van der Beken [:peterv] from comment #31)
> Comment on attachment 706533 [details] [diff] [review]
> Don't install new bindings for on* properties
> 
> [Approval Request Comment]
> Bug caused by (feature/regressing bug #): bug 818219
> User impact if declined: higher memory usage
> Testing completed (on m-c, etc.): pushed to try, was green. We're not
> planning on landing this on trunk because we want to fix it in a different
> way there.

My understanding here is we were going to land something different on aurora.What's the status of the patch Or are we planning on landing the same patch we landed for Fx20 ? We are less than a week away from the merge and will be helpful to have the next steps here.
Comment 41 Nicholas Nethercote [:njn] 2013-03-26 02:15:04 PDT
I think we're close enough to where we were originally -- within about 4 MiB on AWSY -- that we don't need to do anything on Aurora.
Comment 42 Boris Zbarsky [:bz] (still a bit busy) 2013-04-01 19:52:57 PDT
I'm going to declare this fixed on 21, then.

Note You need to log in before you can comment on or make changes to this bug.