Closed
Bug 43457
Opened 24 years ago
Closed 24 years ago
MEM: Style Context Sharing needs to be improved to increase sharing and reduce performance degradation
Categories
(Core :: CSS Parsing and Computation, defect, P2)
Core
CSS Parsing and Computation
Tracking
()
RESOLVED
FIXED
Future
People
(Reporter: attinasi, Assigned: pierre)
References
(Blocks 1 open bug)
Details
(Keywords: embed, perf, Whiteboard: [whitebox])
Attachments
(21 files)
1.27 KB,
text/plain
|
Details | |
175.62 KB,
patch
|
Details | Diff | Splinter Review | |
1.02 KB,
text/plain
|
Details | |
255.84 KB,
patch
|
Details | Diff | Splinter Review | |
137.10 KB,
text/plain
|
Details | |
25.98 KB,
text/plain
|
Details | |
8.65 KB,
text/plain
|
Details | |
137.20 KB,
text/plain
|
Details | |
252.90 KB,
patch
|
Details | Diff | Splinter Review | |
257.70 KB,
patch
|
Details | Diff | Splinter Review | |
15.74 KB,
image/gif
|
Details | |
252.95 KB,
patch
|
Details | Diff | Splinter Review | |
11.40 KB,
text/plain
|
Details | |
6.31 KB,
patch
|
Details | Diff | Splinter Review | |
281.25 KB,
patch
|
Details | Diff | Splinter Review | |
11.96 KB,
patch
|
Details | Diff | Splinter Review | |
15.10 KB,
image/gif
|
Details | |
14.43 KB,
image/gif
|
Details | |
15.10 KB,
image/gif
|
Details | |
207.66 KB,
patch
|
Details | Diff | Splinter Review | |
5.93 KB,
patch
|
Details | Diff | Splinter Review |
Style Context Data sharing currently saves a lot of memory at the expense of a
moderate performance hit. We need to improve the sharing design to minimize the
performance hit and to increase the space savings.
Some initial ideas include:
- Allocate StyleContextData out of a Pool or Arena to minimize heap allocations
- Share style data across documents: presumably this means storing the style
context cache (or a style data cache) at a more global scope than the style set
currently used to root the collection.
- Enable the FastCaching algorithm: this requires closing the encapsulation-hole
of GetMutableStyleData and having the style system manage changes to style data
effectively (updating the CRC, updating the style context cache, possibly
validating the style data is consistent)
This bug is intended to track ideas and log results of experimentation and
prototyping of designs that improve style context sharing in terms of
performance and memory footprint (including bloat).
Reporter | ||
Comment 1•24 years ago
|
||
Added depends on bug 39618 since there is valuable information in that bug that
pertains to this one.
Reporter | ||
Comment 4•24 years ago
|
||
Moving all non-nsbeta3 bugs to future milestone: these will be worked on after
beta3/rtm.
Target Milestone: M20 → Future
Comment 5•24 years ago
|
||
adding myself to interest list
Reporter | ||
Comment 6•24 years ago
|
||
Setting milestone.
Adding additional idea too: Sharing is now all or nothing (all style data for a
context is shared or not). A finer granularity of sharing, where each style
struct can be shared seperately, would increase the sharing potential.
Target Milestone: Future → mozilla0.8
Assignee | ||
Comment 7•24 years ago
|
||
I broke up nsStyleSpacing into 4 smaller structs (nsStyleMargin, nsStylePadding,
nsStyleBorder and nsStyleOutline). It is the first step to reduce the memory
footprint as described in
news://news.mozilla.org/3A5A072F.5F0B752D%40netscape.com
Assignee | ||
Comment 8•24 years ago
|
||
Assignee | ||
Comment 9•24 years ago
|
||
Reporter | ||
Comment 10•24 years ago
|
||
I have reviewed the code and I have only a couple of minor questions/comments:
Zeroth, I'm amazed at the amount of changes, and the lack of problems - well done!
First, I noticed that nsButtonFrameRenderer uses two methods for getting the
borderAndPadding, one is to get border and padding individually and then combine
them, and the other is to use the new shortcut
eStyleStruct_BorderAndPaddingShortcut. Is this intentional? I would have thought
that the shortcut would be the way to go in all cases.
Secondly, I started seeing this #ifdef XP_MAC \ pragma mark - \ #endif construct
with no documentation: what is it, and should it be removed or commented?
Third, it looks like nsCSSRendering has an 'overloaded' argument type now, since
there are casts of aBorderStyle to nsStyleOutline. This is seriously ugly, and
should probably be changed to pass in both types and check for one being null.
At the very least, document the hackery.
Fourth, I saw several examples where there is a call like
'this->CalcBorderPadding(...)'. Why bother with the 'this->'? It is equivalent
to 'CalcBorderPadding(...)'. See the nsGfxCheckboxControlFrame.cpp diff for the
first example of this construct.
Fifth, nsHTMLReflowState.h introduced forward declarations for
nsStyleBorder,Margin, and Padding, but none was there previously for
nsStyleSpacing. I'm curious why we need the forward declarations now.
r=attinasi, assuming these points are addressed, especially point (3).
Assignee | ||
Comment 11•24 years ago
|
||
1) Fixed.
2) "#pragma mark -" puts separators in the list of functions of the Mac editor.
You can see them in other particularly long files that are often modified by Mac
folks.
3) Correct, I made uglier something that was already there. It's fixed.
4) Fixed, but this notation can be found in many places.
5) There *was* a forward declaration of nsStyleSpacing. You comments allowed me
to make a global search on nsStyleSpacing and to remove it entirely, including in
comments.
Reporter | ||
Comment 12•24 years ago
|
||
Performance regression caused by this checkin is in bug 66263
Reporter | ||
Comment 13•24 years ago
|
||
Important Comment from bug 66263
------ Additional Comments From Axel Hecht 2001-01-24 23:24 -------
This checkin has caused a regression in the XSLT code, which caused some worries
to both Peter and me over the last few days. The backing out made our life
happy again.
To be specific, we couldn't style a xhtml body tag anymore.
If code like the one backed out should go back in, could you (pierre?) please
speak out load, so we can test in advance?
Thanx
Axel
Reporter | ||
Comment 14•24 years ago
|
||
Moving to Mozilla 1.0 - this cannot be completed in the next week or two...
Target Milestone: mozilla0.8 → mozilla1.0
Comment 15•24 years ago
|
||
I tried to apply this patch (to test XSLT), to no avail.
I got around the whitespace mess, but the filenames are just not found.
I need a commandline to apply this on unix, I tried all the commandline opts
I know.
Btw, XSLT is targeted to go into the default builds for 0.9, if this patch is
the code that caused us to regress this will be pretty serious.
Axel
Comment 16•24 years ago
|
||
Is the performance problems solved now?
It seems that the patches have been checked in on Feb 7. Does XSLT work now?
This also caused some troubles for us trying to break content from layout, but I
believe peterv has solved them for that case(?)
Comment 18•24 years ago
|
||
XSLT didn't regress this time.
Axel
Reporter | ||
Comment 19•24 years ago
|
||
From an email about some profiling data:
Brendan Eich wrote:
- 27% of this time (9msec) is spent in 70 calls to
nsXBLBinding::GenerateAnonymousContent(). About half of this
time (5msec) goes to nsXMLElement::CloneNode() (copying the
binding).
- 20% of the time (17msec) is spent in
StyleSetImpl::ResolveStyleFor().
- Most of the time (9msec) is spent in 160 calls to
NS_NewStyleContext(). The single biggest consumer is
AccumlateCRC() (1msec).
1msec is not big in light of other numbers, but it's still a lot of cycles.
Marc, wouldn't a rotating xor a la
http://lxr.mozilla.org/mozilla/source/js/src/jsdhash.c#74 be sufficient? I
don't see the need for full-on Galois field arithmetic here! I'm all for making
the CRC calculation faster, however we do need to make sure that we still get a
good ration of unique CRCs to total CRCs. In fact, I bet we can also limit the
CRC calculation to only select fields int he StyleContextData structures, and
still get most of the benefit.
Me thinks that this is worth looking into - being carefule to track the CRC
uniqueness.
Assignee | ||
Comment 20•24 years ago
|
||
I will soon have another big checkin. Reassigned to myself.
Set milestone to mozilla0.8.1 (March 13th).
Assignee: attinasi → pierre
Status: ASSIGNED → NEW
Priority: P3 → P2
Summary: PERF: Style Context Sharing needs to be improved to increase sharing and reduce performance degradation → MEM: Style Context Sharing needs to be improved to increase sharing and reduce performance degradation
Target Milestone: mozilla1.0 → mozilla0.8.1
Assignee | ||
Comment 22•24 years ago
|
||
I've got something that apparently works nicely. The StyleData, which accounts
for about half the memory used by the style system, is reduced by 65% or 70%.
The overall reduction for the style system is around 35%, more on complex pages.
The reduction can reach a couple of hundreds K on pages like Netscape.com.
Apparently again, the performance wasn't hurt much, if at all. I'll post more
accurate data later. I'm going to attach the patch for other people to start
play with it.
Assignee | ||
Comment 23•24 years ago
|
||
Assignee | ||
Comment 24•24 years ago
|
||
Assignee | ||
Comment 25•24 years ago
|
||
For those who want to review the code or test the diffs, the basic principles are
the following:
1) Change of interfaces to modify the style data:
GetMutableStyleData() was replaced by 2 methods: ReadMutableStyleData() and
WriteMutableStyleData(). Look into nsIMutableStyleContext.h for more info on the
helper classes and the implementation.
2) Memory footprint reduction:
In nsStyleContext.cpp, the nsStyleContextData now contains pointers to the 15
individual style structures, instead of the structures themselves. When a
pointer is null, it means that we are inheriting the data from the parent.
3) Simplification of the code:
The 15 pointers in nsStyleContextData are stored as a union of individual
pointers and an array of generic pointers. The array allows to loop through the
structures instead of duplicating the code 15 times in switch() statements. This
simplification required a slightly unfortunate change in the definition of
nsStyleStruct (in nsStyleStruct.h) where we now expose methods that should be
hidden from everybody but the StyleContext. It's a small drawback for a big
gain.
Comment 26•24 years ago
|
||
I'm trying to build with the latest patch, and I get
mozilla/content/base/src/nsStyleContext.cpp:4218:35: nsStyleContextDebug.cpp: No
such file or directory
Did you miss a file in the diff?
Comment 27•24 years ago
|
||
In nsStyleContext.cpp, methods StyleDisplayImpl::SetFrom and
StyleDisplayImpl::CopyTo, you haven't made the same change within the #ifdef
IBMBIDI block that you made in the surrounding code.
Another tiny point: at the very beginning of the same file, there is a
regression from "Bidi" to "BiDi" in a comment. Please change this back to "Bidi"
to retain consistency.
Assignee | ||
Comment 28•24 years ago
|
||
Thanks to Mike and Simon for pointing out these errors. I'm going to attach the
files themselves instead of the diffs.
Note that the debug features in nsStyleContextDebug.cpp/.h can only be activated
on the Mac for now. Someone has to adapt IsTimeToDumpDebugData() for other
platforms.
Assignee | ||
Comment 29•24 years ago
|
||
Assignee | ||
Comment 30•24 years ago
|
||
Assignee | ||
Comment 31•24 years ago
|
||
Assignee | ||
Comment 32•24 years ago
|
||
nsStyleContextDebug.cpp and nsStyleContextDebug.h should be stored in the same
directory as nsStyleContext.cpp (mozilla/content/base/src)
Comment 33•24 years ago
|
||
Adding myself to the list.
Rather than exposing some methods, couldn't you make StyleContext a friend of
nsStyleStruct ? (Didn't look at the patch, so dunno if this makes sense.)
Comment 35•24 years ago
|
||
If you broke the XUL stuff, you'll know just by running the product. If a
navigator window and a prefs window comes up and looks ok, then the XUL struct
must work.
Assignee | ||
Comment 36•24 years ago
|
||
Heikki: I could have declared the methods private and make StyleContextImpl and
nsStyleContextData friends of nsStyleStruct but it seemed to me that it wasn't
much of an improvement. It would also require to move some debugging code into a
wrapper class and declare it as friend of nsStyleStruct.
Assignee | ||
Comment 37•24 years ago
|
||
Dave & others: Here is yet another version of nsStyleContext.cpp that fixes some
obvious problems with XUL...
Assignee | ||
Comment 38•24 years ago
|
||
Reporter | ||
Comment 39•24 years ago
|
||
Pierre, what testing have you done on this? Specifically, have you run the
style/block/table regression tests? Have you run John Morrison's Load Time
tests? Just curious.
Assignee | ||
Comment 40•24 years ago
|
||
I did not test against any test suites yet. Please check Block & Tables as they
require a Window machine and I'll take care of the CSS test suites.
The tests I did until now, and that I just completed, where about memory
footprint and performance in Debug and Optimized builds, before and after my
changes. As you can see, we have a slowdown of around 5% as a trade-off to
reduce the memory by 30% to 40% in the Style System overall, which represents
about 80k for Yahoo, 160k for CNN and Netscape and 275k for MSN.
The savings for the chrome are a bit disappointing, around 15%, which represents
64k for Navigator and 100k for Messenger. The percentage is so low in the chrome
because most of the memory allocated for the style system in XUL files is taken
by CSSDeclarations and StyleRules. The XUL files generate between 2 and 2.5
times the average number of declarations and style rules that we find in web
pages.
Slowdown/DBG(%) Slowdown/OPT(%) Footprint(%) Footprint(Kb)
Yahoo 2 2 -30 -83
Slashdot 4 6 -35 -140
CNN 5 0 -31 -158
Netscape 1 2 -35 -170
MSN 3 5 -40 -275
Navigator.xul n/a n/a -14 -64
Messenger.xul n/a n/a -18 -100
CSSDeclarationImpl CSSStyleRuleImpl
Yahoo 332 582
Slashdot 339 584
CNN 466 710
Netscape 380 625
MSN 402 644
Navigator.xul 928 1389
Messenger.xul 1002 1457
Comment 42•24 years ago
|
||
I see no r= or sr=, but understand from pierre that marc has not tested the
patch (or perhaps neither tested nor reviewed it? The message forwarded to me
isn't as clear as an r= in the bug).
Has anyone tried the patch on Linux?
Should these changes carpool in, to get verification and followup fixes if
needed? Today is not a good day to avoid a carpool.
/be
Assignee | ||
Comment 43•24 years ago
|
||
I posted a request for review last Monday but got basically no response. My
partner Marc is way too busy fighting other fires in Layout.
Could someone please help me to review these changes? I would especially
appreciate if someone could build and play a bit with Mozilla on Unix. Thanks!
Comment 44•24 years ago
|
||
Sounds like testing is more important than reviewing at this stage. Pierre, I
can try an up-to-date patch -- should I just detach the diffs and the whole
files and use them, or are they now slightly out of date with respect to top of
trunk?
Cc'ing jrgm, who should run the patch too, if he's able.
/be
Comment 45•24 years ago
|
||
To build it, I need (?):
attach_id=28072 -- diffs to layout/content, but remove the
nsStyleContext*.(h|cpp) stuff
attach_id=28118 -- nsStyleContextDebug.cpp
attach_id=28119 -- nsStyleContextDebug.h
attach_id=28142 -- nsStyleContext.cpp (updated version)
where the last three are whole files, and go in mozilla/content/base/src.
Is that correct?
I will do this on win32, as my linux box is (currently) the test
server, and my mac build-fu is weak.
However, if someone can, running the gecko perf tests (which are more specific
to this bug); and the block/table/layout regression tests would be good if
people are concerned.
Assignee | ||
Comment 46•24 years ago
|
||
I haven't build in the past 4 days but according to Bonsai & CVS, it should still
be ok. To merge the changes:
- apply the diffs
- add nsStyleContextDebug.cpp/.h to mozilla/content/base/src
- overwrite nsStyleContext.cpp
Comment 47•24 years ago
|
||
sorry. I didn't get to this quite yet, but I will do it by 3pm tomorrow.
Comment 48•24 years ago
|
||
once we get a build of some sort, we can coordinate to run the
block/table/layout regression tests which Pierre wasn't able to. I'll get this
coordinate with the cia@netscape.com group. Thanks.
Assignee | ||
Comment 49•24 years ago
|
||
I checked in nsStyleContextDebug.cpp/.h and updated my local tree. I'm going to
attach a new set of diffs in a single file. To try the new code, just do an
update and apply the patch.
Assignee | ||
Comment 50•24 years ago
|
||
Comment 51•24 years ago
|
||
Sorry, I finally got these run done. As pierre mentioned, the times are
about 4% slower, on either the first load (variable range), or a subsequent
visit when cached (fairly consistent across the board). (This was for a
tree pulled ~11pm last night, built, ran test, patched, built, ran test 2,
on win2k).
Comment 52•24 years ago
|
||
Hrm, that's a pretty steep price to pay. What were the savings again?
Comment 53•24 years ago
|
||
jrgm,
still got that build?
can you provide a pointer to it?
how 'bout we get curt to to a special mem test run to look at that graph?
Comment 54•24 years ago
|
||
Yes, I still have that build on win2k. It's not on a shared drive, but I
can make it available (eek! virus). But, those memory tests run on linux,
don't they? Are you really looking for a linux build (which waterson made).
Comment 55•24 years ago
|
||
I can potentially do a round of builds on three platforms for this once we get
scc's string changes tested and in.
Assignee | ||
Comment 56•24 years ago
|
||
I have a fix for the two regressions detected by QA in the test builds that were
made last week. I sent Leaf the new diffs this morning in order to generate new
test builds. I'm going to attach them there too.
Assignee | ||
Comment 57•24 years ago
|
||
Assignee | ||
Comment 58•24 years ago
|
||
People started to inquire about performance lately. Here is a copy...
Chris Waterson wrote:
>
> Asa Dotzler wrote:
>
> > It looks like Pierre is working to get the Style Context Sharing changes
> > landed for Mozilla 0.9. drivers is looking for input on this one. Do
> > you think it's something we want to do. Without further objection I
> > suspect that they are going to try to land it this week. I'm getting
> > ready to start waving off major landings until after 0.9 real soon now
> > but this could make it in sooner than that.
>
> > --Asa
>
> Pierre, have you fixed the performance problems yet? If so, then let's
> take the changes if you can land them by tomorrow.
>
> If not, then I think these should be waved off for mozilla-0.9. We're
> simply out of time to fix the performance problems. (I'd originally
> endorsed ``just landing the changes'' because the landing date was early
> last week.)
>
> My $0.02.
>
> chris
It was never in my plans to fix the 4% performance drop before landing
because nobody objected that the trade-off, memory footprint vs performance,
was not very much acceptable. Phil inquired about the perf yesterday. Here
is my response (you are mentioned in it, Chris). It also describes two ways
to improve the speed a bit: one which consists in tuning up the memory
allocators (sfraser usually does that on the Mac, I don't know for other
platforms), another in adding arenas.
As for checking in, I've been waiting for the Build team to make some test
builds since Thursday morning. They promised them for sometime today,
Tuesday. QA needs to do a last round of verifications, then I'll checkin.
When doing a first round of verification 10 days ago, QA found 2 regressions.
It's fixed now so if you like I can skip this second round of tests and just
checkin.
Pierre
Pierre Saslawsky wrote:
>
> Phil Peterson wrote:
>
> > Chris and Pierre, I heard some talk today that the CSS footprint work
> > comes with some performance degradation. Is this the case? If so, could
> > we talk sometime tomorrow about the details of how much heap space is
> > saved and how much performance is lost? I'd like to understand this
> > before the work gets checked in.
> >
> > -- Phil.
>
> Basically, we save 70% of the StyleData which represents half the footprint
> of the style system. It means that we save 30% to 40% of the entire memory
> used by the style system.
>
> The savings represent approximately:
>
> * 80k on Yahoo
> * 160k on Netscape and CNN
> * 275k on MSN
> * 64k per empty Navigator window (no sidebar)
> * 100k per empty Messenger window (no lists, just the chrome)
>
> I originally estimated the slowdown at less than 5%. John Morrison
> confirmed that it is in fact 4%. Note that this degradation has been
> measured without doing any optimization at all.
>
> Chris Waterson had the same question and concern as you:
>
> ------- Additional Comments From Chris Waterson 2001-03-27 23:59
> -------
> Hrm, that's a pretty steep price to pay. What were the savings
> again?
>
> When I gave him the numbers, his response was:
>
> Chris Waterson wrote:
> >
> > Wow, that's great.
> >
> > chris
>
> Anyhow, the easiest way to improve the performance a bit would be to tune
> up the memory allocators to reserve larger pools of blocks that match the
> individual style structures (we no longer allocate the StyleData as a
> single block of 800 bytes but as 14 or 15 smaller blocks that correspond to
> the style structures like font, margin, color etc...). Tuning up the
> memory allocators is something we should always do before shipping a
> version anyhow (and that we did, afaik). It would be a good idea to do it
> after I checkin as the pattern of the block sizes we allocate across the
> app would change quite a bit for at least 5 of them (out of the 15).
>
> Another way to improve the performance would be to get these style
> structures out of arenas. It involves a little bit of work. I haven't
> looked into it but at first sight it's something that can be done in a few
> days.
>
> Below are the comments and chart from John Morrison:
>
> ------- Additional Comments From John Morrison 2001-03-27 21:35
> -------
> Sorry, I finally got these run done. As pierre mentioned, the
> times are
> about 4% slower, on either the first load (variable range), or a
> subsequent
> visit when cached (fairly consistent across the board). (This was
> for a
> tree pulled ~11pm last night, built, ran test, patched, built,
> ran test 2,
> on win2k).
>
> More details are at http://bugzilla.mozilla.org/show_bug.cgi?id=43457
>
> Pierre
>
>
> [Image]
>
>
>
Assignee | ||
Comment 59•24 years ago
|
||
Assignee | ||
Comment 60•24 years ago
|
||
A point to consider is that when I landed the first part of these changes back in
February, I did a 1-line fix that improved the overall performance by 5%.
Could we consider the two parts of my changes as a whole, and say that a 5%
improvement minus a 4% slowdown is still a 1% improvement?
Another idea: why don't we land my changes and disable the style context sharing?
Mine reduce the StyleData by 30 to 40%, the style context sharing by 15 to 40%.
Mine cause a 4% performance drop across the board, the style context sharing a
10% performance drop inside the style system (http://bugzilla.mozilla.org/
showattachment.cgi?attach_id=10050). We need to figure out what was the overall
performance degradation of the style context sharing across the app. It was
enabled them on December 14th 2000 according to the bug report. Was there any
page load measurements taken back then?
Reporter | ||
Comment 61•24 years ago
|
||
Pierre has an interesting idea here, about disabling style context sharing when
we enable his style data sharing. However, it is not clear what the savings
would be for style data sharing if style context sharing is disabled - it may
not yield the same level of memory benefit. Also, style context sharing is
actually faster than what I had originally measured. This is largely due to the
faster StyleContextCache being put in, which I believe resulted in only a 5%
slowdown for Viewer (and even less for SeaMonkey, reported anecdotally). I'd
love to see what the performance cost of style conext sharing is for SeaMonkey,
and now we have the tools to do that (eg. jrgm).
I suggest running tests to characterize the performance and memory gains/losses
across two dimensions:
With and Without context sharing
With and Without data sharing
If we see that the footprint gains due to style context sharing alone are less
than the gains due to style data sharing alone, and the performance gain from
disabling style context sharing make style data sharing's performance loss
acceptable (or zero), then we should turn off context sharing and turn on style
data sharing. I think this is a good approach because unlike the style context
sharing can be enabled or disabled with a #define in one file, so it ie easy to
keep in the tree while we work on performance issues in style data sharing. (I
tried to make this more confusing, but this is as obfuscated as I could get it...)
Comment 62•24 years ago
|
||
[mid-air collision - resubmittig - however, I noted that marc raised similar
points to mine]
BTW, there hasn't been much comments about the interaction of your "StyleData
sharing" and the previous "StyleContext sharing". Could you brief us a little
bit on what is happening on this front? Is there any inter-dependence? When both
are used, do your changes entail an extra overhead in the implementation of the
style context sharing? If both are used, does that lead to "double" (or some
other factor) extra savings. Do your "low-level" styledata changes provide an
independent alternative that aptly removes the need for the most expensive
"higher-level" style-context sharing? [I am just asking these questions as an
indication of the things that some might find useful to understand. Thanks]
Assignee | ||
Comment 63•24 years ago
|
||
StyleContext and StyleData sharing are independant. I'll try to get some numbers
today.
Assignee | ||
Comment 64•24 years ago
|
||
Here is the first batch of numbers...
In terms of memory footprint, the StyleData sharing represents an improvement of
7% to 21% over the StyleContext sharing. I compared the overall size used by the
style system. If we compare just the style data, the difference is twice
as much. I collected the numbers from 3 test pages only: Yahoo, Netscape and
MSN.
Now I have to compare the performance. That's where I fear some problems because
we have much more memory allocations with the StyleData sharing. We'll see... It
will take a couple of hours: I have to make some optimized builds.
================================================================
Count TotalSize MinSize MaxSize AvgSize
================================================================
Yahoo - StyleContext sharing only
StyleContextData 177 135228 764 764 764
StyleContextImpl 295 20954 56 110 71
*** Total *** 2459 291325
----------------------------------------------------------------
Yahoo - StyleData sharing only
StyleContextData 293 74164 68 884 253
StyleContextImpl 293 20810 56 110 71
*** Total *** 2573 230053
================================================================
Netscape - StyleContext sharing only
StyleContextData 369 281916 764 764 764
StyleContextImpl 1035 72902 56 138 70
*** Total *** 3613 508247
----------------------------------------------------------------
Netscape - StyleData sharing only
StyleContextData 1035 247012 68 884 238
StyleContextImpl 1035 72902 56 138 70
*** Total *** 4279 473279
================================================================
MSN - StyleContext sharing only
StyleContextData 587 448468 764 764 764
StyleContextImpl 1418 105952 56 110 74
*** Total *** 4324 716391
----------------------------------------------------------------
MSN - StyleData sharing only
StyleContextData 1411 319808 68 884 226
StyleContextImpl 1411 105464 56 110 74
*** Total *** 5141 587179
================================================================
Results:
Yahoo Data/Context = 230053/291325 = 79% or -21%
Netscape Data/Context = 473279/508247 = 93% or -7%
MSN Data/Context = 587179/716391 = 82% or -18%
Comment 65•24 years ago
|
||
A quick food for thought: if the style data are shared (i.e., using pointers to
the actual values), then may be the style context sharing logic can be re-worked
a little bit so as to determine the sharing based on these shared data pointers.
This way, no cost will be incurred in accummulating the CRC with the actual
values. Seems like combining the two in this way might be of some help in the
-4% perf problem.
Assignee | ||
Comment 66•24 years ago
|
||
It's already the way it works now: the CRC is not computed for these style data
that are inherited from the parent. But, wait, I've got some good news.
Here is the second batch of numbers...
In terms of performance, the StyleData sharing represents an improvement of
2% to 4% over the StyleContext sharing. I compared the overall performance
of the application. If we compare just the time spent inside the style system,
the difference should be at least 3 times higher. I collected the numbers
from 2 test pages only: Yahoo and CNN. I could not test Netscape and MSN
like before because of a regression that was in the tree when I checked
out last week, which causes some pages not to post a notification when the
page is fully loaded.
It appears that the removal of CRC calculations and table lookup that
we have in the StyleContext sharing has offset the larger number of
memory allocations in the StyleData sharing.
It's important to note that in my version, the nsStyleContextData is
still allocated separately from the StyleContextImpl. If I put it
this structure back into the StyleContextImpl, we'll allocate larger
blocks overall but half as many. This could represent an additional
improvement in performance. I'll see if I can work something out
relatively quickly.
==================================================================
Yahoo - StyleContext sharing only
814+812+817+817+814+815+815+815+811/9 = 814 ms
------------------------------------------------------------------
Yahoo - StyleData sharing only
801+792+794+797+793+796+796+799+796/9 = 796 ms
==================================================================
CNN - StyleContext sharing only
1315+1301+1363+1275+1320+1309+1262+1277+1274+1301/10 = 1300 ms
------------------------------------------------------------------
CNN - StyleData sharing only
1238+1242+1241+1246+1254+1260+1257+1260+1244+1256/10 = 1250 ms
==================================================================
Results:
Yahoo StyleData/StyleContext = 796/814 = 97.8% or -2.2%
CNN StyleData/StyleContext = 1250/1300 = 96.2% or -3.8%
Note: In both cases, I dropped 1 out of the 10 measures taken with Yahoo
because it was clearly out of bounds (something like 900ms instead of 800).
Assignee | ||
Comment 67•24 years ago
|
||
Conclusion:
Without any tune-up of any kind, the StyleData sharing represents improvements
over the StyleContext sharing of:
7% to 21% in terms of memory footprint in the style system overall,
2% to 4% in terms of performance across the application.
So... may I checkin now? I would feel more comfortable with another round
verifications but that would require making it a priority for the Build team to
generate some builds on Win32 and Unix, and for QA to play with them a little
bit. Otherwise if the time is running short, I can just checkin. After all,
that's the way everybody else proceeds.
Who wants to sign on this?
Comment 68•24 years ago
|
||
Assume the following situation:
- you build your own mozilla
- you have plenty of memory
- you want to tune for speed instead of size
Will there be a way to turn off these two things (style data sharing / style
context sharing)? Or is the performance loss irreversible?
Comment 69•24 years ago
|
||
From pierre analysis, it appears that StyleData sharing is faster and more
memory efficient than StyleContext sharing. This can be understood that there is
usually more chance for two things (style contexts) to share a common "subpart"
(style data) than to be identical as a whole. Plus, there is a cost in
determining if sharing is possible in the first place. And it appears that this
determination is less expensive with StyleData sharing that with StyleContext
sharing.
So pierre, what is your call, do you recommend to enable StyleData sharing and
disable StyleContext sharing by default?
Assignee | ||
Comment 70•24 years ago
|
||
Sure, it's smaller and faster. I'm fairly confident it won't break anything
because we already went through a first round of QA verifications. But a second
one wouldn't hurt either. I'll let the drivers decide how to proceed because it
would require the Build and QA team to spend some extra time on it.
And of course, I'll be glad to check in. I don't like to have the code sitting
in my local tree. If it's checked in, it would allow me or others to start
playing with an idea I had earlier this week. I'll post it here when I can get
access to my email account again, or maybe I'll open a related bug report.
Basically it consists in storing the style data into hash tables. It would be a
killer in terms of memory footprint but there are some risks in terms of
performance. It's worth doing some instrumentation and get some numbers at
least. After moz0.9, of course.
Assignee | ||
Comment 71•24 years ago
|
||
To answer Andreas's question, I think it is out of the question to disable both
optimizations. According to Marc in bug 39618, the StyleContexts + StyleData on
CNN used to take about 1Mb. With either method we can divide it by 3.
Comment 72•24 years ago
|
||
linux bits in http://ftp.mozilla.org/pub/mozilla/nightly/latest-43457/
windows builds are running now, and will appear in the same directory.
Comment 73•24 years ago
|
||
I tried out the win32-talback drop (in fact I am sending this comment with it).
No weirdness to report so far. Yep, pierre, now that instrumentation has shown
an improved perf, your call makes sense. Indeed, the style data sharing is
"inclusive" of the style context sharing (i.e., the very fact that two style
contexts are identical means that all their style data are indentical and shared).
By single-mindely focussing on style data sharing, maybe more consolidation and
efficiency can be implemented in due time before 1.0.
Comment 74•24 years ago
|
||
So pierre is measuring a speedup, yet at one point I thought Netcsape QA
measured a slowdown (I'm guessing with jrgm's test suite). Pierre, I presume
you've only tried this on Mac? Might other platforms be affected differently?
(e.g., because malloc() is more expensive on Win32/Linux due to lock
contention?) It'd be good to undesrtand why we're seeing different reports.
(Maybe the delta is just too small to realisitically resolve with our tools, in
which case, performance should be considered a non-issue.)
Out of curiosity, what do you see on a larger pages? (e.g.,
nsCSSFrameConstructor.cpp in LXR, tinderbox, table stress test.)
Do we know the impact of this stuff on chrome? (Menus, tree, hover, tooltips,
etc.) Maybe hyatt could help you assess that.
Are you still looking for r= and sr=? Or are those covered?
Comment 75•24 years ago
|
||
Apparently this is the first time it is tested in isolation (without SC sharing)
Assignee | ||
Comment 76•24 years ago
|
||
Chris,
- It is indeed the first time StyleData sharing is tested independently from
StyleContext sharing.
- I tested on the Mac only.
- Results may vary depending on your platform but if experience is of any
indication, they should really not vary by a lot. For instance, when I predicted
a slowdown lower than 5% for SC + SD sharing on the Mac, jrgm found around 4% on
other platforms.
- I will test larger pages later today.
- I don't know the impact on the chrome but if experience is of any indication,
it will be good.
- There has not been any formal review because _you_ and other people declined to
reply to my repeated requests. So with Brendan's suggestion, I went through the
process of a landing, with verification builds handed over to QA before checking
in. I also supported Marc's standpoint that for these changes a thorough testing
would be much more important that a formal review. Anyhow, if you'd like to
review now, please be my guest: the diffs are attached. If you want the exact
code, I can attach new diffs where StyleContext sharing is disabled.
Comment 77•24 years ago
|
||
Were jrgm's tests run using optimized builds with these changes, or debug? Was
the comparison made to a debug or optimized build? Could that account for the
performance hit?
Comment 78•24 years ago
|
||
pierre, I apologize if you felt that I ignored you. (IIRC, when you asked me to
sr=, I replied, asking if Marc liked the changes. You said something along the
lines of ``let's just get a Unix build to work first''.) If you'd still like me
to sr= these changes, attach the latest diffs to the bug and I'll go through
them.
Assignee | ||
Comment 79•24 years ago
|
||
Leaf, all the performance measurements are done with optimized bits. The debug
builds are at least 30% slower, often much more than that.
Chris, thanks for your offer. I'll attach the diffs within the next hour or so.
Look at my comments from [2001-03-18 19:30] for more info.
Comment 80•24 years ago
|
||
So to make sure I understand, the current plan is to disable style context
sharing and enable style data sharing.
Assignee | ||
Comment 81•24 years ago
|
||
Yes, the plan is to disable StyleContext sharing and use StyleData sharing
instead because StyleData sharing results in a smaller memory footprint and a
better performance.
We'll address the performance problem and re-enable StyleContext sharing later.
The benefit of having StyleContext sharing together with StyleData sharing is
that both optimizations work separately. StyleData sharing reduces the memory
used by the nsStyleContextData by 70%, while StyleContext sharing reduces it by
60% or so. The two reductions are almost cumulative in real life tests.
Chris: I'm going to attach new diffs with StyleData sharing only. I had to do a
complete update (meaning: you shouldn't have any problem to apply the patch).
Assignee | ||
Comment 82•24 years ago
|
||
Reporter | ||
Comment 83•24 years ago
|
||
Pierre, I'm confused by the new patch (30514)... Are you just removing the
#define SHARE_STYLECONTEXTS or removing the context sharing code? For example,
what happended to the GETSCDATA macros? Like I said, I'm confused.
Reporter | ||
Comment 84•24 years ago
|
||
Never mind - I see what is going on. Sorry for the noise.
Comment 85•24 years ago
|
||
I think you should talk to scc about the nsMutableStyleXXX stuff. I'd
bet he could help you design a solution that avoids the C-style casts,
and is therefore type-safe. As it stands, you're forcing a programmer
to circumvent the compiler's type-checking mechanisms to use this
stuff!
Personally, I'd have overridden |operator->()| and |operator*()|; e.g.,
class nsMutableStyleBorder {
public:
nsMutableStyleBorder(nsIMutableStyleContext*) {...}
nsMutableStyleBorder(nsIStyleContext*) {...}
nsStyleBorder* operator->() {
return NS_STATIC_CAST(nsStyleBorder*, mStyleStruct); }
const nsStyleBorder* operator->() const {
return NS_STATIC_CAST(const nsStyleBorder*, mStyleStruct); }
nsStyleBorder& operator*() {
return *NS_STATIC_CAST(nsStyleBorder*, mStyleStruct); }
const nsStyleBorder& operator*() const {
return *NS_STATIC_CAST(const nsStyleBorder*, mStyleString); }
};
s.t. a user could do:
nsMutableStyleBorder styleBorder(mStyleContext);
styleBorder->mTextAlign = /*whatever*/;
MapHTMLBorderStyle(aPresContext, *styleBorder, tableFrame);
It might even read more nicely as
nsMutableStyleBorder styleBorder = GetStyleBorderFrom(mStyleContext);
Kinda makes it more clear that you're extracting something from the
style context, rather than ``constructing'' something. (Maybe
not. Take it or leave it.)
You could probably do this with templates and typedefs with a lot less
typing, too. (But that's really scc's domain...)
+#ifdef XP_MAC
+#pragma mark -
+#endif
What does this do?
struct nsStyleStruct {
+ // These declarations should not be here but they make
+ // the code in nsStyleContext.cpp so much easier!...
+ // It allows to store the style data as a union of individual
+ // data structures and an array of generic data structures.
+ virtual void ResetFrom(const nsStyleStruct* aParent, nsIPresContext*
aPresContext) {};
+ virtual void CopyTo(nsStyleStruct* aDest) const {};
+ virtual PRInt32 CalcDifference(const nsStyleStruct* aOther) const {return
0;};
+ virtual PRUint32 ComputeCRC32(PRUint32 aCrc) const {return 0;};
};
Should these be pure virtual? Or do you really have subclasses that
delegate up to these implementations?
Is AssignFrom() a better name than ResetFrom()? Would symmetry be
better if you had a single |CopyTo()| method to copy structures, and a
|GetDefaultsFrom(nsIPresContext*)| to fill in default values? e.g.,
if (aParent)
aParent->CopyTo(STYLESTRUCT[i]);
else
STYLESTRUCT[i]->GetDefaultsFrom(aPresContext);
There are a couple places where you currently do
STYLESTRUCT[i]->CopyFrom(nsnull, aPresContext);
which is a bit weird, and might make more sense using the
GetDefaultsFrom() scheme. Certainly gets rid of a bit of duplicate
assignment code.
+ switch (aSID) {
+ case eStyleStruct_Font:
+ if (aPresContext) {
+ result = new DATAIMPL(Font)(aPresContext->GetDefaultFontDeprecated(),
aPresContext->GetDefaultFixedFontDeprecated());
+ }
+ else {
+ result = new DATAIMPL(Font);
+ }
+ break;
+ case eStyleStruct_Color: result = new DATAIMPL(Color);
break;
+ case eStyleStruct_List: result = new DATAIMPL(List);
break;
+ case eStyleStruct_Position: result = new DATAIMPL(Position);
break;
+ case eStyleStruct_Text: result = new DATAIMPL(Text);
break;
+ case eStyleStruct_Display: result = new DATAIMPL(Display);
break;
+ case eStyleStruct_Table: result = new DATAIMPL(Table);
break;
+ case eStyleStruct_Content: result = new DATAIMPL(Content);
break;
+ case eStyleStruct_UserInterface: result = new DATAIMPL(UserInterface);
break;
+ case eStyleStruct_Print: result = new DATAIMPL(Print);
break;
+ case eStyleStruct_Margin: result = new DATAIMPL(Margin);
break;
+ case eStyleStruct_Padding: result = new DATAIMPL(Padding);
break;
+ case eStyleStruct_Border: result = new DATAIMPL(Border);
break;
+ case eStyleStruct_Outline: result = new DATAIMPL(Outline);
break;
Could you use a function table here to avoid the fact that switch
degenerates to branch-and-test? (Not sure if this is called a lot; if
not, don't worry about it.)
+static PRBool CanShareStyleData(nsIStyleContext* aStyleContext)
{
-#define max_structs eStyleStruct_Max
+ PRBool bSharingSupported = PR_TRUE;
Hungarian notation. Yuck.
+nsStyleStruct* gStructPointer[eStyleStruct_Max][kStructIndexMax];
+bool gStructBusy[eStyleStruct_Max][kStructIndexMax];
Please use PRBool, or PRPackedBool.
+ static bool initialized = false;
+ if (!initialized) {
Ibid.
+ short structType = aSID - 1;
+ for (short structIndex = 0; structIndex < kStructIndexMax; structIndex++) {
+ if (!gStructBusy[structType][structIndex]) {
+ result = gStructPointer[structType][structIndex];
+ gStructBusy[structType][structIndex] = true;
PR_TRUE.
+ NS_ASSERTION(false, "Can't release mutable style struct");
NS_ERROR(...)
I don't really understand why you've decided to pool the style structs
for purposes of nsMutableStyleXXX reading and writing? Why didn't you
just make a member variable in each nsMutableStyleXXX class that would
serve as the temporary storage for reading and writing the struct?
+ if (aSID == eStyleStruct_BorderPaddingShortcut) {
+/*
+ nsMargin border, padding;
+ if (GETDATA(Border)->GetBorder(border)) {
+ if (GETDATA(Padding)->GetPadding(padding)) {
+ border += padding;
+ ((nsStyleBorderPadding&)aStruct).SetBorderPadding(border);
+ }
+ }
+*/
+ nsMargin border, padding;
Use #if 0, and comment as to why this is not currently part of the code.
Index: mozilla/layout/xul/base/src/outliner/src/nsOutlinerBodyFrame.h
===================================================================
RCS file:
/m/pub/mozilla/layout/xul/base/src/outliner/src/nsOutlinerBodyFrame.h,v
retrieving revision 1.25
diff -u -2 -r1.25 nsOutlinerBodyFrame.h
--- nsOutlinerBodyFrame.h 2001/04/11 02:44:25 1.25
+++ nsOutlinerBodyFrame.h 2001/04/12 00:15:59
@@ -29,4 +29,5 @@
#include "nsIScrollbarMediator.h"
#include "nsIWidget.h"
+#include "nsHashtable.h"
Why does nsOutlineBodyFrame now need nsHashTable.h?
Ok, that's all I can do for now. I'll hope marc has covered the hard stuff. :-)
Comment 86•24 years ago
|
||
I'm gonna have to back Chris up on this one, Pierre. The preferred way to make
an object act like a pointer to another object is to implement |operator->| and
|operator*|. For a non-pointer object, these have no built-in implementation,
so you're not removing functionality. |operator&| is defined for _every_
object, so redefining that takes away a power permanently, and for no reason.
Someday, someone might want to pass one by address. If you override
|operator&|, they can't. And you can do the same job better (more easily used
by clients) with |operator->| and |operator*|; plus it'll be that much easier
for someone trying to understand your code, because that's the idiom for
smart-pointer style implementations.
Two other things Chris pointed out are worth re-iterating: (1) avoid C-style
casts; new style casts catch errors that old casts hide; and (2) |bool|, |true|,
and |false|, aren't defined on every platform I'm ashamed to say.
Good use of the stack-based transaction idiom for style modification though. I
like it. I only looked at the auto-style objects; I've got to go examine the
mutable-style cast situation now. More comments when I have a better
understanding of the big picture.
Assignee | ||
Comment 87•24 years ago
|
||
1) Override the operators in nsMutableStyleXXX instead of nsAutoStyleStruct:
----------------------------------------------------------------------------
scc was copied on the review requests too. My fault if I wasn't more specific
when I contacted him because that's where his input would have been the most
valuable: your proposal is the kind of solution that I hoped he had made. I like
the solution but if you don't mind, I'd prefer to implement it after checking in
because it is only a syntactical change to enforce the type-checking.
2) GetStyleXXXFrom(mStyleContext):
----------------------------------
I prefer the current notation where it is clear that the helper classes are in
fact constructing something, because it is the destructor that causes the
modified data to be written in the style context. You might have seen a couple
of places, once in Table code and maybe once in Block layout, where I added
blocks {...} to make sure the data was written before we continued the
processing.
3) Templates:
-------------
Same thing as (1). I'd say let's checkin and work on the notation later if scc
comes with a better idea.
4) #pragma mark -:
------------------
It draws a separator in the list of functions that pops up in the CodeWarrior
editor. Very convenient... You can see it in large files written or maintained
by folks who use the Mac as main developement machine.
5) Pure virtual functions in nsStyleStruct:
-------------------------------------------
It was my idea at first, of course. Unfortunately it would have required all the
classes that inherit from nsStyleStruct to declare and implement these methods.
The breakup of Layout into 2 DLLs made the task a bit more complex, but
especially the result would have been much worse.
6) Is AssignFrom() a better name than ResetFrom()?
--------------------------------------------------
Maybe... It has been there forever, I kept it. I did not like it much at first
too but I got used to it. Other names could have been InheritFrom() or
InitializeFrom(). I usually try to be flexible when I work on code developed by
someone else, and after all ResetFrom() makes sense too.
7) Symetry of ResetFrom() and CopyTo():
---------------------------------------
There is no symetry between ResetFrom() and CopyTo(). The symetrical function to
CopyTo() was SetFrom(). It was so symetrical that the code was duplicated (but
reversed) in the 15 implementations. One function copied member variables from |
this| to the destination; the other function from the origin to |this|. It's
something I described to Marc and you 2 or 3 weeks ago: I removed
nsIMutableStyleContext::SetStyle() which wasn't used anywhere and it allowed me
to remove the 15 SetFrom() methods, and replace the calls to SetFrom() with calls
to CopyTo() in the 2 places where it was done. It doesn't sound obvious at first
but the two issues were indeed related.
8) Replace ResetFrom() with CopyTo() + GetDefaultsFrom():
---------------------------------------------------------
We cannot replace ResetFrom() with the following lines, as you suggest.
if (aParent)
aParent->CopyTo(STYLESTRUCT[i]);
else
STYLESTRUCT[i]->GetDefaultsFrom(aPresContext);
ResetFrom() defines the inheritance. Some properties are inherited, some others
are not. We cannot replace that with a CopyTo().
9) CopyFrom(nsnull):
--------------------
There is no CopyFrom(nsnull), but we have some ResetFrom(nsnull) and they make
sense. It means that we create a style structure that doesn't inherit from any
parent. We need this, for instance, to resolve the style within tables in quirks
mode (Nav4x tables act almost like embedded documents: they don't inherit
anything but a couple of properties).
10) Replace switch() with jump table:
-------------------------------------
I believe good compilers do it automatically when the number of cases within the
switch() reaches a certain threshold.
11) Yucky "bSharingSupported" Hungarian notation:
-------------------------------------------------
Yes, I agree. Marc indeed developed a few bad habits in previous lives but they
are so negligible compared to the quantity and the quality of the work he
otherwise produces that I would have felt really bad pointing out that kind of
things to him. So I kept the name.
12) Replace bool with PRBool:
-----------------------------
Correct.
13) Replace ASSERTION(false) with NS_ERROR():
---------------------------------------------
Correct.
14) Member variable instead of pooling:
---------------------------------------
To minimize the impact on the existing code, I wanted ReadMutableStyleData() to
behave exactly like the current GetMutableStyleData() with the only difference
that the caller must call WriteMutableStyleData() when he's done with the
changes.
15) Commented-out code:
-----------------------
Correct: I should have removed the these lines instead of commenting them out.
16) nsHashTable.h in nsOutlineBodyFrame:
----------------------------------------
nsOutlineBodyFrame always used a nsHashTable but it was getting the declaration
from nsIStyleSet.h. The problem is that nsStyleSet.h doesn't include
nsHashTable.h when SHARE_STYLECONTEXTS is not defined so when I disabled the
StyleContext sharing, I prefered to add the #include "nsHashTable.h" in
nsOutlineBodyFrame which is the real customer, instead of always including it in
nsIStyleSet.h, even when not needed.
====
Thanks a bunch for this very thorough review. The actions I'm proposing to take
are:
1) Give some builds with the current code to jrgm to evaluate the performance.
My own evaluation shows that we can expect a small gain but following your
recommendation, I'm going to test with very large test pages.
2) Fix the small points: #12 + #13 + #15.
3) Checkin if that's ok with everybody.
4) Ask scc if he's got a better solution than the one you proposed in #1 for
type-checkin, plus possibly something with templates as suggested in #3.
5) Check in these changes for type-checking whenever we want or can.
6) Continue our merry way towards new adventures.
I'm not planning to change anything on the following points: #4, #7, #8, #9, #14,
#16.
I think my explanations are sufficient on #5 but I may adapt if someone comes up
with a better solution. I promise I tried very hard.
I think my explanations are sufficient on the following points but I will adapt
if you insist: #2, #6, #10, #11.
Assignee | ||
Comment 88•24 years ago
|
||
Scott: I posted my comments before I read yours. I agree with you and Chris on
#1. If you prefer, I can implement it before checking in but since it doesn't
change anything to the execution, I would not want that to cause another round of
verification builds for QA.
About the mutable style, my main guideline in all the declarations was to have as
little an impact as possible in the code outside the StyleContext.
GetMutableStyleData() is called in approximately 130 places in 35 files, and the
pointers to the mutable style are used in probably hundreds of places (and
fortunately nowhere in a persistent way).
The thing that will make you frown the most is probably #5. If you want to give
it a try, you're welcome. I sincerely tried many solutions with pure virtual
methods or just virtual methods (like having a second base class that the
StyleXXXImpl structures could inherit from, etc...) but given the facts that:
- customers had to be able to use an object created by the StyleContext
(ReadMutableStyleData)
- the StyleContext had to be able to use an object created by the customer
(GetStyle)
- customers could be in another DLL (Layout)
I believe I ended up with a very honorable solution but if there is a better one,
I'm confident you will find it (beware of crashes when virtual tables don't
match).
Comment 89•24 years ago
|
||
I will cast my vote for #1 to be addressed before check-in. Motivation: strike
the iron while it is hot... Now that all eyes are on this, let's make (or at
least try to make) the most of the brainstorming that is going on and the
eagerness around w.r.t. reaping the savings of this style data sharing.
Comment 90•24 years ago
|
||
> 1) Override the operators in nsMutableStyleXXX instead of nsAutoStyleStruct:
What rbs said: you're essentially redefining the API at this point (for all
practical purposes, callers must use your nsMutableStyleXXX objects). If we
don't fix this now, you'll be chasing the tree as people begin to start writing
new code that uses the new API.
One thing that I forgot to ask last night was, what is the linkage impact of
these new classes? Does layout need to link against content now? (I guess not,
since you've not changed any makefiles AFAICT.) Which would mean the code
probably gets duplicated in each of content and layout. (Tangent: we need to
redouble our efforts to get content & layout converted to using
NS_IMPL_NSGETMODULE so they can be recombined for release.)
> 2) GetStyleXXXFrom(mStyleContext):
> ----------------------------------
> I prefer the current notation where it is clear that the helper classes are in
> fact constructing something,
Agreed. That makes sense.
> 5) Pure virtual functions in nsStyleStruct:
> -------------------------------------------
> It was my idea at first, of course. Unfortunately it would have required
> all the classes that inherit from nsStyleStruct to declare and implement
> these methods.
What style structs *don't* need this? Why?
> 14) Member variable instead of pooling:
> ---------------------------------------
> To minimize the impact on the existing code, I wanted ReadMutableStyleData()
> to behave exactly like the current GetMutableStyleData() with the only
> difference that the caller must call WriteMutableStyleData() when he's
> done with the changes.
But because of the nsMutableStyleXXX classes, no sane person would call
[Read|Write]MutableStyleData() by hand, would they? (Are we bending over
backwards to support a non-use case?) I'll have to read the code more carefully
to really understand what you're trying to do here. More comments later.
Assignee | ||
Comment 91•24 years ago
|
||
1) Override the operators in nsMutableStyleXXX instead of nsAutoStyleStruct:
----------------------------------------------------------------------------
Agreed but for the reason rbs mentions.
I did redefine the API but not the way you seem to think.
The previous API was:
nsStyleStruct* GetMutableStyleData(nsStyleStructID aSID)
Now it is:
nsStyleStruct* ReadMutableStyleData(nsStyleStructID aSID)
nsresult WriteMutableStyleData(nsStyleStructID aSID, nsStyleStruct*
aStyleStruct)
The difference being that everywhere there is a Get(), we have to replace it with
a pair of Read()/Write(). The nsMutableStyleXXX objects were defined only as
helper classes to ease the transition of the existing code to the new API. My
goal was to change all the GetMutableStyleData() to something else that fits in
only 1 or 2 lines. I did not want the Write() calls to appear because it would
have made the task much more complex (ie. make sure Write() is called before
every |return| or |break| etc...)
However callers are not required to use the helper classes. The following
snippet is perfectly valid:
nsStyleBorder* border = mContext->ReadMutableStyleData(eStyleStruct_Border);
border->SetBorderStyle(0,NS_STYLE_BORDER_STYLE_SOLID);
nsresult result = mContext->WriteMutableStyleData(eStyleStruct_Border, border);
1.1) Linkage impact:
--------------------
None.
1.2) Duplication of the code:
-----------------------------
Yes, the code of the helper classes is duplicated in Content and Layout. It
doesn't take much space, it is fairly straightforward and it doesn't do much
anyhow (for those who haven't looked at the diffs, the helper classes call
Read()/Write() when the objects are created/destructed).
<gentle_settling_of_accounts>
Honestly it doesn't bother me as much (and far from it) as the big duplication of
"real" code that you approved in nsIStyleContext.h, without my reviewing, in
order to enable the split between Content and Layout: http://bonsai.mozilla.org/
cvsview2.cgi?diff_mode=context&whitespace_mode=show&file=nsIStyleContext.h&root=/
cvsroot&subdir=mozilla/layout/base/public&command=DIFF_FRAMESET&rev1=3.88&rev2=
3.89
</gentle_settling_of_accounts>
5) Pure virtual functions in nsStyleStruct:
-------------------------------------------
I haven't made myself clear, sorry. One of the aspects of the problem is related
to the linkage (or absence of) between Content and Layout. GetMutableStyleData()
is called in a few places in Layout (in the CSSFrameConstructor mainly) so if
nsStyleStruct had pure virtual functions, we would have to link the two
libraries.
14) Member variable instead of pooling:
---------------------------------------
See the code snippet in #1. The helper classes are not mandatory, not using them
removes a little overhead. Some folks may like the "Read/Modify/Write" paradigm
better, after all it's a very common one in computing. Besides we can't always
force people to do the way we like: nsCOMPtrs have been there for ages and we can
still see some NS_RELEASE() checked in everyday.
----
As an incidental note, I very much enjoy this particular code-review. French
people have a thing about arguing. I wish all the big changes were under the
same scrutiny (as much as I wish the small ones were given some slack, but that's
another debate) so thank you, and scc and rbs, for your input.
Comment 92•24 years ago
|
||
pierre: my 2 cents, but they were gained over years at much greater expense:
Don't treat the Read/Write pattern as a good low-level device for the masses.
Do consider making it private/protected/friend/whatever-restricts-it and
mandating the helper classes. nsXPIDLString, nsCOMPtr, etc. and their lack show
the folly of letting low-level, relatively-error-prone tools loose in a large,
mixed-strength community such as Mozilla's.
What is the overhead of the helper classes? It should be noise to zero, I hope.
/be
Comment 93•24 years ago
|
||
> <gentle_settling_of_accounts>
Touche!
> 5) Pure virtual functions in nsStyleStruct:
> ------------------------------------------
> I haven't made myself clear, sorry. One of the aspects of the problem is
> related to the linkage (or absence of) between Content and Layout.
> GetMutableStyleData() is called in a few places in Layout (in the
> CSSFrameConstructor mainly) so if nsStyleStruct had pure virtual functions,
> we would have to link the two libraries.
I don't understand. Pure virtual methods should *eliminate* linkage
dependencies, not *cause* them. Could you please elaborate?
> 14) Member variable instead of pooling:
> ---------------------------------------
> See the code snippet in #1. The helper classes are not mandatory, not
> using them removes a little overhead. Some folks may like the
> "Read/Modify/Write" paradigm better, after all it's a very common one
> in computing. Besides we can't always force people to do the way we like:
> nsCOMPtrs have been there for ages and we can still see some NS_RELEASE()
> checked in everyday.
But my point is, your scheme isn't required to support the read/modify/write
paradigm. I believe that I've now understood the code a bit better, and I
think I see what you're trying to accomplish now. (I think I could even guess at
the code geneology, and how you've arrived at the implementation you did.)
I think you should:
1. Change ReadStyleData() to copy the style data into a caller-provided
struct. WriteStyleData() will write data back from the struct.
2. Your auto-classes should each have an appropriate struct as a member.
Each will use its member to call [Read|Write]StyleData.
3. Any macho men that want to use [Read|Write]StyleData directly will just
have to have their own style struct on the stack as a buffer.
This should simplify the back-end implementation of [Read|Write]StyleData. It
also removes the artificial ``only four live requests'' at a time. It even makes
it so that we could one day hope to have a thread-safe layout engine, etc. (Your
current implementation is just one more stake in thread-safety's heart here!)
FWIW, I am enjoying this too. It's been a good learning experience. Also, I
*don't* think I'm hen-pecking here: I claim that I *am* arguing about the big
changes. You're fundamentally altering how clients will interact with the style
system, and your implementation will have far-reaching consequences.
Reporter | ||
Comment 94•24 years ago
|
||
Chris Waterson wrote:
>1. Change ReadStyleData() to copy the style data into a caller-provided
> struct. WriteStyleData() will write data back from the struct.
This is what the old methods on nsIMutableStyleContext
NS_IMETHOD GetStyle(nsStyleStructID aSID, nsStyleStruct& aStruct) const;
NS_IMETHOD SetStyle(nsStyleStructID aSID, const nsStyleStruct& aStruct);
did, and Peter Linss put those in to facilitate migration from GetMutableStyle
calls to a better encapsulation that does not expose the internal data
structures of the style context.
Comment 95•24 years ago
|
||
I think you're missing my point. I want ReadStyleData() to copy the style data
into a *caller-provided buffer*. (I think the caller can do a better job
managing the memory that ReadStyleData() can; specifically, the caller can do it
in a re-entrant way.)
Reporter | ||
Comment 96•24 years ago
|
||
Chris, GetStyle(...) does exactly that, which is why I brought it up. The caller
creates their style struct and passes it to GetStyle, then mucks with it, and
then calls SetStyle passing it back in. Does that make sense, or am I missing
something?
example usage -
nsStyleDisplay disp;
GetStyle(eStyleStruct_Display, disp);
disp.mDisplay = NS_STYLE_DISPLAY_NONE;
SetStyle(eStyleStruct_Display, disp);
Comment 97•24 years ago
|
||
Am I being dense? What is the point of the style struct pool then? I've got the
diffs applied in a tree. I'll spend some more time looking at them.
Assignee | ||
Comment 98•24 years ago
|
||
Brendan: sure, the overhead isn't much higher than, let's say, the one we have in
switch() statements vs jump tables, so no problem: we can enforce the use of the
helper classes.
Marc & Chris: There is an important point that you may have missed and you will
see how the issues of interfaces and pure virtual methods are intertwined...
In the current implementation, the nsStyleXXX structures don't have any real base
class other than the empty nsStyleStruct which is basically a void*. This leads
to some aberrations like big duplications of code or useless switch() statements.
Compare the current implementation of StyleContextImpl::CalcStyleDifference():
http://lxr.mozilla.org/seamonkey/source/content/base/src/nsStyleContext.cpp#3658
with what I propose. Same thing for GetStyle(), GetStyleData() etc... or even
RemapStyle().
In the current code, none of the 5 basic methods of the nsStyleXXX structures
(ResetFrom, CopyTo, SetFrom, CalcDifference and ComputeCRC32) is inherited. They
could very well have completely different names and different parameters from a
nsStyleXXX struct to another. Yucky, isn't it? It was really important in my
opinion to clean that up in order to improve the legibility of the current code
and ease:
- The implementation of the StyleData sharing.
- The support for more data structures in the style context, like the one added
recently for XUL. We may have some more for CSS2/3 modules that we don't support
yet (aural, webfonts, ruby and whatnot) or external modules like SVG.
- Implement the memory model we want in the future.
The cleanup allowed to declare the nsStyleContextData as a union of individual
data structures (StyleXXXImpl*) and an array of generic data structures
(nsStyleStruct*).
So... I made these 5 functions (later reduced to 4, having removed SetFrom) part
of the base class nsStyleStruct and declared them pure virtual.
The problem is that:
1) In public interfaces, nsStyleStruct is the base class for the structures that
are manipulated by the customers (nsIStyleContext.h declares nsStyleXXX
structures which all inherit from nsStyleStruct).
2) Some customers in other DLLs (read "Layout" but it could be other ones)
declare some of these structures on the stack, for instance to call GetStyle()
and fill up the structure they are passing by reference.
3) None of these public interfaces implements these 5 functions. The only place
where the functions are implemented is in the StyleXXXImpl classes inside
nsStyleContext.cpp (StyleFontImpl, StyleColorImpl, etc...)
4) Result: You get link errors in these other DLLs (er, "Layout") because you
construct objects with pure virtual methods that are not implemented anywhere.
A "solution" would have been to somehow declare the nsStyleStruct as an XPCOM
object and provide constructors in the StyleXXXImpl, or something along that
line... Another "solution" could have been to link back Content and Layout...
Nothing we wanted anyhow because another requirement was that the customers
should be able to continue to access directly the member variables of the
nsStyleXXX objects which are defined the public interfaces.
If you have another idea, I would be interested to look it but I would prefer if
you played with it a little bit in the comfort of your own tree. I think I
already spent enough time on it and I still believe that my solution is close to
the best you can get. C'mon, I double-dare you :-)
...That was about the pure virtual methods. Now about the APIs...
I maintain (you will be interested!) that SetStyle() is evil, as well as any API
where we would have to accept a structure allocated by the customer with the
purpose of modifying the style context. This is for almost the same reason as
above: the implementations are done in the StyleXXXImpl, which are a certain way
private to the StyleContext, and when writing into the StyleContext we need to
receive back objects that we have created in order to be able to talk to them
(ie. objects that implement our code). The principle behind it is fairly clear.
From the StyleContext standpoint, we publish interfaces to these objects but we
must remain the entity that creates them. It's nothing new: COM objects work the
same.
Basically, my points are:
1) We need a base class with the 5, or 4, basic methods for the reasons listed
above.
2) Because of (1), the base class nsStyleStruct must come with its own dummy
implementation of these methods so that (a) other DLLs can link correctly and (b)
the virtual tables are identical everywhere and customers can access the member
variables directly inside the nsStyleXXX structures.
3) Because of (1), we generally cannot accept objects that we don't create. The
only exception (I think, but there is maybe another one) is GetStyle() because we
fill-up the structure passed by the customer by doing "ourOwnStruct->CopyTo(&
aCustomerStruct)".
Comment 99•24 years ago
|
||
Assignee | ||
Comment 100•24 years ago
|
||
Damn, it looks like the diffs for nsCSSStyleRule.cpp are missing.
Assignee | ||
Comment 101•24 years ago
|
||
Assignee | ||
Comment 102•24 years ago
|
||
To be honest, if you are looking for better solutions for the declaration of
nsStyleStruct, the closest thing I found involved a big reshuffling of all the
StyleXXXImpl and quite a bit of modifications in the StyleContext too:
- Leave nsStyleStruct the way it is.
- Use the nsStyleXXX structures for the only purpose of passing data back and
forth with the customers, meaning that when we receive nsStyleXXX, we have to
copy it into a StyleXXXImpl before we do anything (perf?).
- Make the StyleXXXImpl structures not inherit from nsStyleStruct but from
another base class that implements these 4 functions.
- Declare a nsStyleXXX as a member variable in its corresponding StyleXXXImpl.
It sounds simple like that but I think there was a problem (besides the
reshuffling) and I can't remember what.
Comment 103•24 years ago
|
||
With the patches applied, looking at a tree, it's generally much easier to
understand. :-)
> I maintain (you will be interested!) that SetStyle() is evil, as well as
> any API where we would have to accept a structure allocated by the
> customer with the purpose of modifying the style context. This is for
> almost the same reason as above: the implementations are done in the
> StyleXXXImpl, which are a certain way private to the StyleContext, and
> when writing into the StyleContext we need to receive back objects that
> we have created in order to be able to talk to them (ie. objects that
> implement our code).
I find this argument to be pretty weak. I think the *real* issue is that this
has been designed around trying to use C++ vtable dispatch in the back-end
implementation, and that's lead you down this garden path.
The style structs are the interface by which your consumers communicate with you
about style data. Even if you did, say, need to compute a whole bunch of extra
information for your own private use (which you don't), I still don't think the
argument holds water.
You don't need to *create* the style struct to be able to ``talk'' to it
efficiently. You need to know its style struct ID. Think of *that* as your
vtable. If you find the pervasive `switch' logic too burdensome (I would!), then
roll your own dispatch! Places where you've used C++'s build-in virtual dispatch
should simply use the style struct ID to do hand-rolled dispatch: use the style
struct IDs indexes into a table of function pointers, and a *lot* of the ugly
switch statement code will go away.
I don't think that this would really entail much change to what you've already
got. But, it would
1. Get rid of the need for you to have the nsStyleStruct implement any
virtual methods. (This is unsighly anyway: it's propagated implementation
detail out to a place where nobody should know or care about it.)
2. Get rid of the ad hoc style struct pool, which is another nail in the
layout re-entrancy coffin.
3. Still allow you to achieve switch-less nirvana.
4. Still allow you to use your own private storage, if the day comes where
style maintains ``extra'' computed information that's not visible to
consumers of the API.
What do you think?
Assignee | ||
Comment 104•24 years ago
|
||
You are correct about the origin of the problem, I wanted to keep the C++
dispatch. And I still do: I'll give a second look at what I described above
because when I tried it, I had in fact the StyleXXXImpl inheriting both from a
base class that declared the pure virtual methods and from the 'empty'
nsStyleStruct.
If we cut the inheritance between nsStyleStruct and StyleXXXImpl, we are left
with quite a bit of reorg in the StyleXXXImpl, but I prefer to put the burden on
these objects rather than on the rest of the code. The benefits would be the
same, with in addition the C++ dispatch and inheritance, and less changes in
StyleContext and StyleContextData.
Same question: what do you think?
And what do you mean by "Still allow you to use your own private storage..."?
How can I have private storage if I receive a structure that I did not create?
Assignee | ||
Comment 105•24 years ago
|
||
Never mind... I guess I can put a "void* mPrivate" in nsStyleStruct and store
whatever I want in it during ReadMutableStyleData().
Comment 106•24 years ago
|
||
> You are correct about the origin of the problem, I wanted to keep the C++
> dispatch. And I still do:
Why? What is it buying you? What I'm proposing is the same design you've got
now, with a slightly different mechanism. AFAICT, C++ is just getting in the way
of your design here.
I could understand if you were doing virtual dispatch all over the place (and
C++'s syntactic sugar was worth it), but you have:
- four places where you call the virtualized CopyTo()
- four places where you call ResetFrom()
- three places where you call CalcDifference()
- two places where you call ComputeCRC32()
So the syntactic brain-print of doing your own dispatch is marginal.
And it's all *internal* to the style context's implementation: you're not
forcing this mechanism on your users.
As you note, a top-to-bottom C++ design where the nsStyleStruct is an abstract
base class won't work here because of linkage issues, even though it might have
been the most natural. But the alternative that you suggest seems like the worst
of both worlds: you'll still have plenty of places where you need the big ugly
switch statements, and you're only taking advantage of the C++ syntactic sugar
in a dozen places. (Well, a baker's dozen.)
> And what do you mean by "Still allow you to use your own private storage..."?
The structures used to *communicate* style information need not have anything to
do with the structures that you use to *store* that style information in the
style context.
Assignee | ||
Comment 107•24 years ago
|
||
>As you note, a top-to-bottom C++ design where the nsStyleStruct is an abstract
>base class won't work here because of linkage issues,
I haven't made myself clear maybe: nsStyleStruct and the nsStyleXXX structures
would not change compared to what is currently in the tree. Only the
StyleXXXImpl would change and since they are internal, I can do pretty much what
I want with them, including not inherit from nsStyleStruct at all, and inherit
from a base class with pure virtual methods. It's not the worst of both worlds
at all: of course, we won't have the ugly switch statements back. The solution I
propose is identical in all points to yours in its interfaces with the outside
but in addition, it keeps the benefit of using true C++ objects in the inside
(and we are going to need it for the next round of optimizations).
As for the private storage, I guess you meant it could be handled through the
helper classes. If not, please stop playing games and tell me what's on your
mind (or tell me you want to play and I'll gladly play too :-).
Comment 108•24 years ago
|
||
> The solution I propose is identical in all points to yours in its
> interfaces with the outside but in addition, it keeps the benefit
> of using true C++ objects in the inside (and we are going to need
> it for the next round of optimizations).
Okay, I guess I'm most concerned about the external interface, so I'd be
willing to cave on this. I still think that the mechanism that I propose
would be simpler and cleaner. But, I don't have to do the typing. :-)
So we're clear, could you elaborate as to what the benefit of using ``true C++
objects'' are here? How will it affect your future optimizations?
> As for the private storage, I guess you meant it could be handled through the
> helper classes. If not, please stop playing games and tell me what's on your
> mind (or tell me you want to play and I'll gladly play too :-).
My point was that the storage need not have anything to do with the interface
used to communicate the style data. Since I'm not sure *why* you'd want to do
this, my example will be rather bizarre, extreme, and contrived but...here goes.
Let's say, for example, that instead of storing the font information like we do
now, we wanted (for unknown some reason) to store a handle to the resolved font
on the operating system. A referent to the following would be what would be
stored in the style context:
struct nsStyleFontPrivate {
nsIFont* mFont;
};
To get information *into* this object, your ``write'' dispatch for
eStyleStruct_Font would examine the nsStyleFont struct, use GFX to find an
appropriate font, and stuff it into an nsStyleFontPrivate structure.
Conversely, to get information *out of* this object, your ``read'' dispatch
would use the nsIFont object to fill in the nsStyleFont structure.
Like I said, contrived. But the point is, the storage need not have anything to
do with the structures used to communicate about the storage.
Assignee | ||
Comment 109•24 years ago
|
||
About C++:
I sent you and other people earlier this week a description of a memory model
that could bring huge gains in terms of memory footprint. It consists in storing
the style data (nsStyleXXX) in a library composed of hash tables, with one hash
table per type of style data. We'd have one for all the fonts used in the
document, one for all the colors, etc... To implement it, we would need to add
some methods and member variables to a base class that all the StyleXXXImpl
inherit from. For instance, to store the CRC locally, or refcount these objects,
or maybe add some debug functions on the way... Overall, I don't know if a
system like this one would be viable in terms of performance but if it's not,
we'll find something else which may or may not benefit from having a base class
for StyleXXXImpl. I just want to keep the possibility open.
About the storage:
Understood. We were not talking about the same thing. The kind of private
storage that I had in mind was a tag that we would set in ReadMutable() and
retrieve in WriteMutable(). It could be very useful for some optimizations. For
instance, we could store a direct pointer to the data inside the library, or the
CRC of the previous data. With the helper classes, we can pass something like
that between Read() and Write(). It would be a strong argument in favor of
Brendan's suggestion for not exposing the 2 methods but just the helper classes.
Comment 110•24 years ago
|
||
Ok, so I'll look forward to seeing what you come up with. (I have to admit I was
a little bit befuddled by those emails that were flying around; I'll probably be
able to understand them better now.)
> It would be a strong argument in favor of Brendan's suggestion for not
> exposing the 2 methods but just the helper classes.
I'm all for it.
Comment 111•24 years ago
|
||
scc/waterson, have you had any chance to look at the template magic that was
alluded to earlier? It looks something elegant might be possible in this area.
Assignee | ||
Comment 112•24 years ago
|
||
I hope the code will be easier to understand than some of my comments. The diffs
are following. As a summary:
- there is no pooling anymore (the helper classes contain the style structure
that is used for Read/Write)
- the APIs are easier to use (1 line for each call to GetMutableStyleData)
- private methods are no longer exposed (nsStyleStruct is an empty struct like
before).
All the other points are fixed too, except #5 about the templates: we'll see if
scc makes a proposal. My concern is the performance. According to some
measurements, it should be less than 2% slower than the current code and maybe 2
or 3% faster. It's difficult to tell because it seems that something happened in
the tree lately that causes larger fluctuations in the results I'm getting.
I did several implementations because of the performance:
- In the first one, the style data (nsStyleXXX) were embedded as a member
variable inside the StyleBlobs (formerly StyleXXXImpl). It was too slow (between
5 and 10% slower).
- In the second one, the StyleBlobs inherited from both the style data (as they
do now) and from a base class. It was still slightly too slow.
- In the third one, just as a test, I put back in the pooling of style structs
that are used for Read/Write. It was slightly faster than the current
implementation.
- In the last implementation (diffs attached), I took back the second
implementation and streamlined the helper classes to win some cycles. It seems
to have worked a bit. My guess is that it is slightly slower than the current
builds but depending on the platform it may very well be slightly faster too.
Chris & Jrgm: if you do any kind of performance evaluation, please don't use a
daily build to compare with a fresh pull over which you apply my changes. Use
the same pull for both (ie. pull, build, take measurements, apply the diffs over
the same build, take more measurements).
Chris: If you think that performance matters more than trying to keep Layout
thread-safe (or at least the style resolution part of Layout), I can attach some
diffs with the pooling for Read/Write. The rest would not change, it's hidden by
the helper classes.
Assignee | ||
Comment 113•24 years ago
|
||
Comment 114•24 years ago
|
||
I'll do test builds again today with the latest patch.
Comment 115•24 years ago
|
||
Comment 116•24 years ago
|
||
Just an idea. Might this be a more elegant way to declare the helper classes?
Comment 117•24 years ago
|
||
Thanks for your speedy turnaround on this stuff, pierre! I think the
helper classes are much more natural now.
- I'm surprised that you see a speedup with your pooled contexts. What
is happening differently there? Seems like the only difference would
be whether we were copying to/from the stack versus copying to/from
your pool.
Could you post a patch that illustrates how the pooled
implementation works, too? (If you still have it around.)
- I'm a bit fuzzy on what the difference is between nsIStyleContext
and nsIMutableStyleContext. Is the distinction not important once
these changes have been made? If so, should we log a task to remove
one or the other?
- I just noticed that there's another re-entrancy problem in
RemapStyle(). Sorry that I didn't notice that before :-(. Rather
than making the temporary remap data in the static global area, how
about making the PresContext own it? Anyway, I'd be willing to let
this slide if you'll commit to fixing it eventually.
- I still think the tail is wagging you with respect to your dogged
insistence on using C++ virtual dispatch. That said, I'm flattered
that you were able to try so many alternative implementations over
the weekend and were still able to eschew my proposal. :-)
As an example of this, note that it's unfortunate that you need both
SetFrom() and CopyTo() methods on your nsStyleBlob class. In each
place where you call these routines, you have the style struct ID
available to you! The code blaot is minor, but you were doing this
to avoid maintentance hassle, remember?
Remind me again why C++ is going to make things so great here
someday?
- There's still some use of C++ |bool| in here that may bite you on
some of the ports.
- Consider use of a C++ template to define your helper classes; see
above patch.
Assignee | ||
Comment 118•24 years ago
|
||
1) Speedup with pooled strutures:
I can't really explain it. Most of the constructors/destructors are null. The
difference was about 10ms over 800ms to load Yahoo, and 30ms over 1300ms to load
CNN. I can send you the diffs but it would take a little bit of time to merge
them again. If you just want an illustration of how it works, you can look at
the previous diffs. Otherwise would you seriously consider this solution or did
you ask out of curiosity? We could put these structures in the PresContext like
the rest of the permanent data. I was planning to keep it as a backup solution
if jrgm finds some performance problems.
2) Difference between nsIStyleContext and nsIMutableStyleContext:
The goal, I guess, was to limit the scope of what people can do during style
resolution. I don't think it would be a good idea to change it but you can file
a bug to remove nsIMutableStyleContext if you like.
3) Pool in RemapStyle():
Sure, it can be moved to the PresContext. I'll have other things to store there
too. Besides it will allow us to break WrapRemapStyle() in two as it should be.
4) C++:
I worked with jump tables long ago, before C++. Jump tables can be really nifty,
they ensure a very robust code that can easily sustain the burden of the years.
The best example is probably jwz's mime engine, one of the only pieces of code
that's survived since 3.0. I gave you twice a description of the evolution that
I would like to see for the memory model in the style context, but really it
doesn't matter. Whatever we do and whenever we do it, I'd like to keep it open
and flexible so I'll stick with C++. Unless you veto it, of course. As always,
it's all a question of what we want to see in the tree and when we want to see it
but nobody else issued an opinion so I'll let you decide. Thanks for
volunteering, by the way.
5) bools:
They only ones left are in a debug code that I did not maintain. I did not
decide yet whether I'll update it or remove it. Not part of the build, as they
say.
6) Templates:
Really neat, thanks! I thought only scc could speak template...
Comment 119•24 years ago
|
||
> The goal, I guess, was to limit the scope of what people can do
> during style resolution.
I.e., during style resolution you get the ``immutable''
nsIStyleContext?
> I don't think it would be a good idea to change it but you can file
> a bug to remove nsIMutableStyleContext if you like.
Not sure what this means. It seems to me that either:
1. nsIStyleContext and nsIMutableStyleContext are redundant: all style
contexts are immutable, all the time. In which case let's do away
with one of extra interfaces.
2. nsIStyleContext indicates an immutable style context, and therefore
should not have a GetStyleData() member that allows mutation. In
which case, the helper classes are conspiring with a bunch of
naughty callers to write to supposedly ``immutable'' contexts.
Given your comments, it looks like we sort of tried to pay lip-service
to the latter (e.g., nsIStyleContext returns a |const nsStyleData*|),
but we then turn around and write to the ``immutable'' contexts anyway!
(The helper classes all toss the |const| and write data back
regardless of whether the original context was mutable or not).
I'm just trying to understand what the intent of the API is here. I'm
not asking you to fix it right now; rather, I'm asking you to sketch
out a roadmap of where things ought be, and asking you to commit to
Making It Right Some Fine Day (i.e., file a bug).
> Sure, it can be moved to the PresContext. I'll have other things to
> store there too. Besides it will allow us to break WrapRemapStyle()
> in two as it should be.
Ok, great. Again, bug filed please.
> I worked with jump tables long ago, before C++. Jump tables can be
> really nifty, they ensure a very robust code that can easily sustain
> the burden of the years. The best example is probably jwz's mime
> engine, one of the only pieces of code that's survived since 3.0. I
> gave you twice a description of the evolution that I would like to
> see for the memory model in the style context, but really it doesn't
> matter. Whatever we do and whenever we do it, I'd like to keep it
> open and flexible so I'll stick with C++.
This either is a non sequitor, or an extremely sarcastic remark. I'm
guessing the latter. :-)
I'll happily sr= the code as it stands (I can't _force_ you to do
anything, but at least the impact is constrained). To be frank, I
jump tables would provide a more elegant solution here. As it stands,
1. The C++ vtable is redundant with the style struct ID.
2. You need a translation layer to switch back and forth between the
SIDs and the vtables anyway.
3. There are only a handful of places where the C++ syntactic sugar is
even used.
4. You've ended up with redundant code (CopyTo, SetFrom).
But hey, it's your name on the cvs blame! :-)
> They only ones left are in a debug code that I did not maintain. I
> did not decide yet whether I'll update it or remove it. Not part of
> the build, as they say.
Ok.
> Really neat, thanks! I thought only scc could speak template...
I'm an amateur. If you're seriously considering them, let's let scc
look at it first.
So, a couple rhetorical questions left:
1. nsIStyleContext vs. nsIMutableStyleContext. Need roadmap and a bug,
or a rebuttal as to why it's good as it is.
2. Need a bug for re-entrancy issue in WrapRemapStyle().
3. Need blessing from scc if you decide to use templates.
sr=waterson. Thanks for your patience.
Comment 120•24 years ago
|
||
Comment 121•24 years ago
|
||
Comment 122•24 years ago
|
||
Comment 123•24 years ago
|
||
I ran with the builds that waterson passed on to me, for linux and win2k.
I ran (1) win2k/500Mhz/128MB with normal settings, looking at 'already cached'
load times, (2) win2k/500MHz/128MB with disk cache set to 0KB, forcing a
network request for all elements of a page, (3) on a slower linux system
(266MHz/128MB) looking at 'already cached' load times, and (4) on a 800Mhz
linux system (results not shown but they're the same story).
I all cases, the code with pierre's changes measured in as "something like"
2% slower than the baseline build. But the reason I say "someting like" is
that I don't think this page loading test is precise to that level. This is
a wash (with a small hint of small difference if tested on a very quiet network
under very careful initial conditions ;-).
Comment 124•24 years ago
|
||
I tried running jrgm's tests (default settings) on an Extremely Crappy Machine
(133MHz PC with 48MB of RAM running Win98). Here's what I found:
AVG MED MAX MIN TIMES
before 7442 7206 19838 1363 8568 7061 6912 7009 7037
after 7805 7591 20357 1613 9005 7457 7362 7436 7351
delta -363 -385 -519 -250 -437 -396 -450 -427 -314
%improvement -4.8 -5.3 -2.6 -20.9 -5.1 -5.6 -6.5 -6.1 -4.5
This was run from inside Mozilla proper; I'll take another set of measurements
in the mfcEmbed shell. Win32 and Linux Builds from the tree this morning are
available (internally only, sorry) at ftp://btek.mcom.com/pub/.
pierre, could you produce another set of patches that use the global contexts
instead of the stack-based ones? If there really is a delta, I'll roll up my
sleeves and profile.
Comment 125•24 years ago
|
||
Actually, before anyone goes off hacking more code, maybe we should get a 2nd
opinion. Could somebody else confirm or refute my results on a low-end machine?
Assignee | ||
Comment 126•24 years ago
|
||
I'd say: low-end, yes, but with a little bit more memory. I think that if the
difference is more important than on the other machines (the slowdown is 5 or 6%
instead of 2%), it's because we allocate around 3 times as many blocks as we used
to (the blocks are way smaller) and we hit much more often I don't know what
Win32 function that cleans up the memory. I'm no Windows specialist so I don't
know if it makes sense...
Assignee | ||
Comment 127•24 years ago
|
||
Still, I would be curious to see if what I noticed with globals vs stack can be
confirmed on other platforms. I'll make diffs but not until late.
Comment 128•24 years ago
|
||
Regarding the [im]mutable question, let me chime in to point to a comment of
Peter Linss that I once saw in bug 1230. Basically, he was saying that there
were two pathways to the style structs, and only some blessed objects were
allowed to use the mutable pathway. On the same topic, in a reply to an e-mail
message where I asked if MathML frames could use the GetMutable API to alter
their resolved style data [I asked this because I was hoping that the MathML
frames could change their style contexts so as to reflect MathML attributes that
are not supported by the Style System], he replied something along the lines:
"Nope, unless you are a style rule". And he said that he was planning to remove
that method (that's why it was deprecated BTW), and via this process to fix the
bad callers (I guess by this he meant calls that are modifying the style context
outside the style resolution -- these can be seen on pierre's migration patch:
the mutations at the frame-level, e.g., [nsGfxTextControl,Table,nsCSS]Frame).
Pierre might better explain the Right Thing that should to be done for these
random mutations: it is a costly style re-resolution in the associated sub-trees
to bring the style tree back in sync. I guess that maybe why Peter Linss wanted
to avoid the profileration of this practice, and to only let the possibility of
straight mutations to the blessed objects during the controlled style resolution
process. Hope these comments provide some insights into waterson's point 1 and
to the genesis of the problem.
Comment 129•24 years ago
|
||
Also, unless there are some trickery that I am not aware, these mutations
that a frame makes outside the style resolution are lost when a re-resolve is
fired from one of its ancestor frames.
Comment 130•24 years ago
|
||
I kicked off a test run(s) with a Not Quite As **** 64MB/166MHz Windows ME
system, and I'll update when I have that info.
Assignee | ||
Comment 131•24 years ago
|
||
Correct on all points. If we want to simplify the API, we should leave
nsIMutableStyleContext the way it is and remove Read/WriteMutableStyle from
nsIStyleContext. We have been bitten with Marc with various problems coming from
gfx scrollframes that hack into the style context. If you look at the diffs, you
will notice that both StyleContext sharing and StyleData sharing are disabled for
the contexts that correspond to a small list of scrolling elements..
Chris, I'm going to attach a partial diffs that uses globals in Read/Write. I
had a full backup so it should compile ok. The files included in the diffs are:
nsStyleContext.cpp
nsIMutableStyleContext.h
nsIStyleContext.h
nsIStyleSet.h
Assignee | ||
Comment 132•24 years ago
|
||
Comment 133•24 years ago
|
||
Okay, so that system was not (is not?) well. I had to clean a bunch of stuff
up to get it to run sanely at all (e.g., a virus program that meant that
deleting an old profile took >5min). I'm still suspicious of the state of
that machine, so I don't think these numbers mean a heck of a lot (e.g., it
is nominally a faster machine with more memory than what waterson tested, but
the times are slower than what he measured). But, here they are anyways (I
only did 3 cycles; first is uncached, 2 and 3 are cached):
avg med max min #1 #2 #3
before 9674 9694 25181 1694 9864 9283 9878
after 9842 9772 26890 1691 9851 9871 9805
-168 -78 -1709 3 13 -588 73
-1.7% -0.8% -6.8% 0.2% 0.1% -6.3% 0.7%
Assignee | ||
Comment 134•24 years ago
|
||
I'll check in tonight if nobody objects.
Chris, will you have enough time to get some performance data about using globals
instead of stack variables?
Comment 135•24 years ago
|
||
Pierre, drivers@mozilla.org already decided to keep this change out of 0.9, so
we don't have too many confounding variables in trying to unregress performance
(we have regressions already, as well as stability bugs and other things to
fix). waterson sees a 5-6% slowdown on a low-end machine running jrgm's tests,
but his results need to be confirmed. I understand he's gonna help sort out the
performance effects with you. Until they're well in hand, and especially right
now, please do not check this patch into the trunk.
/be
Assignee | ||
Comment 136•24 years ago
|
||
I would like to checkin the change of APIs: that's all the files in the diffs,
except for nsStyleContext.cpp and part of .h files. The reason is that I'm
leaving on sabbatical in a little bit more than 3 weeks and I expect not to have
the time to work on performance or memory footprint again before mid-july. I
have to take care of the bug list that really got out of hands in the last
several weeks.
The change of API that I would like to checkin should not have any effect on
performance whatsoever. The nsMutableXXX helper class will be reduced to a
simple constructor that calls the current GetMutableStyleData().
Please advise.
Assignee | ||
Comment 137•24 years ago
|
||
Note also that the platform that Waterson tested on does not match the
requirements for Netscape 6 (64Mb)
http://home.netscape.com/browsers/6/datasheet/index.html?cp=n6i#platforms
Jrgm's tests on a 64Mb machine shows an average slowdown of 1.7% and a medium
slowdown of 0.8%.
Assignee | ||
Comment 138•24 years ago
|
||
Jrgm's tests also show a speedup of 0.1% on first load on a machine and what
appears like a negligible slowdown on another one (no numbers for this one, but a
graph at http://bugzilla.mozilla.org/showattachment.cgi?attach_id=31092)
Assignee | ||
Comment 139•24 years ago
|
||
Ooops, I'm leaving on sabbatical in a bit more than 2 weeks, not 3. Yeehaaa!
Comment 140•24 years ago
|
||
drivers@mozilla.org don't see the need to get the API changes in for 0.9 -- the
patches here are backed up, as well as on your machine.
It's very hard right now to get any review and testing cycles, because everyone
is rushing for the 0.9 deadline. We'll have plenty of confounding and causative
variables to untangle tomorrow, and for a week (until 0.9 releases). Do we know
exactly what the performance effects of the API-only patch is? Unless you get a
windows build and jrgm tests going fast, I don't know those effects. They
should be zero, but Murphy was an optimist.
We don't need any more risk here, even if it is tiny -- we don't even have the
"brainprint" to take these changes into the mix. Sorry, and we will get them in
after 0.9, with performance improvements too, I hope -- possibly even next week
if you, waterson, et al. can pin down where the cycles are going and optimize
them away.
/be
Assignee | ||
Comment 141•24 years ago
|
||
Sorry, I can't commit myself to work a week on this or even 3 days, I think. I
have more than a hundred bugs to triage before I leave (out of 170+), and more
than a couple of them would be nice to fix for 0.9.1 or 1.0. I'm not sure I'll
be back before the tree closes for 1.0.
Checking in the API would allow someone else to work on the style system memory
model much more easily, as only 3 files would be different. The change of API
would consist in changing the calls from:
nsStyleXXX myStyle = myContext->GetMutableStyleData(eStyleStruct_XXX);
to:
nsMutableStyleXXX myStyle(myContext);
where the template for nsMutableStyleXXX would be:
template <class T, nsStyleStructID SID>
class basic_nsAutoMutableStyle
{
protected:
T mStyleStruct;
public:
basic_nsAutoMutableStyle(nsIMutableStyleContext* aContext) {
aContext->GetMutableStyleData(SID, &mStyleStruct);
}
basic_nsAutoMutableStyle(nsIStyleContext* aContext) {
aContext->GetMutableStyleData(SID, &mStyleStruct);
}
T* get() { return &mStyleStruct; }
T* operator->() { return get(); }
T& operator*() { return *get(); }
};
As you can see, it's pretty much empty. A point to consider is that whatever the
risk of checking in this empty code might be, it would better be managed if I
were still around for a couple of weeks to cover any problems that may arise.
After all, if risk there is, it would make even less sense to finally proceed to
do the checkin the day before I leave. I would check it in now without
hesitation but it's up to drivers@mozilla.org to balance the risk involved by any
code, even empty, and the benefit to the Community of making it possible for its
members to work on something important.
Anyhow, API or not, now or later, a full resolution is in the Future for me.
Marking as such.
Status: NEW → ASSIGNED
Target Milestone: mozilla0.9 → Future
Assignee | ||
Comment 142•24 years ago
|
||
Of course my point was that this mostly empty code doesn't need any performance
testing and to answer your concerns, it was reviewed (and even contributed to) by
waterson.
Assignee | ||
Comment 143•24 years ago
|
||
Based on the current API, an even emptier code would be:
template <class T, nsStyleStructID SID>
class basic_nsAutoMutableStyle
{
protected:
T* mStyleStruct;
public:
basic_nsAutoMutableStyle(nsIMutableStyleContext* aContext) {
mStyleStruct = aContext->GetMutableStyleData(SID);
}
basic_nsAutoMutableStyle(nsIStyleContext* aContext) {
mStyleStruct = aContext->GetMutableStyleData(SID);
}
T* get() { return mStyleStruct; }
T* operator->() { return get(); }
T& operator*() { return *get(); }
};
Comment 144•24 years ago
|
||
pierre, I understand that you're going on sabbatical in short order, but I don't
think that means we need to rush this checkin. The patches aren't going anywhere
(they're in the bug, as well as two of my trees). The tree is going to be locked
down so there won't be much bit-rot with respect to the new APIs.
I think this should be a no-brainer to land as soon as the tree opens for
mozilla-0.9.1 (Friday? Monday?). By then we'll have sorted out this performance
weirdness. (I'm not leaving you high and dry on that; just need to knock out my
own 0.9 bugs.)
Comment 145•24 years ago
|
||
On the Very Crappy Machine (133MHz Win98, but with more memory this time,
160MB), results with mfcEmbed show little difference in page load time with
jrgm's test; in fact, it looks like style sharing has a slight edge:
AVG MED MAX MIN ---------- TIMES -----------
before 3653 3486 10512 643 4602 3369 3449 3425 3420
after 3520 3456 10538 644 3765 3471 3429 3460 3475
delta 133 30 -26 -1
%improv 3.6 0.8 -0.2 -0.1
I'll next try to repeat the seamonkey results to see if there's something
happening with chrome that's introducing the 5% penalty that my previous run
indicated.
Assignee | ||
Comment 146•24 years ago
|
||
It confirms what I have been seeing. The fluctuations in performance seem to
have increased in the past 2 weeks, and the difference between the before/after
results are beyond the margin of error. So why not apply the old mantra and keep
recounting until we get the numbers we want?
Comment 147•24 years ago
|
||
I'm all for that!
Comment 148•24 years ago
|
||
Any follow-up on this? Is it in a branch somewhere?
[Reading the code, it looks like there could be room for futher tweaks to make
it more space-time efficient in subsequent iterations.]
Assignee | ||
Comment 149•24 years ago
|
||
Nothing changed. According to jrgm, the code I'm proposing is on average 1.7%
slower than the original code. It means that a page that takes 2 seconds to load
would take 2.034 seconds. That's 3.4 hundredth of a second more and this is
totally unacceptable to drivers@mozilla.org.
I guess my vision of performance and memory improvement in software engineering
was a bit outdated. I always thought that to improve memory a lot, performance
had to be compromised a little bit and vice versa, to improve performance a lot,
one had to use a little bit more memory. As a result by doing successive steps
(- a lot of memory + a bit of cpu - a lot of cpu + a bit of memory) we still had
in the end a lot less memory a lot less cpu.
Mozilla is different. We don't compromise.
With my best smileys ;-) of course... :-)
Comment 150•24 years ago
|
||
Would that be possible to have a branch with the latest? Continuous large diffs
are scary. With a branch, anyone could pull, and try, and perhaps add or refine.
And QA could easily pull and try from there too if some progress is made.
Assignee | ||
Comment 151•24 years ago
|
||
My first preference is that drivers@mozilla.org treat the bug fairly and allow
the code to go in. We can even use APIs with globals instead of stack variables
and it will reduce the 1.7% slowdown a bit, or maybe even go faster than the
current code as I measured on the Mac.
If it is not possible, my second preference is to checkin the change of APIs so
that the entire changes would be limited to just a few files (nsStyleContext.cpp/
.h, nsIMutableStyleContext.h and maybe nsIStyleSet.h).
If it is not possible either, a branch is nice but it would rate fairly low on my
list of priorities for moz0.9.1. Anybody can do it anyhow. I would be able to
help, though, for instance to update the diffs. Otherwise I'm leaving this
Friday, I'll be back July 9th, and as Chris wrote: the patches aren't going
anywhere.
Note that if I checkin this thing, the support for the aural stylesheets will
move on top of my list. Accessibility issues have a high priority in 6.5.
Comment 152•24 years ago
|
||
pierre, I don't think anyone is averse to this going in for mozilla-0.9.1, but
maybe you are privvy to some email from drivers@ that I am not.
Assignee | ||
Comment 153•24 years ago
|
||
Not really: the only things I heard from drivers were the conclusions copied on
this bug report. What you are saying seems like good news. If a checkin is ok,
I can get in any solution: StyleData sharing with or without StyleContext
sharing, APIs with or without variables on the stack. As far as I can tell, with
StyleData sharing only and with global variables, it uses less memory and it goes
a tad faster than the current code base.
Comment 154•24 years ago
|
||
Pierre, I commented for drivers on 2001-04-17. At 19:33, I wrote:
"...we will get them in after 0.9, with performance improvements too, I hope --
possibly even next week if you, waterson, et al. can pin down where the cycles
are going and optimize them away."
This by way of asking you to hold off until after 0.9 had branched, with an
apology for the request that you wait.
Previously that day, I had commented:
"Pierre, drivers@mozilla.org already decided to keep this change out of 0.9, so
we don't have too many confounding variables in trying to unregress performance
(we have regressions already, as well as stability bugs and other things to
fix). waterson sees a 5-6% slowdown on a low-end machine running jrgm's tests,
but his results need to be confirmed. I understand he's gonna help sort out the
performance effects with you. Until they're well in hand, and especially right
now, please do not check this patch into the trunk."
It sounds like you guys have sorted things out, and that the performance effects
are well in hand. Sorry to leave any doubt in your mind about checking into the
trunk, but really, I don't think you can say that drivers was clearly against
you checking in a 1.7% performance hit, since you didn't know that was the size
of the hit when I wrote my comments. Please be fair here, and by all means
check in.
/be
Comment 155•24 years ago
|
||
Time to weigh in here. I am of the opinion that all of the efforts taken so
far to reduce the style system's footprint, although admirable, are moving in
the wrong direction.
None of the patches that have been produced have addressed the fundamental
problem of the style system, namely that it eagerly resolves all possible
values for every style context.
The sharing strategies employed thus far continue to work under a fundamentally
broken assumption: that all style data must be computed and only after a match
is found can data be thrown away. This patch adds even more complicated
sharing logic, and although it achieves a runtime footprint reduction, it does
nothing to reduce the obscene amount of churn that the style system generates
as it resolves all of these property values.
I have been working on a proof of concept for a new system of rule matching
that uses lazy resolution and that uses other tricks to reduce both footprint
and performance costs. I've gotten far enough with the code that I'm confident
it will be not only a large performance win, but that it will also be a large
footprint win.
I would like to work on developing this new rule-matching strategy for real,
but the problem I have now is that I'm working off the current trunk, and have
touched the same files (in particular, I've rewritten style contexts and the
way rules are matched).
If this patch goes in, I'll be in a world of hurt from a merge perspective.
Given that I'd like to continue work on this new system, I'd prefer it if this
patch did not go into the tree. I don't think it moves in the right direction,
and I don't think it addresses the fundamental performance problems that plague
the style system.
Comment 156•24 years ago
|
||
Actually, if you want to check it in, I won't hold it up. I'll just deal. :)
Comment 157•24 years ago
|
||
If it is a bigger performance and footprint win, as well as more simple and
correct in logic and is within a timeframe of landing in, oh, say the next 2-3
weeks without getting pushed back and back and back further for testing, then I
say wait for it. Perhaps Pierre should take a look at your work Doug and see
if it warrants scrapping the landing coming late today?
Assignee | ||
Comment 158•24 years ago
|
||
I checked in.
Dave: As long as we have a StyleContext, you and everybody else will benefit from
the change of API. You can overwrite my version of nsStyleContext.cpp with yours
and update the mutable APIs. It should not take you more than a couple of hours
to do so. I did it a couple times when switching between the previous version
and the new one.
About the process, I think it has always been clear that Marc, myself and now
Daniel welcome and encourage anybody to help on the style system. You should
have contacted us way sooner, before doing any implementation. We learned by
chance last Sunday that you were in the midst of some serious rework in the style
system by reading a chat on IRC:
<hyatt> argh
<hyatt> com sucks
<hyatt> darnit
<hyatt> the style system uses too much COM
<hyatt> i'm trying not to comify my new rule nodes.
<hyatt> and i can't get around it
<bryner> hyatt: what are you rewriting?
<hyatt> rule matching.
<bryner> do you have a way to speed it up?
<hyatt> hell yes
<hyatt> and to make it take much less memory
I asked you what was going on...
>You mentioned in IRC that you were speeding up rule matching. Is it internal
>to XUL only? Or is it something like nsICSSPseudoComparator which changes a
>little bit the style system in order to make XUL much faster? Or something else?
>
>Just curious. The nsICSSPseudoComparator is maybe not the most elegant thing
>one can come across in the style system but it is very localized and so much
>worth it! Are you working on something like that?
And you replied...
> I'm doing what I talked about with you and Marc in an earlier email...
> namely replacing the mRules member of StyleContextImpl with a lexicographic
> tree. Right now every style context holds on to all the rules it matched as
> an nsISupportsArray... I'm replacing that with a structure that enables
> sharing...
I, and apparently Marc too, don't recall this mail. Or at least we were not
aware that you had started an implementation. Besides, the 'mRules' member is
one of the things that Marc and I did not change. It is fairly separate from the
work we did when reducing the memory footprint of the computed style.
As I said, it will take you a couple of hours to update your tree. I can give
you some pointers if you want. For the rest, you have a CVS account and you can
checkin pretty much what you want as long as you have it approved by the drivers
and go through the same kind of testing we did for StyleContext and StyleData
sharing, but I think you are missing some opportunities by not collaborating with
those who usually work on that code, not to mention the risk of redundant work.
Assignee | ||
Comment 159•24 years ago
|
||
Apparently the bloat went down 3 Mb on Linux and 4 Mb on other flavors of Unix.
I'm stunned, I did not expect the numbers to be so high. People say the
application still works. Let's wait for some more testing and performance
metrics from QA.
Comment 160•24 years ago
|
||
Here is the email that I sent to you, brendan, attinasi, and waterson. Maybe
you don't recall it, but I sent it straight to you, and you even responded to
it.
=======
I was thinking about the style system sharing stuff last night, and I
came up with an idea on how to generalize techniques used in the
outliner widget to make our style sharing code even faster.
What I would propose is that the style system have two levels of cache:
the first line of defense would be a new cache that I'm going to outline
momentarily, and the second line of defense would be the style sharing
code that pierre and attinasi have done. This new cache complements the
existing cache, and both are necessary for maximal footprint reduction
and allocation reduction. I'll call the new cache the level 1 cache,
and the existing cache, the level 2 cache.
Here's the basic idea for the level 1 cache. The style system maintains
a state machine that treats a sorted list of rules as an input word.
This state machine transitions on the rules in the input word until all
have been processed. The resultant final state contains all of the
style data for that input word. On a miss, i.e., when no style data is
found at the final state, you then check the level 2 cache. You have to
allocate the style data and peform t he CRC computation to find matching
style data structs that can then be cached at the final state in the
level 1 cache.
The first level cache will allow us to avoid even doing the allocations
of temporary data that we then throw away, and we won't have to do a CRC
computation (which gets expensive since there are many things to compare).
Thoughts? I would be very interested in translating the outliner state
machine into the style system if you guys think it's a good idea. I
have all the data structures for this already implemented and tested. I
think this works really well, because it complements the code you guys
have already written. Level 1 (the new cache) makes sure the same set
of rules gives you an immediate answer with no allocation. Level 2 (the
existing cache) ensures that different sets of rules that map to similar
style data end up being shared.
Dave
(hyatt@netscape.com)
=====
I presented this idea to you on 4/4/2001, one month ago. Both you and attinasi
responded. I describe the exact approach in this email of treating the sorted
list of rules as an input word, fueling a state machine/lexicographic tree.
To claim that I have been working on this in a vacuum without mentioning my
ideas is completely ridiculous.
Comment 161•24 years ago
|
||
Attinasi's response to the input word paragraph. Again, you are copied on the
mail along with brendan and waterson.
==============
"This sounds like a very nice mechanism. I'm wondering if it is
essentially similar to the sibling style sharing that happens now,
before the style context data sharing phase [1]. Currently, the style
system looks at all child style contexts to see if there are any with
exactly the same rules (comparison by hash, but not a CRC) and if it
finds such a context, it shares it immediatly, without having to fully
resolve style for it. Your level 1 cache sounds like a more global
version of this (not restricted to just sharing among siblings)."
==================
> Apparently the bloat went down 3 Mb on Linux and 4 Mb on other flavors of Unix.
> I'm stunned, I did not expect the numbers to be so high. People say the
> application still works. Let's wait for some more testing and performance
> metrics from QA.
If you're basing this on the bloat numbers from tinderbox, I think they went
down because you didn't add logging for the classes that are no longer part of
the StyleContextData object and you removed (#ifdef-ed out) the logging for
StyleContextData.
Assignee | ||
Comment 163•24 years ago
|
||
David (Baron), I may have been wrong but the calculation is as following: in
StyleContextImpl::SizeOf(), we had the size of everything that belong to the
style context, minus the style data which is counted separately (line 3970)
// count the style data seperately
if (mStyleData) {
mStyleData->SizeOf(aSizeOfHandler,localSize);
}
Then inside nsStyleContextData::SizeOf(), we had the size of all the style data
structures that are not null (line 2344)
// add the sizes of the individual style blobs
if (mData.mBlobs.mFont) aSize += sizeof(StyleFontBlob);
if (mData.mBlobs.mColor) aSize += sizeof(StyleColorBlob);
if (mData.mBlobs.mList) aSize += sizeof(StyleListBlob);
I did add logging for the blobs and I ifdef'd out counting the StyleContextData
in the StyleContextImpl otherwise it would have been counted twice: once in line
2342 in nsStyleContextData::SizeOf():
// get the size of an empty instance and add to the sizeof handler
aSize = sizeof(*this);
...and a second time in line 3955 in StyleContextImpl::SizeOf()
// get the size of an empty instance and add to the sizeof handler
aSize = sizeof(*this);
Where is the bug?
Assignee | ||
Comment 164•24 years ago
|
||
Read "we add" instead of "we had". Twice. Time to sleep.
StyleContextImpl::SizeOf has nothing to do with the bloat logging done by
nsTraceRefcnt. That uses MOZ_COUNT_CTOR/MOZ_COUNT_DTOR or
NS_LOG_ADDREF/NS_LOG_RELEASE. You commented out (I think) the
NS_LOG_ADDREF/NS_LOG_RELEASE pair for the StyleContextData, and you didn't add
any MOZ_COUNT_CTOR/MOZ_COUNT_DTOR for it or for any of the other structs that
are now separately allocated.
Assignee | ||
Comment 166•24 years ago
|
||
Gasp! Correct. But at least the StyleContextData is counted as part of the
StyleContext (with a waste of 1 pointer per StyleContext). The blobs however are
*not* counted.
Assignee | ||
Comment 167•24 years ago
|
||
Assignee | ||
Comment 168•24 years ago
|
||
dbaron, please review. If my estimates are correct, and if the Blobs indeed take
3 to 4 Mb, the gains should be around 500K to 800K. It will be much more when we
address the performnce problems and re-enable StyleContext sharing.
Assignee | ||
Comment 169•24 years ago
|
||
David (Hyatt): I remember this discussion but you were then talking about
complementary mechanisms (level 1 cache & level 2 cache) and yesterday, you
described what Marc and I had done (the level 2 cache) as "moving in the wrong
direction, not addressing the fundamental problem, and working under a
fundamentally broken assumption". The change of tone made me assume that you
were off on something completely different. This being said, reducing the memory
footprint by 40% then by 60% doesn't seem to me as moving in the wrong direction.
Ok, too bad the two things are not enabled at the same time yet, but what kind of
savings do you think you will get, and how does it compare with the proposal I
sent last month?
I propose we get together tomorrow, put down ideas, list problems, define a
roadmap, and open a separate bug report or a thread on the newsgroups to allow
other people to participate. I really am excited about what you are working
on, I just don't see the antagonism with the work that has been done by different
persons over the past few years, and as often I believe that it is out of
exchanges and compromises that the best solutions can be found. Especially after
a good fight. Take your bat.
Assignee | ||
Comment 170•24 years ago
|
||
I fixed the nsTraceRefcnt but the tinderboxen still show a gain of 2Mb, sometimes
3Mb. I think all the structures are accounted for, if someone wants to double
check. Otherwise it's not 4Mb but I'll still take it.
An interesting test would be to get the bloat stats for StyleContext + StyleData
sharing. Any volunteer? Enable SHARE_STYLECONTEXTS in nsIStyleSet.h line #47
and recompile Content, Layout and Editor.
r=dbaron on attachment 33121 [details] [diff] [review], although technically you should have a
MOZ_DECL_CTOR_COUNTER for each one as well (but it's currently a no-op and I
don't think that's likely to change)
Assignee | ||
Comment 172•24 years ago
|
||
I checked in the MOZ_COUNT_CTOR yesterday (my apologies for this crass ignorance)
and as I wrote, the bloat seems to be 2 or 3Mb lower. Here is the data I
collected from the various tinderboxen:
The "before" numbers comes from before my checkin on 05/03 at 06:12.
The "after" numbers comes from after my checkin on 05/03 at 19:18.
-------------------------------------------------------
T-BOX OS BEFORE AFTER GAIN
-------------------------------------------------------
coffee linux 26 - 27 24 -(25) -2 (+)
shrike linux 29 -(30) (27)- 28 -2 (-)
speedracer sun 27 - 28 24 - 25 -3
bismark sun 29 26 - 27 -2 (+)
cement irix 31 29 -2
monkeypox linux 29 26 -3
muerte bsd 30 27 -3
nebiros sun 29 26 -3
senna linux 29 26 -3
-------------------------------------------------------
2Mb or 3Mb less bloat is enough of a step in the right direction.
Closed as Fixed.
The debate shall continue under bug 78961 and in the n.p.m.style newsgroup at
news://news.mozilla.org/3AF314D7.4C896990%40netscape.com
I would like to thank all the 34 people copied on this bug for their interest and
their participation, with special thanks to the QA folks for their dedication,
jrgm for his kindness, Brendan for his wisdom, and Waterson for his volunterism.
If you want to continue to follow the debate, don't forget to add your email to
the CC list under bug 78961.
Status: ASSIGNED → RESOLVED
Closed: 24 years ago
Resolution: --- → FIXED
Comment 173•24 years ago
|
||
I actualy does not see any mem usage improvements... For me it is now worse than
it was.
I can go very quickly to 70Mb mem used by mozilla after the check-in.
Can someone confirm?
PS On Win98.
Updated•22 years ago
|
Whiteboard: [whitebox]
You need to log in
before you can comment on or make changes to this bug.
Description
•