Open Bug 553456 Opened 15 years ago Updated 2 months ago

Unified expandable data blocks for CSS declarations

Categories

(Core :: CSS Parsing and Computation, defect)

defect

Tracking

()

People

(Reporter: zwol, Unassigned)

References

(Blocks 1 open bug)

Details

Attachments

(10 files, 2 obsolete files)

24.43 KB, patch
Details | Diff | Splinter Review
248.36 KB, patch
Details | Diff | Splinter Review
45.12 KB, patch
Details | Diff | Splinter Review
3.26 KB, patch
Details | Diff | Splinter Review
10.63 KB, patch
Details | Diff | Splinter Review
33.15 KB, patch
Details | Diff | Splinter Review
59.55 KB, patch
Details | Diff | Splinter Review
24.24 KB, patch
Details | Diff | Splinter Review
104.42 KB, patch
Details | Diff | Splinter Review
63.52 KB, application/pdf
Details
Attached patch work in progress patch (obsolete) — Splinter Review
This in-progress patch merges nsCSSCompressedDataBlock, nsCSSExpandedDataBlock, and nsCSSDeclaration. The new "mozilla::css::Declaration" structure is stored compactly but expandably, as a vector of css::PropertyValue objects, which can hold pointers to any of the various types of root-level nsCSSValue object (nsCSSValue, nsCSSRect, nsCSSValuePair, nsCSSValueList, nsCSSValuePairList). This should dramatically cut down on specified-value copying. We still have the expanded structure used by nsRuleNode (nsRuleData). A downside is that all specified-value objects that make it into the declaration must be heap-allocated, and to avoid refcounting them, the semantics of css::PropertyValue's copy constructor are abnormal. Some of this could be mitigated by making the other four root-level specified-value objects be special units for nsCSSValue, then nsCSSValue could be directly embedded in PropertyValue, but I'd rather do that as a follow-up. I'm posting this now in case anyone wants to take an early look at it, but it doesn't work; tons of reftests fail (starting with reftest-sanity/test-async.html) and then it crashes. I think the problem is that we're missing EnsureMutable(), but I don't fully understand how style rules and declarations interact, and I'm out of brain for the day.
Attached patch WIP v2 (obsolete) — Splinter Review
I now have style rule updates working. There are still a ton of reftest failures, crashtest assertions (but I think no longer actual *crashes*), and if you just run the browser with this patch, the chrome doesn't look quite right. I could really use more advice - I think we're down to subtle issues with XUL, SVG, and tables at this point, and I don't (for instance) see how these changes in the generic code manage to break reftest-sanity/test-async.xul but not test-async.html, or what's going wrong with reftest-sanity/filter-[12].xhtml.
Attachment #433465 - Attachment is obsolete: true
Attachment #439431 - Flags: feedback?(dbaron)
Attachment #439431 - Flags: feedback?(bzbarsky)
Forgot to mention that the current patch may have a structural dependency on the patch in bug 559715.
So some of the copying was actually pretty critical to the algorithm: none of the values associated with property parsing should actually be committed to the declaration until the complete property parse is successful. How does your new approach handle that? The heap allocation seems disturbing to me -- like it could cost us much of the gain here (if not more). What would you think of an alternative approach in which we: * basically kept the current nsCSSCompressedDataBlock but threw out the expanded data block * keep the current tempdata/data splitting (the copying there should never involve ctors/dtors except for calling dtors on overwrite (which we need to do at some point anyway) and is fundamental to the algorithm) * parsed into a compressed data block (and manipulated them dynamically) by: * keeping a parallel nsCSSPropertySet to show what properties we have in mData and thus what we need to overwrite rather than append for * allowing compressed data blocks to *temporarily* have holes in them (including at the end), and then having a "compress" method that cleans up the holes * allocating compressed data blocks out of an arena so that during initial parse, they have a huge hole at the end, and we just give the extra space back to the arena This is just a brief brainstorm; I'm not particularly committed to it.
(In reply to comment #3) > So some of the copying was actually pretty critical to the algorithm: none of > the values associated with property parsing should actually be committed to the > declaration until the complete property parse is successful. How does your new > approach handle that? That's why the parser still has a mTempData member. We merge from there into the real declaration object only if the complete property parse is successful. If I understand things right, this is only necessary for shorthands, but I'm having enough trouble debugging this patch without adding extra wrinkles to the parser. > The heap allocation seems disturbing to me -- like it could cost us much of the > gain here (if not more). [...] I'd prefer to get this patch to the point where I can actually benchmark it before changing the approach so radically, but if the extra allocations do turn out to hurt, something like what you suggest seems like a plausible alternative (I'd probably try to put nsCSSValue objects directly into the expandable array, first - that wouldn't require quite as much surgery).
> If I understand things right, this is only necessary for shorthands, As implemented, as I recall. For example: color: red ! foopy; would end up in mTempData and then get tossed.
Comment on attachment 439431 [details] [diff] [review] WIP v2 I'm not sure there's additional feedback needed here until you benchmark. I'm sort of unhappy about nsCSSCompressedDataBlock going away, though. I guess we'll see whether your new approach is faster. (I sometimes don't know whether to mark a feedback? request + or -.)
Attachment #439431 - Flags: feedback?(dbaron) → feedback+
Comment on attachment 439431 [details] [diff] [review] WIP v2 The "feedback" I was looking for, here, was whether either you or bz could spot the bug that's making the patch-as-is screw up the UA stylesheet to the point where benchmarking is pointless. Maybe that's not how I should be using the flag. Anyway, I'm engaged in breaking up the patch into lots of tiny pieces and hopefully I'll be able to figure it out in the process.
Attachment #439431 - Flags: feedback?(bzbarsky)
Attachment #439431 - Flags: feedback+
My best guess given comment 7 is that something goes awry in the cloning (since that's something that we always do with UA.css). Can you reproduce the problem with this testcase: <link rel=stylesheet media=print href="something.css"> <link rel=stylesheet media=screen href="something.css"> (the two href values need to be equal) and some rules in something.css?
Depends on: 569719
Depends on: 576044
Here we go with a new patch series for this bug. This works at least as far as passing the layout/style mochitests and the reftests, but has not yet been benchmarked or verified to pass the full test suite. It depends on the work in bug 576044. This chunk just replaces the ad-hoc block of memory in an nsCSSCompressedDataBLock with an nsTArray.
Attachment #439431 - Attachment is obsolete: true
Attachment #465932 - Flags: review?(dbaron)
This makes shorthand and longhand properties' internal code numbers be values of two different enumerations, thus enforcing in the type system a bunch of constraints that formerly needed runtime assertions. It also pulls some of the "internal use only" cases out of ParseSingleValueProperty. The motivation for this change will be clearer when we get to part 8 and 9.
Attachment #465933 - Flags: review?(dbaron)
Now there are two TArrays in a CompressedDataBlock, one for the normal and one for the important properties. It might be better to mash the arrays together and put a flag bit on each entry (and then we wouldn't need mOrder over in css::Declaration, either) but that would want benchmarking. The rationale for this will perhaps be clearer when we get to part 7.
Attachment #465934 - Flags: review?(dbaron)
Now that data blocks hold TArrays, we can use their mutability to avoid compress/expand cycles. This does the easiest one of those, in Declaration::RemoveProperty. Note that I don't try to keep mStyleBits up to date over a remove - my impression is that mStyleBits is just an optimization so this is safe.
Attachment #465935 - Flags: review?(dbaron)
Next piece of avoiding compress/expand cycles: TryReplaceValue can now succeed all the time (so I renamed it TransferValue) and that means the ParseProperty parser entrypoint doesn't need to expand or compress.
Attachment #465936 - Flags: review?(dbaron)
With a little tweaking to the TransferValue interface, it can be used for regular parsing as well, which means we no longer need expand/compress at all. (Expand could have gone away in the previous patch but I thought it would be clearer to take them out at the same time.) Also, the parser can now put successfully-parsed properties directly into the declaration object, it doesn't need the mData intermediate.
Attachment #465937 - Flags: review?(dbaron)
The Declaration object is more-or-less just a wrapper around a single nsCSSCompressedDataBlock object at this point; squash them together.
Attachment #465938 - Flags: review?(dbaron)
With a little reorganization, the CSS parser doesn't need mTempData at all anymore when parsing longhand properties.
Attachment #465939 - Flags: review?(dbaron)
This final patch avoids the need for mTempData when parsing shorthand properties; and that's the last use of nsCSSExpandedDataBlock, so that too can go away. This is less robust than I would ideally like; the parser is now relying extensively on the order of entries in the SubpropertyEntryFor() arrays, and is not validating them as carefully as it could. But anything else I could think of seemed unwieldy, and we already do have extensive dependencies on the order of those arrays, so maybe it's okay. Note also the use of alloca, which appears to be a first in layout - but we really don't want to be allocating this stuff from the heap, or initializing ~30 nsCSSValue objects when we only need three.
Attachment #465941 - Flags: review?(dbaron)
(In reply to comment #17) > Note also the use of alloca, which > appears to be a first in layout - but we really don't want to be allocating > this stuff from the heap, or initializing ~30 nsCSSValue objects when we only > need three. How about using nsAutoTArray?
Attached file benchmark results
I promised benchmark results. They're a little disappointing. Overall, the patch series appears to be a win, but the initial patches have significant cost which is only barely made back later; also, time to parse colors didn't go back down, which suggests that there's something wrong, there. Also it occurred to me in the shower yesterday that parsing declaration blocks is now quadratic in the number of longhand properties. It might have been already, because of the order vector, but it's probably worse now. And I should probably be microbenchmarking MapRuleInfoInto as well as the parser - can anyone suggest how to do that? So I think probably, for the next beta, we want bug 576044 but not this bug, and I'm not promising to do anything more with this bug anytime soon, although I might if I get an idea.
Attachment #466127 - Attachment mime type: application/empty → application/pdf
So, based on perhaps slightly-too-brief examination of the patches, I think I like the idea of patches 3, 4, 5, 7, 8, and 9, and dislike the idea of patches 1 and 6, and sort of neutral about patch 2. I think for patches 1 and 6, we're going to need to make CSS declarations order-preserving. (That will, in turn, allow removing the directional-shorthand stuff.) And I tend to think there's value in using stack scratch buffers and exact-sized heap buffers. In patch 2, the idea of separate types is nice, but I might have kept the enum values separate and tested IsShorthand with >=. And I feel like there should be a simpler way to do what you've done in patches 8 and 9. (In fact, I'd almost be tempted to do it while preserving an mTempData object exposing the same API that nsCSSExpandedDataBlock does, but handling its storage an entirely different way. We could essentially just parse into the relevant single value or alloca-allocated array of values.) But I should look into this in more detail at some point...
(Sorry for delay in responding, I was traveling.) Please keep in mind when evaluating these changes that the principal goal is to remove copies, and the secondary goal is to remove the giant ExpandedDataBlock objects from nsCSSParser (so that Parser objects are cheap to create, and we don't need to cache them, and can dust off the interface cleanup in an obsolete patch in bug 523496). This is why I was trying very hard not to have scratch buffers on the stack. I think we are stuck with one copy of each nsCSSValue for correctness, so that .foo { property: value; property: syntax error; } behaves correctly, but I want very much to have only one copy. I like the idea of making CSS declarations order-preserving (some of the stuff in here goes in that direction, in fact - I think if we merged the important and normal arrays, we'd be 99% there) and I think it would make some of the stuff in this patch less hairy, but I'm not sure I see any practical way to do it while keeping the ExpandedDataBlock interface. Could you expand a little on your observation that this would eliminate the directional-shorthand stuff? I don't really understand that stuff (you're talking about the hidden -source properties and related, right?) and don't immediately see what it has to do with order preservation. Patches 8 and 9 are big and hairy but I think the state of the code after they go in is superior to the way it was before -except- for the reliance on order within arrays, and possibly also the use of alloca(). If you have thoughts on how to get away from those I'm all ears. (In particular, I am actively in favor of passing the destination storage area as an argument to all value-parsing functions, shorthand or longhand, rather than having some of them write directly to storage in the parser object.) Here's an off-the-wall idea for avoiding reallocations in the common case: If we did bug 541496, we could prescan each declaration block and count all the semicolons outside comments, then preallocate space for (semicolons + 1) property-value pairs. We'd still want resizable blocks for DOM manipulation and not wasting space when some of the properties get discarded, though, I think. I do have time to work on this stuff in the next couple months, if you don't mind providing some more concrete guidance on how you want it to look long-term. (I still live in MV if you'd like to discuss it in person some time.)
> I do have time to work on this stuff in the next couple months, if you > don't mind providing some more concrete guidance on how you want it to > look long-term. (I still live in MV if you'd like to discuss it in > person some time.) Sorry for taking so long here; in any case, I think the key design points I'd like to see satisfied are: * taking the same space in compressed form than the current storage (note that the current storage has changed to store the properties and values separately, since the properties compress better) (Note that we can consider properties of the allocator when doing this.) * preserving order of properties within a data block * being reasonably efficient to both parse and (less important) mutate (probably no worse than today at least for common cases) Some more vague thoughts on how to achieve this, but that I'm less tied to: * I don't think removing copies during parsing should be a goal in itself; that said, I could imagine reducing the mTempData -> mData -> datablock process from two copies to one, but this doesn't feel critical to me, and seems like it's probably more trouble than it's worth (at least based on the possible solutions I can think of right now). On the other hand, eliminating the use of nsCSSExpandedDataBlock for this seems worthwhile, as you did in patches 8 and 9 above. But I'm not especially keen on what patch 9 does to the way shorthand parsing code has to be written; I suspect it could be done by reusing the code for the next point. * It seems more valuable to me to reduce the need for expand/compress cycles during mutation. Perhaps this could be achieved by something like: + the parser parses into something that's TArray-like or an actual TArray (i.e., has a bunch of memory overhead both due to the property/value pairing and the double-on-fault allocation) + have a mechanism to turn this TArray-like thing into a compressed block or to created a new compressed block based on an existing compressed block and this TArray. (I think in all cases this would mean *appending* the TArray... which implies removing anything already in the block that's also in the TArray, and then adding new stuff to the end of the block representing what's in the TArray.) I'd hope that this could be done in a relatively straightforward way with a single allocation (which could be a realloc() if it's dealing with an existing block). + have a mechanism (like your patch 4) to handle RemoveProperty, which I think should be the only remaining mutation case. Hopefully such a mechanism would be fast enough that it would avoid the need for bz's special fast paths for single property setters, though perhaps the setters will still be enough faster that they're still worth leaving. I'm going to clear the existing review requests on these patches; I don't think I'd want to take them as they are or in a form close to that, and I'm trying to get my review queue to reflect reality. I'm really sorry about letting this drag on so long, though.
OS: Linux → All
Hardware: x86 → All
One further note: Bugzilla has a great new feature (needinfo? requests) that allows directing bug comments at a specific person (rather than bugmail-to-the-wind, as always) and putting those requests in that person's request queue. I'm considerably more reachable via needinfo? than via other forms of bugmail (other than review?, which it's on par with).
Thanks for the feedback. It is still my intention to work on this, but I may not get back to it for several more months, and I want to finish up my remaining scanner patches first (bug 413958 and bug 543151). I don't have any reaction to your comments at this stage; when I do get back to this, I'll probably try doing it just as you suggested, and see if I hit any snags.
If anyone wants to tackle this or related bugs, http://hg.mozilla.org/users/zackw_panix.com/patches/ has incomplete patches that may be useful.
Resetting owner to default per Zack's request.
Assignee: zackw → nobody
Status: ASSIGNED → NEW
Severity: normal → S3
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: