Closed
Bug 1360889
Opened 8 years ago
Closed 8 years ago
stylo: Add a custom Arc to use for style structs
Categories
(Core :: CSS Parsing and Computation, enhancement, P1)
Core
CSS Parsing and Computation
Tracking
()
RESOLVED
FIXED
People
(Reporter: bholley, Assigned: bholley)
References
Details
Attachments
(4 files)
46.87 KB,
patch
|
emilio
:
review+
|
Details | Diff | Splinter Review |
49.24 KB,
patch
|
emilio
:
review+
|
Details | Diff | Splinter Review |
38.60 KB,
patch
|
emilio
:
review+
|
Details | Diff | Splinter Review |
2.64 KB,
patch
|
emilio
:
review+
|
Details | Diff | Splinter Review |
As discussed in bug 1360883, there are a few disadvantages of Arc for us:
* The weak refcount adds extra memory overhead (certainly on 32-bit), and we don't use it.
* We do several extra atomic RMU operations to handle the weak refcount.
* It doesn't have a free get_mut()-like API (i.e. one that's not RMU).
* We can't use a custom allocator, which makes it inappropriate for arenas, which we may want here.
I took a few minutes to prototype a forked version of Arc and it looks like it'll work fine. Need some more time to finish it up, but I should have patches here soon.
Comment 1•8 years ago
|
||
Do we really want a custom Arc? I mean, we only use make/get_mut dufing the cascade, right?
Seems to me that we could do something like:
```
enum StyleBuilderStruct<'a, T> {
Borrowed(&'a Arc<T>),
Owned(T),
}
impl<'a, T> StyleBuilderStruct<'a, T> {
pub fn make_mut(&mut self) -> &mut T {
if let StyleBuilderStruct::Borrowed(arc) = *self {
*self = StyleBuilderStruct::Owned((**arc).clone());
}
match *self {
StyleBuilderStruct::Owned(ref mut v) => v,
StyleBuilderStruct::Borrowed(..) => debug_unreachable!(),
}
}
pub fn build(self) -> Arc<T> {
match self {
StyleBuilderStruct::Borrowed(s) => s.clone(),
StyleBuilderStruct::Owned(s) => Arc::new(s),
}
}
}
```
That would avoid the initial cloning of the parent/default style structs, and make subsequent make_mut calls basically free.
Assignee | ||
Comment 2•8 years ago
|
||
But that would make StyleBuilder enormous (because it needs inline storage for 20 style structs), and trigger a bunch of extra memmoving, right? That seems really suboptimal.
I don't see what's so bad about forking arc. With the weak stuff stripped out, it's not much code at all.
Comment 3•8 years ago
|
||
> that would make StyleBuilder enormous
Is that bad, if it’s on the stack and never moves?
Assignee | ||
Comment 4•8 years ago
|
||
(In reply to Simon Sapin (:SimonSapin) from comment #3)
> > that would make StyleBuilder enormous
>
> Is that bad, if it’s on the stack and never moves?
It's certainly likely to cause a few more cache misses. Maybe that's ok and not as bad as I was imagining, but we can definitely do better.
Assignee | ||
Comment 5•8 years ago
|
||
Update: We decided to land emilio's approach in comment 1 over the weekend and build on that, which landed in [1]. We also landed [2] and [3] to significantly reduce allocations.
I just profiled obama:
* before: https://perfht.ml/2oU1fRY
* after: https://perfht.ml/2oYGJzC
The upshot is that we significantly reduced allocations, but significantly increased memmov traffic, which is roughly what I'd expect. The exact overhead of the cache misses is hard to measure, but to me the memmov traffic is enough justification to proceed with the custom Arc (also, it sounds like we will indeed need to do arenas). I'll try to get that finished up today.
[1] https://github.com/servo/servo/pull/16663
[2] https://github.com/servo/servo/pull/16661
[3] https://github.com/servo/servo/pull/16664
Comment 6•8 years ago
|
||
I think the biggest offender there now is the font-size stuff. I think we should fix that too.
Manish, you wrote in a comment that you need to cascade font-size regardless of whether it was specified to handle mathml's script min size and language changes.
Both of those are super-rare, is there a reason that shouldn't be guarded by the seen bitfield? Seems quite wasteful to cascade all the time unnecessarily.
Flags: needinfo?(manishearth)
Comment 7•8 years ago
|
||
(I believe that accounts for most of the memmove traffic, given the font struct is specially large)
Assignee | ||
Comment 8•8 years ago
|
||
Let's move the font-size discussion to bug 1361126.
Flags: needinfo?(manishearth)
Assignee | ||
Comment 9•8 years ago
|
||
This is a verbatim copy of the source at [1]. It won't compile on stable rust,
but I'm including it here to make the changes clear.
[1] https://doc.rust-lang.org/src/alloc/arc.rs.html
MozReview-Commit-ID: XEbOK4fXQX
Attachment #8863881 -
Flags: review?(emilio+bugs)
Assignee | ||
Comment 10•8 years ago
|
||
We remove most of the doc comments to minimize the number of lines
of forked code.
MozReview-Commit-ID: LehEisKxkJW
Attachment #8863882 -
Flags: review?(emilio+bugs)
Assignee | ||
Comment 11•8 years ago
|
||
MozReview-Commit-ID: flF0fv9E9M
Attachment #8863883 -
Flags: review?(emilio+bugs)
Assignee | ||
Comment 12•8 years ago
|
||
MozReview-Commit-ID: 8wSsYPEmYE4
Attachment #8863884 -
Flags: review?(emilio+bugs)
Assignee | ||
Comment 13•8 years ago
|
||
Assignee | ||
Comment 14•8 years ago
|
||
These patches improve cascade performance by about 25%. It appears to be mostly patch 4.
StyleArc itself improves release and drop performance for arcs by a couple of nanoseconds each.
Assignee | ||
Comment 15•8 years ago
|
||
here is a profile of the cascade on the obama testcase with these patches: https://perfht.ml/2p2K9RU . It's a bit noisy (a lot of which seems to depend on malloc performance), but this profile shows us a 22ms for the cascade, which is a lot better than the 35ms we were at last week.
Blocks: 1360878
Updated•8 years ago
|
Attachment #8863881 -
Flags: review?(emilio+bugs) → review+
Comment 16•8 years ago
|
||
Comment on attachment 8863882 [details] [diff] [review]
Part 2 - Strip down StyleArc to what we need. v1
Review of attachment 8863882 [details] [diff] [review]:
-----------------------------------------------------------------
It's worth perhaps to put it into a standalone crate?
::: servo/components/style/stylearc.rs
@@ +158,5 @@
> + // Note: in stable rust we can't use intrinsics::abort.
> + // Panic is good enough in practice here (it will trigger
> + // an abort at least in Gecko, and this case is degenerate
> + // enough that Servo shouldn't have code that triggers it).
> + panic!();
If you keep the panic! there's no need for unsafe {}. We can also call at ::std::process::abort, which is both stable and safe, as you prefer.
@@ -583,5 @@
> - // Use Acquire to ensure that we see any writes to `weak` that happen
> - // before release writes (i.e., decrements) to `strong`. Since we hold a
> - // weak count, there's no chance the ArcInner itself could be
> - // deallocated.
> - if this.inner().strong.compare_exchange(1, 0, Acquire, Relaxed).is_err() {
ugh, yeah, I see now why this would be way more slower than needed.
@@ +206,5 @@
> +impl<T: ?Sized> Arc<T> {
> + #[inline]
> + pub fn get_mut(this: &mut Self) -> Option<&mut T> {
> + // See make_mut() for documentation of memory ordering and threadsafety.
> + if this.inner().count.load(Relaxed) == 1 {
Can we keep this in is_unique()? Seemed a bit clear, even if it desugars to the same.
@@ +327,5 @@
> }
> }
>
> #[cfg(test)]
> mod tests {
Do we really want to keep the tests? We don't run them except on travis, perhaps it's a good time to run them from test-unit instead?
Otherwise, perhaps move the tests to tests/unit, assuming they don't rely on internals?
Attachment #8863882 -
Flags: review?(emilio+bugs) → review+
Updated•8 years ago
|
Attachment #8863883 -
Flags: review?(emilio+bugs) → review+
Comment 17•8 years ago
|
||
Comment on attachment 8863884 [details] [diff] [review]
Part 4 - Make StyleBuilder more efficient using stylearc. v1
Review of attachment 8863884 [details] [diff] [review]:
-----------------------------------------------------------------
I told you the StyleBuilder refactoring was going to be worth it regardless of this, hope this patch wasn't too unpleasant to write ;-)
::: servo/components/style/properties/properties.mako.rs
@@ +2024,5 @@
> pub enum StyleStructRef<'a, T: 'a> {
> /// A borrowed struct from the parent, for example, for inheriting style.
> Borrowed(&'a Arc<T>),
> /// An owned struct, that we've already mutated.
> + Owned(Arc<T>),
It'd be nice to compare this with and without StyleArc (that is, reverting part 3), to see what's the impact of the weak count on perf.
Attachment #8863884 -
Flags: review?(emilio+bugs) → review+
Assignee | ||
Comment 18•8 years ago
|
||
(In reply to Emilio Cobos Álvarez [:emilio] from comment #16)
> Comment on attachment 8863882 [details] [diff] [review]
> Part 2 - Strip down StyleArc to what we need. v1
>
> Review of attachment 8863882 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> It's worth perhaps to put it into a standalone crate?
We can eventually, but given that we might do arena stuff, I want to leave it all in the style creates for simplicity. Once things settle down, we can break out what makes sense.
>
> ::: servo/components/style/stylearc.rs
> @@ +158,5 @@
> > + // Note: in stable rust we can't use intrinsics::abort.
> > + // Panic is good enough in practice here (it will trigger
> > + // an abort at least in Gecko, and this case is degenerate
> > + // enough that Servo shouldn't have code that triggers it).
> > + panic!();
>
> If you keep the panic! there's no need for unsafe {}. We can also call at
> ::std::process::abort, which is both stable and safe, as you prefer.
It's only stable as of 1.17, which we don't require yet. I'll add a comment.
>
> @@ -583,5 @@
> > - // Use Acquire to ensure that we see any writes to `weak` that happen
> > - // before release writes (i.e., decrements) to `strong`. Since we hold a
> > - // weak count, there's no chance the ArcInner itself could be
> > - // deallocated.
> > - if this.inner().strong.compare_exchange(1, 0, Acquire, Relaxed).is_err() {
>
> ugh, yeah, I see now why this would be way more slower than needed.
>
> @@ +206,5 @@
> > +impl<T: ?Sized> Arc<T> {
> > + #[inline]
> > + pub fn get_mut(this: &mut Self) -> Option<&mut T> {
> > + // See make_mut() for documentation of memory ordering and threadsafety.
> > + if this.inner().count.load(Relaxed) == 1 {
>
> Can we keep this in is_unique()? Seemed a bit clear, even if it desugars to
> the same.
Done.
>
> @@ +327,5 @@
> > }
> > }
> >
> > #[cfg(test)]
> > mod tests {
>
> Do we really want to keep the tests? We don't run them except on travis,
> perhaps it's a good time to run them from test-unit instead?
Yeah, good catch. I'll just remove them.
>
> Otherwise, perhaps move the tests to tests/unit, assuming they don't rely on
> internals?
Assignee | ||
Comment 19•8 years ago
|
||
(In reply to Emilio Cobos Álvarez [:emilio] from comment #17)
> Comment on attachment 8863884 [details] [diff] [review]
> Part 4 - Make StyleBuilder more efficient using stylearc. v1
>
> Review of attachment 8863884 [details] [diff] [review]:
> -----------------------------------------------------------------
>
> I told you the StyleBuilder refactoring was going to be worth it regardless
> of this, hope this patch wasn't too unpleasant to write ;-)
Yeah totally. :-)
>
> ::: servo/components/style/properties/properties.mako.rs
> @@ +2024,5 @@
> > pub enum StyleStructRef<'a, T: 'a> {
> > /// A borrowed struct from the parent, for example, for inheriting style.
> > Borrowed(&'a Arc<T>),
> > /// An owned struct, that we've already mutated.
> > + Owned(Arc<T>),
>
> It'd be nice to compare this with and without StyleArc (that is, reverting
> part 3), to see what's the impact of the weak count on perf.
I did that. The numbers were definitely worse than with part 4, but it was a bit noisy and hard to tell how much they improved over std::sync::arc. Given my microbenchmarks (a few ns improvement on release and drop), my guess is that it isn't a huge difference.
A more interesting question is how much the performance would regress if we used the CAS get_mut() on std::sync::arc rather than the free get_mut() on stylearc::Arc. If we end up not doing arena allocation, and we at some point want to consider switching back to std::Arc, we could measure that.
Assignee | ||
Comment 20•8 years ago
|
||
Assignee | ||
Updated•8 years ago
|
Summary: stylo: Add a StyleArc to use for style structs → stylo: Add a custom Arc to use for style structs
Comment 21•8 years ago
|
||
I think this can be marked as fixed now, please reopen if it's not the case.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
You need to log in
before you can comment on or make changes to this bug.
Description
•