Closed Bug 1368922 Opened 7 years ago Closed 7 years ago

stylo: Figure out a way to set the dirty bit for DeclarationBlock thread-safety during parallel traversal

Categories

(Core :: CSS Parsing and Computation, enhancement, P2)

enhancement

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox57 --- fixed

People

(Reporter: hiro, Assigned: heycam)

References

Details

Attachments

(1 file, 1 obsolete file)

      No description provided.
Priority: -- → P2
Priority: P2 → --
Hiro, do we still need this?
Flags: needinfo?(hikezoe)
Yes, we might not need thread-safety actually, but current way is a hack.
Flags: needinfo?(hikezoe)
Priority: -- → P4
I'm missing context here. What is this bug about, and is it something we need to prioritize?
Flags: needinfo?(sphink)
The hazard reported here is:

Analyzing Gecko_UnsetDirtyStyleAttr ...
Error: Field write mozilla::DeclarationBlock.mIsDirty
Location: void mozilla::DeclarationBlock::UnsetDirty() @ /builds/worker/checkouts/gecko/obj-analyzed/dist/include/mozilla/DeclarationBlock.h#86
Stack Trace:
Gecko_UnsetDirtyStyleAttr @ /builds/worker/checkouts/gecko/layout/style/ServoBindings.cpp#467

So it appears to be writing to mIsDirty in a non-threadsafe manner. I don't know if Gecko_UnsetDirtyStyleAttr is called from multiple threads, but this is a basic unprotected heap write.
Flags: needinfo?(sphink)
Hm, it does seem like we call it during style resolution. Hiro, what makes this safe, and how can we improve it?
Flags: needinfo?(hikezoe)
Priority: P4 → P3
See bug 1361938 comment 59 and following for the whole context, where I pointed this out (and got basically ignored).

Seems like bug 1361938 comment 66 may explain why is safe(ish?). But then hiro pointed out in bug 1361938 comment 74 that "it's not thread-safe at all", though that's not clear why.

Basically, this is clearly not synchronized in any way. If there's something that makes this work, that is only the fact that we only care about that flag value for declarations that are unique... But this should be either improved, or the invariants asserted somehow.
Ok, I think we should sort this out.
Priority: P3 → P2
The reason why I think it's not thread safe is that we unconditionally unset the dirty bit, so if the condition for processing RESTYLE_STYLE_ATTRIBUTES differs from the condition for modifying the declaration block, that's the problem.  I think we should unset mIsDirty in flag DeclarationBlock::UnsetDitry() only if it's true.  Also if we have a way to know the declaration block is shared among other elements, we can add an assertion there.
Assignee: nobody → hikezoe
Status: NEW → ASSIGNED
Flags: needinfo?(hikezoe)
Cameron, do you know there is a way to tell a given declaration block is shared among other elements?
Flags: needinfo?(cam)
(In reply to Hiroyuki Ikezoe (:hiro) from comment #8)
> The reason why I think it's not thread safe is that we unconditionally unset
> the dirty bit, so if the condition for processing RESTYLE_STYLE_ATTRIBUTES
> differs from the condition for modifying the declaration block, that's the
> problem.  I think we should unset mIsDirty in flag
> DeclarationBlock::UnsetDitry() only if it's true.  Also if we have a way to
> know the declaration block is shared among other elements, we can add an
> assertion there.

How does this help? This is not a data race anyway, assuming boolean sets are atomic.
Oh? Setting a boolean value is an atomic operation? I did not know that. If so, we don't need to worrying about thread safety there, I think.
(In reply to Hiroyuki Ikezoe (:hiro) from comment #12)
> Oh? Setting a boolean value is an atomic operation? I did not know that. If
> so, we don't need to worrying about thread safety there, I think.

I'd say assuming. I'm not sure it's guaranteed at all, and it still seems super fishy.

The key is: If two elements share a style attribute, what prevents us from running that code once and make the other element not think that it has a "dirty" style attribute?
(In reply to Emilio Cobos Álvarez [:emilio] from comment #13)
> (In reply to Hiroyuki Ikezoe (:hiro) from comment #12)
> > Oh? Setting a boolean value is an atomic operation? I did not know that. If
> > so, we don't need to worrying about thread safety there, I think.
> 
> I'd say assuming. I'm not sure it's guaranteed at all, and it still seems
> super fishy.
> 
> The key is: If two elements share a style attribute, what prevents us from
> running that code once and make the other element not think that it has a
> "dirty" style attribute?

Right, we have no way for now.

Actually when style attribute is change, we generate a new style attribute and mark it dirty, if the style attribute marked dirty and if it's shared for other elements, it's a bug, we will see broken animations just like bug 1361938. So such cases should not happen, but nothing guarantees it.  (Honestly I don't want introduce mutex lock there).
(In reply to Hiroyuki Ikezoe (:hiro) from comment #14)
> > The key is: If two elements share a style attribute, what prevents us from
> > running that code once and make the other element not think that it has a
> > "dirty" style attribute?
> 
> Right, we have no way for now.

What does Gecko do here?

(In reply to Hiroyuki Ikezoe (:hiro) from comment #10)
> Cameron, do you know there is a way to tell a given declaration block is
> shared among other elements?

The only thing you could look at is the refcount of the DeclarationBlock.  But you would have to be confident that rule nodes are the only things holding strong references to DeclarationBlocks, for that to be accurate.
Flags: needinfo?(cam)
(In reply to Cameron McCormack (:heycam) from comment #15)
> The only thing you could look at is the refcount of the DeclarationBlock. 
> But you would have to be confident that rule nodes are the only things
> holding strong references to DeclarationBlocks, for that to be accurate.

(And this is false for the style attribute of course, so...)
Assignee: hikezoe → cam
(In reply to Hiroyuki Ikezoe (:hiro) from comment #14)
> Actually when style attribute is change, we generate a new style attribute
> and mark it dirty, if the style attribute marked dirty and if it's shared
> for other elements, it's a bug, we will see broken animations just like bug
> 1361938. So such cases should not happen, but nothing guarantees it. 
> (Honestly I don't want introduce mutex lock there).

So just to confirm, for correctness we would only need to care about when mIsDirty is changed from true to false, if we are sharing that DeclarationBlock.  But as soon as we set mIsDirty to true on a shared DeclarationBlock, we'll clone it and get a unique one?  So the only problem we have here is the setting of mIsDirty to false from multiple threads.  In that case, let's just make that field an atomic.
(In reply to Cameron McCormack (:heycam) from comment #17)
> (In reply to Hiroyuki Ikezoe (:hiro) from comment #14)
> > Actually when style attribute is change, we generate a new style attribute
> > and mark it dirty, if the style attribute marked dirty and if it's shared
> > for other elements, it's a bug, we will see broken animations just like bug
> > 1361938. So such cases should not happen, but nothing guarantees it. 
> > (Honestly I don't want introduce mutex lock there).
> 
> So just to confirm, for correctness we would only need to care about when
> mIsDirty is changed from true to false, if we are sharing that
> DeclarationBlock.  But as soon as we set mIsDirty to true on a shared
> DeclarationBlock, we'll clone it and get a unique one?  So the only problem
> we have here is the setting of mIsDirty to false from multiple threads.  In
> that case, let's just make that field an atomic.

Right.  And if we could tell whether the DeclarationBlock which has the dirty flag is unique in the parallel traversal, it would be really nice, but I have no idea how to do it.
OK.  We could add a separate debug-only reference count for rule nodes that use the DeclarationBlock, but that's probably overkill.  So I think not adding an assertion should be OK.
Comment on attachment 8909624 [details]
Bug 1368922 - Set mIsDirty atomically.

https://reviewboard.mozilla.org/r/181104/#review186312

Thank you! This looks good to me, but I don't fully understand MemoryOrdering thing, (I guess MemoryOrdering::Relaxed should be OK, but I am not totally sure), so this patch shoule be reviewed by someone who is familiar with that stuff.
Attachment #8909624 - Flags: review?(hikezoe) → review+
Bobby, quick r? on the memory ordering I'm using here?
Flags: needinfo?(bobbyholley)
Attachment #8909048 - Attachment is obsolete: true
Attachment #8909048 - Flags: review?(cam)
(In reply to Cameron McCormack (:heycam) from comment #22)
> Bobby, quick r? on the memory ordering I'm using here?

Relaxed should be fine, if we just care about reading or writing to the flag itself.
So to confirm, the basic constraints here are:
* We want to be able to set the bit to false from the parallel traversal.
* Other threads might also set it to false during the parallel traversal, and we want it to be idempotent.
* No thread ever sets the bit to true during the parallel traversal.
* No thread reads the value during the parallel traversal.

If all that's the case, Relaxed should be fine. But can you please add comments in the code around the atomic? Having it in the commit message is less discoverable.
Flags: needinfo?(bobbyholley)
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #24)
> So to confirm, the basic constraints here are:
> * We want to be able to set the bit to false from the parallel traversal.
> * Other threads might also set it to false during the parallel traversal,
> and we want it to be idempotent.
> * No thread ever sets the bit to true during the parallel traversal.
> * No thread reads the value during the parallel traversal.

Yes, that's right.

> If all that's the case, Relaxed should be fine.

I think my mental model of memory ordering is a little faulty.  Here is something I've been confused about.  When using Relaxed, what guarantees that, once we're back on the main thread, that we will see the false value that was written by one of the style worker threads?  Is it just that we're sure that the read will not happen simultaneously with the writes on the style worker threads, because we wait for the threads to be finished doing their work?  I can understand that with the timescales involved between the write and the read of the variable that the CPU isn't going to reorder the load after the store, but is that all we're relying on?
> <bholley> heycam: I believe the answer to that is that there's a SeqCst fence when context switching to/from the thread pool
We're sorry, Autoland could not rebase your commits for you automatically. Please manually rebase your commits and try again.

hg error in cmd: hg rebase -s 1aa023447008 -d b3619f0787cd: rebasing 421481:1aa023447008 "Bug 1368922 - Set mIsDirty atomically. r=hiro" (tip)
merging js/src/devtools/rootAnalysis/analyzeHeapWrites.js
warning: conflicts while merging js/src/devtools/rootAnalysis/analyzeHeapWrites.js! (edit, then use 'hg resolve --mark')
unresolved conflicts (see hg resolve, then hg rebase --continue)
(In reply to Cameron McCormack (:heycam) from comment #26)
> > <bholley> heycam: I believe the answer to that is that there's a SeqCst fence when context switching to/from the thread pool

Specifically, we use a condvar to switch to/from the rayon pool. And synchronization primitives like condvars and mutexes must have SeqCst fences, because they're used to protect critical regions in lock-based parallel programming where nobody thinks about stale caches.
https://hg.mozilla.org/mozilla-central/rev/1bd488e6c4d6
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla57
Thank you Cameron!
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: