AsyncPanZoomController::mState needs to be made threadsafe

NEW
Unassigned

Status

()

defect
P3
normal
6 years ago
a year ago

People

(Reporter: kats, Unassigned)

Tracking

(Blocks 1 bug, {arch})

25 Branch
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [gfx-noted])

Attachments

(3 attachments, 6 obsolete attachments)

2.39 KB, patch
Details | Diff | Splinter Review
29.08 KB, patch
Details | Diff | Splinter Review
60.82 KB, patch
Details | Diff | Splinter Review
The mState variable in APZC is accessed in all sorts of functions, some of which are run on the UI/controller thread and some of which are run on the compositor thread. And maybe some run on the gecko thread! In many places it is accessed without locking. Botond fixed one instance of this in bug 895904 but there are many more. This code needs to be made properly threadsafe, or enforce rules like "mState should only be accessed from thread X".
Assignee: nobody → ajones
Comment on attachment 782945 [details] [diff] [review]
Create a safer lock object

Although I'm not a fan of locks I decided to explore an idea of a safer lock mechanism. One where the data is inaccessible without a lock.

Similarly we need a way to ensure fields outside the lock are only accessed from a single thread.
Attachment #782945 - Flags: feedback?(bugmail.mozilla)
Comment on attachment 782945 [details] [diff] [review]
Create a safer lock object

Review of attachment 782945 [details] [diff] [review]:
-----------------------------------------------------------------

Oh neat, I like this! But I think it should live in someplace more general than gfx/layers. Also BenWa suggested something very similar while reviewing my patches recently. https://bugzilla.mozilla.org/show_bug.cgi?id=866232#c85
Attachment #782945 - Flags: feedback?(bugmail.mozilla)
Attachment #782945 - Flags: feedback?(bgirard)
Attachment #782945 - Flags: feedback+
Comment on attachment 782945 [details] [diff] [review]
Create a safer lock object

Review of attachment 782945 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. I agree with kats that this should be shared in gecko. Have you looked if there's prior art for this? Might be worth a check if someone has a better design for this.

Let me know what you think about the naming.

::: gfx/layers/ipc/LockObj.h
@@ +20,5 @@
> + * Usage:
> + *
> + * Declare your data to be locked like this:
> + *
> + *   LockObj<MyState> mStateLock;

LockObj to me is an object that performs the locking. I think it would be easier to read to have LockedObj<MyState>...

@@ +25,5 @@
> + *
> + * Access your data like this:
> + *
> + *   {
> + *     LockRef<MyState> state(mStateLock);

and have UnlockedRef<MyState>. Also I think it would be easier to read as:

UnlockRef<MyState> state = mStateLocked.unlock();

I think the wording is clearer this way: You have a 'locked' object that you can't use until you've properly 'unlocked' it.

Overall this is great. But here's a few limitations to this design:
1) It doesn't support re-entrancy.
2) It doesn't support multiple objects guarded by the same lock which we have for APZC.

1 can be solved by having a re-entrant variant.
2 can be solved by using a struct so maybe it's not really a problem. It wont work where a lock in one object is guarding data in another object but I think that's fine. It will force people to avoid these which is a better design.
Attachment #782945 - Flags: feedback?(bgirard) → feedback+
(In reply to Benoit Girard (:BenWa) from comment #4)
> I think the wording is clearer this way: You have a 'locked' object that you
> can't use until you've properly 'unlocked' it.

I'm not sure unlock() is the right name. We could call the type ExclusiveRef and the function to create it GetExclusiveRef(). It would look more like this:

auto excl = mLockedState.GetExclusiveRef();
printf("%d\n", excl->x);

> Overall this is great. But here's a few limitations to this design:
> 1) It doesn't support re-entrancy.

I was thinking along the lines of passing a LockRef object into functions that expect to be already locked. Using a re-entrant lock is simpler but less explicit. I don't feel strongly either way. I just wanted to put something up for discussion.

> 2) It doesn't support multiple objects guarded by the same lock which we
> have for APZC.

I was expecting to use a struct.
Comment on attachment 783570 [details] [diff] [review]
Fix locking inside APZC using LockObj;

This is a work in progress patch. I'd like feedback on whether this is a sensible way to go. It conflicts with https://bugzilla.mozilla.org/attachment.cgi?id=783192

There is a remaining systematic problem. Currently we lock to read PanZoomState then we lock again to update it. This could result in a lost update in between. We need to fix that as well.
Attachment #783570 - Flags: review?(bugmail.mozilla)
Attachment #783570 - Flags: review?(bgirard)
Comment on attachment 783570 [details] [diff] [review]
Fix locking inside APZC using LockObj;

Review of attachment 783570 [details] [diff] [review]:
-----------------------------------------------------------------

I have some misgivings about this approach but I'm not sure how valid they are. I think my main problem after looking at this patch is that it leans towards locking variables individually rather than locking the class as a whole. To me this seems like a potential footgun because different pieces of code running on different threads can change different variables concurrently, resulting in what could be an invalid state for the class as a whole.

The other problem this opens up is that if you have two variables that are wrapped in a LockObj and you create ExclRefs on them (in separate functions) in the opposite order, it's a classic deadlock scenario.

In order to get around the problem you described, where you have two locked accesses to the object in a function and the object can change in between, you basically have to hold the lock for the entire function. That makes the deadlock scenarios more likely because then you're more likely to hold locks on multiple variables across entire methods and you'd always have to ensure that all the locks are acquired in some consistent global ordering. In my mind going down this approach just leads to increasing complexity and may not really be worth it.
Attachment #783570 - Flags: review?(bugmail.mozilla) → review-
(That being said I do still think the LockObj class is pretty handy and is probably useful in other pieces of code, but maybe just not in something with as much state as APZC. Perhaps if we broke up APZC into smaller classes with more self-contained pieces this would work better. Also the variant where you have a wrapper around some state that ensures it is only accessed on a particular thread might still be useful here.)
This was supposed to be a feedback? not a review? so I apologise for that.

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #12)
> I have some misgivings about this approach but I'm not sure how valid they
> are. I think my main problem after looking at this patch is that it leans
> towards locking variables individually rather than locking the class as a
> whole. To me this seems like a potential footgun because different pieces of
> code running on different threads can change different variables
> concurrently, resulting in what could be an invalid state for the class as a
> whole.

I agree that we shouldn't lock variables individually. We clearly don't want to do that. I don't see how LockObj or anything else I've done promotes that.

I was thinking along the lines of having a LockObj struct containing all the shared state and a ThreadObj struct for each thread. ThreadObj enforces access from a single thread. I haven't attached the patch yet.
Attachment #783570 - Flags: review?(bgirard) → feedback?(bgirard)
Comment on attachment 783570 [details] [diff] [review]
Fix locking inside APZC using LockObj;

Review of attachment 783570 [details] [diff] [review]:
-----------------------------------------------------------------

Ah that makes sense. In that case I'm fine with it.
Attachment #783570 - Flags: review- → feedback+
Comment on attachment 783564 [details] [diff] [review]
Remove Axis dependency on APZC;  in FrameMetrics rather than relying on APZC->GetFrameMetrics().

Review of attachment 783564 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/ipc/Axis.h
@@ +15,5 @@
>  
>  namespace mozilla {
>  namespace layers {
>  
> +class FrameMetrics;

I'm a bit conflicted by what I think of the changes in this file. Axis no longer depends on APZC which it shouldn't conceptually and depends on FrameMetrics which is a good thing. But we're cluttering many signature now with a FrameMetrics which is more confusing. I'm on the fence about this maybe towards yes. What do you think kats?
Attachment #783564 - Flags: review?(bgirard)
Attachment #783564 - Flags: review+
Attachment #783564 - Flags: feedback?(bugmail.mozilla)
Comment on attachment 783564 [details] [diff] [review]
Remove Axis dependency on APZC;  in FrameMetrics rather than relying on APZC->GetFrameMetrics().

Review of attachment 783564 [details] [diff] [review]:
-----------------------------------------------------------------

Yeah it is kinda cluttered. I also don't really have a strong opinion on this one way or another. If it makes future changes and/or scoping easier then go for it.
Attachment #783564 - Flags: feedback?(bugmail.mozilla)
Without this change we would have to either pass ExclRef everywhere (which would be worse) or use a re-entrant lock.

I agree that the clutter is annoying. The point of the patch is that we no longer have Axis calling back into APZC and assuming a locked state. I'd describe this as reduced coupling at the cost of increased noise.
Comment on attachment 783570 [details] [diff] [review]
Fix locking inside APZC using LockObj;

Review of attachment 783570 [details] [diff] [review]:
-----------------------------------------------------------------

::: gfx/layers/ipc/AsyncPanZoomController.cpp
@@ +325,3 @@
>      const MultiTouchInput& multiTouchInput = aEvent.AsMultiTouchInput();
>      if (multiTouchInput.mType == MultiTouchInput::MULTITOUCH_START) {
>        SetState(WAITING_LISTENERS);

I'm not convinced this is safe. We read above using a lock and in SetState we will grab a new lock and update. This mean that there's a time of use, time of check race here. I imagine that this problem is widespread throughout this patch. We need to think carefully about this. Tentatively f- unless I overlooked something.
Attachment #783570 - Flags: feedback?(bgirard) → feedback-
(In reply to Benoit Girard (:BenWa) from comment #19)
> I'm not convinced this is safe. We read above using a lock and in SetState
> we will grab a new lock and update. This mean that there's a time of use,
> time of check race here. I imagine that this problem is widespread
> throughout this patch. We need to think carefully about this. Tentatively f-
> unless I overlooked something.

I mentioned this problem in https://bugzilla.mozilla.org/show_bug.cgi?id=897017#c11 and I would prefer to deal with this in a separate patch on the same bug. I wanted to start with patch which is close to the existing behaviour before embarking on fixing the SetState() issue in order to make the patches easier to understand/review.
Ohh my bad. I guess it is a pre-existing problem.
Attachment #783570 - Flags: feedback- → feedback+
I can't close this bug without making mState threadsafe as this is actually the title of the bug. I'm just going to deal with it as a separate patch.
Comment on attachment 783563 [details] [diff] [review]
Create a safer lock object;

Review of attachment 783563 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good.

::: xpcom/glue/LockObj.h
@@ +11,5 @@
> +
> +namespace mozilla {
> +
> +/*
> + * LockObj a few advantages over a regular monitor

"has"

@@ +44,5 @@
> +    LockObj(const char* aName) : mMonitor(aName) { }
> +
> +  private:
> +    Monitor mMonitor;
> +    T mState;

We usually don't indent public:/private:.

@@ +50,5 @@
> +  friend class ExclRef<T>;
> +};
> +
> +template<typename T>
> +class ExclRef

How about we just call this Locked<T>?

I think we'll need to add Wait and Notify to Locked<T>, but we can do that in a separate patch.
Component: Graphics: Layers → Panning and Zooming
Attachment #783563 - Attachment is obsolete: true
Attachment #783563 - Flags: review?(roc)
Assignee: ajones → nobody
No longer blocks: apz-fennec
Keywords: arch
Whiteboard: [gfx-noted]
OS: Gonk (Firefox OS) → All
You need to log in before you can comment on or make changes to this bug.