[DeviceStorage] Fix thread safety design issues with volume service

RESOLVED FIXED in Firefox 43

Status

()

Core
DOM: Device Interfaces
RESOLVED FIXED
3 years ago
2 years ago

People

(Reporter: aosmond, Assigned: aosmond)

Tracking

unspecified
FxOS-S5 (21Aug)
x86_64
Linux
Points:
---

Firefox Tracking Flags

(firefox43 fixed)

Details

(Whiteboard: [NG Device Storage], [backout-asap])

Attachments

(1 attachment, 2 obsolete attachments)

Volume service is designed to be accessed off the main thread, but there are problems with the current implementation.

1) When a volume is removed via nsVolumeService::RemoveVolumeByName, the actual removal of the pointer from mVolumeArray is not protected by the array lock.

2) When a volume is updated via nsVolumeService::UpdateVolume, clients of the service may already hold a pointer to the volume object being updated and be concurrently accessing its properties, including non-primitives such as nsString. Rather than update the object in place and risk indeterminate behaviour, the volume service should store a new volume object while holding the array lock. This way the client will have a consistent (if stale) state and may retry as appropriate if the operation fails as a result (this race condition should already be considered).

As a result, from the perspective of the volume service client, a volume object's properties are immutable.

Operations such as mount, unmount and format, may only be completed in the parent process, and thus are not a concern for web workers.
Assignee: nobody → aosmond
Status: NEW → ASSIGNED
Whiteboard: [NG Device Storage]
Created attachment 8612299 [details] [diff] [review]
bug1166320.patch, v1

try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=43e096e8f6c9
Attachment #8612299 - Flags: review?(dhylands)
Comment on attachment 8612299 [details] [diff] [review]
bug1166320.patch, v1

Review of attachment 8612299 [details] [diff] [review]:
-----------------------------------------------------------------

Excellent.

::: dom/system/gonk/nsVolumeService.cpp
@@ +394,5 @@
> +  {
> +    MonitorAutoLock autoLock(mArrayMonitor);
> +    nsVolume::Array::index_type volIndex;
> +    nsRefPtr<nsVolume> vol = FindVolumeByName(aVolume->Name(), &volIndex);
> +    if (vol) {

nit: They seem to like early returns, so I would be inclined to change this to be:

> if (!vol) {
>   mVolumeArray.AppendElement(aVolume);
>   return;
> }

and then unindent the rest of this function.

@@ +495,5 @@
> +    nsRefPtr<nsVolume> vol = FindVolumeByName(aName, &volIndex);
> +    if (!vol) {
> +      return;
> +    }
> +    mVolumeArray.RemoveElementAt(volIndex);

good catch!
Attachment #8612299 - Flags: review?(dhylands) → review+
(In reply to Dave Hylands [:dhylands] from comment #2)
> ::: dom/system/gonk/nsVolumeService.cpp
> @@ +394,5 @@
> > +  {
> > +    MonitorAutoLock autoLock(mArrayMonitor);
> > +    nsVolume::Array::index_type volIndex;
> > +    nsRefPtr<nsVolume> vol = FindVolumeByName(aVolume->Name(), &volIndex);
> > +    if (vol) {
> 
> nit: They seem to like early returns, so I would be inclined to change this
> to be:
> 
> > if (!vol) {
> >   mVolumeArray.AppendElement(aVolume);
> >   return;
> > }
> 
> and then unindent the rest of this function.
>

Won't returning early there skip the observer notifications?
Created attachment 8614656 [details] [diff] [review]
bug1166320.patch, v2 [carries r=dhylands]

try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d636461e155c

While I couldn't return any earlier in the section noted in the review, I did manage to reduce the LOC / indentation with some minor shuffling.
Attachment #8612299 - Attachment is obsolete: true
Attachment #8614656 - Flags: review+

Comment 5

2 years ago
https://hg.mozilla.org/integration/b2g-inbound/rev/371c3a42b25c
https://hg.mozilla.org/mozilla-central/rev/371c3a42b25c
Status: ASSIGNED → RESOLVED
Last Resolved: 2 years ago
status-firefox41: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
This is causing a smoke test blocker bug 1171768. We will need this backed out.
QA Whiteboard: [QAnalyst-Triage+]
Whiteboard: [NG Device Storage] → [NG Device Storage], [backout-asap]
Depends on: 1171768
This was backed out in bug 1171768.
https://hg.mozilla.org/integration/b2g-inbound/rev/163b199605d9
Status: RESOLVED → REOPENED
status-firefox41: fixed → ---
Resolution: FIXED → ---
Target Milestone: mozilla41 → ---
The issue seems to be related to mount locks.

Volumes are managed on the IOThread and are the "master" for all of the information about a volume, except for mount locks.

nsVolumes are managed on the main thread are are the "master" for the mount lock information. This is why the generation number exists, because its possible for updates to be going from main-thread to IOThread for the mount lock information at the same time as volume state changes coming from Volume Manager on the IOThread to change the state.

The generation number increments each time the volume transitions to the MOUNTED state and the mount lock is only valid for a particular generation.

At startup, the VolumeManager marks the volume as "locked" and expects to get an update from the VolumeService to indicate that its really locked or not.

I suspect that this bit of logic got broken in this bug.
Andrew, could you hold off on landing this for 3 weeks? We have no confirmation of this, but it's possible that the patches here caused bug 1171827. Due to the difficulty in reproing that bug, it's difficult for us to verify this causality. It's also difficult to verify that this patch won't cause it again. Thanks for your understanding.
Created attachment 8649541 [details] [diff] [review]
bug1166320.patch, v3

Tested original patch, wouldn't mount my external sdcard on an aries. Tested without and with the new patch, would mount my sdcard.
Attachment #8614656 - Attachment is obsolete: true
Attachment #8649541 - Flags: review?(dhylands)
Target Milestone: --- → FxOS-S5 (21Aug)
Attachment #8649541 - Flags: review?(dhylands) → review+

Comment 12

2 years ago
https://hg.mozilla.org/integration/b2g-inbound/rev/db0ec7465b4a
https://hg.mozilla.org/mozilla-central/rev/db0ec7465b4a
Status: REOPENED → RESOLVED
Last Resolved: 2 years ago2 years ago
status-firefox43: --- → fixed
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.