Closed Bug 1166320 Opened 9 years ago Closed 9 years ago

[DeviceStorage] Fix thread safety design issues with volume service

Categories

(Core :: DOM: Device Interfaces, defect)

x86_64
Linux
defect
Not set
normal

Tracking

()

RESOLVED FIXED
FxOS-S5 (21Aug)
Tracking Status
firefox43 --- fixed

People

(Reporter: aosmond, Assigned: aosmond)

References

Details

(Whiteboard: [NG Device Storage], [backout-asap])

Attachments

(1 file, 2 obsolete files)

Volume service is designed to be accessed off the main thread, but there are problems with the current implementation.

1) When a volume is removed via nsVolumeService::RemoveVolumeByName, the actual removal of the pointer from mVolumeArray is not protected by the array lock.

2) When a volume is updated via nsVolumeService::UpdateVolume, clients of the service may already hold a pointer to the volume object being updated and be concurrently accessing its properties, including non-primitives such as nsString. Rather than update the object in place and risk indeterminate behaviour, the volume service should store a new volume object while holding the array lock. This way the client will have a consistent (if stale) state and may retry as appropriate if the operation fails as a result (this race condition should already be considered).

As a result, from the perspective of the volume service client, a volume object's properties are immutable.

Operations such as mount, unmount and format, may only be completed in the parent process, and thus are not a concern for web workers.
Assignee: nobody → aosmond
Status: NEW → ASSIGNED
Whiteboard: [NG Device Storage]
Attachment #8612299 - Flags: review?(dhylands)
Comment on attachment 8612299 [details] [diff] [review]
bug1166320.patch, v1

Review of attachment 8612299 [details] [diff] [review]:
-----------------------------------------------------------------

Excellent.

::: dom/system/gonk/nsVolumeService.cpp
@@ +394,5 @@
> +  {
> +    MonitorAutoLock autoLock(mArrayMonitor);
> +    nsVolume::Array::index_type volIndex;
> +    nsRefPtr<nsVolume> vol = FindVolumeByName(aVolume->Name(), &volIndex);
> +    if (vol) {

nit: They seem to like early returns, so I would be inclined to change this to be:

> if (!vol) {
>   mVolumeArray.AppendElement(aVolume);
>   return;
> }

and then unindent the rest of this function.

@@ +495,5 @@
> +    nsRefPtr<nsVolume> vol = FindVolumeByName(aName, &volIndex);
> +    if (!vol) {
> +      return;
> +    }
> +    mVolumeArray.RemoveElementAt(volIndex);

good catch!
Attachment #8612299 - Flags: review?(dhylands) → review+
(In reply to Dave Hylands [:dhylands] from comment #2)
> ::: dom/system/gonk/nsVolumeService.cpp
> @@ +394,5 @@
> > +  {
> > +    MonitorAutoLock autoLock(mArrayMonitor);
> > +    nsVolume::Array::index_type volIndex;
> > +    nsRefPtr<nsVolume> vol = FindVolumeByName(aVolume->Name(), &volIndex);
> > +    if (vol) {
> 
> nit: They seem to like early returns, so I would be inclined to change this
> to be:
> 
> > if (!vol) {
> >   mVolumeArray.AppendElement(aVolume);
> >   return;
> > }
> 
> and then unindent the rest of this function.
>

Won't returning early there skip the observer notifications?
try: https://treeherder.mozilla.org/#/jobs?repo=try&revision=d636461e155c

While I couldn't return any earlier in the section noted in the review, I did manage to reduce the LOC / indentation with some minor shuffling.
Attachment #8612299 - Attachment is obsolete: true
Attachment #8614656 - Flags: review+
https://hg.mozilla.org/mozilla-central/rev/371c3a42b25c
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
This is causing a smoke test blocker bug 1171768. We will need this backed out.
QA Whiteboard: [QAnalyst-Triage+]
Whiteboard: [NG Device Storage] → [NG Device Storage], [backout-asap]
This was backed out in bug 1171768.
https://hg.mozilla.org/integration/b2g-inbound/rev/163b199605d9
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla41 → ---
The issue seems to be related to mount locks.

Volumes are managed on the IOThread and are the "master" for all of the information about a volume, except for mount locks.

nsVolumes are managed on the main thread are are the "master" for the mount lock information. This is why the generation number exists, because its possible for updates to be going from main-thread to IOThread for the mount lock information at the same time as volume state changes coming from Volume Manager on the IOThread to change the state.

The generation number increments each time the volume transitions to the MOUNTED state and the mount lock is only valid for a particular generation.

At startup, the VolumeManager marks the volume as "locked" and expects to get an update from the VolumeService to indicate that its really locked or not.

I suspect that this bit of logic got broken in this bug.
Andrew, could you hold off on landing this for 3 weeks? We have no confirmation of this, but it's possible that the patches here caused bug 1171827. Due to the difficulty in reproing that bug, it's difficult for us to verify this causality. It's also difficult to verify that this patch won't cause it again. Thanks for your understanding.
Tested original patch, wouldn't mount my external sdcard on an aries. Tested without and with the new patch, would mount my sdcard.
Attachment #8614656 - Attachment is obsolete: true
Attachment #8649541 - Flags: review?(dhylands)
Target Milestone: --- → FxOS-S5 (21Aug)
Attachment #8649541 - Flags: review?(dhylands) → review+
https://hg.mozilla.org/mozilla-central/rev/db0ec7465b4a
Status: REOPENED → RESOLVED
Closed: 9 years ago9 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: