Closed Bug 1112989 Opened 5 years ago Closed 5 years ago

sync PContent::GetVolumes() takes too much time (200+ms)

Categories

(Core :: DOM: Device Interfaces, defect)

All
Gonk (Firefox OS)
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla38
Tracking Status
firefox36 --- wontfix
firefox37 --- wontfix
firefox38 --- fixed
b2g-v2.2 --- fixed
b2g-master --- fixed

People

(Reporter: kanru, Assigned: kanru)

References

Details

Attachments

(4 files, 3 obsolete files)

From bug 1102715 we found that the sync IPC message PContent::GetVolumes takes too much time and blocks app startup. The call was async but changed to sync in 1063877. We may have to change it back to async (we should generally avoid sync messages anyway..)
Please don't change it back to async until you can eliminate the races that were causing issues by it being async in the first place.
In particular, being async caused the crash in bug 1062541 under stress tests.

I have never personally observed the crash in bug 1062451 in any of the test that we perform locally, but CAF's stress test were able to trigger it.

So tread carefully.
Actually, we could probably use a hybrid approach.

Re-introduce the async method.

When device storage goes to get the volume information (which is unfortunately synchronous), if the async method hasn't yet made the information available, then fallback to using the sync call.

This should give us the async performance, and the sync reliability when the system is under stress and the async method hasn't yet provided the needed information.

I'd like to add, that all of this is only needed so that the child can get the initial state of the volumes. Once the initial volume state is established, everything can be async from then on.

I believe that there may have also been some interactions with Nuwa, where the initial volume state being retrieved was that of when the Nuwa process was launched and not what was in effect when the child process was launched. Since Nuwa is "frozen" it never sees the async updates. When Nuwa launches it gets a copy of the volume data from the parent process, and since the process of populating this data is itself async, so we need to distinguish this case (nuwa initialized data) from data acquired by the child once a child is changed into a real child, instead of being some suspended ghostly image of the parent.

What's important is that the child seems the volume state in effect at the time the child becomes active (i.e. knows its a real child process).

Going to this hybrid approach will mean that the child also has to deal with async updates that occur before the initial state is retrieved (deal with in this case means ignore - not crash).
Since DeviceStorage API itself is async, could  we use async GetVolumes and ReturnVolumes methods so the content process could do something while the B2G process is retrieving the latest information?
Can we send all the volume information that the child needs synchronously as part of the startup message? (PBrowser::LoadURL)
As a part of LoadURL() is a simple solution, but we don't always need that.  And, we need a permission checking?!
(In reply to Thinker Li [:sinker] from comment #6)
> As a part of LoadURL() is a simple solution, but we don't always need that. 
> And, we need a permission checking?!

In the parent, before we send LoadURL, we should be able to tell if the new app will be allowed to use devicestorage, right? We just wouldn't send the info if it doesn't have permission.
Although all DeviceStorage APIs are async, navigator.getDeviceStorages is sync so we need to have all the volumes info ready. Send volumes info along with LoadURL seems a feasible solution.
Attached patch Experimental patch (obsolete) — Splinter Review
I tried to preload volumes info right after preallocated process is created to mimic the effect of sending volumes info earlier.

Before:
https://people.mozilla.org/~bgirard/cleopatra/#report=a1c8de3a1d5c83b70f1581675947a442e659a233&search=GetVolumes&selection=0,1,110,138,496,80,56,497,499,504,56,506,507,56,511,512,515,525,527

After:
https://people.mozilla.org/~bgirard/cleopatra/#report=0fbda95801d9af013a9431df3c7efb2b73bcf970&search=GetVolumes

GetVolumes disappeared completely while hasPendingMessage still takes a lot of time.
Attached patch WIP patch (obsolete) — Splinter Review
Forward known information after initialize ContentParent. Not sure what's the best way to check permissions yet.

@dhylands, it looks like device-storage has defined several permissions, do you know what's the best way to check them?
Attachment #8555051 - Flags: feedback?(dhylands)
Comment on attachment 8555051 [details] [diff] [review]
WIP patch

Review of attachment 8555051 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/ipc/ContentParent.cpp
@@ +1341,5 @@
> +//         "device-storage:crashes",
> +//         "device-storage:pictures",
> +//         "device-storage:videos",
> +//         "device-storage:music",
> +//         "device-storage:sdcard"

We only need to look at the volume based permissions, which are the last 4 (pictures, videos, music, and sdcard)

I modifed my ds-test app (dev_apps/ds-test) and even with no special permissions, it can call getDeviceStorages and get the list of storages. It can't really do anything with them (like look at any files, create new files, etc.)

And I didn't see any permissions checks in the code for getDeviceStorages either.

So, maybe we should just send all of the volumes all of the time? Normally there will be 1 or 2.
Attachment #8555051 - Flags: feedback?(dhylands) → feedback+
(In reply to Dave Hylands [:dhylands] from comment #11)
> Comment on attachment 8555051 [details] [diff] [review]
> WIP patch
> 
> Review of attachment 8555051 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: dom/ipc/ContentParent.cpp
> @@ +1341,5 @@
> > +//         "device-storage:crashes",
> > +//         "device-storage:pictures",
> > +//         "device-storage:videos",
> > +//         "device-storage:music",
> > +//         "device-storage:sdcard"
> 
> We only need to look at the volume based permissions, which are the last 4
> (pictures, videos, music, and sdcard)
> 
> I modifed my ds-test app (dev_apps/ds-test) and even with no special
> permissions, it can call getDeviceStorages and get the list of storages. It
> can't really do anything with them (like look at any files, create new
> files, etc.)
> 
> And I didn't see any permissions checks in the code for getDeviceStorages
> either.
> 
> So, maybe we should just send all of the volumes all of the time? Normally
> there will be 1 or 2.

Thanks. So the permission check isn't a concern, great!
Fabrice, do you remember what we do for permission checks in DeviceStorage? See comment 11? I could have sworn we had some somewhere...
Flags: needinfo?(fabrice)
Before:
https://people.mozilla.org/~bgirard/cleopatra/#report=cc2f297537c63d1505e3e49b948fe3b9feb7b04d&filter=[{"type"%3A"RangeSampleFilter","start"%3A1479,"end"%3A2144}]&selection=0,1059,1096,1136,1426,8,21,1427,1430,1436,21,1437,1438,21,1440,1441,1444,1461,1465,4

After:
https://people.mozilla.org/~bgirard/cleopatra/#report=95427facb5a72d445cb5475a6e50ba85c99848c3&filter=[{"type"%3A"RangeSampleFilter","start"%3A1440,"end"%3A2210}]&selection=0,1036,1092,1139,1472,9,10,1473,1475,1487,10,1488,1489,10,1490,1495,1497,10,1507

Note that GetVolumes and HasPendingMessage does not always block content but when they do they could take 100ms+
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #13)
> Fabrice, do you remember what we do for permission checks in DeviceStorage?
> See comment 11? I could have sworn we had some somewhere...

Looks like device-storage doesn't prevent you to see the volumes but use them.
Attachment #8554480 - Attachment is obsolete: true
Attachment #8555051 - Attachment is obsolete: true
Attachment #8555676 - Flags: review?(dhylands)
(In reply to ben turner [:bent] (use the needinfo? flag!) from comment #13)
> Fabrice, do you remember what we do for permission checks in DeviceStorage?
> See comment 11? I could have sworn we had some somewhere...

The permission checks are done on all of the device storage APIs, but not done on navigator.getDeviceStorage or navigator.getDeviceStorages
Comment on attachment 8555676 [details] [diff] [review]
Part 2. Forward volumes info to ContentChild before app startup

Review of attachment 8555676 [details] [diff] [review]:
-----------------------------------------------------------------

I think that setting mGotVolumesFromParent too early is going to affect your testing, so not giving this r+ yet.

::: dom/ipc/ContentChild.cpp
@@ +89,5 @@
>  #include "nsThreadManager.h"
>  #include "nsAnonymousTemporaryFile.h"
>  #include "nsISpellChecker.h"
>  #include "nsClipboardProxy.h"
> +#include "nsVolumeService.h"

I think that you may need this inside a #ifdef MOZ_WIDGET_GONK

This applies to the pretty much this entire patchset, since the volumes stuff is currently only available under Gonk

@@ +2141,5 @@
>  bool
> +ContentChild::RecvVolumes(nsTArray<VolumeInfo>&& aVolumes)
> +{
> +    nsRefPtr<nsVolumeService> vs = nsVolumeService::GetSingleton();
> +    vs->RecvVolumesFromParent(aVolumes);

This should probably check that vs is non-null

::: dom/system/gonk/nsVolumeService.cpp
@@ +317,5 @@
> +  if (mGotVolumesFromParent) {
> +    // We've already done this, no need to do it again.
> +    return;
> +  }
> +  mGotVolumesFromParent = true;

Isn't this too early?

It seems that when RecvVolumesFromParent is called in a few lines that it will always return doing nothing.
Attachment #8555676 - Flags: review?(dhylands)
Comment on attachment 8555677 [details] [diff] [review]
Part 3. Remove sync PContent::GetVolumes

Review of attachment 8555677 [details] [diff] [review]:
-----------------------------------------------------------------

This patch looks fine :)
Attachment #8555677 - Flags: review?(dhylands) → review+
(In reply to Dave Hylands [:dhylands] from comment #20)
> Comment on attachment 8555676 [details] [diff] [review]
> Part 2. Forward volumes info to ContentChild before app startup
> 
> Review of attachment 8555676 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> I think that setting mGotVolumesFromParent too early is going to affect your
> testing, so not giving this r+ yet.
> 
> ::: dom/ipc/ContentChild.cpp
> @@ +89,5 @@
> >  #include "nsThreadManager.h"
> >  #include "nsAnonymousTemporaryFile.h"
> >  #include "nsISpellChecker.h"
> >  #include "nsClipboardProxy.h"
> > +#include "nsVolumeService.h"
> 
> I think that you may need this inside a #ifdef MOZ_WIDGET_GONK
> 
> This applies to the pretty much this entire patchset, since the volumes
> stuff is currently only available under Gonk
> 
> @@ +2141,5 @@
> >  bool
> > +ContentChild::RecvVolumes(nsTArray<VolumeInfo>&& aVolumes)
> > +{
> > +    nsRefPtr<nsVolumeService> vs = nsVolumeService::GetSingleton();
> > +    vs->RecvVolumesFromParent(aVolumes);
> 
> This should probably check that vs is non-null
> 
> ::: dom/system/gonk/nsVolumeService.cpp
> @@ +317,5 @@
> > +  if (mGotVolumesFromParent) {
> > +    // We've already done this, no need to do it again.
> > +    return;
> > +  }
> > +  mGotVolumesFromParent = true;
> 
> Isn't this too early?
> 
> It seems that when RecvVolumesFromParent is called in a few lines that it
> will always return doing nothing.

Ah, yes. This function was not really used and removed in part 3 so I missed that.
Flags: needinfo?(fabrice)
Comment on attachment 8555675 [details] [diff] [review]
Part 1. Forward know info to ContentChild before app startup

Review of attachment 8555675 [details] [diff] [review]:
-----------------------------------------------------------------

::: dom/ipc/ContentParent.h
@@ +409,5 @@
>      virtual ~ContentParent();
>  
>      void Init();
>  
> +    // Some information could be sent to content very early, they

nit: s/they/it
Attachment #8555675 - Flags: review?(fabrice) → review+
Assignee: nobody → kchen
Attachment #8555676 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #8556367 - Flags: review?(dhylands)
This bug is referenced in bug 1086963, which is about improving the startup time of the Gallery app.  So I'll mention that in my testing, my Gaia-level code that checks the availability of all the device storage volumes takes 200ms the first time I run the app after pushing it to the phone with adb. I assume it takes that long as well after rebooting the phone.  But on subsequent runs of the Gallery app, the same code runs in 10 to 15ms.

It may still be worth optimizing, but it seems worth mentioning that the 200ms referenced in the bug title is the worst-case time, not the normal time.
Comment on attachment 8556367 [details] [diff] [review]
Part 2. Forward volumes info to ContentChild before app startup

Review of attachment 8556367 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good now.
Attachment #8556367 - Flags: review?(dhylands) → review+
Comment on attachment 8556367 [details] [diff] [review]
Part 2. Forward volumes info to ContentChild before app startup

[Approval Request Comment]
Bug caused by (feature/regressing bug #): N/A
User impact if declined: slower app-start-up time when apps use device-storage the first time
Testing completed: patches has been stuck on m-c for a while
Risk to taking this patch (and alternatives if risky): apps may fail to enumerate device-storages if the patches go wrong but it's easily detectable
String or UUID changes made by this patch: N/A
Attachment #8556367 - Flags: approval-mozilla-b2g37?
Attachment #8556367 - Flags: approval-mozilla-b2g37? → approval-mozilla-b2g37+
Specifically, RecvVolumes in this commit...
   https://hg.mozilla.org/releases/mozilla-b2g37_v2_2/rev/fe71f56782ec
...has a different type on its parameter vs. the function it's trying to overload -- 
'const nsTArray<VolumeInfo> &' vs 'nsTArray<VolumeInfo> &&'.

This OS X clang build-log has a slightly-clearer error message than the GCC build log from comment 31 -- it mentions this specific comparison ('const nsTArray<VolumeInfo> &' vs 'nsTArray<VolumeInfo> &&'):
https://treeherder.mozilla.org/logviewer.html#?job_id=40317&repo=mozilla-b2g37_v2_2

I'm guessing the parent signature comes from this line added to the .ipdl, in the same commit:
> Volumes(VolumeInfo[] volumes);

We must have changed how .ipdl generates function-signatures since mozilla-b2g37_v2_2 branched, I guess? And this patch needs fixing to account for that in order to be backported.
Flags: needinfo?(kchen)
Flags: needinfo?(kchen)
Keywords: checkin-needed
You need to log in before you can comment on or make changes to this bug.