[Buri][Settings]Bad performance with some SD card

RESOLVED FIXED in Firefox 23

Status

defect
P1
normal
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: sync-1, Assigned: dhylands)

Tracking

unspecified
1.0.1 IOT1 (10may)
ARM
Gonk (Firefox OS)

Firefox Tracking Flags

(blocking-b2g:tef+, firefox21 wontfix, firefox22 wontfix, firefox23 fixed, b2g18 fixed, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 fixed)

Details

(Whiteboard: [madrid][fixed-in-birch] QARegressExclude c=)

Attachments

(3 attachments, 1 obsolete attachment)

Reporter

Description

6 years ago
Firefox os  v1.0.1
 Mozilla build ID: 20130310070203.
 
 
 +++ This bug was initially created as a clone of Bug #427323 +++
 
 DEFECT DESCRIPTION:
 (1)Insert one 8G SD card with a lot of files inside
 (2)Switch on the phone, and go to settings, then try to go back to homescreen
 (3)the home screen will be shown after about 10 seconds. -NOK1
 (4)And go back to settings, try to check the "Media storage" menu, the details size(Music, Pictures, Movies,etc)can not be shown -- NOK2
 (5)Need to wait for several minutes, to see the details come out.
 (6)And every time, if you kill the settings APP and launch it again, it will cause the Phone low performance again.
 
 more info:
 About 17000 files in the SD card, scanning time is about: 2'40"(B2G process will occupy all the CPU during this time)
 
 REPRODUCING PROCEDURES:
 
 EXPECTED BEHAVIOUR:
 
 ASSOCIATE SPECIFICATION:
 
 TEST PLAN REFERENCE:
 
 TOOLS AND PLATFORMS USED:
 
 USER IMPACT:
 serious
 
 REPRODUCING RATE:
 100%
 
 For FT PR, Please list reference mobile's behavior:
 
 ++++++++++ end of initial bug #427323 description ++++++++++

Updated

6 years ago
blocking-b2g: --- → tef?

Comment 1

6 years ago
Posted file strace log

Comment 2

6 years ago
Some comments from TCL...


Expected behavior:
=================

Whatever the content of the SD-Card, we probably can't expect the end-user to wait more than 5 or 10 seconds to get the data he/she is looking for. And that's even more true when there is no UI feedback to show the handset is actually computing some values.

Also, the handset is never supposed to become that slow / unresponsive, especially after putting the application into background.




Defect investigation:
====================


1. On the SD-Card side:
======================

The SD-Card content is roughly as follow:
- 5,000 pictures
- 500 audio files
- a few videos
- more than 10,000 files of cached data coming from misc. Android applications

While we probably prefer the end-user to start using B2G with a clean SD-Card, there is no guarantee this will happen for everybody, and we still need to provide an acceptable experience when the SD-Card has been previously carried around different devices.


Possible improvement:
--------------------
Reduce the number of files to consider.

- The easiest solution would be to limit the depth of recursion. Typically a depth of 3 seems fine to store regular pictures and music. E.g. "/music/<artist>/<album>/...".
- Possibly, one could also ignore any hidden directory (e.g. starting by ".", like ".thumbnails"), although there is no directory of this kind on our problematic SD-Card...
- We could also ignore everything which is in directories called "cache". There were thousands of files in such directories, on our SD-Card. However, this solution is probably too risky, and doesn't even work in every language...

In our case, the first solution would divide the number of files by 3 and, more importantly, it would help to focus on useful content.



2. On Gecko side:
----------------

Check attached file strace.txt

Most of the CPU time is occupied by the b2g process, with 3 threads fighting to get it: "StreamTrans #xx". At the bottom of these 3 threads, the same function: "UsedSpaceFileEvent()", a function used to get the space occupied by all the files of a certain type. One thread for each type: "pictures", "music", "video".

Sadly these 3 threads do nearly the same thing at the same time: travel the SD-Card recursively, searching for files matching their type, and getting their size. A single thread doing a single travel (but for all the file types !), would be nearly as efficient as any of these 3 threads.

Getting the file size is quite a fastidious operation too. On Linux, a single call to stat64() is enough to determine if a directory entry is a directory, a link or a regular file, and get its size. But in Gecko, due to OS abstraction, 3 or 4 calls to stat64() are required. Multiply these 3 calls to stat64() by 3 threads, you'll find we are currently doing 9 or 10 calls to stat64() when a single would be enough to get everything we need...

However... flash blocks are cached, and an analyze with strace shows that what costs the most is not calling 10 times the same function (10 x 60us), but reading a new block of flash (sometimes several ms). As a proof of that, one can observe that the first time the Settings app needs 2'40" to show the result, but the following times (when inodes and dir. entries are already cached) the duration falls down to 0'30". And if I drop inodes and dir. entries at kernel level ("echo 2 > /proc/sys/vm/drop_caches"), then the duration goes back to 2'40"... Unfortunately, the time spent loading inodes and dir. entries depends on the number of files we have, and the number of directories we are parsing.


Possible improvements:
- Avoid calling 4 times the same function (stat64) with the same argument. Or, try to make the mCachedStat variable (in nsLocalFileUnix.cpp) actually useful so that only one system call is made.
- Create a single thread to process a storage unit. This thread would collect data for all the known file types. And the different app requests for different file types would just have to wait for this thread to complete its job.
- Slow down the requests, i.e. add some sleep in the threads, to make sure other other apps can run smoothly even if heavy work is happening in background. Sadly changing the CPU policy when the Settings app goes to background has no effect here, since everything happens in the b2g process...
- Reduce the number of files to consider, as mentioned before.




3. On Gaia side:
---------------

Strangely, checking the size of "music", "pictures" and "videos" is not even needed to display the first page of the Settings app. Only the free space is required, and can be obtained very quickly by some other way (statfs()).


Possible solutions:
- Do not ask the size by file type until the user enters the detail panel for each storage location.
- Cancel the storage size requests in case the application goes to background, and when the detail panel is exited.
- Remove the feature. I.e. do not display details by file type, just the free space and used space, which can be retrieved quickly (and accurately, c.f. #853714) by statfs().
qawanted to understand how big the problem is. for example on the performance of 500 pictures/50 audio files.

partner is now reporting with 
- 5,000 pictures
- 500 audio files
About 17000 files in the SD card, scanning time is about: 2'40"(B2G process will occupy all the CPU during this time)
Keywords: qawanted
Doug may have some thoughts about DeviceStorage's performance here.
I think we need a better understanding of what we're trying to measure here.

There are not only the various media types, length of video, etc... there is also different SDcard speeds: 
https://www.sdcard.org/developers/overview/speed_class/ 

and different hardware.
Also to note, Gallery, Music, Video handles some things differently.  There has been discussions in regards to this.  Maybe djf could present some info as well?
QA Contact: nhirata.bugzilla
This may make sense to compare against an android 4.0 device and see if our timings are similar or not.? 

Since stat64() is a linear process, wouldn't the time to complete just be half if we halve the number of files on the sd card.. If we have 17k files at 2'40, we have 8.5k files at 1'20, and we can just keep halving that until we have a number we find acceptable. 

I believe the choices provided earlier still stand: Possible solutions:

- Do not ask the size by file type until the user enters the detail panel for each storage location.

- Cancel the storage size requests in case the application goes to background, and when the detail panel is exited. <--- I think this should be a requirement in this case. ---->

- Remove the feature. I.e. do not display details by file type, just the free space and used space, which can be retrieved quickly (and accurately, c.f. #853714) by statfs().
I think for feature parity with android we will want to keep this functionality. 

- possibly extending mediadb to handle this type of request more effectively ? 
ex  DeviceStorageHelper.getStat(['music','video','pictures'], function(size) {
      var melement = document.getElementById('music-space');
      var pelement = document.getElementById('pictures-space');
      var velement = document.getElementById('video-space');
      formatSize(melement, size[0]);
      formatSize(velement, size[1]);
      formatSize(pelement, size[2]);
    });
Last part is just fictional, as the api currently doesn't support a call like that, but might be able to expose an api which made this more performant.
Keywords: qawanted
QA Contact: nhirata.bugzilla → dclarke
As some of the suggestions in comment 7 imply missing functionality, I would like to get input from product team on that possibility and Kaze about technical feasibility.
Flags: needinfo?(ffos-product)
Flags: needinfo?(kaze)
(In reply to dclarke@mozilla.com  [:onecyrenus] from comment #7)
> 
> - Do not ask the size by file type until the user enters the detail panel
> for each storage location.
> 
> - Cancel the storage size requests in case the application goes to
> background, and when the detail panel is exited. <--- I think this should be
> a requirement in this case. ---->

What sort of effort and risk is involved with these potential approaches?
What is the status here? We need a decision asap?
Flags: needinfo?(dclarke)
Doug, can you provide some input on comment 10?
Flags: needinfo?(ffos-product) → needinfo?(doug.turner)
The settings app (./js/media_storage.js) is doing some dumb things.  Basically for every type, it looks like we try to calculate the total space used.  If we edit the app, and restart, it looks like we will do it again -- while the last calculation is still running.  This data isn't stored anywhere, so it is pretty easy to get into a state where we just flood the IO thread with request to calculate disk size.  Pretty awesome.

We can patch this at the work week if it's tef+.  Otherwise, we can fix it up later.
Assignee: nobody → doug.turner
Flags: needinfo?(doug.turner)
blocking-b2g: tef? → tef+
Per discussion in triage session today we need to block on this, but we don't necessarily need to be able to show thumbnails for the large images that are causing problems here. IOW, we could show an image with an x throug it or what not and avoid generating a thumbnail for such images. That's just one idea, but we need something to here to keep the gallery app working with sd cards with large images on it.
Whiteboard: [status: needs workaround, real fix post 1.0.1]
(In reply to Johnny Stenback (:jst, jst@mozilla.com) from comment #14)
> Per discussion in triage session today we need to block on this, but we
> don't necessarily need to be able to show thumbnails for the large images
> that are causing problems here. IOW, we could show an image with an x throug
> it or what not and avoid generating a thumbnail for such images. That's just
> one idea, but we need something to here to keep the gallery app working with
> sd cards with large images on it.

Don't get the relationship with the thumbnails here, can you please clarify?
Flags: needinfo?(jst)
We have performance issues when we have to display thumbnails for a large number of files, because we first need to decode them into their full resolution.  If we were to display a generic image for these files, instead of the actual preview of the image, we expect to get to better performance numbers.  We don't have the data to back that up though, but it seems a reasonable assumption, unless Comment 13 describes all that is wrong here.
The problem I think this bug was talking about was that we stat *all* files on the SD card when the settings app (Media Storage) is opened.  This may take a very long time -- much worse on slow cards, of course.

Comments 14, 15, and 16 talk about a different issue -- that we are slow because of image decoding.  I do not think that this bug should track image decoding performance issues, but could be wrong.  jst, could you comment?  (clearing other needinfo flags for now)
Flags: needinfo?(kaze)
Flags: needinfo?(dclarke)
Comment 14 was of course not meant for this bug (it was for bug 861965). Sorry for the confusion.
Flags: needinfo?(jst)
Whiteboard: [status: needs workaround, real fix post 1.0.1]
Whiteboard: [madrid]
Target Milestone: --- → Madrid (19apr)
(In reply to Doug Turner (:dougt) from comment #17)
> The problem I think this bug was talking about was that we stat *all* files
> on the SD card when the settings app (Media Storage) is opened.  This may
> take a very long time -- much worse on slow cards, of course.
> 
> Comments 14, 15, and 16 talk about a different issue -- that we are slow
> because of image decoding.  I do not think that this bug should track image
> decoding performance issues, but could be wrong.  jst, could you comment? 
> (clearing other needinfo flags for now)

Do you need a Gaia folks to help you on this? (if this is related to Gaia code).
Whiteboard: [madrid] → [madrid][status: needs new patch]
bah.  looks like jdm found a bug in this patch.  I pushed to try:
   https://tbpl.mozilla.org/?tree=Try&rev=782fb2e08184
(In reply to Doug Turner (:dougt) from comment #22)
> I think this should work
>    https://tbpl.mozilla.org/?tree=Try&rev=2b7be07d76b2

Are the few test failures here okay?
Flags: needinfo?(sync-1)
Attachment #739528 - Flags: review?(dhylands)
no test failures that are related.
Flags: needinfo?(sync-1)
Target Milestone: 1.0.1 Madrid (19apr) → 1.0.1 IOT1 (10may)
Assignee

Comment 25

6 years ago
Comment on attachment 739528 [details] [diff] [review]
patch v.1

Review of attachment 739528 [details] [diff] [review]:
-----------------------------------------------------------------

Mostly nits.

Although it looks like you broke the usedSpace for sdcard/app storage types. So you need to add a 4th number to your cache (totalUsed).

r=me with this addressed.

::: dom/devicestorage/DeviceStorageRequestParent.cpp
@@ +532,5 @@
>  nsresult
>  DeviceStorageRequestParent::UsedSpaceFileEvent::CancelableRun()
>  {
>    NS_ASSERTION(!NS_IsMainThread(), "Wrong thread!");
> +  

nit: trailing whitespace (numerous places)

@@ +561,5 @@
> +    r = new PostUsedSpaceResultEvent(mParent, mFile->mStorageType, videoUsage);
> +  }
> +  else if (mFile->mStorageType.EqualsLiteral(DEVICESTORAGE_MUSIC)) {
> +    r = new PostUsedSpaceResultEvent(mParent, mFile->mStorageType, musicUsage);
> +  }

What about other storage types? (sdcard/apps)

@@ +565,5 @@
> +  }
> +  
> +  if (r) {
> +    NS_DispatchToMainThread(r);
> +  }

It seems like it should be an error if r is null. Perhaps assert?

::: dom/devicestorage/nsDeviceStorage.cpp
@@ +67,5 @@
>  
>  nsAutoPtr<DeviceStorageTypeChecker> DeviceStorageTypeChecker::sDeviceStorageTypeChecker;
>  
> +// this needs to be threadsafe.
> +nsAutoPtr<DeviceStorageUsedSpaceCache> DeviceStorageUsedSpaceCache::sDeviceStorageUsedSpaceCache;

This should use StaticAutoPtr from xpcom/base/StaticPtr.h

@@ +94,5 @@
> +  }
> +
> +  NS_ASSERTION(NS_IsMainThread(), "Wrong thread!");
> +
> +  DeviceStorageUsedSpaceCache* result = new DeviceStorageUsedSpaceCache();

Not sure why you added result and didn't just assign sDeviceStorgeUsedSpaceCache directly...

@@ +147,5 @@
> +}
> +
> +void
> +DeviceStorageUsedSpaceCache::SetUsedSizeForType(const nsAString& aType,
> +                                                uint64_t usedSize) {

I think you should just pass in all 3 (or 4) sizes, rather than calling this for each size.

@@ +302,5 @@
> +  nsAutoString extensionMatch;
> +  extensionMatch.AssignLiteral("*");
> +  extensionMatch.Append(Substring(path, dotIdx));
> +  extensionMatch.AppendLiteral(";");
> +  

nit: trailing whitespace

@@ +746,5 @@
> +
> +  if (NS_FAILED(rv) || !e) {
> +    return;
> +  }
> +  

nit: Whole whack of trailing whitespace in this function.

@@ +801,5 @@
> +        *aVideosSoFar += size;
> +      }
> +      else if (type.EqualsLiteral(DEVICESTORAGE_MUSIC)) {
> +        *aMusicSoFar += size;
> +      }

This should also compute a totalUsed space for sdcard storage type.

@@ +811,1 @@
>  DeviceStorageFile::DirectoryDiskUsage(nsIFile* aFile, uint64_t* aSoFar, const nsAString& aStorageType)

Seems to me that this function is now redundant and should be removed?

Although that got me thinking, what happens when you call UsedSpace when the storageType is sdcard?

@@ +1712,5 @@
> +      r = new PostResultEvent(mRequest, videoUsage);
> +    }
> +    else if (mFile->mStorageType.EqualsLiteral(DEVICESTORAGE_MUSIC)) {
> +      r = new PostResultEvent(mRequest, musicUsage);
> +    }

What happened to diskUsage for sdcard or app type? It looks like that's broken after this change.

::: dom/devicestorage/nsDeviceStorage.h
@@ +63,5 @@
> +
> +  class InvalidateRunnable MOZ_FINAL : public nsRunnable
> +  {
> +    public:
> +      InvalidateRunnable(DeviceStorageUsedSpaceCache *aCache) {

nit: space on wrong side of *

@@ +68,5 @@
> +        mOwner = aCache;
> +      }
> +
> +      ~InvalidateRunnable() {}
> +    

nit: trailing whitespace

@@ +93,5 @@
> +    mIOThread->Dispatch(aRunnable, NS_DISPATCH_NORMAL);
> +  }
> +
> +  nsresult GetUsedSizeForType(const nsAString& aStorageType, uint64_t* usedSize);
> +  void SetUsedSizeForType(const nsAString& aStorageType, uint64_t usedSize);

There seems to be some inconsistency in the usage of uint64_t versus int64_t.

For example: PostFree/UsedSpaceResultEvent takes an in64_t and DirectoryDiskUsage takes int64_t ptrs.

@@ +102,5 @@
> +  nsresult GetPicturesUsedSize(uint64_t* usedSize);
> +  nsresult GetMusicUsedSize(uint64_t* usedSize);
> +  nsresult GetVideosUsedSize(uint64_t* usedSize);
> +
> +  bool mPicturesUsedSizeValid;

So it seems to me that having 3 separate valid flags is redundant. Since all 3 values are set at the same time, wouldn't it make more sense to just have one valid flag and one set routine that sets all 3 values?
Attachment #739528 - Flags: review?(dhylands) → review+
Posted patch patch v.2Splinter Review
enough changes here.  another review will not hurt.
Attachment #739528 - Attachment is obsolete: true
Attachment #742910 - Flags: review?(dhylands)
What kind of testing would we like QA to perform once this lands? SD card scanning new files from different apps, ...

30KB patch :(
Also, does this apply cleanly to b2g18 already?
Assignee

Comment 29

6 years ago
Comment on attachment 742910 [details] [diff] [review]
patch v.2

Review of attachment 742910 [details] [diff] [review]:
-----------------------------------------------------------------

After applying the patch, if I run my test app, I get no response to:

storage = navigator.getDeviceStorage('sdcard');
req = storage.usedSpace();

neither req.onerror nor req.onsuccess fires.

::: dom/devicestorage/DeviceStorageRequestParent.cpp
@@ +527,5 @@
> +    NS_DispatchToMainThread(r);
> +    return NS_OK;
> +  }
> +
> +  uint64_t picturesUsage, videosUsage, musicUsage = 0;

Wrong (this is the bug jdm pointed out)

uint64_t picturesUsage = videosUsage = musicUsage = 0;

@@ +541,5 @@
> +  else if (mFile->mStorageType.EqualsLiteral(DEVICESTORAGE_MUSIC)) {
> +    r = new PostUsedSpaceResultEvent(mParent, mFile->mStorageType, musicUsage);
> +  }
> +  else {
> +    NS_WARNING("Unknown storage type");

So I still don't understand how this works for sdcard and apps.

GetUsedSizeForType will fail for storageType of sdcard or apps, which means it will eventually get to this line.

Or am I missing something?
Attachment #742910 - Flags: review?(dhylands) → review-
Assignee

Comment 30

6 years ago
With these patches, I'd give you an r+
Attachment #743400 - Flags: review?(doug.turner)

Updated

6 years ago
Whiteboard: [madrid][status: needs new patch] → [madrid][status: needs new patch] u=user c=performance
Assignee: doug.turner → dhylands
Whiteboard: [madrid][status: needs new patch] u=user c=performance → [madrid][status: awaiting review] u=user c=performance
Attachment #743400 - Flags: review?(doug.turner) → review+
Assignee

Comment 31

6 years ago
I'll go ahead and merge the two patches and land them on the various branches...
Assignee

Comment 32

6 years ago
I had to tweak the patchsets for each branch...

https://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/3f3489356bbc
https://hg.mozilla.org/releases/mozilla-b2g18/rev/326914cf630c
https://hg.mozilla.org/projects/birch/rev/b2770cf6dcbd
Whiteboard: [madrid][status: awaiting review] u=user c=performance → [madrid][fixed-in-birch] u=user c=performance
https://hg.mozilla.org/mozilla-central/rev/b2770cf6dcbd
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
We don't have access to the correct SD Card. Marking as QARegressExclude.
Whiteboard: [madrid][fixed-in-birch] u=user c=performance → [madrid][fixed-in-birch] u=user c=performance, QARegressExclude

Updated

6 years ago
Whiteboard: [madrid][fixed-in-birch] u=user c=performance, QARegressExclude → [madrid][fixed-in-birch] QARegressExclude c=
You need to log in before you can comment on or make changes to this bug.