Closed Bug 853350 Opened 11 years ago Closed 11 years ago

[Buri][Device Storage]Phone lost most functions when /data is full

Categories

(Firefox OS Graveyard :: General, defect, P1)

ARM
Gonk (Firefox OS)
defect

Tracking

(blocking-b2g:tef+, firefox21 wontfix, firefox22 wontfix, firefox23 fixed, b2g18 fixed, b2g18-v1.0.0 wontfix, b2g18-v1.0.1 verified)

RESOLVED FIXED
1.0.1 IOT1 (10may)
blocking-b2g tef+
Tracking Status
firefox21 --- wontfix
firefox22 --- wontfix
firefox23 --- fixed
b2g18 --- fixed
b2g18-v1.0.0 --- wontfix
b2g18-v1.0.1 --- verified

People

(Reporter: sync-1, Assigned: fabrice)

References

Details

(Keywords: late-l10n, Whiteboard: [madrid])

Attachments

(2 files, 7 obsolete files)

Firefox os  v1.0.1
 Mozilla build ID: 20130310070203.
 
 +++ This bug was initially created as a clone of Bug #426784 +++
 
 DEFECT DESCRIPTION:
 (1)Try to make the Device Storage full, such as:
 import 800 facebook friends: 40M
 Download some maps form here maps:40M
 Scan 500 music from SD card: 20M
 Or try to download some applications.
 
 (2)The mobile will not show any warning messages for the memory full --- NOK1
 (3)When try to start some applications, such as video player, emails, calendar,The apps will not be openned , but not show anything. And no any warnings are provided ---- NOK2
 (4)For Settings, you can not do any thange, it can not take effect.
 (5)Browser will not work at all, no page can be opened.
 (6)Received SMS, can not see the detail, show empty 
 
 In general, most functions can not work in this case. And not warning to end user. And also , no means for end user to clean the user data easily. --NOK3
 
 REPRODUCING PROCEDURES:
 
 EXPECTED BEHAVIOUR:
 
 ASSOCIATE SPECIFICATION:
 
 TEST PLAN REFERENCE:
 
 TOOLS AND PLATFORMS USED:
 
 USER IMPACT:
 serious
 
 REPRODUCING RATE:
 100%
 For FT PR, Please list reference mobile's behavior:
 
 ++++++++++ end of initial bug #426784 description ++++++++++
blocking-b2g: --- → tef?
Even worse case, is that, the mobile can not enter to homescreen for ever, when Data partitaion is only 44K free.
(tef+ as we need a way to recover from a full /data, or to prevent this from happening in the first place)
blocking-b2g: tef? → tef+
Jonas, didn't we have a meta bug on handling low storage space situations in the past?  Maybe Doug remembers?
We have bug 847720 for writing tests for this.

Clarifying summary since the /data partition is separate from the partition accessed through the DeviceStorage API, which generally is the /sdcard partition.
Summary: [Buri][Device Storage]Phone lost most functions when the Device Storage is full → [Buri][Device Storage]Phone lost most functions when /data is full
First, maybe we should define what the criteria is. I mean, when the warning message should be popped up? ex. when the available space is less than X% or X MB.
As a first step here, having the system app monitory available space on /data and put up a warning when it goes below X%.

Ideally we could even have a persistent warning in the notification center, along with an icon on the toolbar, as long as free disk space is below X%.

However we would need to also provide users with a reasonable description of what to do. For now the best thing we have is "clear private data" in the browser as well as uninstalling 3rd party applications.
adding UX into CC
(In reply to Jonas Sicking (:sicking) from comment #6)
> As a first step here, having the system app monitory available space on
> /data and put up a warning when it goes below X%.
> 
> Ideally we could even have a persistent warning in the notification center,
> along with an icon on the toolbar, as long as free disk space is below X%.
> 
> However we would need to also provide users with a reasonable description of
> what to do. For now the best thing we have is "clear private data" in the
> browser as well as uninstalling 3rd party applications.

Very good point. Thanks! 

So, what would be the scope for this tef+ case? Do we want to provide a full functionality(ex. to provide options/actions for users to choose, or even allows users to define the criteria) or we just need to do something automatically and make sure the phone can continue working?
Flags: needinfo?(jcarpenter)
For v1.0.1 I suggest we do the least work possible to ensure the device continues to function without *any* user involvement.  New UX at this point means more l10n strings and we are generally out of time for that.  We can afford to do more for v1.1, but for v1.0.1 we just need to avoid the device turning into a brick.

Do we know what parts of Gecko can cause unbounded /data growth?
Fabrice, how could we manage this at the minimum as described by Michael ?
Flags: needinfo?(fabrice)
(In reply to khu from comment #8)
> (In reply to Jonas Sicking (:sicking) from comment #6)
> > As a first step here, having the system app monitory available space on
> > /data and put up a warning when it goes below X%.
> > 
> > Ideally we could even have a persistent warning in the notification center,
> > along with an icon on the toolbar, as long as free disk space is below X%.

This sounds reasonable to me.

> > 
> > However we would need to also provide users with a reasonable description of
> > what to do. For now the best thing we have is "clear private data" in the
> > browser as well as uninstalling 3rd party applications.

A quick visit to the Settings -> Device storage panel shows that we don't show how the device storage is being used and by which applications in the same way that we do for Media storage.    I think that this is important in that it gives the user an idea of what is consuming internal phone storage and help them decide what they can clear off to free up space.   Without this, I can see that users could incorrectly clear images and videos to free up space which won't solve the low on-board storage situation because that media is stored separately onto media storage/sd-card.

The other issue is that it may not be very clear for some applications how to reduce the usage of internal storage. If for example we have a heavy email user where the email app is consuming a majority of the available internal storage, what can the user do to reduce the storage requirement for that app?   Delete email?   Change synchronization settings?   I think it would be useful to identify which apps may be at risk of utilizing extreme amounts of storage and decide what the workarounds for the user might be.

For now, the UX team will work on coming up with a low-space alert and notification UX as already suggested, but I recommend that we also work on as an upgrade to this, the ability to within device storage display per-app data usage.   I understand this was the original intention and that it was deferred for V1.
Flags: needinfo?(jcarpenter)
A warning is a good idea and should be implemented regardless (assuming there's time for the new strings, per Michael's comment above). However we should also remove the footgun. What can we do to automatically prevent device storage from filling to the point of massive functional breakage? Is it possible to impose an artificial floor on the amount of device storage available to the user?
We can probably keep some margin of available space (for instance the webapps install/update code keeps 5MB of free space), but that does not solve the root issue. What happens when you reach this floor, and how can we help the user?

The http cache could be flushed, but it's not on the /data partition.

One thing we could do is to flush the private data of 3rd party apps. It's not ideal but I don't see any other short term solution with no UI involved.
Flags: needinfo?(fabrice)
(In reply to Fabrice Desré [:fabrice] from comment #13)
> 
> One thing we could do is to flush the private data of 3rd party apps. It's
> not ideal but I don't see any other short term solution with no UI involved.

Actually that won't solve anything if the space is used by the core apps or by chrome datastores (like contacts and sms db).

What I think we can do is broadcast a "low-storage-space" system message when we are near a disk full situation. Apps could then take appropriate action. For instance, the browser app could flush its history database.
(In reply to Fabrice Desré [:fabrice] from comment #14)
> (In reply to Fabrice Desré [:fabrice] from comment #13)
> > 
> > One thing we could do is to flush the private data of 3rd party apps. It's
> > not ideal but I don't see any other short term solution with no UI involved.
> 
> Actually that won't solve anything if the space is used by the core apps or
> by chrome datastores (like contacts and sms db).
> 
> What I think we can do is broadcast a "low-storage-space" system message
> when we are near a disk full situation. Apps could then take appropriate
> action. For instance, the browser app could flush its history database.

That seems to make sense. Email could also jettison oldest stored messages.

I still think we need a warning to the user, though. 

So that's three measures we need to take, in order of severity:

1) Remove the footgun by installing artificial storage floor, preventing catastrophic app and system errors
2) Warn the user when they are approaching that artificial floor (requires new strings and logic)
3) Automatically mitigate by broadcasting "low storage space" system message so individual apps can flush themselves.
Flagging as UX Most Wanted, given severity of the problem.
Whiteboard: u=user c=keyboard s=ux-most-wanted
Some comments from TCL:

- One shall always assume the worst from end-users and 3rd-party apps. Surely it is nice to inform them about the low-storage situation. Surely some of them may be smart enough to help improve the situation. But, since there is no guarantee everybody will always behave as we wish, we shall never rely on them to solve our problems.

- From our point of view, the "Low storage" UI message is fine and needed. Ideally it would be clickable and lead to a system app showing the disk consumption for every app (/data only). But such message is merely a "hint" explaining why some features may experience troubles. The message alone is not enough; the solution must come from the firmware.

- When the user deletes some items in an indexed storage, is there some DB compaction happening ? Or the DB only grows but never shrinks ? In case compaction is supported, how is it triggered ? Doesn't it require some (more) flash space to work ? Generally speaking, any SQL operation (including deletion) requires some flash space, at least to write the DB journal... Deleting some old emails may not solve anything if DB compaction doesn't take place... Deleting some old emails may not even be possible if there is not enough space to write the journal file... (and disabling the journal file would lead to DB corruption...)

- From our point of view, the user needs to know how much data is used by each application (in the /data partition) to make cleaning choices. We also think an Android-style "clean private data" feature is a useful alternative to uninstalling applications (which is quite violent to fix a "simple" low-storage issue). Also, it is not possible to uninstall core apps, but we could at least clean related data.

- With such low cost handset, and so small space in /data, most users may spend their time living near the "disk full" limit, always trying to keep as many applications as possible installed. We would not advise to clean automatically and randomly the data of any application. Some of this data may have cost a significant amount of data traffic (and money) to retrieve (e.g. all those large and nearly useless Facebook pictures of the Contact app... you'd better be on WiFi or only have very few friends...)

- Last but not least, we would advise to secure the "critical path" going from the Boot to the Homescreen and the Settings app, so that the end-user can always arrive into the application which can display the space used by each app, and select which apps to uninstall or clear app private data (if this feature is implemented...) Globally "securing the critical path" would mean identifying all the files which need to be written along the way, then for each of these files:
    1. Either avoid writing,
    2. Or support writing failures,
    3. Or protect the file by a "reserved space" (e.g. a big dummy file that you delete just before writing and recreate just after)
Dave could you take a look at this and see if there's a relatively simple solution that would mitigate the immediate problem of device functionality?  Avoid UI and string localization would be good at this stage of the game, so perhaps a solution like:
- if the partition is over 95% full and the app writing to the partition is not a certified app, deny the write.  This would depend on having a consistent chokepoint to monitor application writes, but would keep the core device functional under these conditions.
Assignee: nobody → dhylands
I don't know that there is any simple solution.

I see 2 main things that contribute to space being consumed:

1 - Installing apps. We could probably add a threshold check somewhere so that we don't install an app if it doesn't wind up with some amount of free space.

2 - IndexedDB data. If you stick in an external SD card with many many images, then it will wind up writing lots of data into the db.

So, I'll reassign to bent (since he's the indexeddb "guy") and put a needinfo on fabrice re the app installation.
Assignee: dhylands → bent.mozilla
Flags: needinfo?(fabrice)
The webapps install/update code already keeps 5MB of free space (see https://mxr.mozilla.org/mozilla-central/source/dom/apps/src/AppDownloadManager.jsm#22).
Flags: needinfo?(fabrice)
Generally speaking, I don't know that doing something like "reserve 5MB for if we run low on space" is going to work unless we have something very special in mind for those 5MB.

However it is a very good point that if we run too low on storage we might not be able to delete data since it requires a journal file to be written. So we should look into if we need to reserve space for journal files during deletion.

I agree we should enable the user to see how much data each app consumes, but that requires big changes to gecko internals, so not something that we will finish in time for v1.0 or v1.1.

I think our main task for v1.1 is to ensure that our internal apps continue to work even if we are out of space on /data. And we should ensure that the user can perform certain storage-reducing operations such as clearing private data in the browser or uninstalling an app.

And add low-disk UI of course.
Whatever UI we come up with here is surely going to require new strings though. Which is a big schedule risk.
Ben pointed out that applications can always use .deleteDatabase to free up space.

This function deletes a whole database and so never requires any journals to be created. So it's safe to use even if the disk is full to the brim.

If we make sure that
* Our internal apps use .deleteDatabase as a fallback, possibly even by putting non-critical data in separate databases
* Add UI for "low on disk space"
* Ensure that we can delete 3rd party apps even when we're out of disk space

Then I think we're in good shape here. At least for v1.0
Keywords: late-l10n
I'm almost certainly the wrong assignee for this bug. As I understand it we need lots of product/UX design here as well as maybe exposing more info to web apps through new DOM APIs (remaining free space of whole disk, space used by a particular app, etc?).

There are some preventative measures we can take for indexedDB databases specifically, so I've created bug 858674 to address those options.

This needs a new owner.
Assignee: bent.mozilla → nobody
(In reply to Dave Hylands [:dhylands] from comment #19)
> I see 2 main things that contribute to space being consumed:
> 
> 1 - Installing apps. We could probably add a threshold check somewhere so
> that we don't install an app if it doesn't wind up with some amount of free
> space.
> 
> 2 - IndexedDB data. If you stick in an external SD card with many many
> images, then it will wind up writing lots of data into the db.

Isn't the offline cache also written in the /data partition?

(In reply to Jonas Sicking (:sicking) from comment #23)
> If we make sure that
> * Our internal apps use .deleteDatabase as a fallback, possibly even by
> putting non-critical data in separate databases
> * Add UI for "low on disk space"
> * Ensure that we can delete 3rd party apps even when we're out of disk space
> 
> Then I think we're in good shape here. At least for v1.0

I am afraid that with these three points an user might still brick her device. Note that even if our internal apps purge its data, the user is notified about the low disk space situation and we still allow her to remove 3rd party apps, she might, for example, browse to a website that stores offline cache data or make a phone call that would be logged in the dialer app indexed db. So, I would also add refusing apps to write in the /data partition while the device is in a low disk space situation. I guess that means to refuse storing indexedDB data, new apps and new offline cache.
Specs attached. There's a few questions in there for devs. 

* We need to determine the low storage thresholds that trigger these prompts.
* We need to confirm that the recommendations made in these prompts are accurate and helpful.
Whiteboard: u=user c=keyboard s=ux-most-wanted → u=user c=system s=ux-most-wanted
Attachment #734288 - Flags: review?(dhylands)
Attachment #734288 - Flags: review?(jonas)
Attachment #734288 - Flags: review?(fabrice)
Adding Jonas, Fabrice and Dave for feasibility review of the specs.
Comment on attachment 734288 [details]
Low Storage UX specs, v2

Drive-by review from a QA perspective.

review- for the proposed change on the apps side - there's already an acceptable UI firing indicating to the user that they are out of space, even though it's not perfectly polished UI. We should only pose l10n risk at this point only when it's absolutely necessary.

I question if it's also a good idea to pollute a user flow going to marketplace to start firing low storage prompts immediately - I'd only fire the no space warnings when it's absolutely necessary. For example, when I go to marketplace, how do you know I'm going there primarily to install an app? Maybe I have a different smaller purpose in the short term?

In an effort to balance l10n risk here, why can't the existing low storage prompt be used here in any of these places? It's not necessarily the most polished UX, but it will balance l10n risk here.

The banner looks fine, but I'm wondering if it's worthwhile firing a "low storage" noise as well when that banner appears.
Attachment #734288 - Flags: review-
So per comment 28 is there already some localized UI for out-of-space situations?  Maybe we're just not firing it here.
AFAIW, we have navigator.getDeviceStorage(TYPE).freeSpace API already,
but it's now implemented individually in each app to display their own notification/prompts.

I also don't know if it is worthy to show system banner if some app is already has its own notify prompt and some doesn't.

Josh, let's discuss this more in detail if you'd like to have a system-wide OO-Space UI.
So you're going to allow the stability of the system to depend on each and every app getting it right?
(In reply to Lucas Adamski [:ladamski] (plz needinfo) from comment #29)
> So per comment 28 is there already some localized UI for out-of-space
> situations?  Maybe we're just not firing it here.

There is, but only related to app installations:

http://mxr.mozilla.org/gaia/source/apps/system/js/app_install_manager.js#188
http://mxr.mozilla.org/gaia/source/apps/system/locales/system.en-US.properties#84
I think we're getting very focused on the UI here, but the reality is that we need to ensure the core device continues to function regardless of whether the user sees a UI and takes action based upon it.  So we still need a technical mitigation to keep the device working, and that seems separate from the UI.
(In reply to Lucas Adamski [:ladamski] (plz needinfo) from comment #33)
> I think we're getting very focused on the UI here, but the reality is that
> we need to ensure the core device continues to function regardless of
> whether the user sees a UI and takes action based upon it.  So we still need
> a technical mitigation to keep the device working, and that seems separate
> from the UI.

AFAIK when an OS is running out of space, you would definitely see some thing strange such as you couldn't open any big app or save would failed. I wonder what we could do to this but leave some gecko/gonk guys to verify the issue is solvable or not.
Comment on attachment 734288 [details]
Low Storage UX specs, v2

I don't see anything wrong with the dialogs (except for my note below).

However, I don't see any difference between the "No space left on the device" due to just filling up /data versus installing an app.

Shouldn't the thresholds just be preferences? I would imagine that the sizes would change depending on the phone.
Attachment #734288 - Flags: review?(dhylands)
Here's my current thoughts about how to mitigate that:
- Implement on the gecko side a free space watcher component. When the free space goes below the low space threshold, flip a pref. When it goes back above the low space threshold plus some additional margin (to prevent oscillations around the threshold) flip the pref back. We'll relay that change to Gaia's system app.
- Gecko APIs that need space will abort any action that takes disk space when the low space pref is on. This includes indexedDB transactions, localStorage, offline cache, installing / updating apps.
- All Gaia apps must behave sanely when these apis will fail.

On the UI side, the dialog looks ok to me, and that will be the easiest part here anyway.
Attachment #734288 - Flags: review?(fabrice)
(In reply to Fabrice Desré [:fabrice] from comment #36)
> Here's my current thoughts about how to mitigate that:
> - Implement on the gecko side a free space watcher component. When the free
> space goes below the low space threshold, flip a pref. When it goes back
> above the low space threshold plus some additional margin (to prevent
> oscillations around the threshold) flip the pref back. We'll relay that
> change to Gaia's system app.
> - Gecko APIs that need space will abort any action that takes disk space
> when the low space pref is on. This includes indexedDB transactions,
> localStorage, offline cache, installing / updating apps.

This sounds a good idea. But, does it mean that we need more efforts on this? 
And, we also need to identify which Gecko APIs for this, right? 

> - All Gaia apps must behave sanely when these apis will fail.
> 
> On the UI side, the dialog looks ok to me, and that will be the easiest part
> here anyway.
A few comments on Fabrice's proposal.

We might be able to avoid the requirement of actively checking for free space, which seems bad for performance. Basically, if we could use http://dell9.ma.utexas.edu/cgi-bin/man-cgi?inotify+7 we could probably do something like:

1- Have a reserved space as mentioned in comment 17. We will need to decide how big does this space needs to be.

2- During the boot process, set a watcher via "inotify_add_watch" (not sure about this yet as I couldn't read the doc in detail) for /data so we can monitor when /data is modified (IN_MODIFY flag).

3- When we notice that we are running out of space, we set the pref as Fabrice mentioned, free the reserved space and notify about the low storage situation.

4- At this point, all the mentioned APIs should fail except for certified apps (or even better a set of chosen apps like settings, system, homescreen) so we can let the user free space.

5- Once we get enough device storage again, flip the pref back and recreate the reserved space.

I've seen that inotify is already being used in Gonk https://mxr.mozilla.org/mozilla-central/source/widget/gonk/libui/EventHub.cpp

Thanks to Antonio for the pointer to inotify :)
(In reply to khu from comment #37)
> This sounds a good idea. But, does it mean that we need more efforts on
> this? 
> And, we also need to identify which Gecko APIs for this, right? 

I am afraid so.
I think that using inotify won't give the results you want (it only watches particular inodes - you need to set one up for every directory in order to see all changes).

fanotify looks like it might be a more appropriate API. See the first comment (the one with the notifier.c source)
http://serverfault.com/questions/386784/linux-trigger-a-real-time-alarm-on-a-low-disk-space-condition
Thanks Dave! It seems that fanotify might work :)
Assigning myself since I have started writing a fanotify-based component.
Assignee: nobody → fabrice
Blocks: 861894
Whiteboard: u=user c=system s=ux-most-wanted → u=user c=system s=ux-most-wanted [status: workaround maybe EOW]
Whiteboard: u=user c=system s=ux-most-wanted [status: workaround maybe EOW] → u=user c=system s=ux-most-wanted [status: workaround maybe EOW][madrid]
Attached patch wip (obsolete) — Splinter Review
wip patch, using the fanotify API. The call to fanotify_mark fails with errno=22, not yet sure why.

I have a custom keon kernel is someone wants to test that.
Attached patch wip 2 (obsolete) — Splinter Review
Some progress there, fan_notify_mark is now working (thanks jst for pointing out the 64 bits issue in 32 bits syscalls). It's now crashing in MessageLoopForIO::current()->WatchFileDescriptor(), investigation is ongoing.
Attachment #738557 - Attachment is obsolete: true
Target Milestone: --- → Madrid (19apr)
Attached patch wip 3 (obsolete) — Splinter Review
This one actually works! There is still some cleanup and polish to do, but I'd like to know if my thread usage there is ok or not.

Also, the component is currently in toolkit/components. That does not feel really right to me but I don't know where to put it.
Attachment #738957 - Attachment is obsolete: true
Attachment #739313 - Flags: feedback?(bent.mozilla)
(In reply to Jason Smith [:jsmith] from comment #28)
> Comment on attachment 734288 [details]
> Low Storage UX specs, v2
> 
> Drive-by review from a QA perspective.
> 
> review- for the proposed change on the apps side - there's already an
> acceptable UI firing indicating to the user that they are out of space, even
> though it's not perfectly polished UI. We should only pose l10n risk at this
> point only when it's absolutely necessary.
> 
> I question if it's also a good idea to pollute a user flow going to
> marketplace to start firing low storage prompts immediately - I'd only fire
> the no space warnings when it's absolutely necessary. For example, when I go
> to marketplace, how do you know I'm going there primarily to install an app?
> Maybe I have a different smaller purpose in the short term?
> 
> In an effort to balance l10n risk here, why can't the existing low storage
> prompt be used here in any of these places? It's not necessarily the most
> polished UX, but it will balance l10n risk here.
> 
> The banner looks fine, but I'm wondering if it's worthwhile firing a "low
> storage" noise as well when that banner appears.

Updated (and simplified) UX is detailed in bug #861921.
Whiteboard: u=user c=system s=ux-most-wanted [status: workaround maybe EOW][madrid] → u=user c=system s=ux-most-wanted [status: needs new patch AND needs new kernel][madrid]
Comment on attachment 739313 [details] [diff] [review]
wip 3

Review of attachment 739313 [details] [diff] [review]:
-----------------------------------------------------------------

I didn't do a full review here, just tried to stick with broad issues. Overall this looks great!

I would like to review the whole thing once you've addressed these.

::: toolkit/components/diskspacewatcher/fanotify.h
@@ +1,1 @@
> +#ifndef _LINUX_FANOTIFY_H

Not sure what the deal is with this file... This should one day soon be included in the kernel headers for our device, right?

::: toolkit/components/diskspacewatcher/nsDiskSpaceWatcher.cpp
@@ +140,5 @@
> +
> +/* readonly attribute bool lowFreeSpace; */
> +NS_IMETHODIMP nsDiskSpaceWatcher::GetLowFreeSpace(bool *aLowFreeSpace)
> +{
> +  *aLowFreeSpace = mLowFreeSpace;

You're reading a value here that may be modified on the IO thread at the same time. We should avoid data races like this always. (In the benign case this could cause some consumers to think that they are in one mode while others think they are in the opposite mode for a short period of time. In the non-benign case you could return true or false here randomly!)

I would just add another variable, mMainThreadLowFreeSpace or something, that tracks the last state set by DiskSpaceNotifier. DiskSpaceNotifier::Run would set it, and then you'd return mMainThreadLowFreeSpace here.

@@ +147,5 @@
> +
> +/* readonly attribute long freeSpace; */
> +NS_IMETHODIMP nsDiskSpaceWatcher::GetFreeSpace(int64_t *aFreeSpace)
> +{
> +  *aFreeSpace = mFreeSpace;

This definitely won't work on 32-bit. (You could be half-way through writing the 64-bit value on the IO thread while you read this value on the main thread and you'd end up returning the wrong value.)

I'd add another main-thread version of the variable like above.

@@ +176,5 @@
> +    return;
> +  }
> +  printf_stderr("fanotify_mark ok");
> +
> +  MessageLoopForIO::current()->WatchFileDescriptor(

This has a bool return value that you're not checking. I expect it can fail sometimes.

@@ +185,5 @@
> +  printf_stderr("nsDiskSpaceWatcher::Start ok");
> +}
> +
> +/* void start (); */
> +NS_IMETHODIMP nsDiskSpaceWatcher::Start()

You don't currently have any protection if Start gets called more than once. See below where I don't think you need Start/Stop, but if you do, you'll need to handle this case.

@@ +226,5 @@
> +
> +void nsDiskSpaceWatcher::OnFileCanReadWithoutBlocking(int aFd) {
> +  printf_stderr("nsDiskSpaceWatcher::OnFileCanReadWithoutBlocking\n");
> +  struct fanotify_event_metadata *fem = NULL;
> +  char buf[4096];

How did you pick this buffer size?

@@ +234,5 @@
> +  len = read(aFd, buf, sizeof(buf));
> +  if (len > 0) {
> +    fem = (fanotify_event_metadata *)buf;
> +
> +    while (FAN_EVENT_OK(fem, len)) {

I don't think this will work. FAN_EVENT_OK bails out if the length is less than the expected metadata length, so if you only read a partial message here you're going to discard it and then crash on the next chunk probably.

Also, you can't use FAN_EVENT_OK until you've read at least 4 chars (to ensure that you have a valid event_len to test)...

@@ +240,5 @@
> +      if (rc < 0) {
> +        printf_stderr("Unable to stat fd %d", errno);
> +      } else {
> +        mFreeSpace = sfs.f_bavail * sfs.f_bsize;
> +        if (!mLowFreeSpace && (mFreeSpace < mMinFreeSpace)) {

So... What happens if we start up and the first time we get called back our free space is somewhere in the range [mMinFreeSpace, mMinFreeSpace * 3/2]? It looks like we'll say that we're not in low disk space mode. However, if we started up with less than mMinFreeSpace and then got up to the same range later then we would continue to say that we're in low disk space mode. I don't understand how the same amount of disk space counting as low or not-low depending on startup conditions is a good idea...

Maybe this is just a simple oversight?

@@ +250,5 @@
> +          mLowFreeSpace = false;
> +          FireNotification();
> +        }
> +      }
> +      close(fem->fd);

I don't know this API really but is it guaranteed to always give you an open fd?

::: toolkit/components/diskspacewatcher/nsIDiskSpaceWatcher.idl
@@ +11,5 @@
> +{
> +  readonly attribute bool lowFreeSpace; // true if we are low on disk space.
> +  readonly attribute long long freeSpace;    // the free space currently available.
> +  void start(); // starts watching for disk space.
> +  void stop();  // stops watching for disk space.

Are there use cases for starting/stopping this? I expect we could get away with it starting at startup, stopping at shutdown, and never changing during runtime...
Attachment #739313 - Flags: feedback?(bent.mozilla)
> I would like to review the whole thing once you've addressed these.

Unless you need review before I'm back (Apr. 24), of course. I'm sure others can do it.
Marking for 5/10, since taking a platform change later than that may slip our schedule due to destabilization.
Target Milestone: 1.0.1 Madrid (19apr) → 1.0.1 IOT1 (10may)
Attached patch wip 4 (obsolete) — Splinter Review
The main changes compared to the previous patch are:
- the implementation is now in hal/, and only the xpcom bits in toolkit.
- it's using 2 different preferences to set the low and high level that make us switch from low disk space to ok.
- the interface only carries the state of the service, we start as soon as possible and shutdown... on shutdown.

There's one remaining issue that I don't understand: accessing the .isDiskFull property on the service always returns "undefined", and the c++ getter doesn't run, while getting freeSpace works as expected.
Attachment #739313 - Attachment is obsolete: true
Attachment #742122 - Flags: review?(bent.mozilla)
Depends on: 866243
Comment on attachment 742122 [details] [diff] [review]
wip 4

Review of attachment 742122 [details] [diff] [review]:
-----------------------------------------------------------------

::: b2g/chrome/content/shell.js
@@ +1177,5 @@
> +                                         .navigator.getDeviceStorage("apps");
> +
> +let req = deviceStorage.freeSpace();
> +req.onsuccess = req.onerror = function statResult(e) {
> +  if (!e.target.result) {

I think you want a separate error handler... Can't freeSpace be 0? In that case you'd bail out rather than setting all these prefs and registering the observer.

@@ +1183,5 @@
> +  }
> +
> +  let freeBytes = e.target.result;
> +  Services.prefs.setIntPref("disk_space_watcher.low_threshold",
> +                            freeBytes / (1024 * 1024) + 5);

Do you want to do |Math.round(freeBytes / (1024 * 1024))| ?

@@ +1189,5 @@
> +                            freeBytes / (1024 * 1024) + 10);
> +  Services.obs.addObserver(function(aSubject, aTopic, aData) {
> +    let watcher = aSubject.QueryInterface(Ci.nsIDiskSpaceWatcher);
> +    dump("Got notification: " + aTopic + " - " + aData + " " +
> +         watcher.freeSpace + " " + watcher.isDiskFull);

I guess this should live behind some kind of DEBUG flag?

::: hal/Hal.h
@@ +250,5 @@
>  void AdjustSystemClock(int64_t aDeltaMilliseconds);
>  
>  /**
>   * Set timezone
> + * @param aTimezoneSpec The definition can be found in

You have a bunch of whitespace changes in here on lines that I doubt you want blame for ;)

::: hal/gonk/GonkDiskSpaceWatcher.cpp
@@ +54,5 @@
> +
> +  // We should never write to the fanotify fd.
> +  virtual void OnFileCanWriteWithoutBlocking(int aFd)
> +  {
> +    MOZ_NOT_REACHED("Must not write to fanotify fd");

Add a MOZ_CRASH here too?

@@ +63,5 @@
> +
> +private:
> +  ~DiskSpaceWatcher();
> +
> +

Nit: extra line

@@ +64,5 @@
> +private:
> +  ~DiskSpaceWatcher();
> +
> +
> +  int64_t mLowThreshold;

These should all be uint64_t

@@ +74,5 @@
> +
> +  bool mIsDiskFull;
> +  int64_t mFreeSpace;
> +  static int mFd;
> +  static MessageLoopForIO::FileDescriptorWatcher mReadWatcher;

Static members should use the 's' prefix. But really, these shouldn't be static. Just make DoStart/DoStop non-static and use 'NewRunnableMethod' instead of 'NewRunnableFunction' below and I think you'll be ok.

@@ +93,5 @@
> +#define WATCHER_PREF_SIZE_DELTA "disk_space_watcher.size_delta"
> +
> +static const char kWatchedPath[] = "/data";
> +
> +DiskSpaceWatcher::DiskSpaceWatcher() :

Nit: colon on the next line

@@ +111,5 @@
> +  mSizeDelta = Preferences::GetInt(WATCHER_PREF_SIZE_DELTA, 1) * 1024 * 1024;
> +  mFd = -1;
> +}
> +
> +DiskSpaceWatcher::~DiskSpaceWatcher()

As far as I can tell this will never be called because I think you're leaking it. Where should it be cleaned up? Keep in mind that you have methods running on another thread so I bet making your class have a threadsafe refcount would help.

@@ +129,5 @@
> +    NS_WARNING("Error calling inotify_init()");
> +    return;
> +  }
> +
> +  uint64_t mask = FAN_CLOSE;

Is stack variable this needed?

@@ +164,5 @@
> +  }
> +}
> +
> +void
> +DiskSpaceWatcher::NotifyUpdate() {

Nit: { on its own line, in a couple other places too.

@@ +165,5 @@
> +}
> +
> +void
> +DiskSpaceWatcher::NotifyUpdate() {
> +  gDiskSpaceWatcherService->UpdateState(mIsDiskFull, mFreeSpace);

I hope we can avoid putting this on the interface but in any event I think that the notification that your component sends happens on the main thread. Having callbacks on a random thread is pretty scary.

@@ +170,5 @@
> +}
> +
> +void
> +DiskSpaceWatcher::OnFileCanReadWithoutBlocking(int aFd) {
> +  struct fanotify_event_metadata *fem = nullptr;

Nit: * on the left

@@ +175,5 @@
> +  char buf[4096];
> +  struct statfs sfs;
> +  int32_t len, rc;
> +
> +  len = read(aFd, buf, sizeof(buf));

You probably need to handle EINTR at least

@@ +176,5 @@
> +  struct statfs sfs;
> +  int32_t len, rc;
> +
> +  len = read(aFd, buf, sizeof(buf));
> +  if (len >= sizeof(fanotify_event_metadata)) {

If the read is only partial then you're going to discard the beginning and probably crash later. I think you'll have to handle this.

@@ +190,5 @@
> +        // low and high thresholds.
> +        // Once we are in 'full' mode we send updates for all size changes with
> +        // a minimum of time between messages or when we cross a size change
> +        // threshold.
> +        if (mFreeSpace == -1) {

Is this possible? You just set mFreeSpace above...

@@ +194,5 @@
> +        if (mFreeSpace == -1) {
> +          mIsDiskFull = mFreeSpace > mLowThreshold &&
> +                        mFreeSpace < mHighThreshold;
> +          if (mIsDiskFull) {
> +            mLastTimestamp = PR_Now();

Newer code should use TimeStamp, not PR_Now.

@@ +214,5 @@
> +        }
> +        mLastFreeSpace = mFreeSpace;
> +      }
> +      close(fem->fd);
> +      fem = FAN_EVENT_NEXT(fem, len);

I don't think that this is safe... If you haven't read a complete next message then FAN_EVENT_OK may crash.

@@ +220,5 @@
> +  }
> +}
> +
> +void
> +StartDiskSpaceWatcher(nsIDiskSpaceWatcher* aService)

Assert that you're on the main thread. In Stop too.

@@ +227,5 @@
> +  if (gHalDiskSpaceWatcher != nullptr) {
> +    return;
> +  }
> +
> +  gDiskSpaceWatcherService = aService;

Assert that gDiskSpaceWatcherService is null here.

@@ +236,5 @@
> +    NewRunnableFunction(DiskSpaceWatcher::DoStart));
> +}
> +
> +void
> +StopDiskSpaceWatcher()

Shouldn't you null out gDiskSpaceWatcherService here too?

::: layout/build/nsLayoutModule.cpp
@@ +1303,5 @@
>  #endif
>  #ifdef MOZ_B2G_BT
>    { "profile-after-change", "Bluetooth Service", BLUETOOTHSERVICE_CONTRACTID },
>  #endif
> +  { "profile-after-change", "Disk Space Watcher Service", DISKSPACEWATCHER_CONTRACTID },

This should all get moved to the component registration in nsDiskSpaceWatcher.

::: toolkit/components/diskspacewatcher/moz.build
@@ +3,5 @@
> +# This Source Code Form is subject to the terms of the Mozilla Public
> +# License, v. 2.0. If a copy of the MPL was not distributed with this
> +# file, You can obtain one at http://mozilla.org/MPL/2.0/.
> +
> +#TEST_DIRS += ['test']

What's the story with testing this?

::: toolkit/components/diskspacewatcher/nsDiskSpaceWatcher.cpp
@@ +10,5 @@
> +
> +#define NS_DISKSPACEWATCHER_CID \
> +  { 0xab218518, 0xf197, 0x4fb4, { 0x8b, 0x0f, 0x8b, 0xb3, 0x4d, 0xf2, 0x4b, 0xf4 } }
> +
> +class nsDiskSpaceWatcher MOZ_FINAL : public nsIDiskSpaceWatcher,

Nit: We're not really big on the 'ns' prefix nowadays, I'd remove it.

@@ +18,5 @@
> +  NS_DECL_ISUPPORTS
> +  NS_DECL_NSIDISKSPACEWATCHER
> +  NS_DECL_NSIOBSERVER
> +
> +  nsDiskSpaceWatcher();

This should be private too since it can only be called from within FactoryCreate.

@@ +20,5 @@
> +  NS_DECL_NSIOBSERVER
> +
> +  nsDiskSpaceWatcher();
> +
> +  // Called by Get().

You don't have a Get() :)

@@ +22,5 @@
> +  nsDiskSpaceWatcher();
> +
> +  // Called by Get().
> +  static nsDiskSpaceWatcher*
> +  Create();

I don't think you need this method, you can just inline it into FactoryCreate.

@@ +33,5 @@
> +
> +  void FireObserverNotification(bool aIsDiskFull, int64_t aFreeSpace);
> +
> +  int64_t mFreeSpace;
> +  bool    mIsDiskFull;

bools pack better if they're next to each other.

@@ +40,5 @@
> +  int64_t mMainThreadFreeSpace;
> +  bool    mMainThreadIsDiskFull;
> +};
> +
> +using namespace mozilla;

I'd move this up right after the includes.

@@ +53,5 @@
> +public:
> +  DiskSpaceNotifier(const bool aIsFull, const int64_t aFreeSpace,
> +                    bool* aMTIsFull, int64_t* aMTFreeSpace)
> +                   : mIsFull(aIsFull), mFreeSpace(aFreeSpace),
> +                     mMTIsFull(aMTIsFull), mMTFreeSpace(aMTFreeSpace) {}

Nit: The initializer list should start at column 2 or 4.

@@ +55,5 @@
> +                    bool* aMTIsFull, int64_t* aMTFreeSpace)
> +                   : mIsFull(aIsFull), mFreeSpace(aFreeSpace),
> +                     mMTIsFull(aMTIsFull), mMTFreeSpace(aMTFreeSpace) {}
> +
> +  NS_IMETHOD Run()

Assert that you're running on the main thread.

@@ +58,5 @@
> +
> +  NS_IMETHOD Run()
> +  {
> +    nsCOMPtr<nsIObserverService> observerService =
> +    mozilla::services::GetObserverService();

Nit: Indent that second line.

@@ +61,5 @@
> +    nsCOMPtr<nsIObserverService> observerService =
> +    mozilla::services::GetObserverService();
> +
> +    *mMTIsFull = mIsFull;
> +    *mMTFreeSpace = mFreeSpace;

Rather than do this using pointers why not just add a method to nsDiskSpaceWatcher to update it's main thread data? Otherwise you need to null-check gDiskSpaceWatcher here because it could have been destroyed...

@@ +89,5 @@
> +};
> +
> +nsDiskSpaceWatcher::nsDiskSpaceWatcher() :
> +  mFreeSpace(-1),
> +  mIsDiskFull(false)

You need to initialize your other members.

@@ +125,5 @@
> +    gDiskSpaceWatcher = service;
> +    ClearOnShutdown(&gDiskSpaceWatcher);
> +  }
> +
> +  nsRefPtr<nsDiskSpaceWatcher> service = gDiskSpaceWatcher.get();

Nit: The get() shouldn't be necessary here.

@@ +136,5 @@
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +
> +  if (!strcmp(aTopic, "profile-after-change")) {
> +    if (!Preferences::GetBool("disk_space_watcher.enabled", false)) {

Nit: Since you're checking both cases it's simpler to check the true case first to avoid the '!'

@@ +149,5 @@
> +  return NS_ERROR_UNEXPECTED;
> +}
> +
> +/* readonly attribute bool isDiskFull; */
> +NS_IMETHODIMP nsDiskSpaceWatcher::GetIsDiskFull(bool *aIsDiskFull)

Nit: You've been putting the * on the left everywhere else in this file.

@@ +151,5 @@
> +
> +/* readonly attribute bool isDiskFull; */
> +NS_IMETHODIMP nsDiskSpaceWatcher::GetIsDiskFull(bool *aIsDiskFull)
> +{
> +  printf_stderr("nsDiskSpaceWatcher::GetIsDiskFull mIsDiskFull=%d", mIsDiskFull);

These need to get cleaned up before checkin.

@@ +157,5 @@
> +  return NS_OK;
> +}
> +
> +/* readonly attribute long freeSpace; */
> +NS_IMETHODIMP nsDiskSpaceWatcher::GetFreeSpace(int64_t *aFreeSpace)

Should these methods throw an exception if the disabled pref is set? Or should we make it so the component never exists in the first place?

@@ +175,5 @@
> +  * This method can be called on any thread, so we explicitely dispatch
> +  * the notification on the main thread.
> +  */
> +void
> +nsDiskSpaceWatcher::FireObserverNotification(bool aIsDiskFull, int64_t aFreeSpace) {

Nit: { on its own line

::: toolkit/components/diskspacewatcher/nsIDiskSpaceWatcher.idl
@@ +7,5 @@
> +[scriptable, uuid(3aceba74-2ed5-4e99-8fe4-06e90e2b8ef0)]
> +interface nsIDiskSpaceWatcher : nsISupports
> +{
> +  readonly attribute bool isDiskFull;     // True if we are low on disk space.
> +  readonly attribute long long freeSpace; // The free space currently available.

This should be unsigned.

@@ +10,5 @@
> +  readonly attribute bool isDiskFull;     // True if we are low on disk space.
> +  readonly attribute long long freeSpace; // The free space currently available.
> +
> +  // Called back by the HAL implementation to update our state.
> +  [noscript] void updateState(in bool isDiskFull, in long long freeSpace);

I'm not sure this is needed. We link HAL into libxul, right? You should be able to just call methods on nsDiskSpaceWatcher directly from C++ rather than exposing it on the interface.
Attachment #742122 - Flags: review?(bent.mozilla)
(fwiw, all of the space units in device storage are going to be moved to uin64_t.  we will convert up after having to deal with xpcom file io where required)
Flags: needinfo?(firefoxos-ux-bugzilla)
Since this is tef+, flagging Rob to take a look at this before EOD today.
Flags: needinfo?(firefoxos-ux-bugzilla) → needinfo?(rmacdonald)
The UX appears to already be defined for this here:

https://bugzilla.mozilla.org/show_bug.cgi?id=861921#c32

Please flag me if there are any additional questions or concerns.
Flags: needinfo?(rmacdonald)
Whiteboard: u=user c=system s=ux-most-wanted [status: needs new patch AND needs new kernel][madrid] → [status: needs new patch AND needs new kernel][madrid]
wonder what's the next step here? Thanks
Flags: needinfo?(fabrice)
(In reply to Joe Cheng [:jcheng] from comment #56)
> wonder what's the next step here? Thanks

I'm updating the patch to address bent's comments.
Flags: needinfo?(fabrice)
Attached patch wip 5 (obsolete) — Splinter Review
New patch with the following changes:
- Renamed files and classes to not use the 'ns' prefix (except the idl. is that ok?)
- We call the xpcom component on the main thread.
- We don't create the service at all if the pref is disabled.
- Start & Stop use profile-change notifications.
- Many comments addressed, except the following:

(In reply to ben turner [:bent] from comment #52)
 
> ::: b2g/chrome/content/shell.js

Sorry for not telling you earlier, everything in shell.js is just debug/log, and won't be part of the final patch.

> ::: hal/Hal.h
> @@ +250,5 @@
> >  void AdjustSystemClock(int64_t aDeltaMilliseconds);
> >  
> >  /**
> >   * Set timezone
> > + * @param aTimezoneSpec The definition can be found in
> 
> You have a bunch of whitespace changes in here on lines that I doubt you
> want blame for ;)

Ha, my editor just fixed them automagically. These are only comments, I don't mind much.

> Static members should use the 's' prefix. But really, these shouldn't be
> static. Just make DoStart/DoStop non-static and use 'NewRunnableMethod'
> instead of 'NewRunnableFunction' below and I think you'll be ok.

I tried switching to NewRunnableMethod but then OnFileCanReadWithoutBlocking() is never called. If we stay with the static members, I'll remove all the nsISupport code added for NewRunnableMethod
 
> @@ +176,5 @@
> > +  struct statfs sfs;
> > +  int32_t len, rc;
> > +
> > +  len = read(aFd, buf, sizeof(buf));
> > +  if (len >= sizeof(fanotify_event_metadata)) {
> 
> If the read is only partial then you're going to discard the beginning and
> probably crash later. I think you'll have to handle this.
> @@ +214,5 @@
> > +        }
> > +        mLastFreeSpace = mFreeSpace;
> > +      }
> > +      close(fem->fd);
> > +      fem = FAN_EVENT_NEXT(fem, len);
> 
> I don't think that this is safe... If you haven't read a complete next
> message then FAN_EVENT_OK may crash.

All the code I found using fanotify does something similar. For instance https://github.com/systemd/systemd/blob/master/src/readahead/readahead-collect.c#L434
It looks like we have some guarantees from the kernel.

> ::: toolkit/components/diskspacewatcher/moz.build
> @@ +3,5 @@
> > +# This Source Code Form is subject to the terms of the Mozilla Public
> > +# License, v. 2.0. If a copy of the MPL was not distributed with this
> > +# file, You can obtain one at http://mozilla.org/MPL/2.0/.
> > +
> > +#TEST_DIRS += ['test']
> 
> What's the story with testing this?

I'll write a gonk xpcshell test.

> @@ +125,5 @@
> > +    gDiskSpaceWatcher = service;
> > +    ClearOnShutdown(&gDiskSpaceWatcher);
> > +  }
> > +
> > +  nsRefPtr<nsDiskSpaceWatcher> service = gDiskSpaceWatcher.get();
> 
> Nit: The get() shouldn't be necessary here.

Unfortunately the compiler disagrees

> @@ +10,5 @@
> > +  readonly attribute bool isDiskFull;     // True if we are low on disk space.
> > +  readonly attribute long long freeSpace; // The free space currently available.
> > +
> > +  // Called back by the HAL implementation to update our state.
> > +  [noscript] void updateState(in bool isDiskFull, in long long freeSpace);
> 
> I'm not sure this is needed. We link HAL into libxul, right? You should be
> able to just call methods on nsDiskSpaceWatcher directly from C++ rather
> than exposing it on the interface.

I haven't yet looked into that change. Will do tomorrow.

One other thing to implement in this bug is to gracefully fail when running on device that have no support in their kernel. I'll check if we can do a runtime detection - if not we'll have to whitelist supported devices.
Attachment #746226 - Flags: feedback?(bent.mozilla)
Attachment #746226 - Attachment is patch: true
Attached patch patch v6 (obsolete) — Splinter Review
Ben, this patch has the changes we discussed on IRC to use NewRunnableMethod(). No assertions in DEBUG mode, and works fine in opt.

I discussed writing tests with Fernando, but we are a bit blocked on that since none of the tests target we have on tbpl support fanotify for now :(
Attachment #742122 - Attachment is obsolete: true
Attachment #746226 - Attachment is obsolete: true
Attachment #746226 - Flags: feedback?(bent.mozilla)
Attachment #746653 - Flags: review?(bent.mozilla)
Comment on attachment 734288 [details]
Low Storage UX specs, v2

Removing the request here. This seems mostly fine, but I think we're going to go for something simpler for v1.0.

Lets revisit this once we have more time for polishing this.
Attachment #734288 - Flags: review?(jonas)
Just so that we're all on the same page, I believe our plan for this bug is:

* Create a service which monitors available disk space for /data is "running low"
* When we're "running low", we'll make IndexedDB only able to delete data,
  appcache won't perform any updates, and localStorage will not permit any data to
  be written.
* We already have code in place to prevent apps from being installed when we're low
  on space. I think this is correct even for hosted apps since eventually even
  hosted apps will be using local storage (appcache, IDB, etc) to cache data.

We also need to ensure that the following things work when the device is completely out of disk space (i.e. not just low on it).

* Starting the device
* Launching all built-in apps
* Clearing cached emails from the email app
* Clearing cached info in calendar app
* Removing files from the music/gallery/video apps
* Uninstalling 3rd party apps

Is there more?
(In reply to Jonas Sicking (:sicking) from comment #61)
> Just so that we're all on the same page, I believe our plan for this bug is:
> 
> * Create a service which monitors available disk space for /data is "running
> low"
> * When we're "running low", we'll make IndexedDB only able to delete data,
>   appcache won't perform any updates, and localStorage will not permit any
> data to
>   be written.
> * We already have code in place to prevent apps from being installed when
> we're low
>   on space. I think this is correct even for hosted apps since eventually
> even
>   hosted apps will be using local storage (appcache, IDB, etc) to cache data.

and

* Notify the user about the low space situation as described in https://bugzilla.mozilla.org/show_bug.cgi?id=861921#c12
Whiteboard: [status: needs new patch AND needs new kernel][madrid] → [status: awaiting review (bent trying for May 8th)][madrid]
Comment on attachment 746653 [details] [diff] [review]
patch v6

Review of attachment 746653 [details] [diff] [review]:
-----------------------------------------------------------------

::: hal/Hal.h
@@ +575,5 @@
> + * Start monitoring disk space for low space situations.
> + *
> + * This API is currently only allowed to be used from the main process.
> + */
> +void StartDiskSpaceWatcher(nsIDiskSpaceWatcher* aService);

Let's remove this parameter. It's a little clunky because we don't actually support anything but DiskSpaceWatcher. Then you can add a static DiskSpaceWatcher::UpdateState() that will take care of everything.

::: hal/gonk/GonkDiskSpaceWatcher.cpp
@@ +46,5 @@
> +    } _mask;
> +    _mask._64 = mask;
> +    return syscall(368, fanotify_fd, flags, _mask._32[0], _mask._32[1],
> +                   dfd, pathname);
> +  } else {

Nit: No need for else after a return.

@@ +71,5 @@
> +
> +private:
> +  ~GonkDiskSpaceWatcher() {};
> +
> +

Nit: Extra line here

@@ +85,5 @@
> +
> +  int mFd;
> +  MessageLoopForIO::FileDescriptorWatcher mReadWatcher;
> +
> +  void NotifyUpdate();

Nit: How about moving this up with the destructor

@@ +119,5 @@
> +    bool mIsDiskFull;
> +    uint64_t mFreeSpace;
> +};
> +
> +GonkDiskSpaceWatcher::GonkDiskSpaceWatcher() :

You're missing an initializer for mLastFreeSpace.

@@ +120,5 @@
> +    uint64_t mFreeSpace;
> +};
> +
> +GonkDiskSpaceWatcher::GonkDiskSpaceWatcher() :
> +  mLastTimestamp(),

No need to include members with default initializers

@@ +177,5 @@
> +  NS_ASSERTION(XRE_GetIOMessageLoop() == MessageLoopForIO::current(),
> +               "Not on the correct message loop");
> +
> +  mReadWatcher.StopWatchingFileDescriptor();
> +  if (mFd != -1) {

I think you should call StopWatchingFileDescriptor() inside the if block. It only needs to be called if you have a valid fd.

@@ +183,5 @@
> +    close(mFd);
> +    mFd = -1;
> +  }
> +
> +  delete gHalDiskSpaceWatcher;

It's weird to create things on one thread and then delete them on another (it means that your members all have to be threadsafe, for example). I'd rework this so that you call delete on the main thread after DoStop() has finished.

@@ +191,5 @@
> +// before calling the xpcom object.
> +void
> +GonkDiskSpaceWatcher::NotifyUpdate()
> +{
> +  NS_DispatchToMainThread(new DiskSpaceNotifier(mIsDiskFull, mFreeSpace),

Please add a stack nsCOMPtr<nsIRunnable> here so that the runnable has a valid refcount before passing to NS_DispatchToMainThread.

@@ +207,5 @@
> +  do {
> +    len = read(aFd, buf, sizeof(buf));
> +  } while(len == -1 && errno == EINTR);
> +
> +  if ((uint32_t)len >= sizeof(fanotify_event_metadata)) {

You need to bail out if len is < 0 here... Casting it to uint32_t first will make it huge and your >= test will always pass and you'll crash.

And I still think this is dangerous. It's possible we have some guarantee from the kernel that it will only ever hand us an integral number of these but if it's not documented then I wouldn't count on it.

How about this. If our assumption (that we get some whole number of these things) is violated we're going to crash in a scary way. What if we add a release-mode MOZ_CRASH if this assumption is ever violated? That way we'll see it in crash reports and we won't crash dangerously.

@@ +215,5 @@
> +      rc = fstatfs(fem->fd, &sfs);
> +      if (rc < 0) {
> +        NS_WARNING("Unable to stat fan_notify fd");
> +      } else {
> +        bool firstRun = mFreeSpace == 0;

Isn't it possible that mFreeSpace could be 0 once the disk fills up?

@@ +223,5 @@
> +        // Once we are in 'full' mode we send updates for all size changes with
> +        // a minimum of time between messages or when we cross a size change
> +        // threshold.
> +        if (firstRun) {
> +          mIsDiskFull = mFreeSpace > mLowThreshold &&

What happens here if mFreeSpace <= mLowThreshold? That should count as having a full disk, right?

@@ +238,5 @@
> +          mIsDiskFull = false;
> +          NotifyUpdate();
> +        } else if (mIsDiskFull) {
> +          if (mTimeout < TimeStamp::Now() - mLastTimestamp ||
> +              mSizeDelta < llabs(mFreeSpace - mLastFreeSpace)) {

This is a little weird, lots of small changes very quickly won't trigger a notification since you're changing mLastFreeSpace every time here. It seems like you should move the setting of mLastTimestamp and mLastFreeSpace into NotifyUpdate()... Wouldn't that be easier?

@@ +239,5 @@
> +          NotifyUpdate();
> +        } else if (mIsDiskFull) {
> +          if (mTimeout < TimeStamp::Now() - mLastTimestamp ||
> +              mSizeDelta < llabs(mFreeSpace - mLastFreeSpace)) {
> +            mLastTimestamp = TimeStamp::Now();

Let's save the timestamp in a stack var here to avoid two quick calls to TimeStamp::Now

@@ +274,5 @@
> +void
> +StopDiskSpaceWatcher()
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +  if (gHalDiskSpaceWatcher == nullptr) {

Nit: 'if (!gHalDiskSpaceWatcher)'

@@ +290,5 @@
> +// RunnableMethodTraits needs to be specialized in the root namespace.
> +using namespace mozilla::hal_impl;
> +
> +template<>
> +struct RunnableMethodTraits<GonkDiskSpaceWatcher>

I'd put this at the top of the file, just forward-declare mozilla::hal_impl::GonkDiskSpaceWatcher.

::: toolkit/components/diskspacewatcher/DiskSpaceWatcher.cpp
@@ +5,5 @@
> +#include "mozilla/Hal.h"
> +#include "mozilla/ModuleUtils.h"
> +#include "mozilla/Preferences.h"
> +#include "mozilla/ClearOnShutdown.h"
> +#include "DiskSpaceWatcher.h"

Nit: Put this first so that you know your header compiles on its own.

@@ +26,5 @@
> +  MOZ_ASSERT(NS_IsMainThread());
> +  MOZ_ASSERT(gDiskSpaceWatcher == nullptr);
> +}
> +
> +DiskSpaceWatcher::~DiskSpaceWatcher()

Can you MOZ_ASSERT(gDiskSpaceWatcher == nullptr) here?

@@ +28,5 @@
> +}
> +
> +DiskSpaceWatcher::~DiskSpaceWatcher()
> +{
> +  mozilla::hal::StopDiskSpaceWatcher();

This shouldn't be necessary, you'll never get here if you added yourself as an observer of profile-before-change (since it adds a ref) and you will always call StopDiskSpaceWatcher() when the observer service calls you back.

@@ +47,5 @@
> +
> +  if (!gDiskSpaceWatcher) {
> +    // Create new instance, register, return
> +    nsRefPtr<DiskSpaceWatcher> service = new DiskSpaceWatcher();
> +    NS_ENSURE_TRUE(service, nullptr);

new is infallible now, no need to check 'service' here.

@@ +49,5 @@
> +    // Create new instance, register, return
> +    nsRefPtr<DiskSpaceWatcher> service = new DiskSpaceWatcher();
> +    NS_ENSURE_TRUE(service, nullptr);
> +
> +    gDiskSpaceWatcher = service;

Just do 'gDiskSpaceWatcher = new DiskSpaceWatcher();'

@@ +64,5 @@
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +
> +  if (!strcmp(aTopic, "profile-after-change")) {
> +    if (Preferences::GetBool("disk_space_watcher.enabled", false)) {

You've already checked this, no need to do it again.

@@ +101,5 @@
> +  *aFreeSpace = mFreeSpace;
> +  return NS_OK;
> +}
> +
> +/* [noscript] void updateState (in bool isDiskFull, in long long freeSpace); */

This isn't relevant any more.

@@ +123,5 @@
> +  nsCOMPtr<nsISupports> subject;
> +  CallQueryInterface(gDiskSpaceWatcher.get(), getter_AddRefs(subject));
> +  MOZ_ASSERT(subject);
> +  observerService->NotifyObservers(subject,
> +                                   "disk-space-watcher",

This would be better as a #define in the header, maybe DISK_SPACE_WATCHER_TOPIC? That way other C++ consumers can have compile-time rather than runtime errors if this ever changes.

::: toolkit/components/diskspacewatcher/DiskSpaceWatcher.h
@@ +13,5 @@
> +  NS_DECL_ISUPPORTS
> +  NS_DECL_NSIDISKSPACEWATCHER
> +  NS_DECL_NSIOBSERVER
> +
> +  DiskSpaceWatcher();

This should be private (only called within FactoryCreate)

@@ +18,5 @@
> +
> +  static already_AddRefed<DiskSpaceWatcher>
> +  FactoryCreate();
> +
> +  NS_IMETHODIMP UpdateState(bool aIsDiskFull, uint64_t aFreeSpace);

This no longer needs to be NS_IMETHODIMP. Hopefully it can be a static void.

::: toolkit/components/diskspacewatcher/Makefile.in
@@ +19,5 @@
> +  DiskSpaceWatcher.cpp \
> +  $(NULL)
> +
> +include $(topsrcdir)/config/rules.mk
> +include $(topsrcdir)/ipc/chromium/chromium-config.mk

I'm not sure you actually need chromium-config.mk here. Can you try it without?
Attachment #746653 - Flags: review?(bent.mozilla)
Whiteboard: [status: awaiting review (bent trying for May 8th)][madrid] → [status: needs new patch][madrid]
Attachment #734288 - Flags: review-
Attached patch patch v7 (obsolete) — Splinter Review
Addressing comments, should be all fixed except this one:

> > +
> > +include $(topsrcdir)/config/rules.mk
> > +include $(topsrcdir)/ipc/chromium/chromium-config.mk
> 
> I'm not sure you actually need chromium-config.mk here. Can you try it
> without?

Compiling fails without chromium-config.mk because the Hal parts need ipc support.

I used UINT64_MAX as the initial value for mFreeSpace. This should be large enough to not hit available storage in devices too soon...
Attachment #746653 - Attachment is obsolete: true
Attachment #747544 - Flags: review?(bent.mozilla)
Attached patch patch v7Splinter Review
I forgot to qref some changes. sorry
Attachment #747544 - Attachment is obsolete: true
Attachment #747544 - Flags: review?(bent.mozilla)
Attachment #747549 - Flags: review?(bent.mozilla)
Comment on attachment 747549 [details] [diff] [review]
patch v7

Review of attachment 747549 [details] [diff] [review]:
-----------------------------------------------------------------

This looks great! A handful of nits but one bigger problem below (search for UINT64). r=me with these things fixed!

::: hal/gonk/GonkDiskSpaceWatcher.cpp
@@ +111,5 @@
> +class DiskSpaceNotifier : public nsRunnable
> +{
> +public:
> +  DiskSpaceNotifier(const bool aIsDiskFull, const uint64_t aFreeSpace)
> +                   : mIsDiskFull(aIsDiskFull), mFreeSpace(aFreeSpace) {}

Nit: below you have : on preceding line and initializer list closer to the left margin, pick one :)

@@ +113,5 @@
> +public:
> +  DiskSpaceNotifier(const bool aIsDiskFull, const uint64_t aFreeSpace)
> +                   : mIsDiskFull(aIsDiskFull), mFreeSpace(aFreeSpace) {}
> +
> +  NS_IMETHOD Run()

Assert that both these runnables run on the main thread.

@@ +115,5 @@
> +                   : mIsDiskFull(aIsDiskFull), mFreeSpace(aFreeSpace) {}
> +
> +  NS_IMETHOD Run()
> +  {
> +

Nit: Nuke extra newline, here and in your other little runnable below.

@@ +120,5 @@
> +    DiskSpaceWatcher::UpdateState(mIsDiskFull, mFreeSpace);
> +    return NS_OK;
> +  }
> +
> +  private:

Nit: This should line up with 'public' above.

@@ +231,5 @@
> +    len = read(aFd, buf, sizeof(buf));
> +  } while(len == -1 && errno == EINTR);
> +
> +  // We should get an exact multiple of fanotify_event_metadata
> +  if (len <= 0 || (len % sizeof(fanotify_event_metadata) != 0)) {

Nit: Let's make this 'sizeof(*fem)' to keep the two in sync.

@@ +236,5 @@
> +    printf_stderr("About to crash: fanotify_event_metadata read error.");
> +    MOZ_CRASH();
> +  }
> +
> +  if ((uint32_t)len >= sizeof(fanotify_event_metadata)) {

Nit: No need for this check now.

@@ +237,5 @@
> +    MOZ_CRASH();
> +  }
> +
> +  if ((uint32_t)len >= sizeof(fanotify_event_metadata)) {
> +    fem = (fanotify_event_metadata *)buf;

Nit: reinterpret_cast<fanotify_event_metadata *>(buf)

@@ +250,5 @@
> +        // low and high thresholds.
> +        // Once we are in 'full' mode we send updates for all size changes with
> +        // a minimum of time between messages or when we cross a size change
> +        // threshold.
> +        TimeStamp now = TimeStamp::Now();

Nit: This is only used in the last case now so no need for the stack var any more (and you don't need to call it except in that one case)

@@ +251,5 @@
> +        // Once we are in 'full' mode we send updates for all size changes with
> +        // a minimum of time between messages or when we cross a size change
> +        // threshold.
> +        TimeStamp now = TimeStamp::Now();
> +        if (mFreeSpace == UINT64_MAX) {

This can't ever be true, right? You set mFreeSpace just above here.

::: hal/gonk/fanotify.h
@@ +1,2 @@
> +#ifndef _LINUX_FANOTIFY_H
> +#define _LINUX_FANOTIFY_H

Let's make sure we include a note about how we got this header

::: toolkit/components/diskspacewatcher/DiskSpaceWatcher.cpp
@@ +22,5 @@
> +
> +DiskSpaceWatcher::DiskSpaceWatcher()
> +{
> +  MOZ_ASSERT(NS_IsMainThread());
> +  MOZ_ASSERT(gDiskSpaceWatcher == nullptr);

Nit: !gDiskSpaceWatcher, here and below.

@@ +44,5 @@
> +    return nullptr;
> +  }
> +
> +  if (!gDiskSpaceWatcher) {
> +    // Create new instance, register, return

Nit: This comment is kinda wrong and hardly useful, let's just kill it ;)

@@ +94,5 @@
> +  return NS_OK;
> +}
> +
> +// static
> +void DiskSpaceWatcher::UpdateState(bool aIsDiskFull, uint64_t aFreeSpace)

You should bail out if gDiskSpaceWatcher is null here.

::: toolkit/components/diskspacewatcher/moz.build
@@ +3,5 @@
> +# This Source Code Form is subject to the terms of the Mozilla Public
> +# License, v. 2.0. If a copy of the MPL was not distributed with this
> +# file, You can obtain one at http://mozilla.org/MPL/2.0/.
> +
> +#TEST_DIRS += ['test']

Nit: Might as well remove this for now.

::: toolkit/components/diskspacewatcher/nsIDiskSpaceWatcher.idl
@@ +13,5 @@
> +
> +%{ C++
> +#define DISKSPACEWATCHER_CONTRACTID "@mozilla.org/toolkit/disk-space-watcher;1"
> +
> +#define DISKSPACEWATCHER_OBSERVER_TOPIC "disk-space-watcher"

This could use a comment, "Expect to see this with 'full' and 'free'", etc.
Attachment #747549 - Flags: review?(bent.mozilla) → review+
Whiteboard: [status: needs new patch][madrid] → [status: needs uplift][madrid]
I am not sure why this didn't break tbpl :\

/Volumes/SSD/dev/mozilla/mozilla-birch/hal/gonk/GonkDiskSpaceWatcher.cpp: In function 'void mozilla::hal_impl::StartDiskSpaceWatcher()':
/Volumes/SSD/dev/mozilla/mozilla-birch/hal/gonk/GonkDiskSpaceWatcher.cpp:281: error: 'aService' was not declared in this scope

Followup: https://hg.mozilla.org/projects/birch/rev/561fc560251e
https://hg.mozilla.org/mozilla-central/rev/1683f7d9f105
https://hg.mozilla.org/mozilla-central/rev/ff15e9b25bdd

Since I believe that ferjm has another follow-up fix coming for this bug, please set checkin-needed on it to ensure that it gets uplifted.
https://hg.mozilla.org/mozilla-central/rev/561fc560251e
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Keywords: verifyme
QA Contact: jsmith
https://hg.mozilla.org/releases/mozilla-b2g18/rev/a6d94d610cec
https://hg.mozilla.org/releases/mozilla-b2g18_v1_0_1/rev/ab0171b41ca7

This has the various bustage fixes rolled into it.
Flags: needinfo?(ryanvm)
Keywords: checkin-needed
Whiteboard: [status: needs uplift][madrid] → [madrid]
Flags: in-moztrap?(jsmith)
Hi Fernando, Fabrice and Ben, awesome that you got this and bug 861921 fixed on such a tight schedule. Great work!
Local storage & appcache tests definitely have revealed that the behavior we are trying to do here works - we're warning the user when we're low on storage, preventing writes when we're low on storage, and persisting a notification until the storage returns back to an okay state (> 10 MB).
Flags: in-moztrap?(jsmith) → in-moztrap-
Hi, I am coming from Bug 883621
see https://bugzilla.mozilla.org/show_bug.cgi?id=883621#c11

Is there any chance this can be implemented on linux (I am using TB)
and was bitten by disk full situation a few times in the past.
Yes, I know that the data under linux can be anywhere, but
we can try
 - the mail directory (where mail folders are stored for the current user)
 - TMP, TMPDIR, etc. (where the tempory files are created)
 - user's ~/.profile directories, 
 - where user's cache goes, etc.
At least the two directories that should be monitored
by nsOfflineCacheUpdateService.cpp, IndexedDatabaseManager.cpp
would be nice targets for such checking.

> WARNING: Could not get disk status from nsIDiskSpaceWatcher: file /REF-COMM-CENTRAL/comm-central/mozilla/uriloader/prefetch/nsOfflineCacheUpdateService.cpp, line 312
>WARNING: No disk space watcher component available!: file /REF-COMM-CENTRAL/comm-central/mozilla/dom/indexedDB/IndexedDatabaseManager.cpp, line 301

Rather than letting developers concocting ad-hoc solutions, I would like to see
a flexible infrastructure such as this one.

Oh well, I am not sure which bugzilla entries I should be submitting this RFE now.

TIA
Blocks: 1086987
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: