Closed Bug 893276 Opened 11 years ago Closed 7 years ago

sdk/simple-storage is writing to disk every 5 mins !

Categories

(Add-on SDK Graveyard :: General, defect, P2)

defect

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: evold, Unassigned)

References

Details

Attachments

(1 file, 2 obsolete files)

This is insane.
Frankly I think we should deprecate this..
Flags: needinfo?(rFobic)
Flags: needinfo?(jgriffiths)
Flags: needinfo?(dtownsend+bugmail)
I'd be happy to write a JEP for something that offers a `write` method.
Another option is to only write after setting (with a timer so that setting multiple times at once doesn't cause multiple saves, and instead only one save occurs)
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #3)
> Another option is to only write after setting (with a timer so that setting
> multiple times at once doesn't cause multiple saves, and instead only one
> save occurs)

This is what I do with Scriptish, and it this seems to be how most old school add-ons behave..
If we do keep this 5min interval then there should be no write operation if the content hasn't changed, at the very least..
Can we discuss this at the next team meeting? I don't have a good feel for what the trade-offs are.
Flags: needinfo?(jgriffiths)
Here are my thoughts on this subject:


- Lot's of IO is unfortunate, but since it happens off the main UI thread I don't consider it to be that critical.
- I don't think changing API to manual IO is a viable direction. It would feel as step backwards.
- Simple storage is no different from `localStorage` if we wanna fix this I think we should just pursue exposure of `localStorage` per add-on. There are some platform issues to be addressed to make it possible, but they're worth fixing & maybe workarounding in someways until then.
- Writes on changes would obviously be a better option that what we're currently doing. I believe it's not being done originally since there were no proxies available to make that possible. So changing implementation to use proxies and do
IO on mutations is an option, although I'd prefer localStorage option.
- I would also consider doing some benchmarking before investing a lot of time into this to be sure that these changes would have any visible impact justifying efforts put into.
Flags: needinfo?(rFobic)
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #1)
> Frankly I think we should deprecate this..

This is silly and we should not be causing disk I/O every 5 minutes, however it seems like there are ways to avoid that (only writing on change, after a delay maybe when user is idle). What would be the reasons for deprecating this module entirely?
Flags: needinfo?(dtownsend+bugmail)
There is an additional possible problem that some other addons might 'police' the profile directory, and clean out the persisted JSON file.  Right now, we have a hunch that this is happening, but need to verify it.
(Our (UX Prototyping, Test Pilot) needs around this are for storing the starts of long-running timers, experiment phase settings, and other things that in the Bad Old Days, would have been stored in settings.  A small 'persist-across-restart' store is probably enough.)
(In reply to Gregg Lind (User Research - Test Pilot) from comment #9)
> There is an additional possible problem that some other addons might
> 'police' the profile directory, and clean out the persisted JSON file. 
> Right now, we have a hunch that this is happening, but need to verify it.

What issue are you experiencing? and do you know the file is gone? or is it just the simple-storage that appears to be missing?
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #11)
> (In reply to Gregg Lind (User Research - Test Pilot) from comment #9)
> > There is an additional possible problem that some other addons might
> > 'police' the profile directory, and clean out the persisted JSON file. 
> > Right now, we have a hunch that this is happening, but need to verify it.
> 
> What issue are you experiencing? and do you know the file is gone? or is it
> just the simple-storage that appears to be missing?

This sounds unrelated to the disk I/O issue, can we spin it off into a separate bug?
(In reply to Irakli Gozilalishvili [:irakli] [:gozala] [@gozala] from comment #7)
> - Lot's of IO is unfortunate, but since it happens off the main UI thread I
> don't consider it to be that critical.

Forgive me for not being familiar with addon-sdk stuff, but what ensures that the IO happens off the main thread? The code in io/text-streams.js / io/file.js seem to use sync I/O APIs, is there something that enforces that those run in a separate process/worker? 

> - Simple storage is no different from `localStorage` if we wanna fix this I
> think we should just pursue exposure of `localStorage` per add-on. There are
> some platform issues to be addressed to make it possible, but they're worth
> fixing & maybe workarounding in someways until then.

We're trying to discourage and deprecate localStorage in general, so exposing it more widely doesn't seem like a good idea.
Switching to localStorage is a non-starter, for reasons gavin mentioned. (In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #13)
> (In reply to Irakli Gozilalishvili [:irakli] [:gozala] [@gozala] from
> comment #7)
> > - Lot's of IO is unfortunate, but since it happens off the main UI thread I
> > don't consider it to be that critical.
> 
> Forgive me for not being familiar with addon-sdk stuff, but what ensures
> that the IO happens off the main thread? The code in io/text-streams.js /
> io/file.js seem to use sync I/O APIs, is there something that enforces that
> those run in a separate process/worker?

We write the data with NetUtil.asyncCopy. Opening and closing the file are still sync I think though.

> > - Simple storage is no different from `localStorage` if we wanna fix this I
> > think we should just pursue exposure of `localStorage` per add-on. There are
> > some platform issues to be addressed to make it possible, but they're worth
> > fixing & maybe workarounding in someways until then.
> 
> We're trying to discourage and deprecate localStorage in general, so
> exposing it more widely doesn't seem like a good idea.

Yeah localStorage isn't the right solution here. We should either deprecate this module (but I haven't heard any reasons to do so yet) or clean up the redundant disk I/O.
I have a mostly working patch that switches to only writing on change unfortunately it runs afoul of bug 895223
Depends on: 895223
Attached file Pull request (obsolete) —
This makes us only write after data is modified.
Attachment #777955 - Flags: review?(evold)
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #13)
> (In reply to Irakli Gozilalishvili [:irakli] [:gozala] [@gozala] from
> comment #7)
> > - Simple storage is no different from `localStorage` if we wanna fix this I
> > think we should just pursue exposure of `localStorage` per add-on. There are
> > some platform issues to be addressed to make it possible, but they're worth
> > fixing & maybe workarounding in someways until then.
> 
> We're trying to discourage and deprecate localStorage in general, so
> exposing it more widely doesn't seem like a good idea.

Interesting, why is this?
Flags: needinfo?(gavin.sharp)
(In reply to Dave Townsend (:Mossop) from comment #8)
> (In reply to Erik Vold [:erikvold] [:ztatic] from comment #1)
> > Frankly I think we should deprecate this..
> 
> This is silly and we should not be causing disk I/O every 5 minutes, however
> it seems like there are ways to avoid that (only writing on change, after a
> delay maybe when user is idle). What would be the reasons for deprecating
> this module entirely?

It's not necessary, but I think this module is far less useful than an alternate could be, here are some issues with it:

* Must use JSON, no other format choice.
* I cannot choose when to do a save, thus it is impossible to be optimal, and I hate the idea of useless disk writing.
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #18)
> (In reply to Dave Townsend (:Mossop) from comment #8)
> > (In reply to Erik Vold [:erikvold] [:ztatic] from comment #1)
> > > Frankly I think we should deprecate this..
> > 
> > This is silly and we should not be causing disk I/O every 5 minutes, however
> > it seems like there are ways to avoid that (only writing on change, after a
> > delay maybe when user is idle). What would be the reasons for deprecating
> > this module entirely?
> 
> It's not necessary, but I think this module is far less useful than an
> alternate could be, here are some issues with it:
> 
> * Must use JSON, no other format choice.

Well JS objects, arrays and primitives, you don't need to worry about JSON. What other kinds of data would you want to store from JS?

> * I cannot choose when to do a save, thus it is impossible to be optimal,
> and I hate the idea of useless disk writing.

The patch I have gets rid of any useless disk writing, we just write whenever data is changed. We could go even further and only write on shutdown, though that loses us some protection from crashes.

I'm not against creating a better settings storage module, though we already have indexeddb too, but assuming we can solve the problems with simple-storage I don't see a need to deprecate it and force all users to switch unnecessarily right now.
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #17)
> > We're trying to discourage and deprecate localStorage in general, so
> > exposing it more widely doesn't seem like a good idea.
> 
> Interesting, why is this?

A synchronous API dependent on disk IO is just generally a bad idea. We've made some improvements recently (bug 600307, bug 807021) but the fundamental problem of page script/responsiveness being blocked on IO remains.
Flags: needinfo?(gavin.sharp)
Priority: -- → P1
Ah I just noticed that simple-storage does a sync io when reading from disk.. yet another reason to deprecate it.
(In reply to :Gavin Sharp (use gavin@gavinsharp.com for email) from comment #20)
> (In reply to Erik Vold [:erikvold] [:ztatic] from comment #17)
> > > We're trying to discourage and deprecate localStorage in general, so
> > > exposing it more widely doesn't seem like a good idea.
> > 
> > Interesting, why is this?
> 
> A synchronous API dependent on disk IO is just generally a bad idea. We've
> made some improvements recently (bug 600307, bug 807021) but the fundamental
> problem of page script/responsiveness being blocked on IO remains.

The sync API for simple-storage is regrettable.
Comment on attachment 777955 [details]
Pull request

Irakli I'm going to wait for your a+ before reviewing this optimization.  If you'd like to do the review too then be my guest, I'd rather not.
Attachment #777955 - Flags: feedback?(rFobic)
Comment on attachment 777955 [details]
Pull request

Erik: I don't mind taking over this review.

This approach is fine but does has a flaw of backwards incompatibility. Which maybe alright but I'd like to ensure we understand consequences.

For example:

```js
var point = { x: 0, y: 0 };
storage.point = point;
point.x = 1;
```

This is very simple example, but you can imagine more complicated one. Problem is attempts to save changes to point will only be made at shutdown (assuming it was clean one). But then since we write async there's no guarantees data will end up
saved. That being said current implementation has "maybe saves on shutdown" flow
too but I don't know which one would be more error prone in practice.

I think we could preserve behavior backwards compatible if instead of doing proxies we just state snapshot via `JSON.stringify` after delays and compared it to previous one if there is a diff we can do actual writes to disk.
Attachment #777955 - Flags: review?(evold)
Attachment #777955 - Flags: feedback?(rFobic)
Attachment #777955 - Flags: feedback+
(In reply to Irakli Gozilalishvili [:irakli] [:gozala] [@gozala] from comment #24)
> Comment on attachment 777955 [details]
> Pull request
> 
> Erik: I don't mind taking over this review.
> 
> This approach is fine but does has a flaw of backwards incompatibility.
> Which maybe alright but I'd like to ensure we understand consequences.
> 
> For example:
> 
> ```js
> var point = { x: 0, y: 0 };
> storage.point = point;
> point.x = 1;
> ```
> 
> This is very simple example, but you can imagine more complicated one.
> Problem is attempts to save changes to point will only be made at shutdown
> (assuming it was clean one). But then since we write async there's no
> guarantees data will end up
> saved. That being said current implementation has "maybe saves on shutdown"
> flow
> too but I don't know which one would be more error prone in practice.
> 
> I think we could preserve behavior backwards compatible if instead of doing
> proxies we just state snapshot via `JSON.stringify` after delays and
> compared it to previous one if there is a diff we can do actual writes to
> disk.

Thanks Irakli.  I probably would've missed that.
(In reply to Irakli Gozilalishvili [:irakli] [:gozala] [@gozala] from comment #24)
> Comment on attachment 777955 [details]
> Pull request
> 
> Erik: I don't mind taking over this review.
> 
> This approach is fine but does has a flaw of backwards incompatibility.
> Which maybe alright but I'd like to ensure we understand consequences.
> 
> For example:
> 
> ```js
> var point = { x: 0, y: 0 };
> storage.point = point;
> point.x = 1;
> ```

Ah yes, that is problematic.

> This is very simple example, but you can imagine more complicated one.
> Problem is attempts to save changes to point will only be made at shutdown
> (assuming it was clean one). But then since we write async there's no
> guarantees data will end up
> saved. That being said current implementation has "maybe saves on shutdown"
> flow
> too but I don't know which one would be more error prone in practice.

I think it should always save on shutdown from manager.unload

> I think we could preserve behavior backwards compatible if instead of doing
> proxies we just state snapshot via `JSON.stringify` after delays and
> compared it to previous one if there is a diff we can do actual writes to
> disk.

I considered that but after seeing that the quota was 5MB that means potentially holding onto 5MB for the lifetime of the app and I'd hoped to avoid that. Maybe it's a better solution though.
(In reply to Dave Townsend (:Mossop) from comment #26)
> (In reply to Irakli Gozilalishvili [:irakli] [:gozala] [@gozala] from
> comment #24)
> > I think we could preserve behavior backwards compatible if instead of doing
> > proxies we just state snapshot via `JSON.stringify` after delays and
> > compared it to previous one if there is a diff we can do actual writes to
> > disk.
> 
> I considered that but after seeing that the quota was 5MB that means
> potentially holding onto 5MB for the lifetime of the app and I'd hoped to
> avoid that. Maybe it's a better solution though.

We could hash the snapshots.
(In reply to Erik Vold [:erikvold] [:ztatic] from comment #27)
> (In reply to Dave Townsend (:Mossop) from comment #26)
> > (In reply to Irakli Gozilalishvili [:irakli] [:gozala] [@gozala] from
> > comment #24)
> > > I think we could preserve behavior backwards compatible if instead of doing
> > > proxies we just state snapshot via `JSON.stringify` after delays and
> > > compared it to previous one if there is a diff we can do actual writes to
> > > disk.
> > 
> > I considered that but after seeing that the quota was 5MB that means
> > potentially holding onto 5MB for the lifetime of the app and I'd hoped to
> > avoid that. Maybe it's a better solution though.
> 
> We could hash the snapshots.

I like it, I'll see how costly that is for a large data set, I can probably do it asynchronously.
Comment on attachment 781351 [details]
Pointer to Github pull request: https://github.com/mozilla/addon-sdk/pull/1134

This uses the same tests from the previous pull request but instead uses hashing to check if the data has changed. I had to do an asynchronous hash which uses up around 30ms of main thread time for the full 5MB of data, a synchronous hash of 5MB was taking around 1.2s which is too much.
Attachment #781351 - Flags: review?(rFobic)
Irakli thinks we can make simple-storage reads async, let's talk about that issue in bug 898474
Attachment #781351 - Flags: review?(rFobic) → review+
(In reply to Dave Townsend (:Mossop) from comment #30)
> Comment on attachment 781351 [details]
> Pointer to Github pull request:
> https://github.com/mozilla/addon-sdk/pull/1134
> 
> This uses the same tests from the previous pull request but instead uses
> hashing to check if the data has changed. I had to do an asynchronous hash
> which uses up around 30ms of main thread time for the full 5MB of data, a
> synchronous hash of 5MB was taking around 1.2s which is too much.

30ms on what hardware? Are we making a change to poll for changes every 5seconds and keeping 5mb of data per jetpack in memory?
(In reply to Taras Glek (:taras) from comment #32)
> (In reply to Dave Townsend (:Mossop) from comment #30)
> > Comment on attachment 781351 [details]
> > Pointer to Github pull request:
> > https://github.com/mozilla/addon-sdk/pull/1134
> > 
> > This uses the same tests from the previous pull request but instead uses
> > hashing to check if the data has changed. I had to do an asynchronous hash
> > which uses up around 30ms of main thread time for the full 5MB of data, a
> > synchronous hash of 5MB was taking around 1.2s which is too much.
> 
> 30ms on what hardware?

On my Sony Vaio Z

> Are we making a change to poll for changes every
> 5seconds and keeping 5mb of data per jetpack in memory?

No, we're making a change so that instead of writing any data in simple-storage (anywhere from 1 byte to 5MB) out to disk every 5 minutes we instead hash the data and only write it to disk if the hash has changed.
30ms jank on a fast machine is likely 60-100ms on common hardware(eg our amd E350 slow reference machines). Sounds better than doing IO every 5s, but still terrible. Should just get rid of such apis.
(In reply to Taras Glek (:taras) from comment #34)
> 30ms jank on a fast machine is likely 60-100ms on common hardware(eg our amd
> E350 slow reference machines). Sounds better than doing IO every 5s, but
> still terrible. Should just get rid of such apis.

I don't think it was a solid 30ms, that was just the total time spent on the main thread, I'll have to test more to find out how many uses of the event loop that was
Also I don't know where the 5s came from
Hardware: x86 → All
Flags: needinfo?(dtownsend+bugmail)
Comment on attachment 781351 [details]
Pointer to Github pull request: https://github.com/mozilla/addon-sdk/pull/1134

Updated to master and switch to use the fs API, warrants another review pass I think then we can land this.
Attachment #781351 - Flags: review+ → review?(rFobic)
Flags: needinfo?(dtownsend+bugmail)
Comment on attachment 781351 [details]
Pointer to Github pull request: https://github.com/mozilla/addon-sdk/pull/1134

Still made few comments, but nothing critical, I'll leave it up to you weather you wanna address them before landing.
Attachment #781351 - Flags: review?(rFobic) → review+
Is there any chance to force simple-storage to save to disk? Or make it persist after every write, and not wait for event in the future?
(For all those who are suggesting deprecating the api, some in the wild stories for what I am using this for:

- JSONable addon specific configuration and prefs that would otherwise go to 'prefs'
- small amounts of transitory data storage

The alternative (for my use cases) would be to stick things in prefs, which is gross because of stringify, integer overflow on timestamps, etc.  

A 'proxy a pref' solution (myprefs.storage) with 'auto json', that exposes a 'proxy' (that looks like the `simplestore.storage` object) would be fine.
)
(In reply to urishmorgun from comment #39)
> Is there any chance to force simple-storage to save to disk? Or make it
> persist after every write, and not wait for event in the future?

Persisting after every write isn't something we can do without breaking the API, and if we're going to do that then we'd be more likely to switch to something other than simple storage.
(In reply to Dave Townsend (:Mossop) from comment #41)
> Persisting after every write isn't something we can do without breaking the
> API, and if we're going to do that then we'd be more likely to switch to
> something other than simple storage.

Well, then a write that user initiates would be very useful
Commits pushed to master at https://github.com/mozilla/addon-sdk

https://github.com/mozilla/addon-sdk/commit/08cf8c66717bcf8992286aee702abc4c8daebbf3
Bug 893276: Only write simple-storage to disk after it has been changed.

https://github.com/mozilla/addon-sdk/commit/4f3ef6ced9a1377afd7c3dc3bf232c0183fd47c6
Merge pull request #1134 from Mossop/bug893276_2

Bug 893276: Only write simple-storage to disk after it has been changed. r=@gozala
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Assignee: nobody → dtownsend+bugmail
https://tbpl.mozilla.org/php/getParsedLog.php?id=30928521&tree=Jetpack 

One of the test runs failed with this failure. Not sure how important that is.
Flags: needinfo?(dtownsend+bugmail)
Commit pushed to master at https://github.com/mozilla/addon-sdk

https://github.com/mozilla/addon-sdk/commit/11344b27cc0a1632ab757fda3b43b320e8c6830d
Revert bug 893276 due to permanent failures.

This reverts commit 4f3ef6ced9a1377afd7c3dc3bf232c0183fd47c6, reversing
changes made to 7b5f80a40e79506293c58ba1c337c93be45423b2.
Status: RESOLVED → REOPENED
Flags: needinfo?(dtownsend+bugmail)
Resolution: FIXED → ---
Commit pushed to master at https://github.com/mozilla/addon-sdk

https://github.com/mozilla/addon-sdk/commit/4b6a467c3ce61fbe96083f1de2885a06a9b853b5
Bug 893276: sdk/simple-storage is writing to disk every 5 mins. r=gozala
After unbitrotting this and running in through try I couldn't see any failures, so let's see what happens if I land it again.
Status: REOPENED → RESOLVED
Closed: 11 years ago10 years ago
Resolution: --- → FIXED
Commit pushed to master at https://github.com/mozilla/addon-sdk

https://github.com/mozilla/addon-sdk/commit/bb87af0616a451c89cb2349235d20b160963a952
Revert "Bug 893276: sdk/simple-storage is writing to disk every 5 mins. r=gozala"

This reverts commit 4b6a467c3ce61fbe96083f1de2885a06a9b853b5.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Depends on: 996200
Attached file fixes tests
Attachment #777955 - Attachment is obsolete: true
Attachment #781351 - Attachment is obsolete: true
Attachment #8407231 - Flags: review?(rFobic)
Attachment #8407231 - Flags: review?(rFobic) → review+
Commits pushed to master at https://github.com/mozilla/addon-sdk

https://github.com/mozilla/addon-sdk/commit/7df29f8d057645ab8428ed7eacac73eeaa26fc88
Bug 893276: Only write simple-storage json file if the data has changed.

https://github.com/mozilla/addon-sdk/commit/9a82bbc2ec7d8df9023aa11acac5d5f7e8c2ece1
Merge pull request #1462 from Mossop/bug893276

Bug 893276: Only write simple-storage json file if the data has changed. r=gozala
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Depends on: 1001074
Irakli, can you revert this patch and do the uplifts?

This patch caused a regression see bug 1058840, we should also write a test for this.
Flags: needinfo?(rFobic)
Assignee: dtownsend+bugmail → nobody
Someone should probably work on this.
Priority: P1 → --
(In reply to Erik Vold [:erikvold] (work week -> pto) from comment #54)
> Someone should probably work on this.

Adding it to the triage list.
Priority: -- → P2
https://bugzilla.mozilla.org/show_bug.cgi?id=1399562
Status: REOPENED → RESOLVED
Closed: 10 years ago7 years ago
Resolution: --- → INCOMPLETE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: