Closed Bug 988301 Opened 6 years ago Closed 6 years ago

Avoid main-thread IO in Sync code

Categories

(Firefox :: Sync, defect)

defect
Not set

Tracking

()

RESOLVED FIXED
mozilla31

People

(Reporter: rvitillo, Assigned: rvitillo)

References

(Blocks 1 open bug)

Details

(Keywords: main-thread-io)

Attachments

(1 file, 9 obsolete files)

The following files are main-thread IO offenders and appear in about 5-10% of Telemetry's submissions.

{profile}\weave\changes
{profile}\weave\failed
{profile}\weave\toFetch
{profile}\weave\changes\forms.json
{profile}\weave\changes\passwords.json
{profile}\weave\changes\bookmarks.json
{profile}\weave\changes\history.json
{profile}\weave\changes\clients.json
{profile}\weave\changes\addons.json
{profile}\weave\toFetch\tabs.json
{profile}\weave\toFetch\prefs.json
{profile}\weave\toFetch\history.json
{profile}\weave\toFetch\passwords.json
{profile}\weave\toFetch\forms.json
{profile}\weave\toFetch\bookmarks.json
{profile}\weave\toFetch\clients.json
{profile}\weave\failed\passwords.json
{profile}\weave\failed\tabs.json
{profile}\weave\failed\clients.json
{profile}\weave\failed\history.json
{profile}\weave\failed\addons.json
{profile}\weave\failed\bookmarks.json
{profile}\weave\failed\prefs.json
{profile}\weave\toFetch\addons.json
{profile}\weave\failed\forms.json
{profile}\weave\addonsreconciler.json
{profile}\weave\changes\adblockplus.json
{profile}\weave\failed\adblockplus.json
{profile}\weave\toFetch\adblockplus.json
Yeah, Sync does everything bad that you wish you could go back to 2008 and tell people not to do: main-thread IO, spinning the event loop, synchronous programming style.

Fixing this without introducing lots more event loop spinning is a big chunk of work, and I don't see anything that's changed in the org that will prioritize a rewrite. Fingers crossed.
While properly fixing Sync would involve a nearly complete rewrite, I think the I/O issues could be addressed.

It seems to me the culprit is services/sync/modules/util.js:{jsonLoad, jsonSave}. Those functions already use callbacks. We should be able to drop in OS.File to make this problem go away.
(In reply to Gregory Szorc [:gps] from comment #2)
> While properly fixing Sync would involve a nearly complete rewrite, I think
> the I/O issues could be addressed.
> 
> It seems to me the culprit is services/sync/modules/util.js:{jsonLoad,
> jsonSave}. Those functions already use callbacks. We should be able to drop
> in OS.File to make this problem go away.

Aren't NetUtil.asyncFetch and NetUtil.asyncCopy already doing IO off the main thread?
(In reply to Gregory Szorc [:gps] from comment #2)

> It seems to me the culprit is services/sync/modules/util.js:{jsonLoad,
> jsonSave}. Those functions already use callbacks. We should be able to drop
> in OS.File to make this problem go away.

Indeed, we have services/common/util.js:readJSON,writeJSON that use OS.File and return promises.
(In reply to Marco Castelluccio [:marco] from comment #3)
> Aren't NetUtil.asyncFetch and NetUtil.asyncCopy already doing IO off the
> main thread?

I think bug 928340 and bug 801599 are good answers to this (to sum up, they are not so async as the name states)
I'll take this.
Assignee: nobody → mar.castelluccio
Status: NEW → ASSIGNED
I'm mostly done. There are still a couple of failing tests, I think it is a timing problem.

For example, in test_syncengine_sync.js, |test_processIncoming_previousFailed| fails only if I also run the previous |test_processIncoming_notify_count| test.
In  |test_processIncoming_previousFailed| the failure is in the initial environment checks. So, it looks like the previous test left the environment in an unclean state (indeed if I make jsonSave a no-op, the test passes).
Attached patch Patch (obsolete) — Splinter Review
Actually the problem was that in services/sync/modules/util.js we're importing FileUtils.jsm and NetUtil.jsm and re-exporting them to users of the module. I did remove the imports because they weren't necessary in the util.js module and I thought that users of the module would break in a clear way so that I could fix them one by one.
Instead, there wasn't any error message.
Attachment #8397466 - Flags: review?(rnewman)
Comment on attachment 8397466 [details] [diff] [review]
Patch

There are still a couple of failures.
test_tracker_addChanged.js - Timeout on try, OK locally.
test_syncengine_sync.js - Fails in a seemingly random way (I think there's a race condition between the tests contained in the file).
Attachment #8397466 - Flags: review?(rnewman)
I don't have time at the moment to fix those issues, I may re-take this in the future if no one else steps in.
Assignee: mar.castelluccio → nobody
Status: ASSIGNED → NEW
Attached patch Patch (obsolete) — Splinter Review
This is the latest version of the patch.
Attachment #8397466 - Attachment is obsolete: true
Summary: Avoid main-thread IO → Avoid main-thread IO in Sync code
Assignee: nobody → rvitillo
Attachment #8398029 - Attachment is obsolete: true
Attachment #8399260 - Flags: review?(rnewman)
(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #12)
> Created attachment 8399260 [details] [diff] [review]
> Avoid main-thread IO in Sync code, v3

https://tbpl.mozilla.org/?tree=Try&rev=b15402eee237
Comment on attachment 8399260 [details] [diff] [review]
Avoid main-thread IO in Sync code, v3

Review of attachment 8399260 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/sync/modules/policies.js
@@ +13,5 @@
>  Cu.import("resource://services-sync/constants.js");
>  Cu.import("resource://services-sync/engines.js");
>  Cu.import("resource://services-sync/util.js");
> +Cu.import("resource://gre/modules/FileUtils.jsm");
> +Cu.import("resource://gre/modules/NetUtil.jsm");

Do you need this here?

::: services/sync/modules/util.js
@@ +344,5 @@
>      if (that._log) {
>        that._log.trace("Loading json from disk: " + filePath);
>      }
>  
> +    if (!(yield OS.File.exists(path))) {

Actually, that's a counter-pattern.
The right way to do this is to
try {
  yield OS.File.read() // Or other reading op
} catch (ex if ex instanceof OS.File.Error
            && ex.becauseNoSuchFile) {
  // Handle non-existent file
}

@@ +381,5 @@
>     *        constant on error or null if no error was encountered (and
>     *        the file saved successfully).
>     */
> +  jsonSave: Task.async(function*(filePath, that, obj, callback) {
> +    let dir = OS.Path.join(OS.Constants.Path.profileDir, "weave");

Unless you intend to uplift, you can now simply write
 yield OS.File.makeDir(dir, {from: OS.Constants.Path.profileDir})l
this should also be faster than the code you have below.

::: services/sync/tests/unit/test_errorhandler.js
@@ +9,5 @@
>  Cu.import("resource://services-sync/service.js");
>  Cu.import("resource://services-sync/status.js");
>  Cu.import("resource://services-sync/util.js");
>  Cu.import("resource://testing-common/services/sync/utils.js");
> +Cu.import("resource://gre/modules/FileUtils.jsm");

Do you need this here?
Attachment #8399260 - Flags: review?(rnewman) → feedback+
(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #12)
> Created attachment 8399260 [details] [diff] [review]
> Avoid main-thread IO in Sync code, v3

Right, I was modifying everything to use promises, but this is safer.

(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?") from comment #14)
> ::: services/sync/modules/policies.js
> @@ +13,5 @@
> >  Cu.import("resource://services-sync/constants.js");
> >  Cu.import("resource://services-sync/engines.js");
> >  Cu.import("resource://services-sync/util.js");
> > +Cu.import("resource://gre/modules/FileUtils.jsm");
> > +Cu.import("resource://gre/modules/NetUtil.jsm");
> 
> Do you need this here?

The FileUtils.jsm and NetUtil.jsm modules were imported and re-exported by util.js. So if we stop doing this, we need to import them directly when needed.

> ::: services/sync/modules/util.js
> @@ +344,5 @@
> >      if (that._log) {
> >        that._log.trace("Loading json from disk: " + filePath);
> >      }
> >  
> > +    if (!(yield OS.File.exists(path))) {
> 
> Actually, that's a counter-pattern.
> The right way to do this is to
> try {
>   yield OS.File.read() // Or other reading op
> } catch (ex if ex instanceof OS.File.Error
>             && ex.becauseNoSuchFile) {
>   // Handle non-existent file
> }

The old code called |callback.call(that)| for every file-related error. So, to keep the old behavior, you could just do:
try {
  yield read
} catch (e if e instanceof OS.File.Error) {
  callback.call(that);
} ...
Attachment #8399260 - Attachment is obsolete: true
Attachment #8399458 - Flags: review?(dteller)
Comment on attachment 8399458 [details] [diff] [review]
Avoid main-thread IO in Sync code, v4

Review of attachment 8399458 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/sync/modules/util.js
@@ +351,5 @@
> +      json = yield CommonUtils.readJSON(path)
> +    } catch (e) {
> +      if (that._log) {
> +        that._log.debug("Failed to load json: " +
> +                        CommonUtils.exceptionStr(e));

It is my understanding that we don't want to report an error if the file just doesn't exist.

@@ +381,5 @@
> +    let error = null;
> +
> +    try {
> +      yield OS.File.makeDir(dir, { ignoreExisting: true,
> +                                  from: OS.Constants.Path.profileDir })

Nit: I'd intend so that |from| is below |ignoreExisting|.

@@ -400,5 @@
> -    let json = typeof obj == "function" ? obj.call(that) : obj;
> -    let out = JSON.stringify(json);
> -
> -    let fos = FileUtils.openSafeFileOutputStream(file);
> -    let is = this._utf8Converter.convertToInputStream(out);

Do you still need this._utf8Converter?
Attachment #8399458 - Flags: review?(dteller) → feedback+
(In reply to David Rajchenbach Teller [:Yoric] (please use "needinfo?") from comment #17)
> @@ -400,5 @@
> > -    let json = typeof obj == "function" ? obj.call(that) : obj;
> > -    let out = JSON.stringify(json);
> > -
> > -    let fos = FileUtils.openSafeFileOutputStream(file);
> > -    let is = this._utf8Converter.convertToInputStream(out);
> 
> Do you still need this._utf8Converter?

Yes, it's needed by encodeUTF8.
Attachment #8399458 - Attachment is obsolete: true
Attachment #8399494 - Flags: review?(dteller)
Comment on attachment 8399494 [details] [diff] [review]
Avoid main-thread IO in Sync code, v5

Review of attachment 8399494 [details] [diff] [review]:
-----------------------------------------------------------------

::: services/sync/modules/util.js
@@ +353,5 @@
> +    } catch (e) {
> +      if (that._log) {
> +        that._log.debug("Failed to load json: " +
> +                        CommonUtils.exceptionStr(e));
> +      }

If you want to keep the old behavior here (no logging for IO errors):
try {
  json = yield CommonUtils.readJSON(path);
} catch (e if e instanceof OS.File.Error) {
} catch (e) {
  ...
}

Note that this is exactly equivalent to what you're doing, because readJSON will never fail with e.becauseExists.

Otherwise, if you want to log all errors:
try {
  json = yield CommonUtils.readJSON(path);
} catch (e) {
  ...
}
(In reply to Marco Castelluccio [:marco] from comment #20)
> Comment on attachment 8399494 [details] [diff] [review]
> Avoid main-thread IO in Sync code, v5
> 
> Review of attachment 8399494 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: services/sync/modules/util.js
> @@ +353,5 @@
> > +    } catch (e) {
> > +      if (that._log) {
> > +        that._log.debug("Failed to load json: " +
> > +                        CommonUtils.exceptionStr(e));
> > +      }
> 
> If you want to keep the old behavior here (no logging for IO errors):
> try {
>   json = yield CommonUtils.readJSON(path);
> } catch (e if e instanceof OS.File.Error) {
> } catch (e) {
>   ...
> }

My bad, what I really had in mind was e.becauseNoSuchFile...
Attachment #8399494 - Attachment is obsolete: true
Attachment #8399494 - Flags: review?(dteller)
Attachment #8399530 - Flags: review?(dteller)
Comment on attachment 8399530 [details] [diff] [review]
Avoid main-thread IO in Sync code, v6

Review of attachment 8399530 [details] [diff] [review]:
-----------------------------------------------------------------

r=me, with a few changes below.
As I don't have review rights on that directory, you will need review from someone who has.

::: services/sync/modules/util.js
@@ +348,5 @@
> +    let json;
> +
> +    try {
> +      json = yield CommonUtils.readJSON(path)
> +    } catch (e if e instanceof OS.File.Error) {

This actually doesn't make sense – not your fault, it didn't make sense before either.

You whould ignore only |if e instanceof OS.File.Error && e.becauseNoSuchFile|. Other errors should be logged. Also, please add a comment
 // Ignore non-existent files
to the corresponding branch.

@@ +377,5 @@
>     *        the file saved successfully).
>     */
> +  jsonSave: Task.async(function*(filePath, that, obj, callback) {
> +    let path = OS.Path.join(OS.Constants.Path.profileDir, "weave", filePath + ".json");
> +    let dir = OS.Path.dirname(path);

Can you change the order?
let dir = OS.Path.join(..., "weave")
let path = OS.Path.join(dir, filePath + ".json")

@@ +381,5 @@
> +    let dir = OS.Path.dirname(path);
> +    let error = null;
> +
> +    try {
> +      yield OS.File.makeDir(dir, { ignoreExisting: true,

You don't need ignoreExisting anymore. It's now default.

@@ +382,5 @@
> +    let error = null;
> +
> +    try {
> +      yield OS.File.makeDir(dir, { ignoreExisting: true,
> +                                   from: OS.Constants.Path.profileDir })

Nit: semi-column.
Attachment #8399530 - Flags: review?(dteller) → feedback+
By the way, perhaps we need to plug stuff into AsyncShutdown. Or is it ok if we are killed during an async write?
Flags: needinfo?(gps)
Plugging Sync into AsyncShutdown should probably be done. However, I think the failure window already exists and doesn't change with this patch (please correct me if my assumptions about how OS.File handles shutdown are accurate) and thus we should not worry about it here.
Flags: needinfo?(gps)
Attachment #8399530 - Attachment is obsolete: true
Attachment #8399913 - Flags: review?(rnewman)
Comment on attachment 8399913 [details] [diff] [review]
Avoid main-thread IO in Sync code, v7

Review of attachment 8399913 [details] [diff] [review]:
-----------------------------------------------------------------

Looks fine to me, assuming tests pass.

::: services/sync/modules/util.js
@@ +347,5 @@
>  
> +    let json;
> +
> +    try {
> +      json = yield CommonUtils.readJSON(path)

Trailing ;

@@ +349,5 @@
> +
> +    try {
> +      json = yield CommonUtils.readJSON(path)
> +    } catch (e if e instanceof OS.File.Error && e.becauseNoSuchFile) {
> +      // Ignore non-existent files

Trailing .
Attachment #8399913 - Flags: review?(rnewman) → review+
After the latest changes the following xpcshell test timeouts on all Windows builds:

TEST-UNEXPECTED-FAIL | C:\slave\test\build\tests\xpcshell\tests\services\sync\tests\unit\test_tracker_addChanged.js | test failed (with xpcshell return code: 15)
The filePath argument is a path in Unix format, maybe this is causing troubles for OS.Path.join on Windows?
Right. We expect paths to use native format. You should create filePath using OS.Path.join().
https://hg.mozilla.org/integration/fx-team/rev/da1c426d69fa
Flags: in-testsuite+
Keywords: checkin-needed
Whiteboard: [fixed-in-fx-team]
https://hg.mozilla.org/mozilla-central/rev/da1c426d69fa
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Whiteboard: [fixed-in-fx-team]
Target Milestone: --- → mozilla31
Blocks: 1044700
Component: Firefox Sync: Backend → Sync
Product: Cloud Services → Firefox
You need to log in before you can comment on or make changes to this bug.