Do automatic periodic async, off-mainthread writes for the prefs file when prefs change

NEW
Assigned to

Status

()

Core
Preferences: Backend
5 years ago
3 hours ago

People

(Reporter: vladan, Assigned: milan)

Tracking

(Depends on: 2 bugs, Blocks: 8 bugs, {main-thread-io})

16 Branch
main-thread-io
Points:
---
Dependency tree / graph
Bug Flags:
firefox-backlog +

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [Snappy:P2][qf:p1][bhr])

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(5 attachments, 20 obsolete attachments)

2.18 KB, text/plain
Details
59 bytes, text/x-review-board-request
milan
: review?
bsmedberg
Details | Review
59 bytes, text/x-review-board-request
milan
: review?
bsmedberg
Details | Review
59 bytes, text/x-review-board-request
milan
: review?
bsmedberg
Details | Review
59 bytes, text/x-review-board-request
milan
: review?
bsmedberg
Details | Review
(Reporter)

Description

5 years ago
There is a significant number of Telemetry reports showing multi-second hangs from preferences being written to disk synchronously.

NtFlushBuffersFile (in ntdll.dll)
 -> FlushFileBuffersImplementation (in kernel32.dll)
 -> FileSync (in nspr4.dll)
 -> nsSafeFileOutputStream::Finish() (in xul.dll)
 -> nsBufferedOutputStream::Finish() (in xul.dll)
 -> mozilla::Preferences::WritePrefFile(nsIFile *) (in xul.dll)
 -> mozilla::Preferences::SavePrefFileInternal(nsIFile *) (in xul.dll)
 -> mozilla::Preferences::SavePrefFile(nsIFile *) (in xul.dll)
 -> NS_InvokeByIndex_P (in xul.dll)
 -> XPCWrappedNative::CallMethod(XPCCallContext &,XPCWrappedNative::CallMode) (in xul.dll)
I have the impression that backgrounding writes (with OS.File atomic write) would be pretty low-risk.
This stack trace is interesting, as it seems that the multi-second hangs originate from JavaScript. This would make a rewrite with OS.File rather natural, if we can serialize the preference file to a string.
(Reporter)

Updated

4 years ago
Whiteboard: [Snappy:P2]
We could use nsIBackgroundFileSaver, that stores the file through a background thread.
(In reply to Marco Castelluccio [:marco] from comment #3)
> We could use nsIBackgroundFileSaver, that stores the file through a
> background thread.

The main problem with this is that each write would require a new thread. I am almost sure that we have cases in which we modify many preferences in a row and save after each change, so this might possibly explode.
Keywords: main-thread-io

Comment 5

4 years ago
(In reply to David Rajchenbach Teller [:Yoric] from comment #4)
> (In reply to Marco Castelluccio [:marco] from comment #3)
> > We could use nsIBackgroundFileSaver, that stores the file through a
> > background thread.
> 
> The main problem with this is that each write would require a new thread. I
> am almost sure that we have cases in which we modify many preferences in a
> row and save after each change, so this might possibly explode.

I've just been pointed to this bug because of issues I am having with exactly this situation.  It occurs in instant-apply prefwindows when savePrefFile() is called after every preference value change.  While that might not sound so terrible, it can occur for every single typed (or deleted) character in a textbox.  That, or programmatically applying multiple preference changes, can result in the multi-second hangs you are seeing from the telemetry.

This particular instance could be semi-solved just by modifying the preferences binding.  That still leaves a blocking write of the prefs file at some point, just not instantly and not for every single character.  The non-instant-apply approach to writing out all the preferences when the dialog closes is to not call savePrefFile() until all the changes have been applied, controlled by a batching property on the preference element.
(In reply to Ian Nartowicz from comment #5)
> This particular instance could be semi-solved just by modifying the
> preferences binding.  That still leaves a blocking write of the prefs file
> at some point, just not instantly and not for every single character.

Yes, we should fix the preferences bindings to avoid calling savePrefFile too often (I thought there was a bug on file about that already; if not, please file one!). This bug is about the more general problem and is harder to fix.

Updated

3 years ago
Blocks: 683808

Comment 7

3 years ago
So, I would *think* that we can't just switch the API here, because on shutdown we actually need to be sure we write out the prefs, because of how the prefs system works right now.

Would switching to a regular async/OMT flush system work (ie saving automatically and asynchronously, off main thread, and saving no more often than once every 5 minutes or so (a la DelayedTask)), combining that with guarding savePrefFile to no-op iff passed "null" as the file and with no dirty state?

AFAICT this would eliminate the sync disk writes in most cases (where a pref has not been modified within 5 minutes of shutdown/backgrounding) and improve reliability in data loss scenarios.

Also, is this something we could work on as part of the backlog, Gavin? :-)
Flags: needinfo?(gavin.sharp)
If we wish to do this in JS (e.g. by rebinding Services.prefs to a PreferenceService.jsm, rather than the XPCOM implementation), I suggest taking advantage of the OS.File.writeAtomic backup mechanism http://wp.me/p52O1-mh and using it for recovery if the file is corrupted, which should go a long way towards ensuring that we do not corrupt data. Similarly, if we do it in native code, a similar rename/backup mechanism would be a good idea.

Regardless, please use this with AsyncShutdown. I hope to land the XPCOM version in August (bug 918317).

Comment 9

3 years ago
(In reply to David Rajchenbach Teller [:Yoric] (away until August 20th - use "needinfo") from comment #8)
> If we wish to do this in JS (e.g. by rebinding Services.prefs to a
> PreferenceService.jsm, rather than the XPCOM implementation)

This is generally an interesting idea - I don't know if there's a good reason not to implement the prefs system in JS instead of C++...

(we'd probably still want a small XPCOM wrapper for cpp, of course...)
If we're implementing in JS, DeferredSave.jsm could be what you're looking for - batches up "dirty" notifications for a configured delay period, then writes out the file asynchronously. It supports an explicit flush for shutdown time, and handles all the ugly edge cases of the data becoming dirty again part-way through a write and so on.

That said, prefs has so many C++ clients that I'd be pretty nervous about porting the main API to JS.
(In reply to :Gijs Kruitbosch (Bugmail catchup, needinfo if urgent) from comment #7)
> Would switching to a regular async/OMT flush system work (ie saving
> automatically and asynchronously, off main thread, and saving no more often
> than once every 5 minutes or so (a la DelayedTask)), combining that with
> guarding savePrefFile to no-op iff passed "null" as the file and with no
> dirty state?

Sounds like a good plan. 5 minutes sounds like a lot - on the order of "seconds" would probably achieve a similar effect while lowering the chances of needing to write out synchronously on shutdown.

> AFAICT this would eliminate the sync disk writes in most cases (where a pref
> has not been modified within 5 minutes of shutdown/backgrounding) and
> improve reliability in data loss scenarios.

I'm not sure that that's "most cases" (I could see some code setting prefs on shutdown regularly), but it would be interesting to find out (via telemetry)!

> Also, is this something we could work on as part of the backlog, Gavin? :-)

Certainly, if we have a clear plan. I'm less certain about the relative priority of this compared to other main-thread-IO/perf efforts, perhaps Vladan can advise?
Flags: needinfo?(gavin.sharp)

Comment 12

3 years ago
(In reply to :Gavin Sharp [email: gavin@gavinsharp.com] from comment #11)
> (In reply to :Gijs Kruitbosch (Bugmail catchup, needinfo if urgent) from
> comment #7)
> > Also, is this something we could work on as part of the backlog, Gavin? :-)
> 
> Certainly, if we have a clear plan. I'm less certain about the relative
> priority of this compared to other main-thread-IO/perf efforts, perhaps
> Vladan can advise?

Vladan? :-)
Flags: needinfo?(vdjeric)
I don't think you need telemetry to measure pref writes during shutdown: I bet you could do it just by running the browser a bit. In any case, I think we should tightly scope this bug to periodically writing the pref file off the main thread before shutdown when it's dirty. At shutdown if it's dirty we can either synchronously write it one last time from the main thread, or use some sort of shutdown blocker to wait for the last async write.

We shouldn't reimplement this in JS now. If we're thinking about a deeper prefs overhaul, we need a redesign (and I have thoughts I can dump somewhere).

I actually think it's a high priority for data quality to do the periodic save (e.g. wait 5 seconds after prefs are dirty and then write them), entirely apart from the jank associated with the main-thread write. So I'd be happy to say we should prioritize this soon.
Flags: needinfo?(vdjeric)

Comment 14

3 years ago
Sounds like there's enough consensus to include this in the backlog, at the least. I would also agree with Benjamin that it would probably make sense to prioritize soon. We can do followups (to change consumers or such) later once we at least have an async, OMT way to save prefs.
Flags: firefox-backlog+
Summary: Preferences are written to disk on the main thread → Do automatic periodic async, off-mainthread writes for the prefs file when prefs change, and no-op savePrefsFile(null) if pref state is not dirty

Updated

3 years ago
Depends on: 1047389
Let's make sure this works nicely with AsyncShutdown, so let's wait until bug 918317 has landed.
Depends on: 918317
Depends on: 1056170
We already skip writing to the default prefs file when the preferences aren't dirty: http://hg.mozilla.org/mozilla-central/annotate/ffdd1a398105/modules/libpref/Preferences.cpp#l916
Summary: Do automatic periodic async, off-mainthread writes for the prefs file when prefs change, and no-op savePrefsFile(null) if pref state is not dirty → Do automatic periodic async, off-mainthread writes for the prefs file when prefs change
(Reporter)

Comment 17

3 years ago
I can work on this. We can use this bug for the short-term fix described in the comments, and then we can address proper prefs storage (e.g. log storage) in bug 897049 and bug 956713
Assignee: nobody → vdjeric
Hey Vladan, is there any update on this? Do you think you could find someone to work on it?
Flags: needinfo?(vdjeric)
(This came up in bug 1185567 fyi)
(Reporter)

Comment 20

2 years ago
(In reply to Jared Wein [:jaws] (please needinfo? me) from comment #18)
> Hey Vladan, is there any update on this? Do you think you could find someone
> to work on it?

I wanted Aaron Klotz to work on this, but this quarter it's looking unlikely, given the lengthy list of other projects we have been asked to do. I'd be happy to turn this over to anyone interested and help them get started, discuss design, etc. If it's not done by Q4, I will ask Aaron to work on it.
Flags: needinfo?(vdjeric)
(Reporter)

Updated

2 years ago
Assignee: vdjeric → nobody

Comment 21

2 years ago
Is there other in-tree code that does OMT writes that could serve as some inspiration if I wanted to take a stab at this pre-Q4?
Flags: needinfo?(vdjeric)
(Reporter)

Comment 22

2 years ago
There's a bunch of C++ code that writes files OMT in the codebase. bug 880864 is one example, LocalStorage is another, there's also the reading of Telemetry.ShutdownTime.txt in Telemetry.cpp.

The basic approach is to send an event to the STS event queue and have the event do the I/O on the STS thread.

There's also an open question regarding acceptable delays for writing prefs. Some code (such as startupCrash detection) writes to prefs and expects the pref state to be resilient to a crash after the write call returns.
Flags: needinfo?(vdjeric)

Updated

2 years ago
Flags: needinfo?(gijskruitbosch+bugs)

Comment 23

a year ago
Created attachment 8703278 [details] [diff] [review]
do async omt pref writes - WIP, f?vladan

Happy new year!

It seems I didn't get anything done before Q4, and nobody else did anything in Q4 either, so today/yesterday I figured I'd have a poke at this.

This is a pretty rough WIP. I haven't done concurrency in C++ before. I'm mostly wondering if this is the right approach, if there are glaring things I've missed, or if there's a simpler way to do this.

Some notes:
- I left the SavePrefFile API alone (ie it is still sync), because there are some API consumers that I'm fairly sure assume that they get a sync write out of it.
- Telemetry and a number of other things write to the prefs right after startup which triggers a flurry of separate writes, so I ended up upping the delay to 15s instead of just 5
- I haven't implemented shutdown blocking yet. AFAICT, both pre- and post-patch there is no guarantee we'll write an updated file before shutdown (!). Happy to look at doing that, but it seems there's no MDN docs for the C++ version of the AsyncShutdown APIs.
- There are currently shutdown-time pref writes that don't get flushed to disk. I noticed SocialService.jsm that does this for xpcom-shutdown (ie sets a pref, potentially for no reason, but doesn't flush to disk), there might be others. So if I'm right that there's no guarantee right now, implementing such a guarantee might potentially regress shutdown performance (oh, the irony) if we don't update consumers
- I've not updated any consumers (that frequently call savePrefFile, that is) in this patch, but arguably we should do that when we land this and/or soon after.
Attachment #8703278 - Flags: feedback?(vladan.bugzilla)

Updated

a year ago
Assignee: nobody → gijskruitbosch+bugs
Status: NEW → ASSIGNED

Comment 24

a year ago
Created attachment 8703279 [details] [diff] [review]
Debug printfs I used

Dunno if this is helpful, or could be helpful if I converted it to use "proper" NSPR logging or something, but I did use them to debug and check that my implementation was doing vaguely sane things, so I figured I might as well add them here.
Flags: needinfo?(gijskruitbosch+bugs)
(Reporter)

Comment 25

a year ago
Comment on attachment 8703278 [details] [diff] [review]
do async omt pref writes - WIP, f?vladan

Review of attachment 8703278 [details] [diff] [review]:
-----------------------------------------------------------------

I'm on PTO this week, so redirecting to aklotz
Attachment #8703278 - Flags: feedback?(vladan.bugzilla) → feedback?(aklotz)
Comment on attachment 8703278 [details] [diff] [review]
do async omt pref writes - WIP, f?vladan

Review of attachment 8703278 [details] [diff] [review]:
-----------------------------------------------------------------

OK, I do have a few suggestions. I'm not super familiar with libpref code so it's probably best to ask one of those peers to look at this as well.

::: modules/libpref/Preferences.cpp
@@ +107,5 @@
> +  NS_LINEBREAK;
> +
> +static const uint32_t kPrefDelayedSaveTimeoutMs = 15000;
> +
> +

Nit: extra blank line

@@ +220,5 @@
> +{
> +public:
> +  PreferenceRunnableSaver(nsIFile* aFile)
> +  {
> +    mFile = aFile;

Nit: please put these in an initializer list.

@@ +304,5 @@
> +    }
> +  }
> +
> +private:
> +  RefPtr<nsIFile> mFile;

nsCOMPtr since you're typing on an interface and not a concrete class.

::: modules/libpref/Preferences.h
@@ +375,5 @@
>  
>  private:
>    nsCOMPtr<nsIFile>        mCurrentFile;
>  
> +  RefPtr<nsITimer>         mCurrentOMTSaveTimer;

This should be nsCOMPtr since it is typed on an interface instead of a concrete class.

::: modules/libpref/prefapi.cpp
@@ +987,5 @@
> +void PREF_MarkDirty()
> +{
> +  if (!gDirty) {
> +    gDirty = true;
> +    gDirtyCallback();

Make sure this is non-null before calling.
Attachment #8703278 - Flags: feedback?(aklotz) → feedback+
I hope that the write-on-shutdown will happen early and/or that there will remain a few synchronous writes because I have seen shutdown hangs in very busy circumstances (see e.g. my bug 570316, which seemed to have disappeared but has recently resurfaced).

I can think of at least the following:
- closing the prefs dialog, especially closing it by [OK] with browser.preferences.instantApply being false;
- closing the about:config tab. Maybe even refreshing it but I'm less sure.

Comment 28

a year ago
Created attachment 8708517 [details]
MozReview Request: Bug 789945 - do async omt pref writes, r?bsmedberg

Review commit: https://reviewboard.mozilla.org/r/31079/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/31079/
Attachment #8708517 - Flags: review?(benjamin)

Comment 29

a year ago
Created attachment 8708518 [details]
MozReview Request: Bug 789945 - part 2: add asyncshutdown blocker, r?bsmedberg

Review commit: https://reviewboard.mozilla.org/r/31081/diff/#index_header
See other reviews: https://reviewboard.mozilla.org/r/31081/
Attachment #8708518 - Flags: review?(benjamin)

Comment 30

a year ago
Created attachment 8708519 [details]
Debug printfs I used
Attachment #8703279 - Attachment is obsolete: true

Comment 31

a year ago
https://reviewboard.mozilla.org/r/31079/#review27911

I applied all the nits from aklotz, I think. (Thanks!)

The 2nd part implements a "proper" shutdown blocker instead of the current call to savePrefFile when profile-before-change fires. In general, this won't change anything compared to the status quo, but it will ensure that a running async save will complete before we shut down. Note that this means we'll still be doing an IO-on-the-main-thread save as we are now.

In this sense both these patches are conservative - savePrefFile() also continues to do sync main thread IO. I'd want to file followup bugs to teach the preference bindings not to be so eager to call that method themselves.

As a consequence, right now, this would often result in more disk IO than we currently do, because in some cases we'd normally change prefs and not save to disk, and we will now start async saving to disk before shut down. Some of telemetry/healthreport (particularly the "datareporting.sessions.current.totalTime" and friends prefs) writes prefs "late", meaning we almost always shutdown-write prefs anyway.

::: modules/libpref/Preferences.cpp:294
(Diff revision 1)
> +      // Notify the main thread we're done
> +      mozilla::MonitorAutoLock mon(Preferences::sPreferences->mOMTSaveMonitor);
> +      Preferences::sPreferences->mAreWaitingForOMTSave = false;
> +
> +      mon.Notify();

Note that I changed the ordering here compared to the earlier patch to match what the personal dictionary stuff was doing.

As noted earlier, it'd be good if someone actually sanity-checked me carefully, because after years of being told threads are hard, and not all that much experience with them, I'm not at all confident that this code is currently correct as written. :-)

Comment 32

a year ago
https://reviewboard.mozilla.org/r/31081/#review27995

::: modules/libpref/Preferences.cpp:1042
(Diff revision 1)
> +  if (!hadCurrentFile) {
> +    mShutdownBlocker = new PrefShutdownBlocker(
> +      NS_LITERAL_STRING("Pref service shutdown blocker"));
> +    nsCOMPtr<nsIAsyncShutdownClient> shutdownPhase = GetShutdownPhase();
> +    shutdownPhase->AddBlocker(mShutdownBlocker,
> +                              NS_LITERAL_STRING(__FILE__),
> +                              __LINE__,
> +                              NS_LITERAL_STRING("Pref service shutdown blocker"));
> +  }

Note that I'm doing this here rather than in Init() for two reasons:
1) doing it in Init() breaks the world, because AsyncShutdown tries to read preferences
2) there isn't really a point in doing writes on shutdown (or blocking shutdown for such writes) unless we have an `mCurrentFile` to write to.
Blocks: 1133405

Comment 33

a year ago
Milan, when I asked Benjamin about this patch ~3 weeks ago he suggested that perhaps you could review this if you feel comfortable doing so - do you? :-)
Flags: needinfo?(milan)
(Assignee)

Comment 34

a year ago
Sure, I'll take a look.
(Assignee)

Comment 35

a year ago
Are we now guaranteed that all the preferences setting is done on the main thread?  I know we had a big push to do that, wanted to confirm we've completed that work.

Comment 36

a year ago
(In reply to Milan Sreckovic [:milan] from comment #35)
> Are we now guaranteed that all the preferences setting is done on the main
> thread?  I know we had a big push to do that, wanted to confirm we've
> completed that work.

There are some asserts for that in the code (ifdef'd out for b2g, admittedly), so I hope/assume so, but I don't know for sure.
(Assignee)

Comment 37

a year ago
https://reviewboard.mozilla.org/r/31079/#review32313

Thought I'd send a few things while looking at the rest.

::: modules/libpref/Preferences.cpp:109
(Diff revision 1)
> +static const uint32_t kPrefDelayedSaveTimeoutMs = 15000;

I'd add a way to overwrite this, perhaps with an environment variable.  It would give us a quick way of triaging in the field.  Perhaps to the point where we're able to disable the async saving with that environment variable (e.g., zero negative value)

::: modules/libpref/Preferences.cpp:206
(Diff revision 1)
> +    MOZ_ASSERT(NS_IsMainThread());

We're in big trouble if this is ever called off main thread, so I'd go stronger than debug assert.  MOZ_CRASH or MOZ_RELEASE_ASSERT, or even guarding against it, rather than asserting.

::: modules/libpref/Preferences.cpp:210
(Diff revision 1)
> +      if (gDirty) {

gDirty will be set to true if we fail in the OMT write.  Before this patch, we would just fail, perhaps report, and be done with it.  With this patch, we are going to schedule another OMT write? That's new, are we sure we want it?

::: modules/libpref/Preferences.cpp:233
(Diff revision 1)
> +    // execute a "safe" save by saving through a tempfile

Do we want to check if gHashTable exists, or are we safe?

::: modules/libpref/Preferences.cpp:1030
(Diff revision 1)
>      if (!gDirty)

The "contract" is that when we return from SavePrefFileInternal(), the pref file on disk is up to date, correct?  If we set gDirty to false just before we dispatch, this won't be the case.  We seem to be already handling the "there is an OMT write in progress at some level" below.

This should probably stay, it's more describing why it doesn't feel right that we're setting gDirty to false in the timer callback.

::: modules/libpref/Preferences.cpp:1035
(Diff revision 1)
> +      if (!mAreWaitingForOMTSave) {

At first I thought that the monitor is protecting  mAreWaitingForOMTSave as the condition variable.  It seems to be the case in a couple of places (we get the lock, then we deal with that variable), but not in all of them?  Or am I reading the code wrong?

::: modules/libpref/Preferences.cpp:1070
(Diff revision 1)
> -  // execute a "safe" save by saving through a tempfile
> +  PreferenceRunnableSaver* saver = new PreferenceRunnableSaver(aFile);

This feels like a memory leak?

::: modules/libpref/Preferences.cpp:1093
(Diff revision 1)
> -      pref = nullptr;
> +      gDirty = false;

I'm not sure about this one.  Why reset dirty here and then change it back in the runnable's Run() method, if things go wrong?

::: modules/libpref/Preferences.cpp:1097
(Diff revision 1)
> +    } else {

The callback happens on the main thread, right? When would we ever be in a situation where gDirty is false?
(Assignee)

Comment 38

a year ago
https://reviewboard.mozilla.org/r/31079/#review32947

::: modules/libpref/Preferences.h:384
(Diff revision 1)
> +  mozilla::Monitor         mOMTSaveMonitor;

Tracing through the code, it's starting to feel like we could benefit from a "state tracking" type of enum, rather than a boolean and the existence of the runnable.  Same functionality, just more explicit on what the current state is, between not doing anything, and having sent the request and being in the middle of processing the request, etc.

::: modules/libpref/prefapi.h:195
(Diff revision 1)
> +void PREF_MarkDirty();

I like this not only for new functionality, but for hiding the direct access of gDirty.  To the point where we may want to remove gDirty from extern, leave it in Preferences (probably just as a member variable instead of a global) and have the callback take care of both setting the state and deciding what to do afterwards:

Remove PREF_MarkDirty()after replacing it with

if (gDirtyCallback) gDirtyCallback();


and make the callback check:

void
Preferences::PrefsDirtyCallback()
{
  if (sPreferences && gHashTable && gDirty) {
    sPreferences->SavePrefsToDefaultFileOMT();
  }
}

Just random thoughts :)
(Assignee)

Updated

11 months ago
See Also: → bug 1287215
(Assignee)

Updated

10 months ago
Depends on: 1287945
(Assignee)

Updated

10 months ago
Depends on: 1288519
(Assignee)

Updated

10 months ago
Flags: needinfo?(milan)
See Also: → bug 848461

Comment 39

4 months ago
Milan, I haven't had a chance to come back to this for a while. I know you've looked at this more recently than I - do you have cycles / interest in taking this over?
Flags: needinfo?(milan)
(Assignee)

Comment 40

4 months ago
I probably do have time, and have made some patches on top of yours, but I'm not fully convinced that we actually write to disk that often (today.)  Given that the original report was from 2012, perhaps things changed :), I'll take a look at telemetry.
(Assignee)

Comment 41

4 months ago
Hmm.  Where is that telemetry we mention in comment 0?  Gijs, do you know if we still collect it, and if so, what it is?
Flags: needinfo?(milan) → needinfo?(gijskruitbosch+bugs)

Comment 42

4 months ago
(In reply to Milan Sreckovic [:milan] from comment #41)
> Hmm.  Where is that telemetry we mention in comment 0?  Gijs, do you know if
> we still collect it, and if so, what it is?

I have no idea... Georg, do you know?
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(gfritzsche)

Comment 43

4 months ago
(In reply to Milan Sreckovic [:milan] from comment #40)
> I probably do have time, and have made some patches on top of yours, but I'm
> not fully convinced that we actually write to disk that often (today.) 
> Given that the original report was from 2012, perhaps things changed :),
> I'll take a look at telemetry.

FWIW, I think we might not write that often, but doing it sync and OMT might still be problematic. Getting a profile with a large prefs.js file and trying to change some things in about:preferences if disk writes are slow should trigger this. Of course, that almost never happens on our fast dev machines. :-(

Comment 44

4 months ago
(In reply to :Gijs from comment #43)
> (In reply to Milan Sreckovic [:milan] from comment #40)
> > I probably do have time, and have made some patches on top of yours, but I'm
> > not fully convinced that we actually write to disk that often (today.) 
> > Given that the original report was from 2012, perhaps things changed :),
> > I'll take a look at telemetry.
> 
> FWIW, I think we might not write that often, but doing it sync and OMT might
> still be problematic.

Err, I meant *on* main thread, of course...
(In reply to :Gijs from comment #42)
> (In reply to Milan Sreckovic [:milan] from comment #41)
> > Hmm.  Where is that telemetry we mention in comment 0?  Gijs, do you know if
> > we still collect it, and if so, what it is?
> 
> I have no idea... Georg, do you know?

This looks like chrome hangs?
Chris, do you have a better idea on what tooling is good for this now or who would know?
Flags: needinfo?(gfritzsche) → needinfo?(chutten)

Comment 46

4 months ago
A stack like that and a vague mention of "telemetry reports" means it was probably from chromeHangs or threadHangs, yes.

As for what would be good tooling in this year of 2017... probably still hangs from chromeHangs/threadHangs. A notebook could plot a histogram of hang lengths for stacks including "mozilla::Preferences" to see their distribution. From there we could determine whether multi-second hangs in mozilla::Preferences calls are a common thing or uncommon thing.
Flags: needinfo?(chutten)

Comment 47

3 months ago
(In reply to Chris H-C :chutten from comment #46)
> A stack like that and a vague mention of "telemetry reports" means it was
> probably from chromeHangs or threadHangs, yes.
> 
> As for what would be good tooling in this year of 2017... probably still
> hangs from chromeHangs/threadHangs. A notebook could plot a histogram of
> hang lengths for stacks including "mozilla::Preferences" to see their
> distribution. From there we could determine whether multi-second hangs in
> mozilla::Preferences calls are a common thing or uncommon thing.

Is this something you or someone else on the data team could take a stab at doing? My experience with spark so far has not been great...

I'll point out though that in the current state, telemetry's (hah!) own requirements mean we *always* end up writing to the prefs on shutdown, even with these patches (once they're unbitrotted etc.) because we store session length data in the prefs and so those are guaranteed to change (I think telemetry even has a shutdown observer of sorts for this).

The patch might still be a win because it does the IO off-main-thread, and it helps for various other cases where we currently force a pref save (and/or enables us to stop doing that). But that's going to be harder to measure than the shutdown hang stuff.

I'm also going to unassign this as realistically I'm not working on this at the moment. Hopefully that might spur someone else on to take this on. (Like Milan, maybe? :-) )
Assignee: gijskruitbosch+bugs → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(milan)
Flags: needinfo?(chutten)
(Assignee)

Comment 48

3 months ago
I'd be happy to take the bug (I have some stuff on top of what Gijs started with), but I would really want to see a measurement that we expect to get better, and then make sure it gets better.  Otherwise, it will be difficult to get a change like this landed - it complicates the code, and if there is no benefit we can show, it's a hard sell.

Chris, how do we measure what's mentioned above?  Is it already in place, and if not, what needs to be put in place?

Comment 49

3 months ago
There's nothing terribly high level that is already in place. A few minutes' scrambling at EOD yesterday got me this far: https://gist.github.com/chutten/0e91b0cc2cfe4cacc244a0db95a3810d 

Not sure if this is what we're expecting to see or not.

Of relevance might be the return of the BHR dashboard that :dthayer is working on (bug 1344003). Even if the dashboard's too coarse (only shows the top N hangs, may not include Preferences), the analysis might be helpful.
Flags: needinfo?(chutten)
(In reply to :Gijs from comment #47)
> I'll point out though that in the current state, telemetry's (hah!) own
> requirements mean we *always* end up writing to the prefs on shutdown, even
> with these patches (once they're unbitrotted etc.) because we store session
> length data in the prefs and so those are guaranteed to change (I think
> telemetry even has a shutdown observer of sorts for this).

Can you file a Telemetry bug with the details on this? Thanks.
Flags: needinfo?(gijskruitbosch+bugs)

Comment 51

3 months ago
(In reply to Georg Fritzsche [:gfritzsche] from comment #50)
> (In reply to :Gijs from comment #47)
> > I'll point out though that in the current state, telemetry's (hah!) own
> > requirements mean we *always* end up writing to the prefs on shutdown, even
> > with these patches (once they're unbitrotted etc.) because we store session
> > length data in the prefs and so those are guaranteed to change (I think
> > telemetry even has a shutdown observer of sorts for this).
> 
> Can you file a Telemetry bug with the details on this? Thanks.

bug 1346223.
Depends on: 1346223
Flags: needinfo?(gijskruitbosch+bugs)
Writing telemetry data to disk is showing up as long running hangs on the parent process main thread for 8 seconds or longer (search for Preferences::WritePrefFile in https://people-mozilla.org/~mlayzell/bhr/20170429/all.html).  I think we should try to fix this for QF if possible.
Whiteboard: [Snappy:P2] → [Snappy:P2][qf:p1]
Bug 1361262 has an example code path that should help with reproducing a write to the prefs database after a successful restart from a startup crash.  Milan mentioned to me in person that he had some difficulty previously reproducing these writes, perhaps that bug would be helpful here.
(Assignee)

Comment 54

23 days ago
It's more that I didn't see a lot of calls to write the prefs, as this bug was often mentioning those as well.  They don't seem to be that common - however, we're now talking about "one" possibly being a problem, perhaps because of the size, so the frequency of saves isn't that important, it's that they happen on the main thread in the first place.
Yeah I'm going to pick this up and review. Every time I look it my eyes glaze over and I need to just block out some hours to be sure I get it right.
(In reply to Milan Sreckovic [:milan] from comment #54)
> It's more that I didn't see a lot of calls to write the prefs, as this bug
> was often mentioning those as well.  They don't seem to be that common -
> however, we're now talking about "one" possibly being a problem, perhaps
> because of the size, so the frequency of saves isn't that important, it's
> that they happen on the main thread in the first place.

Oh, I see, a lot of *calls*.  Sorry I misunderstood what you said yesterday.  :-)  IINM the only entry point to this is nsIPrefService::SavePrefFile().  Looking at where those function is called from C++ these days, there is a call when we shut down, one where the computer suspends, one from the idle daily service, one from the Places Database::ForceCrashAndReplaceDatabase function (sounds scary!), and one from DriverCrashGuard::FlushPreferences() (not sure what that is used for) besides bug 1361262.  But sadly there are also JS consumers: <https://searchfox.org/mozilla-central/search?q=symbol:_ZN14nsIPrefService12SavePrefFileEP7nsIFile%2C%23savePrefFile&redirect=false>  The BHR data shows that some of the cases where we are being hurt by this are caused by JS consumers of this code, but it's hard to exactly say from which consumer.

I tried a bit, and here is one example:
https://people-mozilla.org/~mlayzell/bhr/20170429/987.html#1 comes from <https://searchfox.org/mozilla-central/rev/abe68d5dad139e376d5521ca1d4b7892e1e7f1ba/toolkit/mozapps/extensions/internal/XPIProvider.jsm#4394>; the pseudostack is:
Startup::XRE_Main
gre/modules/addons/XPIProvider.jsm:2940
gre/modules/addons/XPIProvider.jsm:4937

Looking through https://people-mozilla.org/~mlayzell/bhr/20170429/all.html it's clear that a lot of these problematic cases are coming from JS consumers but it's hard to find these in https://people-mozilla.org/~mlayzell/bhr/20170429.html.  I filed bug 1361842 in order to add a pseudo-stack entry to help make this more easily visible on the BHR data...
See Also: → bug 1361842

Comment 57

23 days ago
(In reply to Benjamin Smedberg [:bsmedberg] from comment #55)
> Yeah I'm going to pick this up and review. Every time I look it my eyes
> glaze over and I need to just block out some hours to be sure I get it right.

The patches are now more than a year old though, so they've probably bitrotted. Milan already did some pref service tidying up in bug 1287215, and more must have happened to make get*Pref take optional default args. It would probably make sense for me or Milan or *someone* to unbitrot the patches prior to review... Milan, do you have cycles? I'm pretty swamped with Photon stuff these days...
(Assignee)

Updated

23 days ago
See Also: → bug 897049
(Assignee)

Updated

23 days ago
See Also: → bug 1185567

Updated

22 days ago
Whiteboard: [Snappy:P2][qf:p1] → [Snappy:P2][qf:p1][bhr]
(Assignee)

Updated

22 days ago
See Also: → bug 1288519
(Assignee)

Comment 58

22 days ago
I'll see try to at least un-bitrot the original patches before the weekend.

Updated

22 days ago
Blocks: 1360214
(Assignee)

Updated

21 days ago
Attachment #8708517 - Attachment is obsolete: true
Attachment #8708517 - Flags: review?(benjamin)
(Assignee)

Updated

21 days ago
Attachment #8708518 - Attachment is obsolete: true
Flags: needinfo?(milan)
Attachment #8708518 - Flags: review?(benjamin)
(Assignee)

Updated

21 days ago
Attachment #8708519 - Attachment is patch: false
(Assignee)

Updated

21 days ago
Attachment #8708519 - Attachment is obsolete: true
(Assignee)

Updated

21 days ago
Assignee: nobody → milan
(Assignee)

Updated

21 days ago
Attachment #8703278 - Attachment is obsolete: true
(Assignee)

Comment 59

21 days ago
Created attachment 8864983 [details] [diff] [review]
Part 1. Move the code for writing into a separate class.  Add SavePrefFileMainThread to C++ only.

The write method keeps checking a non-protected variable for a possible "interrupt", rather than having a cancellable runnable.  Perhaps that would be a better approach.  Other than the interrupt part, the Write() method is copied from the old WritePrefFile.

There was a number of places that used XRE_IsContentProcess to catch error conditions, where we really needed !XRE_IsParentProcess, so those changes ended up here as well.
Attachment #8864983 - Flags: feedback?
(Assignee)

Comment 60

21 days ago
Created attachment 8864985 [details] [diff] [review]
Part 2. Off main thread preference save functionality, unused.

This went in a different direction that Gijs' original patches, so blame me for any bad things, and credit to Gijs for all the good things.

Runnables to send the write work to a separate thread, and one to send the error condition back to main thread, in which case we (if necessary) do the main thread write.

Off main thread writes do not make an attempt to consolidate, unless they are processed out of order.  The earlier request will not write if a later one already did.  There is otherwise no time delay suggesting we should wait for more requests.

Main thread writes try to stop the off main thread ones, and also block any OMT requests that may be in the queue, but are not being processed yet.

Open questions (that I know of):

**** Is the thread we're using, NS_STREAMTRANSPORTSERVICE_CONTRACTID, the right one to use?
Attachment #8864985 - Flags: feedback?
(Assignee)

Comment 61

21 days ago
Created attachment 8864988 [details] [diff] [review]
Part 3. Put OMT pref save behind a preference, off by default.

I'm not actually sure we would want something like this.  The idea is that we'd be able to do a telemetry experiment if this was a preference.  If we were to keep this functionality, I'm also unsure if this is the right way to do it.
Attachment #8864988 - Flags: feedback?
(Assignee)

Comment 62

21 days ago
Created attachment 8864991 [details] [diff] [review]
Part 4. Switch some of the calls to SafePrefFile (which will eventually go off main thread) to explicitly be on the main thread save

This patch needs individual domain experts.  I'm not clear which ones could be async.
Attachment #8864991 - Flags: feedback?
(Assignee)

Comment 63

21 days ago
Some things coming up:
* Will relax the lock held during the main thread write.

Some questions to answer:
* Do we want off main thread requests to interrupt earlier, non-processed ones?  I'd probably rather not bother.
* Do we want a crash report note in case we crashed with outstanding off main thread requests in the queue?
* Do we want more telemetry on... something around this functionality?
(Assignee)

Updated

21 days ago
Attachment #8864988 - Attachment is patch: true
(Assignee)

Updated

21 days ago
Attachment #8864991 - Attachment is patch: true
(Assignee)

Comment 64

21 days ago
There are two non-public methods that take a boolean to decide if they should work on the main thread, or off the main thread.  I cannot decide if "true" should mean off main thread, or if "true" should mean on main thread.  Right now, it means off.  Neither sounds quite right.  May go to the trouble of making an enum, just to avoid making the decision as to what true means.  Any suggestions are welcome.
(Assignee)

Comment 65

21 days ago
Comment on attachment 8864983 [details] [diff] [review]
Part 1. Move the code for writing into a separate class.  Add SavePrefFileMainThread to C++ only.

Review of attachment 8864983 [details] [diff] [review]:
-----------------------------------------------------------------

::: modules/libpref/Preferences.cpp
@@ +309,5 @@
> +    }
> +
> +    // Last chance to abort...
> +    if (aInterrupt && *aInterrupt) {
> +      NS_WARNING("OMT pref save interrupted 7");

Clearly, I was assuming that just abandoning an out stream without doing a Finish() call would mean nothing is ever done, whatever was "written" disappears and no changes are made to the actual file in question.

That strikes me as a very dangerous assumption, so whoever looks at this should look very carefully.
(Assignee)

Comment 66

21 days ago
(In reply to Milan Sreckovic [:milan] from comment #65)
> ...
> 
> Clearly, I was assuming that just abandoning an out stream without doing a
> Finish() call would mean nothing is ever done, whatever was "written"
> disappears and no changes are made to the actual file in question.
> 
> That strikes me as a very dangerous assumption, so whoever looks at this
> should look very carefully.

Apparently, my assumption was correct.  Not calling Finish() means nothing gets done on those streams.  Exactly what I needed.
(Assignee)

Comment 67

21 days ago
Created attachment 8865023 [details] [diff] [review]
Part 5. Enable OMT pref file writes
(Assignee)

Updated

21 days ago
Attachment #8865023 - Attachment is patch: true
(Assignee)

Comment 68

18 days ago
Created attachment 8865540 [details]
Part 0. Design decisions
(Assignee)

Comment 69

18 days ago
Created attachment 8865541 [details] [diff] [review]
Part 2. Off main thread save functionality, unused
Attachment #8864985 - Attachment is obsolete: true
Attachment #8864985 - Flags: feedback?
(Assignee)

Updated

18 days ago
Attachment #8865541 - Flags: feedback?
(Assignee)

Comment 70

18 days ago
Comment on attachment 8865540 [details]
Part 0. Design decisions

Most of the module specific review is captured in this "design description".  Benjamin, reviewing that will probably free you from detailed code reviewed until "the end", as it describes what I'm trying to do.

Of course, you can feel free to review the code as well, but I thought this may be easier.
Attachment #8865540 - Flags: review?(benjamin)
(Assignee)

Comment 71

17 days ago
Created attachment 8865982 [details] [diff] [review]
Part 1. Move the code for writing into a separate class.  Add SavePrefFileMainThread to C++ only.
Attachment #8864983 - Attachment is obsolete: true
Attachment #8864983 - Flags: feedback?
(Assignee)

Comment 72

17 days ago
Created attachment 8865983 [details] [diff] [review]
Part 2. Off main thread preference save functionality, unused.
Attachment #8865541 - Attachment is obsolete: true
Attachment #8865541 - Flags: feedback?
(Assignee)

Comment 73

17 days ago
Created attachment 8865984 [details] [diff] [review]
Part 3. Put OMT pref save behind a preference, off by default.
Attachment #8864988 - Attachment is obsolete: true
Attachment #8864988 - Flags: feedback?
(Assignee)

Comment 74

17 days ago
Created attachment 8865985 [details] [diff] [review]
Part 4. Switch some of the calls to SafePrefFile (which will eventually go off main thread) to explicitly be on the main thread save
Attachment #8864991 - Attachment is obsolete: true
Attachment #8864991 - Flags: feedback?
(Assignee)

Comment 75

17 days ago
Created attachment 8865986 [details] [diff] [review]
Part 5. Enable OMT pref file writes
Attachment #8865023 - Attachment is obsolete: true
(Assignee)

Comment 76

16 days ago
Note: If we are trying to get rid of sync save on shutdown, we have to make sure we handle any outstanding request we may have when it comes to shut down.
(Assignee)

Comment 77

16 days ago
Created attachment 8866382 [details] [diff] [review]
Part 3. Put OMT pref save behind a preference, off by default.
Attachment #8865984 - Attachment is obsolete: true
(Assignee)

Comment 78

16 days ago
Created attachment 8866494 [details] [diff] [review]
Part 2. Off main thread preference save functionality, unused.
Attachment #8865983 - Attachment is obsolete: true
(Assignee)

Comment 79

16 days ago
Created attachment 8866495 [details] [diff] [review]
Part 3. Put OMT pref save behind a preference, off by default.
Attachment #8866382 - Attachment is obsolete: true
Comment on attachment 8865540 [details]
Part 0. Design decisions

>1. New Preferences::SavePrefFileMainThread() method:
> * signature otherwise matches SavePrefFile();
> * accessible from C++ only;
> * calling this method ensures (a blocking/sync) main thread preference file save.

Reading below, but I'm hoping we either don't need this or it's a temporary porting mechanism?

>2. Preferences::SavePrefFile(), accessible from JS as savePrefFile():
> * behind a preference preferences.allow.omt-write does async preference file save on the NS_STREAMTRANSPORTSERVICE_CONTRACTID thread.

I don't know a lot about this thread? Is it ok to block this thread on disk I/O?

>3. Some common stuff:
> * Preferences maintain a dirty flag; this is cleared on successful sync save or immediately on async save request (see below how we handle an async save failure);
> * Each save request (sync and async) gets a unique ID;
> * A global variable holds the ID of the last successful save;
> * A mutex used for synchronization of some operations (more details below)
>
>4. Async saves details:
> * Async saves collect all the data on the main thread, the create a runnable that eventually does the file write, and "succeed" immediately;
> * It is assumed that the file pointer remains valid during the lifetime of an async save requst;

File pointer?

> * Async saves hold the mutex lock for the duration of the file save;

This sounds wrong. What is protected by the mutex lock?

> * A request for async save to "interrupt" (abandon all changes and return with success) is made by setting a global shared variable (not behind the lock);

What is the purpose of making this interruptible? I don't understand what the interruptions are for.

> * Async save requests are not allowed during shutdown - they are converted to sync saves.

As a long-term plan I don't think this is right. We should instead have a shutdown blocker on async save. This would avoid blocking other shutdown activities on disk I/O.

As a short-term plan if it makes the patch easier to write, sure.

Other things I think we should make clear:
* all saves are only triggered if user prefs are dirty
* all saves continue to operate using a safe output stream/atomic move.
* now that we have this async system, and user pref change should trigger saving prefs "soon"; I'd suggest 500ms just so that batches of prefs that get set near the same time don't trigger a bunch of separate async writes.

My recommendation would be to get rid of the sync save case in the code entirely, because it would significantly reduce the complexity of the system. If necessary we could have a promiseSavePrefFile method which returns a promise.

>A. There is no telemetry for items related to the success or interruption of pref file saves.

Telemetry for failure is moderately valuable, at least worth a followup bug. Telemetry for interruption doesn't seem worth it.

>B. There is no note in the crash report about "this happened during the preference file save".

Probably not important.
Attachment #8865540 - Flags: review?(benjamin)
(Assignee)

Comment 81

14 days ago
Yeah, I prefer to go gradually into a big change.  For example, I would rather land and use the ability to have some saves async, then wait until everything is done.  I think we can start seeing the results sooner, and perhaps decide that not all is the highest priority.

In particular, I wanted to keep the ability to do a sync save.  The goal of making all of them async is a good one, and we should strive for it, but the full async solution gets complicated.  In other words, I think we should enhance the system to allow async saves, not redesign it from scratch for only async saves, at least not as a blocking first step.

All saves are only triggered if the user prefs are dirty (although we currently have a path where that is not the case, and I would probably want to have that change in a separate bug.)

Also, explicitly and because I don't think we need it, I don't have the "wait around for a while, and combine multiple saves into a single one". I suppose it could be added, but I wanted to separate the "reduce the number of disk writes" from the "reduce the blocking of the main thread". I think we can do the former in a separate bug, and we would benefit from the later because the main thread is blocked for less time (note that collecting all the data is still done on the main thread, and that can take non-trivial amount of time - that's a separate optimization that can be considered, with or without the OMT saves.) 


(In reply to Benjamin Smedberg [:bsmedberg] from comment #80)
...
> > * calling this method ensures (a blocking/sync) main thread preference file save.
> 
> Reading below, but I'm hoping we either don't need this or it's a temporary
> porting mechanism?
> 
Perhaps.  If it turns out that we don't need a sync save, then we get rid of it.  If that's only after a follow up bug, then we do it at that point.


> >2. Preferences::SavePrefFile(), accessible from JS as savePrefFile():
> > * behind a preference preferences.allow.omt-write does async preference file save on the NS_STREAMTRANSPORTSERVICE_CONTRACTID thread.
> 
> I don't know a lot about this thread? Is it ok to block this thread on disk
> I/O?

I have no clue, I was hoping you'd know :)  I'll try to find out.


> > * It is assumed that the file pointer remains valid during the lifetime of an async save requst;
> 
> File pointer?

nsIFile* is what we use for saving.  I don't know what's backing it up, but I see the current code making assumptions that it's valid and there to be used when necessary.  If that's not the case, any direction would be appreciated.

> 
> > * Async saves hold the mutex lock for the duration of the file save;
> 
> This sounds wrong. What is protected by the mutex lock?

This is only needed if we're allowing sync saves.  We want to make sure that async and sync saves don't happen at the same time.  And sync save needs to wait for an async save to finish before it can take a turn.  Now, that sounds like a long time, as you'd have main thread possibly blocked on the whole process, which is where the interrupt below comes into the picture...


> > * A request for async save to "interrupt" (abandon all changes and return with success) is made by setting a global shared variable (not behind the lock);
> 
> What is the purpose of making this interruptible? I don't understand what
> the interruptions are for.

If a sync save shows up, we don't need any active async saves to actually complete - the file is about to get overwritten with fresh data anyway.  So, we want to stop the async save at a safe point during the process.



> > * Async save requests are not allowed during shutdown - they are converted to sync saves.
> 
> As a long-term plan I don't think this is right. We should instead have a
> shutdown blocker on async save. This would avoid blocking other shutdown
> activities on disk I/O.
> 
> As a short-term plan if it makes the patch easier to write, sure.

Yes, makes sense.  It would be part of getting rid of sync saves.


> My recommendation would be to get rid of the sync save case in the code
> entirely, because it would significantly reduce the complexity of the
> system. If necessary we could have a promiseSavePrefFile method which
> returns a promise.

Not sure getting rid of sync reduces complexity, but there is less code in the Preference.cpp for sure.  In particular, readUserPrefs() would need to block on a successful write; if nowhere else, we have tests that explicitly test the save+read expecting the correct data.

I'll open a few follow up bugs that separate some of the items mentioned here.  We can then decide if we want to land/ship when all or handled, or a subset.
(Assignee)

Updated

14 days ago
Blocks: 1364495
(Assignee)

Updated

14 days ago
Blocks: 1364496
(Assignee)

Updated

14 days ago
Blocks: 1364498
(Assignee)

Comment 82

14 days ago
(In reply to Benjamin Smedberg [:bsmedberg] from comment #80)
> ...
> 
> >2. Preferences::SavePrefFile(), accessible from JS as savePrefFile():
> > * behind a preference preferences.allow.omt-write does async preference file save on the NS_STREAMTRANSPORTSERVICE_CONTRACTID thread.
> 
> I don't know a lot about this thread? Is it ok to block this thread on disk
> I/O?

Gijs, I got this from your patch.  Did you investigate if the thread above was the appropriate one for pref file save/be blocked on disk i/o, or was it a placeholder in the patch?  Just don't want to repeat the investigation if one was already made.

Thanks!
Flags: needinfo?(gijskruitbosch+bugs)
(Assignee)

Comment 83

14 days ago
(In reply to Benjamin Smedberg [:bsmedberg] from comment #80)
> ... 
> My recommendation would be to get rid of the sync save case in the code
> entirely, because it would significantly reduce the complexity of the
> system. If necessary we could have a promiseSavePrefFile method which
> returns a promise.

Forgot to mention - sync saves are only available through C++ - JS always gets async saves.  The side effect is that readUserPrefs call may get you an out of date file, unless we want it to potentially block the main thread longer.
My recommendation would be, to reduce complexity:

* the actual implementation of all saves should be async. That should probably be on the general-purpose threadpool, not the socket thread.
* to support blocking saves at least temporarily, kick off an async save and either using sync dispatch or a monitor.wait.

That reduces the complexity inherent in any interruptions and cross-thread synchronization of writes; we know that only one thread is ever writing to the pref file, and you merely have to synchronize on epochs.

Other notes:

I believe that nsIFile is not a thread-safe object and so you'd need to use string paths.
(Assignee)

Updated

14 days ago
Blocks: 1364509
(Assignee)

Comment 85

14 days ago
See bug 1364496 comment 1 and bug 1364496 comment 2 for some examples of what we'd need to handle before we can make all saves async.  That will require different expertise from (probably) different people.
(Assignee)

Comment 86

14 days ago
(In reply to Benjamin Smedberg [:bsmedberg] from comment #84)
> My recommendation would be, to reduce complexity:
...

I think this just went out of my ability and time to work on - I don't have enough Gecko coding experience.  I'm going to bow out, but if there are simple things where I can help, let me know.
Assignee: milan → nobody

Comment 87

10 days ago
(In reply to Milan Sreckovic [:milan] from comment #82)
> (In reply to Benjamin Smedberg [:bsmedberg] from comment #80)
> > ...
> > 
> > >2. Preferences::SavePrefFile(), accessible from JS as savePrefFile():
> > > * behind a preference preferences.allow.omt-write does async preference file save on the NS_STREAMTRANSPORTSERVICE_CONTRACTID thread.
> > 
> > I don't know a lot about this thread? Is it ok to block this thread on disk
> > I/O?
> 
> Gijs, I got this from your patch.  Did you investigate if the thread above
> was the appropriate one for pref file save/be blocked on disk i/o, or was it
> a placeholder in the patch?  Just don't want to repeat the investigation if
> one was already made.

Well, it's now 1.5 years ago... I think I copied it from elsewhere. From a quick look, it's used by nsAsyncStreamCopier, NetUtils, service workers, HTML fileSystem stuff, nsStreamUtils, etc., for file IO. So I *think* it's a reasonable first choice, unless we think this stuff should get its own thread...

(In reply to Milan Sreckovic [:milan] from comment #86)
> (In reply to Benjamin Smedberg [:bsmedberg] from comment #84)
> > My recommendation would be, to reduce complexity:
> ...
> 
> I think this just went out of my ability and time to work on - I don't have
> enough Gecko coding experience.  I'm going to bow out, but if there are
> simple things where I can help, let me know.

OK, that's 2 of us now. Ehsan, can someone from :qf or photon perf pick this up? I think we all agree this is important, but we're struggling to find people to work on it and drive this over the finishline.
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(ehsan)
(Assignee)

Comment 88

10 days ago
To be clear, I can do the "optional async save, with sync save still available, and then the authors of sync calls have to go and decide if they can go async".  Those are the patches on this bug.  And a callback on preference write (in bug 1361262) type of thing to deal with at least the sync pref save during startup.

It's the "make all saves async, and put a delay before doing any saves" that's throwing me for a loop, especially the part where I then have to figure out what all the sync callers are now supposed to do.  Things like bug 1364496 comment 1 and bug 1364496 comment 2, for example.  Don't have the expertise for that.
(Assignee)

Comment 89

10 days ago
On a related note, I also prefer a gradual and safe approach, so that's definitely coloring my opinion on this.  I would rather put in the ability to async save, and change one place that we've identified as important, and then do the rest in the follow up bugs.
So the only parts that I feel have to happen here to have a safe and understandable codepath are doing all the actual writes on the thread. So even if that means a sync dispatch to the thread which blocks in some cases, that at least seriously reduces the complexity of synchronizing I/O on multiple threads.
(Assignee)

Comment 91

10 days ago
Talked offline over some details, should be able to do this.
Assignee: nobody → milan
Flags: needinfo?(ehsan)
(Assignee)

Updated

9 days ago
Blocks: 1361262
(Assignee)

Comment 92

7 days ago
The "always write on the same thread" simplifies some things.  The only worry I have right now - when we do need to perform a sync save (likely, this are few and far between cases, if any, tracked in bug 1364496), the main thread would be blocked on the thread performing writes - not just on the pref writes themselves, but anything that may be executed in that thread.
(Assignee)

Comment 93

7 days ago
There are some add-ons that use readUserPref method - that will block if there are pending writes.
Ugh, that API needs to die. What would happen if we just removed it now?
(Assignee)

Comment 95

2 days ago
Created attachment 8870866 [details]
Part 0. Design decisions
Attachment #8865540 - Attachment is obsolete: true
Attachment #8865982 - Attachment is obsolete: true
Attachment #8865985 - Attachment is obsolete: true
Attachment #8865986 - Attachment is obsolete: true
Attachment #8866494 - Attachment is obsolete: true
Attachment #8866495 - Attachment is obsolete: true
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Comment 100

2 days ago
mozreview-review
Comment on attachment 8870870 [details]
Bug 789945: Part 4. Explicitly switch C++ SavePrefFile callers to main thread or off main thread versions of the function. Enable OMT pref save on by default, at which point, the sync saves are more expensive than before.

https://reviewboard.mozilla.org/r/142440/#review146180

::: modules/libpref/Preferences.cpp:916
(Diff revision 1)
>    }
>  
>    nsresult rv = NS_OK;
>  
>    if (!nsCRT::strcmp(aTopic, "profile-before-change")) {
> -    rv = SavePrefFile(nullptr);
> +    // There are no async saves outstanding, we have a shutdown blocker.

I may have lied on this; I did run into a problem trying to create a shutdown blocker at this stage of the shutdown at one point, but that may have been some other problem.  I will change this to SavePrefFileOffMainThread() and see if it works.
Comment hidden (mozreview-request)
(Assignee)

Comment 102

a day ago
mozreview-review
Comment on attachment 8870870 [details]
Bug 789945: Part 4. Explicitly switch C++ SavePrefFile callers to main thread or off main thread versions of the function. Enable OMT pref save on by default, at which point, the sync saves are more expensive than before.

https://reviewboard.mozilla.org/r/142440/#review146522

::: modules/libpref/Preferences.cpp:99
(Diff revision 2)
> +static bool gInShutdownTooLateForChanges = false;
>  void
>  Preferences::DirtyCallback()
>  {
>    if (gHashTable && sPreferences && !sPreferences->mDirty) {
> +    MOZ_DIAGNOSTIC_ASSERT(!gInShutdownTooLateForChanges);

This lights up try with crashes.
(Assignee)

Comment 103

a day ago
(In reply to Milan Sreckovic [:milan] from comment #102)
> ...
> >  {
> >    if (gHashTable && sPreferences && !sPreferences->mDirty) {
> > +    MOZ_DIAGNOSTIC_ASSERT(!gInShutdownTooLateForChanges);
> 
> This lights up try with crashes.

For example, browser/modules/test/unit/social/test_social.js, using browser/modules/test/unit/social/head.js sets social.activeProviders in do_register_cleanup.  That call shows up after profile-before-change.
Comment hidden (mozreview-request)
(Assignee)

Comment 105

a day ago
(In reply to Benjamin Smedberg [:bsmedberg] from comment #94)
> Ugh, that API needs to die. What would happen if we just removed it now?

Opened bug 1367813 for this.
Blocks: 1367813
(Assignee)

Comment 106

22 hours ago
mozreview-review
Comment on attachment 8870868 [details]
Bug 789945: Part 2. Off main thread functionality for the save.  Sync save is done by dispatching a sync message to the same thread. OMT dispatches a message to the main thread with the result of the save.

https://reviewboard.mozilla.org/r/142436/#review146588

::: modules/libpref/test/unit/test_bug506224.js:14
(Diff revision 1)
>    var ps = Cc["@mozilla.org/preferences-service;1"]
>             .getService(Ci.nsIPrefService);
>    var prefs = ps.getDefaultBranch(null);
>    var userprefs = ps.getBranch(null);
>  
> +  // Make sure the preference writes are synchronous, as we are

Since async only applies to default read/save, we do not need this change in the test - it will be sync anyway.

::: services/sync/tests/unit/test_prefs_store.js:35
(Diff revision 1)
>    // back to this prefs_test_prefs_store.js directly in the obj dir. This
>    // upsets things in confusing ways :) We avoid this by explicitly telling the
>    // pref service to use a file in our profile dir.
>    let prefFile = do_get_profile();
>    prefFile.append("prefs.js");
> +

Since sync applies only to default read/save, we likely do not need this.
(Assignee)

Comment 107

22 hours ago
The review comments above were meant for the setting of ...omt-write preference, not the code that follows it.
(Assignee)

Comment 108

19 hours ago
:snorp, how aggressively do you want to switch to off main thread pref saves?  The part 4 patch above leaves two places with sync saves.  Or, we can leave the preference for allowing async saves off by default on mobile.  Thoughts?
Flags: needinfo?(snorp)
(In reply to Milan Sreckovic [:milan] from comment #108)
> :snorp, how aggressively do you want to switch to off main thread pref
> saves?  The part 4 patch above leaves two places with sync saves.  Or, we
> can leave the preference for allowing async saves off by default on mobile. 
> Thoughts?

OMT pref saves seem like they would be good on mobile, but I think we'd need a way to synchronously flush any pending saves. We get one opportunity to get into a good state when the application is backgrounded, and after that Android can just kill us without notice.
Flags: needinfo?(snorp)
You need to log in before you can comment on or make changes to this bug.