Closed Bug 789945 Opened 12 years ago Closed 7 years ago

Save preferences asynchronously

Categories

(Core :: Preferences: Backend, defect)

16 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla56
Performance Impact high
Tracking Status
firefox56 --- disabled
firefox57 --- fixed

People

(Reporter: vladan, Assigned: milan)

References

(Depends on 2 open bugs, Blocks 3 open bugs)

Details

(Keywords: main-thread-io, Whiteboard: [Snappy:P2][bhr])

Attachments

(5 files, 23 obsolete files)

59 bytes, text/x-review-board-request
benjamin
: review+
Details
59 bytes, text/x-review-board-request
benjamin
: review+
Details
59 bytes, text/x-review-board-request
benjamin
: review+
Details
59 bytes, text/x-review-board-request
benjamin
: review+
Details
2.83 KB, text/plain
Details
There is a significant number of Telemetry reports showing multi-second hangs from preferences being written to disk synchronously. NtFlushBuffersFile (in ntdll.dll) -> FlushFileBuffersImplementation (in kernel32.dll) -> FileSync (in nspr4.dll) -> nsSafeFileOutputStream::Finish() (in xul.dll) -> nsBufferedOutputStream::Finish() (in xul.dll) -> mozilla::Preferences::WritePrefFile(nsIFile *) (in xul.dll) -> mozilla::Preferences::SavePrefFileInternal(nsIFile *) (in xul.dll) -> mozilla::Preferences::SavePrefFile(nsIFile *) (in xul.dll) -> NS_InvokeByIndex_P (in xul.dll) -> XPCWrappedNative::CallMethod(XPCCallContext &,XPCWrappedNative::CallMode) (in xul.dll)
I have the impression that backgrounding writes (with OS.File atomic write) would be pretty low-risk.
This stack trace is interesting, as it seems that the multi-second hangs originate from JavaScript. This would make a rewrite with OS.File rather natural, if we can serialize the preference file to a string.
Whiteboard: [Snappy:P2]
We could use nsIBackgroundFileSaver, that stores the file through a background thread.
(In reply to Marco Castelluccio [:marco] from comment #3) > We could use nsIBackgroundFileSaver, that stores the file through a > background thread. The main problem with this is that each write would require a new thread. I am almost sure that we have cases in which we modify many preferences in a row and save after each change, so this might possibly explode.
(In reply to David Rajchenbach Teller [:Yoric] from comment #4) > (In reply to Marco Castelluccio [:marco] from comment #3) > > We could use nsIBackgroundFileSaver, that stores the file through a > > background thread. > > The main problem with this is that each write would require a new thread. I > am almost sure that we have cases in which we modify many preferences in a > row and save after each change, so this might possibly explode. I've just been pointed to this bug because of issues I am having with exactly this situation. It occurs in instant-apply prefwindows when savePrefFile() is called after every preference value change. While that might not sound so terrible, it can occur for every single typed (or deleted) character in a textbox. That, or programmatically applying multiple preference changes, can result in the multi-second hangs you are seeing from the telemetry. This particular instance could be semi-solved just by modifying the preferences binding. That still leaves a blocking write of the prefs file at some point, just not instantly and not for every single character. The non-instant-apply approach to writing out all the preferences when the dialog closes is to not call savePrefFile() until all the changes have been applied, controlled by a batching property on the preference element.
(In reply to Ian Nartowicz from comment #5) > This particular instance could be semi-solved just by modifying the > preferences binding. That still leaves a blocking write of the prefs file > at some point, just not instantly and not for every single character. Yes, we should fix the preferences bindings to avoid calling savePrefFile too often (I thought there was a bug on file about that already; if not, please file one!). This bug is about the more general problem and is harder to fix.
Blocks: 683808
So, I would *think* that we can't just switch the API here, because on shutdown we actually need to be sure we write out the prefs, because of how the prefs system works right now. Would switching to a regular async/OMT flush system work (ie saving automatically and asynchronously, off main thread, and saving no more often than once every 5 minutes or so (a la DelayedTask)), combining that with guarding savePrefFile to no-op iff passed "null" as the file and with no dirty state? AFAICT this would eliminate the sync disk writes in most cases (where a pref has not been modified within 5 minutes of shutdown/backgrounding) and improve reliability in data loss scenarios. Also, is this something we could work on as part of the backlog, Gavin? :-)
Flags: needinfo?(gavin.sharp)
If we wish to do this in JS (e.g. by rebinding Services.prefs to a PreferenceService.jsm, rather than the XPCOM implementation), I suggest taking advantage of the OS.File.writeAtomic backup mechanism http://wp.me/p52O1-mh and using it for recovery if the file is corrupted, which should go a long way towards ensuring that we do not corrupt data. Similarly, if we do it in native code, a similar rename/backup mechanism would be a good idea. Regardless, please use this with AsyncShutdown. I hope to land the XPCOM version in August (bug 918317).
(In reply to David Rajchenbach Teller [:Yoric] (away until August 20th - use "needinfo") from comment #8) > If we wish to do this in JS (e.g. by rebinding Services.prefs to a > PreferenceService.jsm, rather than the XPCOM implementation) This is generally an interesting idea - I don't know if there's a good reason not to implement the prefs system in JS instead of C++... (we'd probably still want a small XPCOM wrapper for cpp, of course...)
If we're implementing in JS, DeferredSave.jsm could be what you're looking for - batches up "dirty" notifications for a configured delay period, then writes out the file asynchronously. It supports an explicit flush for shutdown time, and handles all the ugly edge cases of the data becoming dirty again part-way through a write and so on. That said, prefs has so many C++ clients that I'd be pretty nervous about porting the main API to JS.
(In reply to :Gijs Kruitbosch (Bugmail catchup, needinfo if urgent) from comment #7) > Would switching to a regular async/OMT flush system work (ie saving > automatically and asynchronously, off main thread, and saving no more often > than once every 5 minutes or so (a la DelayedTask)), combining that with > guarding savePrefFile to no-op iff passed "null" as the file and with no > dirty state? Sounds like a good plan. 5 minutes sounds like a lot - on the order of "seconds" would probably achieve a similar effect while lowering the chances of needing to write out synchronously on shutdown. > AFAICT this would eliminate the sync disk writes in most cases (where a pref > has not been modified within 5 minutes of shutdown/backgrounding) and > improve reliability in data loss scenarios. I'm not sure that that's "most cases" (I could see some code setting prefs on shutdown regularly), but it would be interesting to find out (via telemetry)! > Also, is this something we could work on as part of the backlog, Gavin? :-) Certainly, if we have a clear plan. I'm less certain about the relative priority of this compared to other main-thread-IO/perf efforts, perhaps Vladan can advise?
Flags: needinfo?(gavin.sharp)
(In reply to :Gavin Sharp [email: gavin@gavinsharp.com] from comment #11) > (In reply to :Gijs Kruitbosch (Bugmail catchup, needinfo if urgent) from > comment #7) > > Also, is this something we could work on as part of the backlog, Gavin? :-) > > Certainly, if we have a clear plan. I'm less certain about the relative > priority of this compared to other main-thread-IO/perf efforts, perhaps > Vladan can advise? Vladan? :-)
Flags: needinfo?(vdjeric)
I don't think you need telemetry to measure pref writes during shutdown: I bet you could do it just by running the browser a bit. In any case, I think we should tightly scope this bug to periodically writing the pref file off the main thread before shutdown when it's dirty. At shutdown if it's dirty we can either synchronously write it one last time from the main thread, or use some sort of shutdown blocker to wait for the last async write. We shouldn't reimplement this in JS now. If we're thinking about a deeper prefs overhaul, we need a redesign (and I have thoughts I can dump somewhere). I actually think it's a high priority for data quality to do the periodic save (e.g. wait 5 seconds after prefs are dirty and then write them), entirely apart from the jank associated with the main-thread write. So I'd be happy to say we should prioritize this soon.
Flags: needinfo?(vdjeric)
Sounds like there's enough consensus to include this in the backlog, at the least. I would also agree with Benjamin that it would probably make sense to prioritize soon. We can do followups (to change consumers or such) later once we at least have an async, OMT way to save prefs.
Flags: firefox-backlog+
Summary: Preferences are written to disk on the main thread → Do automatic periodic async, off-mainthread writes for the prefs file when prefs change, and no-op savePrefsFile(null) if pref state is not dirty
Depends on: 1047389
Let's make sure this works nicely with AsyncShutdown, so let's wait until bug 918317 has landed.
Depends on: 918317
We already skip writing to the default prefs file when the preferences aren't dirty: http://hg.mozilla.org/mozilla-central/annotate/ffdd1a398105/modules/libpref/Preferences.cpp#l916
Summary: Do automatic periodic async, off-mainthread writes for the prefs file when prefs change, and no-op savePrefsFile(null) if pref state is not dirty → Do automatic periodic async, off-mainthread writes for the prefs file when prefs change
I can work on this. We can use this bug for the short-term fix described in the comments, and then we can address proper prefs storage (e.g. log storage) in bug 897049 and bug 956713
Assignee: nobody → vdjeric
Hey Vladan, is there any update on this? Do you think you could find someone to work on it?
Flags: needinfo?(vdjeric)
(In reply to Jared Wein [:jaws] (please needinfo? me) from comment #18) > Hey Vladan, is there any update on this? Do you think you could find someone > to work on it? I wanted Aaron Klotz to work on this, but this quarter it's looking unlikely, given the lengthy list of other projects we have been asked to do. I'd be happy to turn this over to anyone interested and help them get started, discuss design, etc. If it's not done by Q4, I will ask Aaron to work on it.
Flags: needinfo?(vdjeric)
Assignee: vdjeric → nobody
Is there other in-tree code that does OMT writes that could serve as some inspiration if I wanted to take a stab at this pre-Q4?
Flags: needinfo?(vdjeric)
There's a bunch of C++ code that writes files OMT in the codebase. bug 880864 is one example, LocalStorage is another, there's also the reading of Telemetry.ShutdownTime.txt in Telemetry.cpp. The basic approach is to send an event to the STS event queue and have the event do the I/O on the STS thread. There's also an open question regarding acceptable delays for writing prefs. Some code (such as startupCrash detection) writes to prefs and expects the pref state to be resilient to a crash after the write call returns.
Flags: needinfo?(vdjeric)
Flags: needinfo?(gijskruitbosch+bugs)
Happy new year! It seems I didn't get anything done before Q4, and nobody else did anything in Q4 either, so today/yesterday I figured I'd have a poke at this. This is a pretty rough WIP. I haven't done concurrency in C++ before. I'm mostly wondering if this is the right approach, if there are glaring things I've missed, or if there's a simpler way to do this. Some notes: - I left the SavePrefFile API alone (ie it is still sync), because there are some API consumers that I'm fairly sure assume that they get a sync write out of it. - Telemetry and a number of other things write to the prefs right after startup which triggers a flurry of separate writes, so I ended up upping the delay to 15s instead of just 5 - I haven't implemented shutdown blocking yet. AFAICT, both pre- and post-patch there is no guarantee we'll write an updated file before shutdown (!). Happy to look at doing that, but it seems there's no MDN docs for the C++ version of the AsyncShutdown APIs. - There are currently shutdown-time pref writes that don't get flushed to disk. I noticed SocialService.jsm that does this for xpcom-shutdown (ie sets a pref, potentially for no reason, but doesn't flush to disk), there might be others. So if I'm right that there's no guarantee right now, implementing such a guarantee might potentially regress shutdown performance (oh, the irony) if we don't update consumers - I've not updated any consumers (that frequently call savePrefFile, that is) in this patch, but arguably we should do that when we land this and/or soon after.
Attachment #8703278 - Flags: feedback?(vladan.bugzilla)
Assignee: nobody → gijskruitbosch+bugs
Status: NEW → ASSIGNED
Attached patch Debug printfs I used (obsolete) — Splinter Review
Dunno if this is helpful, or could be helpful if I converted it to use "proper" NSPR logging or something, but I did use them to debug and check that my implementation was doing vaguely sane things, so I figured I might as well add them here.
Flags: needinfo?(gijskruitbosch+bugs)
Comment on attachment 8703278 [details] [diff] [review] do async omt pref writes - WIP, f?vladan Review of attachment 8703278 [details] [diff] [review]: ----------------------------------------------------------------- I'm on PTO this week, so redirecting to aklotz
Attachment #8703278 - Flags: feedback?(vladan.bugzilla) → feedback?(aklotz)
Comment on attachment 8703278 [details] [diff] [review] do async omt pref writes - WIP, f?vladan Review of attachment 8703278 [details] [diff] [review]: ----------------------------------------------------------------- OK, I do have a few suggestions. I'm not super familiar with libpref code so it's probably best to ask one of those peers to look at this as well. ::: modules/libpref/Preferences.cpp @@ +107,5 @@ > + NS_LINEBREAK; > + > +static const uint32_t kPrefDelayedSaveTimeoutMs = 15000; > + > + Nit: extra blank line @@ +220,5 @@ > +{ > +public: > + PreferenceRunnableSaver(nsIFile* aFile) > + { > + mFile = aFile; Nit: please put these in an initializer list. @@ +304,5 @@ > + } > + } > + > +private: > + RefPtr<nsIFile> mFile; nsCOMPtr since you're typing on an interface and not a concrete class. ::: modules/libpref/Preferences.h @@ +375,5 @@ > > private: > nsCOMPtr<nsIFile> mCurrentFile; > > + RefPtr<nsITimer> mCurrentOMTSaveTimer; This should be nsCOMPtr since it is typed on an interface instead of a concrete class. ::: modules/libpref/prefapi.cpp @@ +987,5 @@ > +void PREF_MarkDirty() > +{ > + if (!gDirty) { > + gDirty = true; > + gDirtyCallback(); Make sure this is non-null before calling.
Attachment #8703278 - Flags: feedback?(aklotz) → feedback+
I hope that the write-on-shutdown will happen early and/or that there will remain a few synchronous writes because I have seen shutdown hangs in very busy circumstances (see e.g. my bug 570316, which seemed to have disappeared but has recently resurfaced). I can think of at least the following: - closing the prefs dialog, especially closing it by [OK] with browser.preferences.instantApply being false; - closing the about:config tab. Maybe even refreshing it but I'm less sure.
Attached file Debug printfs I used (obsolete) —
Attachment #8703279 - Attachment is obsolete: true
https://reviewboard.mozilla.org/r/31079/#review27911 I applied all the nits from aklotz, I think. (Thanks!) The 2nd part implements a "proper" shutdown blocker instead of the current call to savePrefFile when profile-before-change fires. In general, this won't change anything compared to the status quo, but it will ensure that a running async save will complete before we shut down. Note that this means we'll still be doing an IO-on-the-main-thread save as we are now. In this sense both these patches are conservative - savePrefFile() also continues to do sync main thread IO. I'd want to file followup bugs to teach the preference bindings not to be so eager to call that method themselves. As a consequence, right now, this would often result in more disk IO than we currently do, because in some cases we'd normally change prefs and not save to disk, and we will now start async saving to disk before shut down. Some of telemetry/healthreport (particularly the "datareporting.sessions.current.totalTime" and friends prefs) writes prefs "late", meaning we almost always shutdown-write prefs anyway. ::: modules/libpref/Preferences.cpp:294 (Diff revision 1) > + // Notify the main thread we're done > + mozilla::MonitorAutoLock mon(Preferences::sPreferences->mOMTSaveMonitor); > + Preferences::sPreferences->mAreWaitingForOMTSave = false; > + > + mon.Notify(); Note that I changed the ordering here compared to the earlier patch to match what the personal dictionary stuff was doing. As noted earlier, it'd be good if someone actually sanity-checked me carefully, because after years of being told threads are hard, and not all that much experience with them, I'm not at all confident that this code is currently correct as written. :-)
https://reviewboard.mozilla.org/r/31081/#review27995 ::: modules/libpref/Preferences.cpp:1042 (Diff revision 1) > + if (!hadCurrentFile) { > + mShutdownBlocker = new PrefShutdownBlocker( > + NS_LITERAL_STRING("Pref service shutdown blocker")); > + nsCOMPtr<nsIAsyncShutdownClient> shutdownPhase = GetShutdownPhase(); > + shutdownPhase->AddBlocker(mShutdownBlocker, > + NS_LITERAL_STRING(__FILE__), > + __LINE__, > + NS_LITERAL_STRING("Pref service shutdown blocker")); > + } Note that I'm doing this here rather than in Init() for two reasons: 1) doing it in Init() breaks the world, because AsyncShutdown tries to read preferences 2) there isn't really a point in doing writes on shutdown (or blocking shutdown for such writes) unless we have an `mCurrentFile` to write to.
Blocks: 1133405
Milan, when I asked Benjamin about this patch ~3 weeks ago he suggested that perhaps you could review this if you feel comfortable doing so - do you? :-)
Flags: needinfo?(milan)
Sure, I'll take a look.
Are we now guaranteed that all the preferences setting is done on the main thread? I know we had a big push to do that, wanted to confirm we've completed that work.
(In reply to Milan Sreckovic [:milan] from comment #35) > Are we now guaranteed that all the preferences setting is done on the main > thread? I know we had a big push to do that, wanted to confirm we've > completed that work. There are some asserts for that in the code (ifdef'd out for b2g, admittedly), so I hope/assume so, but I don't know for sure.
https://reviewboard.mozilla.org/r/31079/#review32313 Thought I'd send a few things while looking at the rest. ::: modules/libpref/Preferences.cpp:109 (Diff revision 1) > +static const uint32_t kPrefDelayedSaveTimeoutMs = 15000; I'd add a way to overwrite this, perhaps with an environment variable. It would give us a quick way of triaging in the field. Perhaps to the point where we're able to disable the async saving with that environment variable (e.g., zero negative value) ::: modules/libpref/Preferences.cpp:206 (Diff revision 1) > + MOZ_ASSERT(NS_IsMainThread()); We're in big trouble if this is ever called off main thread, so I'd go stronger than debug assert. MOZ_CRASH or MOZ_RELEASE_ASSERT, or even guarding against it, rather than asserting. ::: modules/libpref/Preferences.cpp:210 (Diff revision 1) > + if (gDirty) { gDirty will be set to true if we fail in the OMT write. Before this patch, we would just fail, perhaps report, and be done with it. With this patch, we are going to schedule another OMT write? That's new, are we sure we want it? ::: modules/libpref/Preferences.cpp:233 (Diff revision 1) > + // execute a "safe" save by saving through a tempfile Do we want to check if gHashTable exists, or are we safe? ::: modules/libpref/Preferences.cpp:1030 (Diff revision 1) > if (!gDirty) The "contract" is that when we return from SavePrefFileInternal(), the pref file on disk is up to date, correct? If we set gDirty to false just before we dispatch, this won't be the case. We seem to be already handling the "there is an OMT write in progress at some level" below. This should probably stay, it's more describing why it doesn't feel right that we're setting gDirty to false in the timer callback. ::: modules/libpref/Preferences.cpp:1035 (Diff revision 1) > + if (!mAreWaitingForOMTSave) { At first I thought that the monitor is protecting mAreWaitingForOMTSave as the condition variable. It seems to be the case in a couple of places (we get the lock, then we deal with that variable), but not in all of them? Or am I reading the code wrong? ::: modules/libpref/Preferences.cpp:1070 (Diff revision 1) > - // execute a "safe" save by saving through a tempfile > + PreferenceRunnableSaver* saver = new PreferenceRunnableSaver(aFile); This feels like a memory leak? ::: modules/libpref/Preferences.cpp:1093 (Diff revision 1) > - pref = nullptr; > + gDirty = false; I'm not sure about this one. Why reset dirty here and then change it back in the runnable's Run() method, if things go wrong? ::: modules/libpref/Preferences.cpp:1097 (Diff revision 1) > + } else { The callback happens on the main thread, right? When would we ever be in a situation where gDirty is false?
https://reviewboard.mozilla.org/r/31079/#review32947 ::: modules/libpref/Preferences.h:384 (Diff revision 1) > + mozilla::Monitor mOMTSaveMonitor; Tracing through the code, it's starting to feel like we could benefit from a "state tracking" type of enum, rather than a boolean and the existence of the runnable. Same functionality, just more explicit on what the current state is, between not doing anything, and having sent the request and being in the middle of processing the request, etc. ::: modules/libpref/prefapi.h:195 (Diff revision 1) > +void PREF_MarkDirty(); I like this not only for new functionality, but for hiding the direct access of gDirty. To the point where we may want to remove gDirty from extern, leave it in Preferences (probably just as a member variable instead of a global) and have the callback take care of both setting the state and deciding what to do afterwards: Remove PREF_MarkDirty()after replacing it with if (gDirtyCallback) gDirtyCallback(); and make the callback check: void Preferences::PrefsDirtyCallback() { if (sPreferences && gHashTable && gDirty) { sPreferences->SavePrefsToDefaultFileOMT(); } } Just random thoughts :)
Flags: needinfo?(milan)
See Also: → 848461
Milan, I haven't had a chance to come back to this for a while. I know you've looked at this more recently than I - do you have cycles / interest in taking this over?
Flags: needinfo?(milan)
I probably do have time, and have made some patches on top of yours, but I'm not fully convinced that we actually write to disk that often (today.) Given that the original report was from 2012, perhaps things changed :), I'll take a look at telemetry.
Hmm. Where is that telemetry we mention in comment 0? Gijs, do you know if we still collect it, and if so, what it is?
Flags: needinfo?(milan) → needinfo?(gijskruitbosch+bugs)
(In reply to Milan Sreckovic [:milan] from comment #41) > Hmm. Where is that telemetry we mention in comment 0? Gijs, do you know if > we still collect it, and if so, what it is? I have no idea... Georg, do you know?
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(gfritzsche)
(In reply to Milan Sreckovic [:milan] from comment #40) > I probably do have time, and have made some patches on top of yours, but I'm > not fully convinced that we actually write to disk that often (today.) > Given that the original report was from 2012, perhaps things changed :), > I'll take a look at telemetry. FWIW, I think we might not write that often, but doing it sync and OMT might still be problematic. Getting a profile with a large prefs.js file and trying to change some things in about:preferences if disk writes are slow should trigger this. Of course, that almost never happens on our fast dev machines. :-(
(In reply to :Gijs from comment #43) > (In reply to Milan Sreckovic [:milan] from comment #40) > > I probably do have time, and have made some patches on top of yours, but I'm > > not fully convinced that we actually write to disk that often (today.) > > Given that the original report was from 2012, perhaps things changed :), > > I'll take a look at telemetry. > > FWIW, I think we might not write that often, but doing it sync and OMT might > still be problematic. Err, I meant *on* main thread, of course...
(In reply to :Gijs from comment #42) > (In reply to Milan Sreckovic [:milan] from comment #41) > > Hmm. Where is that telemetry we mention in comment 0? Gijs, do you know if > > we still collect it, and if so, what it is? > > I have no idea... Georg, do you know? This looks like chrome hangs? Chris, do you have a better idea on what tooling is good for this now or who would know?
Flags: needinfo?(gfritzsche) → needinfo?(chutten)
A stack like that and a vague mention of "telemetry reports" means it was probably from chromeHangs or threadHangs, yes. As for what would be good tooling in this year of 2017... probably still hangs from chromeHangs/threadHangs. A notebook could plot a histogram of hang lengths for stacks including "mozilla::Preferences" to see their distribution. From there we could determine whether multi-second hangs in mozilla::Preferences calls are a common thing or uncommon thing.
Flags: needinfo?(chutten)
(In reply to Chris H-C :chutten from comment #46) > A stack like that and a vague mention of "telemetry reports" means it was > probably from chromeHangs or threadHangs, yes. > > As for what would be good tooling in this year of 2017... probably still > hangs from chromeHangs/threadHangs. A notebook could plot a histogram of > hang lengths for stacks including "mozilla::Preferences" to see their > distribution. From there we could determine whether multi-second hangs in > mozilla::Preferences calls are a common thing or uncommon thing. Is this something you or someone else on the data team could take a stab at doing? My experience with spark so far has not been great... I'll point out though that in the current state, telemetry's (hah!) own requirements mean we *always* end up writing to the prefs on shutdown, even with these patches (once they're unbitrotted etc.) because we store session length data in the prefs and so those are guaranteed to change (I think telemetry even has a shutdown observer of sorts for this). The patch might still be a win because it does the IO off-main-thread, and it helps for various other cases where we currently force a pref save (and/or enables us to stop doing that). But that's going to be harder to measure than the shutdown hang stuff. I'm also going to unassign this as realistically I'm not working on this at the moment. Hopefully that might spur someone else on to take this on. (Like Milan, maybe? :-) )
Assignee: gijskruitbosch+bugs → nobody
Status: ASSIGNED → NEW
Flags: needinfo?(milan)
Flags: needinfo?(chutten)
I'd be happy to take the bug (I have some stuff on top of what Gijs started with), but I would really want to see a measurement that we expect to get better, and then make sure it gets better. Otherwise, it will be difficult to get a change like this landed - it complicates the code, and if there is no benefit we can show, it's a hard sell. Chris, how do we measure what's mentioned above? Is it already in place, and if not, what needs to be put in place?
There's nothing terribly high level that is already in place. A few minutes' scrambling at EOD yesterday got me this far: https://gist.github.com/chutten/0e91b0cc2cfe4cacc244a0db95a3810d Not sure if this is what we're expecting to see or not. Of relevance might be the return of the BHR dashboard that :dthayer is working on (bug 1344003). Even if the dashboard's too coarse (only shows the top N hangs, may not include Preferences), the analysis might be helpful.
Flags: needinfo?(chutten)
(In reply to :Gijs from comment #47) > I'll point out though that in the current state, telemetry's (hah!) own > requirements mean we *always* end up writing to the prefs on shutdown, even > with these patches (once they're unbitrotted etc.) because we store session > length data in the prefs and so those are guaranteed to change (I think > telemetry even has a shutdown observer of sorts for this). Can you file a Telemetry bug with the details on this? Thanks.
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Georg Fritzsche [:gfritzsche] from comment #50) > (In reply to :Gijs from comment #47) > > I'll point out though that in the current state, telemetry's (hah!) own > > requirements mean we *always* end up writing to the prefs on shutdown, even > > with these patches (once they're unbitrotted etc.) because we store session > > length data in the prefs and so those are guaranteed to change (I think > > telemetry even has a shutdown observer of sorts for this). > > Can you file a Telemetry bug with the details on this? Thanks. bug 1346223.
Depends on: 1346223
Flags: needinfo?(gijskruitbosch+bugs)
Writing telemetry data to disk is showing up as long running hangs on the parent process main thread for 8 seconds or longer (search for Preferences::WritePrefFile in https://people-mozilla.org/~mlayzell/bhr/20170429/all.html). I think we should try to fix this for QF if possible.
Whiteboard: [Snappy:P2] → [Snappy:P2][qf:p1]
Bug 1361262 has an example code path that should help with reproducing a write to the prefs database after a successful restart from a startup crash. Milan mentioned to me in person that he had some difficulty previously reproducing these writes, perhaps that bug would be helpful here.
It's more that I didn't see a lot of calls to write the prefs, as this bug was often mentioning those as well. They don't seem to be that common - however, we're now talking about "one" possibly being a problem, perhaps because of the size, so the frequency of saves isn't that important, it's that they happen on the main thread in the first place.
Yeah I'm going to pick this up and review. Every time I look it my eyes glaze over and I need to just block out some hours to be sure I get it right.
(In reply to Milan Sreckovic [:milan] from comment #54) > It's more that I didn't see a lot of calls to write the prefs, as this bug > was often mentioning those as well. They don't seem to be that common - > however, we're now talking about "one" possibly being a problem, perhaps > because of the size, so the frequency of saves isn't that important, it's > that they happen on the main thread in the first place. Oh, I see, a lot of *calls*. Sorry I misunderstood what you said yesterday. :-) IINM the only entry point to this is nsIPrefService::SavePrefFile(). Looking at where those function is called from C++ these days, there is a call when we shut down, one where the computer suspends, one from the idle daily service, one from the Places Database::ForceCrashAndReplaceDatabase function (sounds scary!), and one from DriverCrashGuard::FlushPreferences() (not sure what that is used for) besides bug 1361262. But sadly there are also JS consumers: <https://searchfox.org/mozilla-central/search?q=symbol:_ZN14nsIPrefService12SavePrefFileEP7nsIFile%2C%23savePrefFile&redirect=false> The BHR data shows that some of the cases where we are being hurt by this are caused by JS consumers of this code, but it's hard to exactly say from which consumer. I tried a bit, and here is one example: https://people-mozilla.org/~mlayzell/bhr/20170429/987.html#1 comes from <https://searchfox.org/mozilla-central/rev/abe68d5dad139e376d5521ca1d4b7892e1e7f1ba/toolkit/mozapps/extensions/internal/XPIProvider.jsm#4394>; the pseudostack is: Startup::XRE_Main gre/modules/addons/XPIProvider.jsm:2940 gre/modules/addons/XPIProvider.jsm:4937 Looking through https://people-mozilla.org/~mlayzell/bhr/20170429/all.html it's clear that a lot of these problematic cases are coming from JS consumers but it's hard to find these in https://people-mozilla.org/~mlayzell/bhr/20170429.html. I filed bug 1361842 in order to add a pseudo-stack entry to help make this more easily visible on the BHR data...
See Also: → 1361842
(In reply to Benjamin Smedberg [:bsmedberg] from comment #55) > Yeah I'm going to pick this up and review. Every time I look it my eyes > glaze over and I need to just block out some hours to be sure I get it right. The patches are now more than a year old though, so they've probably bitrotted. Milan already did some pref service tidying up in bug 1287215, and more must have happened to make get*Pref take optional default args. It would probably make sense for me or Milan or *someone* to unbitrot the patches prior to review... Milan, do you have cycles? I'm pretty swamped with Photon stuff these days...
Whiteboard: [Snappy:P2][qf:p1] → [Snappy:P2][qf:p1][bhr]
I'll see try to at least un-bitrot the original patches before the weekend.
Attachment #8708517 - Attachment is obsolete: true
Attachment #8708517 - Flags: review?(benjamin)
Attachment #8708518 - Attachment is obsolete: true
Flags: needinfo?(milan)
Attachment #8708518 - Flags: review?(benjamin)
Attachment #8708519 - Attachment is patch: false
Attachment #8708519 - Attachment is obsolete: true
Assignee: nobody → milan
Attachment #8703278 - Attachment is obsolete: true
The write method keeps checking a non-protected variable for a possible "interrupt", rather than having a cancellable runnable. Perhaps that would be a better approach. Other than the interrupt part, the Write() method is copied from the old WritePrefFile. There was a number of places that used XRE_IsContentProcess to catch error conditions, where we really needed !XRE_IsParentProcess, so those changes ended up here as well.
Attachment #8864983 - Flags: feedback?
This went in a different direction that Gijs' original patches, so blame me for any bad things, and credit to Gijs for all the good things. Runnables to send the write work to a separate thread, and one to send the error condition back to main thread, in which case we (if necessary) do the main thread write. Off main thread writes do not make an attempt to consolidate, unless they are processed out of order. The earlier request will not write if a later one already did. There is otherwise no time delay suggesting we should wait for more requests. Main thread writes try to stop the off main thread ones, and also block any OMT requests that may be in the queue, but are not being processed yet. Open questions (that I know of): **** Is the thread we're using, NS_STREAMTRANSPORTSERVICE_CONTRACTID, the right one to use?
Attachment #8864985 - Flags: feedback?
I'm not actually sure we would want something like this. The idea is that we'd be able to do a telemetry experiment if this was a preference. If we were to keep this functionality, I'm also unsure if this is the right way to do it.
Attachment #8864988 - Flags: feedback?
This patch needs individual domain experts. I'm not clear which ones could be async.
Attachment #8864991 - Flags: feedback?
Some things coming up: * Will relax the lock held during the main thread write. Some questions to answer: * Do we want off main thread requests to interrupt earlier, non-processed ones? I'd probably rather not bother. * Do we want a crash report note in case we crashed with outstanding off main thread requests in the queue? * Do we want more telemetry on... something around this functionality?
Attachment #8864988 - Attachment is patch: true
Attachment #8864991 - Attachment is patch: true
There are two non-public methods that take a boolean to decide if they should work on the main thread, or off the main thread. I cannot decide if "true" should mean off main thread, or if "true" should mean on main thread. Right now, it means off. Neither sounds quite right. May go to the trouble of making an enum, just to avoid making the decision as to what true means. Any suggestions are welcome.
Comment on attachment 8864983 [details] [diff] [review] Part 1. Move the code for writing into a separate class. Add SavePrefFileMainThread to C++ only. Review of attachment 8864983 [details] [diff] [review]: ----------------------------------------------------------------- ::: modules/libpref/Preferences.cpp @@ +309,5 @@ > + } > + > + // Last chance to abort... > + if (aInterrupt && *aInterrupt) { > + NS_WARNING("OMT pref save interrupted 7"); Clearly, I was assuming that just abandoning an out stream without doing a Finish() call would mean nothing is ever done, whatever was "written" disappears and no changes are made to the actual file in question. That strikes me as a very dangerous assumption, so whoever looks at this should look very carefully.
(In reply to Milan Sreckovic [:milan] from comment #65) > ... > > Clearly, I was assuming that just abandoning an out stream without doing a > Finish() call would mean nothing is ever done, whatever was "written" > disappears and no changes are made to the actual file in question. > > That strikes me as a very dangerous assumption, so whoever looks at this > should look very carefully. Apparently, my assumption was correct. Not calling Finish() means nothing gets done on those streams. Exactly what I needed.
Attachment #8865023 - Attachment is patch: true
Comment on attachment 8865540 [details] Part 0. Design decisions Most of the module specific review is captured in this "design description". Benjamin, reviewing that will probably free you from detailed code reviewed until "the end", as it describes what I'm trying to do. Of course, you can feel free to review the code as well, but I thought this may be easier.
Attachment #8865540 - Flags: review?(benjamin)
Attachment #8865023 - Attachment is obsolete: true
Note: If we are trying to get rid of sync save on shutdown, we have to make sure we handle any outstanding request we may have when it comes to shut down.
Comment on attachment 8865540 [details] Part 0. Design decisions >1. New Preferences::SavePrefFileMainThread() method: > * signature otherwise matches SavePrefFile(); > * accessible from C++ only; > * calling this method ensures (a blocking/sync) main thread preference file save. Reading below, but I'm hoping we either don't need this or it's a temporary porting mechanism? >2. Preferences::SavePrefFile(), accessible from JS as savePrefFile(): > * behind a preference preferences.allow.omt-write does async preference file save on the NS_STREAMTRANSPORTSERVICE_CONTRACTID thread. I don't know a lot about this thread? Is it ok to block this thread on disk I/O? >3. Some common stuff: > * Preferences maintain a dirty flag; this is cleared on successful sync save or immediately on async save request (see below how we handle an async save failure); > * Each save request (sync and async) gets a unique ID; > * A global variable holds the ID of the last successful save; > * A mutex used for synchronization of some operations (more details below) > >4. Async saves details: > * Async saves collect all the data on the main thread, the create a runnable that eventually does the file write, and "succeed" immediately; > * It is assumed that the file pointer remains valid during the lifetime of an async save requst; File pointer? > * Async saves hold the mutex lock for the duration of the file save; This sounds wrong. What is protected by the mutex lock? > * A request for async save to "interrupt" (abandon all changes and return with success) is made by setting a global shared variable (not behind the lock); What is the purpose of making this interruptible? I don't understand what the interruptions are for. > * Async save requests are not allowed during shutdown - they are converted to sync saves. As a long-term plan I don't think this is right. We should instead have a shutdown blocker on async save. This would avoid blocking other shutdown activities on disk I/O. As a short-term plan if it makes the patch easier to write, sure. Other things I think we should make clear: * all saves are only triggered if user prefs are dirty * all saves continue to operate using a safe output stream/atomic move. * now that we have this async system, and user pref change should trigger saving prefs "soon"; I'd suggest 500ms just so that batches of prefs that get set near the same time don't trigger a bunch of separate async writes. My recommendation would be to get rid of the sync save case in the code entirely, because it would significantly reduce the complexity of the system. If necessary we could have a promiseSavePrefFile method which returns a promise. >A. There is no telemetry for items related to the success or interruption of pref file saves. Telemetry for failure is moderately valuable, at least worth a followup bug. Telemetry for interruption doesn't seem worth it. >B. There is no note in the crash report about "this happened during the preference file save". Probably not important.
Attachment #8865540 - Flags: review?(benjamin)
Yeah, I prefer to go gradually into a big change. For example, I would rather land and use the ability to have some saves async, then wait until everything is done. I think we can start seeing the results sooner, and perhaps decide that not all is the highest priority. In particular, I wanted to keep the ability to do a sync save. The goal of making all of them async is a good one, and we should strive for it, but the full async solution gets complicated. In other words, I think we should enhance the system to allow async saves, not redesign it from scratch for only async saves, at least not as a blocking first step. All saves are only triggered if the user prefs are dirty (although we currently have a path where that is not the case, and I would probably want to have that change in a separate bug.) Also, explicitly and because I don't think we need it, I don't have the "wait around for a while, and combine multiple saves into a single one". I suppose it could be added, but I wanted to separate the "reduce the number of disk writes" from the "reduce the blocking of the main thread". I think we can do the former in a separate bug, and we would benefit from the later because the main thread is blocked for less time (note that collecting all the data is still done on the main thread, and that can take non-trivial amount of time - that's a separate optimization that can be considered, with or without the OMT saves.) (In reply to Benjamin Smedberg [:bsmedberg] from comment #80) ... > > * calling this method ensures (a blocking/sync) main thread preference file save. > > Reading below, but I'm hoping we either don't need this or it's a temporary > porting mechanism? > Perhaps. If it turns out that we don't need a sync save, then we get rid of it. If that's only after a follow up bug, then we do it at that point. > >2. Preferences::SavePrefFile(), accessible from JS as savePrefFile(): > > * behind a preference preferences.allow.omt-write does async preference file save on the NS_STREAMTRANSPORTSERVICE_CONTRACTID thread. > > I don't know a lot about this thread? Is it ok to block this thread on disk > I/O? I have no clue, I was hoping you'd know :) I'll try to find out. > > * It is assumed that the file pointer remains valid during the lifetime of an async save requst; > > File pointer? nsIFile* is what we use for saving. I don't know what's backing it up, but I see the current code making assumptions that it's valid and there to be used when necessary. If that's not the case, any direction would be appreciated. > > > * Async saves hold the mutex lock for the duration of the file save; > > This sounds wrong. What is protected by the mutex lock? This is only needed if we're allowing sync saves. We want to make sure that async and sync saves don't happen at the same time. And sync save needs to wait for an async save to finish before it can take a turn. Now, that sounds like a long time, as you'd have main thread possibly blocked on the whole process, which is where the interrupt below comes into the picture... > > * A request for async save to "interrupt" (abandon all changes and return with success) is made by setting a global shared variable (not behind the lock); > > What is the purpose of making this interruptible? I don't understand what > the interruptions are for. If a sync save shows up, we don't need any active async saves to actually complete - the file is about to get overwritten with fresh data anyway. So, we want to stop the async save at a safe point during the process. > > * Async save requests are not allowed during shutdown - they are converted to sync saves. > > As a long-term plan I don't think this is right. We should instead have a > shutdown blocker on async save. This would avoid blocking other shutdown > activities on disk I/O. > > As a short-term plan if it makes the patch easier to write, sure. Yes, makes sense. It would be part of getting rid of sync saves. > My recommendation would be to get rid of the sync save case in the code > entirely, because it would significantly reduce the complexity of the > system. If necessary we could have a promiseSavePrefFile method which > returns a promise. Not sure getting rid of sync reduces complexity, but there is less code in the Preference.cpp for sure. In particular, readUserPrefs() would need to block on a successful write; if nowhere else, we have tests that explicitly test the save+read expecting the correct data. I'll open a few follow up bugs that separate some of the items mentioned here. We can then decide if we want to land/ship when all or handled, or a subset.
(In reply to Benjamin Smedberg [:bsmedberg] from comment #80) > ... > > >2. Preferences::SavePrefFile(), accessible from JS as savePrefFile(): > > * behind a preference preferences.allow.omt-write does async preference file save on the NS_STREAMTRANSPORTSERVICE_CONTRACTID thread. > > I don't know a lot about this thread? Is it ok to block this thread on disk > I/O? Gijs, I got this from your patch. Did you investigate if the thread above was the appropriate one for pref file save/be blocked on disk i/o, or was it a placeholder in the patch? Just don't want to repeat the investigation if one was already made. Thanks!
Flags: needinfo?(gijskruitbosch+bugs)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #80) > ... > My recommendation would be to get rid of the sync save case in the code > entirely, because it would significantly reduce the complexity of the > system. If necessary we could have a promiseSavePrefFile method which > returns a promise. Forgot to mention - sync saves are only available through C++ - JS always gets async saves. The side effect is that readUserPrefs call may get you an out of date file, unless we want it to potentially block the main thread longer.
My recommendation would be, to reduce complexity: * the actual implementation of all saves should be async. That should probably be on the general-purpose threadpool, not the socket thread. * to support blocking saves at least temporarily, kick off an async save and either using sync dispatch or a monitor.wait. That reduces the complexity inherent in any interruptions and cross-thread synchronization of writes; we know that only one thread is ever writing to the pref file, and you merely have to synchronize on epochs. Other notes: I believe that nsIFile is not a thread-safe object and so you'd need to use string paths.
See bug 1364496 comment 1 and bug 1364496 comment 2 for some examples of what we'd need to handle before we can make all saves async. That will require different expertise from (probably) different people.
(In reply to Benjamin Smedberg [:bsmedberg] from comment #84) > My recommendation would be, to reduce complexity: ... I think this just went out of my ability and time to work on - I don't have enough Gecko coding experience. I'm going to bow out, but if there are simple things where I can help, let me know.
Assignee: milan → nobody
(In reply to Milan Sreckovic [:milan] from comment #82) > (In reply to Benjamin Smedberg [:bsmedberg] from comment #80) > > ... > > > > >2. Preferences::SavePrefFile(), accessible from JS as savePrefFile(): > > > * behind a preference preferences.allow.omt-write does async preference file save on the NS_STREAMTRANSPORTSERVICE_CONTRACTID thread. > > > > I don't know a lot about this thread? Is it ok to block this thread on disk > > I/O? > > Gijs, I got this from your patch. Did you investigate if the thread above > was the appropriate one for pref file save/be blocked on disk i/o, or was it > a placeholder in the patch? Just don't want to repeat the investigation if > one was already made. Well, it's now 1.5 years ago... I think I copied it from elsewhere. From a quick look, it's used by nsAsyncStreamCopier, NetUtils, service workers, HTML fileSystem stuff, nsStreamUtils, etc., for file IO. So I *think* it's a reasonable first choice, unless we think this stuff should get its own thread... (In reply to Milan Sreckovic [:milan] from comment #86) > (In reply to Benjamin Smedberg [:bsmedberg] from comment #84) > > My recommendation would be, to reduce complexity: > ... > > I think this just went out of my ability and time to work on - I don't have > enough Gecko coding experience. I'm going to bow out, but if there are > simple things where I can help, let me know. OK, that's 2 of us now. Ehsan, can someone from :qf or photon perf pick this up? I think we all agree this is important, but we're struggling to find people to work on it and drive this over the finishline.
Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(ehsan)
To be clear, I can do the "optional async save, with sync save still available, and then the authors of sync calls have to go and decide if they can go async". Those are the patches on this bug. And a callback on preference write (in bug 1361262) type of thing to deal with at least the sync pref save during startup. It's the "make all saves async, and put a delay before doing any saves" that's throwing me for a loop, especially the part where I then have to figure out what all the sync callers are now supposed to do. Things like bug 1364496 comment 1 and bug 1364496 comment 2, for example. Don't have the expertise for that.
On a related note, I also prefer a gradual and safe approach, so that's definitely coloring my opinion on this. I would rather put in the ability to async save, and change one place that we've identified as important, and then do the rest in the follow up bugs.
So the only parts that I feel have to happen here to have a safe and understandable codepath are doing all the actual writes on the thread. So even if that means a sync dispatch to the thread which blocks in some cases, that at least seriously reduces the complexity of synchronizing I/O on multiple threads.
Talked offline over some details, should be able to do this.
Assignee: nobody → milan
Flags: needinfo?(ehsan)
The "always write on the same thread" simplifies some things. The only worry I have right now - when we do need to perform a sync save (likely, this are few and far between cases, if any, tracked in bug 1364496), the main thread would be blocked on the thread performing writes - not just on the pref writes themselves, but anything that may be executed in that thread.
There are some add-ons that use readUserPref method - that will block if there are pending writes.
Ugh, that API needs to die. What would happen if we just removed it now?
Attached file Part 0. Design decisions (obsolete) —
Attachment #8865540 - Attachment is obsolete: true
Attachment #8865982 - Attachment is obsolete: true
Attachment #8865985 - Attachment is obsolete: true
Attachment #8865986 - Attachment is obsolete: true
Attachment #8866494 - Attachment is obsolete: true
Attachment #8866495 - Attachment is obsolete: true
Comment on attachment 8870870 [details] Bug 789945: Part 4. Explicitly make some pref save be blocking calls. https://reviewboard.mozilla.org/r/142440/#review146180 ::: modules/libpref/Preferences.cpp:916 (Diff revision 1) > } > > nsresult rv = NS_OK; > > if (!nsCRT::strcmp(aTopic, "profile-before-change")) { > - rv = SavePrefFile(nullptr); > + // There are no async saves outstanding, we have a shutdown blocker. I may have lied on this; I did run into a problem trying to create a shutdown blocker at this stage of the shutdown at one point, but that may have been some other problem. I will change this to SavePrefFileOffMainThread() and see if it works.
Comment on attachment 8870870 [details] Bug 789945: Part 4. Explicitly make some pref save be blocking calls. https://reviewboard.mozilla.org/r/142440/#review146522 ::: modules/libpref/Preferences.cpp:99 (Diff revision 2) > +static bool gInShutdownTooLateForChanges = false; > void > Preferences::DirtyCallback() > { > if (gHashTable && sPreferences && !sPreferences->mDirty) { > + MOZ_DIAGNOSTIC_ASSERT(!gInShutdownTooLateForChanges); This lights up try with crashes.
(In reply to Milan Sreckovic [:milan] from comment #102) > ... > > { > > if (gHashTable && sPreferences && !sPreferences->mDirty) { > > + MOZ_DIAGNOSTIC_ASSERT(!gInShutdownTooLateForChanges); > > This lights up try with crashes. For example, browser/modules/test/unit/social/test_social.js, using browser/modules/test/unit/social/head.js sets social.activeProviders in do_register_cleanup. That call shows up after profile-before-change.
(In reply to Benjamin Smedberg [:bsmedberg] from comment #94) > Ugh, that API needs to die. What would happen if we just removed it now? Opened bug 1367813 for this.
Blocks: 1367813
Comment on attachment 8870868 [details] Bug 789945: Part 2. Off main thread functionality for the save. https://reviewboard.mozilla.org/r/142436/#review146588 ::: modules/libpref/test/unit/test_bug506224.js:14 (Diff revision 1) > var ps = Cc["@mozilla.org/preferences-service;1"] > .getService(Ci.nsIPrefService); > var prefs = ps.getDefaultBranch(null); > var userprefs = ps.getBranch(null); > > + // Make sure the preference writes are synchronous, as we are Since async only applies to default read/save, we do not need this change in the test - it will be sync anyway. ::: services/sync/tests/unit/test_prefs_store.js:35 (Diff revision 1) > // back to this prefs_test_prefs_store.js directly in the obj dir. This > // upsets things in confusing ways :) We avoid this by explicitly telling the > // pref service to use a file in our profile dir. > let prefFile = do_get_profile(); > prefFile.append("prefs.js"); > + Since sync applies only to default read/save, we likely do not need this.
The review comments above were meant for the setting of ...omt-write preference, not the code that follows it.
:snorp, how aggressively do you want to switch to off main thread pref saves? The part 4 patch above leaves two places with sync saves. Or, we can leave the preference for allowing async saves off by default on mobile. Thoughts?
Flags: needinfo?(snorp)
(In reply to Milan Sreckovic [:milan] from comment #108) > :snorp, how aggressively do you want to switch to off main thread pref > saves? The part 4 patch above leaves two places with sync saves. Or, we > can leave the preference for allowing async saves off by default on mobile. > Thoughts? OMT pref saves seem like they would be good on mobile, but I think we'd need a way to synchronously flush any pending saves. We get one opportunity to get into a good state when the application is backgrounded, and after that Android can just kill us without notice.
Flags: needinfo?(snorp)
Comment on attachment 8870867 [details] Bug 789945: Part 1. Extract preference write functionality into a separate method/class and add interrupt callback functionality. Add methods to explicitly ask for main or off main thread save (inactive.) https://reviewboard.mozilla.org/r/142434/#review147102 r+ with nits fixed ::: modules/libpref/Preferences.cpp:232 (Diff revision 2) > > +// Write the preference data to a file. The data is extracted > +// during the construction and owned by an instance of this class, and > +// we assume that the file remains valid throughout its lifetime. > +// > +class PreferencesWriter add `final` ::: modules/libpref/Preferences.cpp:336 (Diff revision 2) > +#endif > + return rv; > + } > + > +protected: > + nsCOMPtr<nsIFile> mFile; It doesn't matter yet because you don't actually have threading, but we need to be careful about the threading constraints here. nsIFile is IIRC an apartment-threaded object and so we can't destroy this class on a different thread than created it. Beware.
Attachment #8870867 - Flags: review?(benjamin) → review+
Comment on attachment 8870869 [details] Bug 789945: Part 3. Flush late in the shutdown, to give a last save a chance to complete and for the flush to not be blocking. " https://reviewboard.mozilla.org/r/142438/#review148026 ::: modules/libpref/Preferences.cpp:425 (Diff revision 2) > // If it's interrupted because of the min id, we consider the call > // a success, and if we fail during the write, we will send back the > // fail message to the main thread. > class PWRunnable : public PreferencesWriter, > - public Runnable > + public nsIRunnable, > + public nsIAsyncShutdownBlocker Is it better for each of these runnables to be a shutdown blocker, rather than have Preferences.cpp itself be the shutdown blocker? I'm interested in the tradeoffs/how you decidedon this. ::: modules/libpref/Preferences.cpp:446 (Diff revision 2) > // Make a copy of these so we can have them in runnable lambda > + nsCOMPtr<nsIAsyncShutdownBlocker> self(this); > nsresult rvCopy = rv; > int32_t idCopy = mId; > - SystemGroup::Dispatch("Preferences::WriterRunnable", > + SystemGroup::Dispatch( > + nsPrintfCString("Preferences::WriterRunnable-%d", mId).get(), Don't use nsPrintfCString here. Dispatch is only supposed to take statically allocated strings IIRC, because you don't control lifetime. ::: modules/libpref/Preferences.cpp:454 (Diff revision 2) > - PWHelper::SetCompleted(idCopy, rvCopy); > + PWHelper::SetCompleted(idCopy, rvCopy); > - })); > > + nsCOMPtr<nsIAsyncShutdownClient> sp = GetShutdownPhase(); > + if (sp) { > + sp->RemoveBlocker(self); I don't see anywhere which *adds* self to shutdown blockers. Did I miss something/is that in a different patch? ::: modules/libpref/Preferences.cpp:471 (Diff revision 2) > + mozilla::services::GetAsyncShutdown(); > + MOZ_RELEASE_ASSERT(svc); > + > + nsCOMPtr<nsIAsyncShutdownClient> sp; > + nsresult rv = svc->GetProfileBeforeChange(getter_AddRefs(sp)); > + if (!sp) { This deserves a comment at least. I'm guessing this is required because of last-ditch pref writes that happen during profile-before-change? ::: modules/libpref/Preferences.cpp:500 (Diff revision 2) > +} > + > +NS_IMETHODIMP > +PWRunnable::GetState(nsIPropertyBag**) > +{ > + return NS_OK; You need an *aResult = nullptr here or you risk undefined behavior.
Attachment #8870869 - Flags: review?(benjamin) → review-
Comment on attachment 8870870 [details] Bug 789945: Part 4. Explicitly make some pref save be blocking calls. https://reviewboard.mozilla.org/r/142440/#review148036 ::: modules/libpref/Preferences.cpp:924 (Diff revision 4) > } else if (!strcmp(aTopic, "load-extension-defaults")) { > pref_LoadPrefsInDirList(NS_EXT_PREFS_DEFAULTS_DIR_LIST); > } else if (!nsCRT::strcmp(aTopic, "reload-default-prefs")) { > // Reload the default prefs from file. > + MOZ_RELEASE_ASSERT(NS_IsMainThread()); > + if (AllowOffMainThreadSave()) { I don't understand why this block is here. Reloading default prefs shouldn't affect user prefs.
Attachment #8870870 - Flags: review?(benjamin) → review-
Comment on attachment 8870868 [details] Bug 789945: Part 2. Off main thread functionality for the save. talked this over with Milan on IRC. There is a thread-racy variable here which going to cause TSan fits and also might be an actual race error on loose-data-model hardware. I think we're going to instead of using counters use a single static Atomic<char*[]> sPendingData setting new data to post looks like this: UniquePtr<char*[]> newData = ...construct new pref list; UniquePtr<char*[]> oldData = sPendingData.exchange(newData); the writing thread pulls like this: UniquePtr<char*[]> dataToWrite = sPendingData.exchange(nullptr); And on any thread, you can check whether there are pending writes by doing bool writesPending = !sPendingData.compareExchange(nullptr, nullptr);
Attachment #8870868 - Flags: review?(benjamin)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #114) > ... > > + nsCOMPtr<nsIFile> mFile; > > It doesn't matter yet because you don't actually have threading, but we need > to be careful about the threading constraints here. nsIFile is IIRC an > apartment-threaded object and so we can't destroy this class on a different > thread than created it. > > Beware. Right now, we don't expect the writer to be holding the last reference to the file - we're only using the async write with mCurrentFile owned by Preferences, and we don't allow change to mCurrentFile with outstanding write requests. So, really, a "safe" thing to do would be to just pass and use nsFile* instead of reference counting. The obvious options: 1. Use a pointer. If something goes wrong and mCurrentFile disappears, we have a use after free. Not nice. 2. Use the code as is. If something goes wrong and mCurrentFile disappears, we will end up destroying on the wrong thread, which is assert in debug and possible memory leak on release? 3. Send the file back with the result message. That should be safe even if mCurrentFile disappears, but is perhaps an overkill? #1 is out, perhaps #3 isn't too much of an overkill.
(In reply to Benjamin Smedberg [:bsmedberg] from comment #115) > > class PWRunnable : public PreferencesWriter, > > - public Runnable > > + public nsIRunnable, > > + public nsIAsyncShutdownBlocker > > Is it better for each of these runnables to be a shutdown blocker, rather > than have Preferences.cpp itself be the shutdown blocker? > > I'm interested in the tradeoffs/how you decidedon this. Mostly to avoid tracking how many requests are outstanding before we remove the blocker. So, if I have multiple runnables "out there", and each one is a blocker, that tracking is automatic. If I just have a single blocker, it needs to keep count. I haven't considered any negative implications of multiple blockers, though I don't anticipate having multiple runnables active (but it is possible.)
Attached file Part 0. Design decisions (obsolete) —
Attachment #8870866 - Attachment is obsolete: true
Attachment #8870867 - Attachment is obsolete: true
Attachment #8870868 - Attachment is obsolete: true
Attachment #8870869 - Attachment is obsolete: true
Attachment #8870870 - Attachment is obsolete: true
(In reply to Milan Sreckovic [:milan] from comment #60) > ... > > **** Is the thread we're using, NS_STREAMTRANSPORTSERVICE_CONTRACTID, the > right one to use? Just had Honza confirm this is the right one to use.
I'm not necessarily happy with the state of shutdown, but it seems there is no good state of shutdown (see bug 1364498), at least the way things currently are. 1. We currently do the "last preferences save" when Preferences::Observe gets profile-before-change. 2. We're allowed to change preferences until the end of profile-before-change, so if there are any that get modified after the observer call, but before the end of all observer calls, those won't be saved. bsmedberg suggestion of making an explicit Preferences services call in nsXREDirProvider::DoShutdown() instead of relying on the observer would take care of this. 3. profile-before-change is where we want shutdown blockers to be done. profile-before-change-telemetry, which comes after that, is where we sometimes modify preference toolkit.telemetry.cachedClientID. Perhaps it's OK if we don't save that - we never did before (see #1.) 4. Shutdown blockers need to be taken care of inside of profile-before-change call, currently before the Preferences get that call. Again, bsmedbergs suggestion would take care of that, but note that it would have to be before profile-before-change, not after. So, we have to choose between #2 and #4. Where I'm at right now: I) Have an explicit call in nsXREDirProvider::DoShutdown() to prepare Preferences for shutdown. This will do an async save, and put in process a "dismantly OMT pref saves and shutdown blocker" mechanism in Preferences. If there are any further saves, they will be on main thread. II) Try to find which preferences are getting set within profile-before-change callbacks, and see if they could be moved to profile-change-teardown, as a follow up to this bug. III) Do a main thread sync preference save when processing profile-before-change in the Preferences. This will retain today's behaviour in which preferences get saved. I will find out how many of those we actually have. IV) Keep not saving preferences that get modified *after* profile-before-change Preferences callback. That will definitely be things that are done within profile-before-change-telemetry. It wouldn't be difficult to make sure all the preferences get saved, with a callback after profile-before-change-telemetry, but that would guarantee a main thread save when those get modified. Should be done in a follow up bug, as that changes behaviour we have today.
Attached file Part 0. Design decisions (obsolete) —
Attachment #8873921 - Attachment is obsolete: true
Comment on attachment 8870868 [details] Bug 789945: Part 2. Off main thread functionality for the save. https://reviewboard.mozilla.org/r/142436/#review152496 r=me with nits fixed ::: modules/libpref/Preferences.cpp:860 (Diff revision 6) > { > - return false; > + // Put in a preference that allows us to disable > + // off main thread preference file save. > + if (sAllowOMTPrefWrite < 0) { > + bool value = false; > + Preferences::GetBool("preferences.allow.omt-write", &value); Use the new with-default version of GetBool: `bool value = Preferences::GetBool("preferences.allow.omt-write", false);` ::: modules/libpref/Preferences.cpp:863 (Diff revision 6) > + if (sAllowOMTPrefWrite < 0) { > + bool value = false; > + Preferences::GetBool("preferences.allow.omt-write", &value); > + sAllowOMTPrefWrite = value ? 1 : 0; > + } > + return sAllowOMTPrefWrite; This is an implicit int->bool cast. I'd suggest making it explicit with `return !!sAllowOMTPrefWrite;` ::: modules/libpref/Preferences.cpp:1218 (Diff revision 6) > + // There were no previous requests, dispatch one since > + // sPendingWriteData has the up to date information. > + nsCOMPtr<nsIEventTarget> target = > + do_GetService(NS_STREAMTRANSPORTSERVICE_CONTRACTID, &rv); > + if (NS_SUCCEEDED(rv)) { > + bool omt = aOnWhichThread == SaveMethod::OffMainThread; is `OffMainThread` the right description of this flag now? Isn't this more about "save synchronously" versus "save asynchronously"? If so it might be worth updating the SaveMethod enum.
Attachment #8870868 - Flags: review?(benjamin) → review+
Comment on attachment 8870870 [details] Bug 789945: Part 4. Explicitly make some pref save be blocking calls. https://reviewboard.mozilla.org/r/142440/#review152510
Attachment #8870870 - Flags: review?(benjamin) → review+
Comment on attachment 8870869 [details] Bug 789945: Part 3. Flush late in the shutdown, to give a last save a chance to complete and for the flush to not be blocking. " https://reviewboard.mozilla.org/r/142438/#review152512 Reading through this, I think we don't need ~any of the complexity here. Our only goal is to make sure that a pref write is finished before we go away. The only thing we need to do that is a single Flush() call sometime after profile-before-change is finished. You can actually move it as late as after profile-before-change-telemetry. So I'd recommend adding a Preferences::Shutdown() method, having that Flush(). If you like, you can add a flag so that any pref changes that happen after Shutdown() would cause a warning (later maybe even an assertion). I really don't want to change the behavior of sAllowOMTPrefWrite after we've set it up: we'd almost certainly need a flush in order to do thread synchronization, and it just gets messy again.
Attachment #8870869 - Flags: review?(benjamin) → review-
Attachment #8875805 - Attachment is obsolete: true
Comment on attachment 8870870 [details] Bug 789945: Part 4. Explicitly make some pref save be blocking calls. https://reviewboard.mozilla.org/r/142440/#review153540
Comment on attachment 8870869 [details] Bug 789945: Part 3. Flush late in the shutdown, to give a last save a chance to complete and for the flush to not be blocking. " https://reviewboard.mozilla.org/r/142438/#review153550 woo, ship it
Attachment #8870869 - Flags: review?(benjamin) → review+
Pushed by msreckovic@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/523a719a0930 Part 1. Extract preference write functionality into a separate method/class and add interrupt callback functionality. Add methods to explicitly ask for main or off main thread save (inactive.) r=bsmedberg https://hg.mozilla.org/integration/autoland/rev/c5161ba62453 Part 2. Off main thread functionality for the save. r=bsmedberg https://hg.mozilla.org/integration/autoland/rev/4a067c50723a Part 3. Flush late in the shutdown, to give a last save a chance to complete and for the flush to not be blocking. r=bsmedberg" https://hg.mozilla.org/integration/autoland/rev/1363bdd29ce7 Part 4. Explicitly make some pref save be blocking calls. r=bsmedberg
So after changing some pref from about:config, how much time should I wait to make sure that the preference has been written completely to the disk?
Flags: needinfo?(milan)
Depends on: 1373250
(In reply to Mayank Bansal from comment #152) > So after changing some pref from about:config, how much time should I wait > to make sure that the preference has been written completely to the disk? There is no delay on queuing the work, so whatever it takes for the writing thread to pick the work up and write it. There is no batching, so two about:config changes will likely cause two writes because of the time between two pref changes in the UI.
Flags: needinfo?(milan)
This bug ended up only covering the async save part, not the "automatically trigger saves on pref changes", so modifying the summary.
Summary: Do automatic periodic async, off-mainthread writes for the prefs file when prefs change → Save preferences asynchronously
Blocks: 981818
Depends on: 1386263
This was disabled on Beta56 for causing bug 1386263. It remains enabled on Nightly.
Depends on: 1384454
Let's see if bug 1388464 takes care of this.
Performance Impact: --- → P1
Whiteboard: [Snappy:P2][qf:p1][bhr] → [Snappy:P2][bhr]
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: