Closed Bug 1401236 Opened 8 years ago Closed 8 years ago

Sync hangs main process for multiple seconds, apparently after visiting links

Categories

(Firefox :: Sync, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
firefox57 --- affected

People

(Reporter: jryans, Assigned: lina)

References

Details

Attachments

(1 file)

I can't seem to trigger this issue reliably, so that's not the best... In any case, I noticed in recently Nightlies that the main process would hang for multiple seconds occasionally. Here's a profile of the issue happening: https://perfht.ml/2hfamef The key lines in the stack during hang appear to be: addChangedID (resource://services-sync/engines.js:145) js::RunScript js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) <name omitted> js::ForwardingProxyHandler::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) const js::CrossCompartmentWrapper::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) const js::Proxy::call(JSContext*, JS::Handle<JSObject*>, JS::CallArgs const&) js::proxy_Call(JSContext*, unsigned int, JS::Value*) js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) Interpret(JSContext*, js::RunState&) js::RunScript(JSContext*, js::RunState&) onVisit js::RunScript js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct) <name omitted> JS_CallFunctionValue(JSContext*, JS::Handle<JSObject*>, JS::Handle<JS::Value>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>) nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS*, unsigned short, XPTMethodDescriptor const*, nsXPTCMiniVariant*) PrepareAndDispatch SharedStub nsNavHistory::NotifyOnVisit(nsIURI*, long long, long long, long long, int, nsTSubstring<char> const&, bool, unsigned int, unsigned int, nsTSubstring<char16_t> const&) mozilla::places::(anonymous namespace)::NotifyVisitObservers::Run()
It's probably the history tracker reading persisted changed IDs from disk. We use `JSONFile#load` to read the file (https://searchfox.org/mozilla-central/rev/1c13d5cf85f904afb8976c02a80daa252b893fca/toolkit/modules/JSONFile.jsm#180)...which is async, except I see we don't check if `this.dataReady` is true *before* we read. That might be the issue here.
Assignee: nobody → kit
Status: NEW → ASSIGNED
Comment on attachment 8909891 [details] Bug 1401236 - Don't reread already loaded data in `JSONFile#load`. https://reviewboard.mozilla.org/r/181382/#review186624 This makes me nervous for 2 reasons: 1) It seems like it might be the case that some consumers *intend* to re-read data from disk at some point. 2) I don't see the JSON file thing in the profile? Instead I see us spending a lot of time in nsNavHistory::NotifyOnVisit and its callbacks. It seems activity stream especially is trying to call getScreenshotForURL and spending a lot of time doing that. Maybe I'm misreading the profile? What makes you think this is the culprit?
Attachment #8909891 - Flags: review?(gijskruitbosch+bugs)
(In reply to :Gijs from comment #3) > 1) It seems like it might be the case that some consumers *intend* to re-read data from disk at some point. Maybe, though it's not clear to me why they would, since we check `this.dataReady` and bail immediately after `read`. At best, we've done a lot of busy work to read the contents into memory again, only to discard them because the file is already loaded. Sync calls `load` and spins a nested event loop unconditionally (https://searchfox.org/mozilla-central/rev/1c13d5cf85f904afb8976c02a80daa252b893fca/services/sync/modules/engines.js#100), so we'll reread the tracker file every time you visit a page. I can move the `dataReady` check into Sync, but thought it made more sense to do here. > Maybe I'm misreading the profile? What makes you think this is the culprit? It's likely not the main culprit, but I noticed `get changedIDs` showing up repeatedly in the profile. We can easily avoid that work, especially for large tracker files.
(In reply to Kit Cambridge (he/him) [:kitcambridge] (UTC-7) from comment #4) > (In reply to :Gijs from comment #3) > > 1) It seems like it might be the case that some consumers *intend* to re-read > data from disk at some point. > > Maybe, though it's not clear to me why they would, since we check > `this.dataReady` and bail immediately after `read`. At best, we've done a > lot of busy work to read the contents into memory again, only to discard > them because the file is already loaded. > > Sync calls `load` and spins a nested event loop unconditionally > (https://searchfox.org/mozilla-central/rev/ > 1c13d5cf85f904afb8976c02a80daa252b893fca/services/sync/modules/engines. > js#100), so we'll reread the tracker file every time you visit a page. I can > move the `dataReady` check into Sync, but thought it made more sense to do > here. Fair point! > > Maybe I'm misreading the profile? What makes you think this is the culprit? > > It's likely not the main culprit, but I noticed `get changedIDs` showing up > repeatedly in the profile. We can easily avoid that work, especially for > large tracker files. OK, that makes sense. We might need to keep the bug open to do further investigation?
Keywords: leave-open
Comment on attachment 8909891 [details] Bug 1401236 - Don't reread already loaded data in `JSONFile#load`. https://reviewboard.mozilla.org/r/181382/#review186642 Kit's right, I was wrong, let's do this. :-)
Attachment #8909891 - Flags: review+
Pushed by kcambridge@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/c0decd9037fe Don't reread already loaded data in `JSONFile#load`. r=Gijs
We were chatting about this on IRC, and while I agree the JSONFile patch is correct, I don't see how it explains jank. I suspect Gijs may be correct that ActivityStream is the actual culprit here (but is attributed to Sync due to us spinning event loops, which we are working hard to kill). :jryans, any chance you could try disabling AS to see if that theory may actually be correct?
Flags: needinfo?(jryans)
(In reply to Mark Hammond [:markh] from comment #8) > We were chatting about this on IRC, and while I agree the JSONFile patch is > correct, I don't see how it explains jank. I suspect Gijs may be correct > that ActivityStream is the actual culprit here (but is attributed to Sync > due to us spinning event loops, which we are working hard to kill). :jryans, > any chance you could try disabling AS to see if that theory may actually be > correct? I mean, that and/or get another profile with a build that has this fix, which won't have sync in the stacks anymore. :-)
(In reply to :Gijs from comment #9) > I mean, that and/or get another profile with a build that has this fix, > which won't have sync in the stacks anymore. :-) I think Sync may still appear (although it seems likely it will no longer be JSONFile.load) - sync has .promiseSpinningly which always spins an event loop and is probably implicated. We'll be "done" asyncifying sync when promiseSpinningly no longer exists :)
(In reply to Mark Hammond [:markh] from comment #11) > sync has .promiseSpinningly which always spins an event > loop hrm - that's possibly not actually true, depending on how these promises resolve and the semantics of nsIThreadManager - but I'm sure we can still expect sync to pop up in odd stack traces after this patch.
(In reply to Mark Hammond [:markh] from comment #8) > We were chatting about this on IRC, and while I agree the JSONFile patch is > correct, I don't see how it explains jank. I suspect Gijs may be correct > that ActivityStream is the actual culprit here (but is attributed to Sync > due to us spinning event loops, which we are working hard to kill). :jryans, > any chance you could try disabling AS to see if that theory may actually be > correct? D'oh, I even knew about the issues with Sync event loops... Guess I should have expanded the profile further. :/ In an attempt to test this, I have not yet updated to a Nightly with the fix here. Instead, I disabled "browser.newtabpage.activity-stream.enabled" in about:config. It looks like AS listens for the pref change, so I did not restart Firefox. (The AS content is now gone from about:newtab.) I'll leave it this way for a bit and see if the issue happens again. Keeping ni? for now while I test.
With AS disabled today, I did not notice this bug anymore. Should I morph this into an AS bug somehow?
Flags: needinfo?(jryans)
See Also: → 1401817
(In reply to J. Ryan Stinnett [:jryans] (use ni?) from comment #14) > With AS disabled today, I did not notice this bug anymore. > > Should I morph this into an AS bug somehow? Given this bug has landed a patch, I think another bug is best - I just opened bug 1401817. The patch here is now in central, so I'll close this.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Keywords: leave-open
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: