Closed Bug 1462135 Opened Last year Closed Last year

30 second hang in PU_getCharsetForURI/</< | nsAnnotationService::GetPageAnnotation

Categories

(Toolkit :: Places, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla63
Tracking Status
firefox62 --- wontfix
firefox63 --- fixed

People

(Reporter: mstange, Assigned: standard8)

References

Details

(Keywords: perf, Whiteboard: [fxperf:p3])

Attachments

(1 file)

My browser was frozen for half a minute in nsAnnotationService::GetPageAnnotation just now. I have a profile but I don't want to share it because it contains URLs that might allow people to steal my Gmail session.

This hang started when I opened a new tab and typed some letters into the URL bar.

Here's the full stack, copied out of the profile:

> ___psynch_mutexwait
> _pthread_mutex_lock_slow
> vdbeUnbind
> bindText
> int mozilla::storage::(anonymous namespace)::variantToSQLiteT<mozilla::storage::(anonymous namespace)::BindingColumnData>(mozilla::storage::(anonymous namespace)::BindingColumnData, nsIVariant*)
> mozilla::storage::BindingParams::bind(sqlite3_stmt*)
> mozilla::storage::Statement::ExecuteStep(bool*)
> Statement::ExecuteStep
> nsAnnotationService::StartGetAnnotation(nsIURI*, long long, nsTSubstring<char> const&, nsCOMPtr<mozIStorageStatement>&)
> nsAnnotationService::GetPageAnnotation(nsIURI*, nsTSubstring<char> const&, nsIVariant**)
> NS_InvokeByIndex
> XPCWrappedNative::CallMethod(XPCCallContext&, XPCWrappedNative::CallMode)
> XPC_WN_CallMethod(JSContext*, unsigned int, JS::Value*)
> js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct)
> Interpret(JSContext*, js::RunState&)
> js::RunScript(JSContext*, js::RunState&)
> PU_getCharsetForURI/</< resource://gre/modules/PlacesUtils.jsm
> js::RunScript
> js::InternalCallOrConstruct(JSContext*, JS::CallArgs const&, js::MaybeConstruct)
> <name omitted>
> JS_CallFunctionValue(JSContext*, JS::Handle<JSObject*>, JS::Handle<JS::Value>, JS::HandleValueArray const&, JS::MutableHandle<JS::Value>)
> nsXPCWrappedJSClass::CallMethod(nsXPCWrappedJS*, unsigned short, nsXPTMethodInfo const*, nsXPTCMiniVariant*)
> PrepareAndDispatch
> SharedStub
> nsThread::ProcessNextEvent(bool, bool*)
> NS_ProcessPendingEvents(nsIThread*, unsigned int)
> nsBaseAppShell::NativeEventCallback()
> nsAppShell::ProcessGeckoEvents(void*)
> nsAppShell::ProcessGeckoEvents
> ___CFRUNLOOP_IS_CALLING_OUT_TO_A_SOURCE0_PERFORM_FUNCTION__
> ___CFRunLoopDoSources0
> ___CFRunLoopRun
> _CFRunLoopRunSpecific
> RunCurrentEventLoopInMode
> ReceiveNextEventCommon
> _BlockUntilNextEventMatchingListInModeWithFilter
> __DPSNextEvent
> -[NSApplication(NSEvent) _nextEventMatchingEventMask:untilDate:inMode:dequeue:]
> -[GeckoNSApplication nextEventMatchingMask:untilDate:inMode:dequeue:]
> -[NSApplication run]
> nsAppShell::Run()
> nsAppStartup::Run()
> XREMain::XRE_mainRun()
> XREMain::XRE_main
> XRE_main(int, char**, mozilla::BootstrapConfig const&)
> main
> start
> (root)
Keywords: perf
OS: Unspecified → All
Hardware: Unspecified → All
Summary: 30 second hang in PU_getCharsetForURI/</< nsAnnotationService::GetPageAnnotation → 30 second hang in PU_getCharsetForURI/</< | nsAnnotationService::GetPageAnnotation
`getPageAnnotation` is synchronous and runs SQL on the main thread, and gets called via `BrowserUtils.parseUrlAndPostData` from unified complete. :-/ OOC, do you see any statements mentioning `moz_annos` in about:telemetry#slow-sql-tab?
(In reply to Kit Cambridge (they/them) [:kitcambridge] from comment #1)
> OOC, do you see any statements mentioning `moz_annos` in
> about:telemetry#slow-sql-tab?

No I don't, the only statement in the "Slow SQL Statements on Helper Threads" list that took longer than a second is this one:
> UPDATE moz_places SET frecency = ROUND(frecency * :decay_rate) WHERE frecency > 0 /* places.sqlite */
And it took 521896ms, i.e. over 8 minutes.
I'm not familiar enough with places to know if we use annotations for anything besides bookmarks, ie if fixing bug 1460577 will make this go away or not.

Anyway, 30s is clearly ridiculous...

Maybe Marco or Kit have more ideas about how we could improve this. It looks like we already avoid running this query if there's no replacement character in the post data or URL query. I guess we could also get a bool as to whether the URI is even bookmarked, but you're still going to hit this for bookmark keyword cases, and without a better API I don't really know how to make this better. I'm assuming it's not possible to just add an async/OMT sql query for this information instead?


Otherwise, one thing that we should probably consider doing is memo-izing the result somehow. Off-hand it looks like we would re-fetch the charset when the user keeps typing (and we presumably hit this again in the actual load via the callsite in browser.js ).
Flags: needinfo?(mak77)
Flags: needinfo?(kit)
See Also: → 1460577
Whiteboard: [fxperf]
(In reply to :Gijs (he/him) from comment #3)
> Anyway, 30s is clearly ridiculous...

main-thread IO is unpredictable, could be 1us like 1 minute, it depends on the OS... That's why it's horrible.

(In reply to :Gijs (he/him) from comment #3)
> ie if fixing bug 1460577 will make this go away
> or not.

No, this is a page annotation, we still want to move those to async, but it's a bit more complex.

(In reply to Kit Cambridge (they/them) [:kitcambridge] from comment #1)
> `getPageAnnotation` is synchronous and runs SQL on the main thread, and gets
> called via `BrowserUtils.parseUrlAndPostData` from unified complete. :-/

Yes, this fact makes the severity a bit higher because it's during address bar completion.
On the other side it happens when the user typed a bookmark keyword and most users don't have those.

There are various solutions, my preferred one is converting page annotations to something else, that can be async. I don't think we have any page anno that deserves to be synchronous. Introducing a temporary async API may also be ok.
My current idea is to have an additional column in moz_places containing all annotations for each page in some format, and storing/retriving that asynchronously.
A simpler and low hanging fruit approach would probably just add a new async API to retrieve page annos. It would probably be trivial to remove the existing getPageAnnotation APIs and provide an async one.
Flags: needinfo?(mak77)
Priority: -- → P2
Regardless, annotations were our target after async history (done) and async bookmarks (just missing tags), so it's on our radar.
Flags: needinfo?(kit)
See Also: 1460577
As other similar cases of jankiness in these days, I suspect bug 1461753 has reduced the likelihood for this to appear, even if it's still a valid bug since we should stop abusing main-thread.
Depends on: 1461753
Depends on: 1468972
Whiteboard: [fxperf] → [fxperf:p3]
No longer blocks: removeAnnotations
I'm working on replacing getCharsetFromURI with a version that will avoid the main thread sync IO, hence taking.
Assignee: nobody → standard8
Blocks: 1468972
Status: NEW → ASSIGNED
No longer depends on: 1468972
Priority: P2 → P1
This also avoids us doing main thread sync io.

MozReview-Commit-ID: KR0p7eeu1sj
Comment on attachment 8991831 [details]
Bug 1462135 - Replace PlacesUtils.getCharsetForURI with PU.history.fetch with an option to fetch annotations.

Marco Bonardo [::mak] has approved the revision.

https://phabricator.services.mozilla.com/D2117
Attachment #8991831 - Flags: review+
Pushed by mbanner@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/da4ff0f7bff5
Replace PlacesUtils.getCharsetForURI with PU.history.fetch with an option to fetch annotations. r=mak
https://hg.mozilla.org/mozilla-central/rev/da4ff0f7bff5
Status: ASSIGNED → RESOLVED
Closed: Last year
Resolution: --- → FIXED
Target Milestone: --- → mozilla63
Is this something which should ride the trains?
Flags: needinfo?(standard8)
note that any main-thread IO can cause 30 seconds (or more) hangs, even a single access. This is just one of the many cases, it's not something that will solve jank by itself.
(In reply to Ryan VanderMeulen [:RyanVM] from comment #12)
> Is this something which should ride the trains?

I think it should ride the trains. As Marco said it is only part of jank reducing efforts. Also as per comment 6, the likelihood of this occurring has been reduced by bug 1461753.

Although the patch isn't too scary, it is really part of a set of improvements where we're removing synchronously access annotations, and I can't quite remember where this one comes, but I suspect it is in the middle.
Flags: needinfo?(standard8)
You need to log in before you can comment on or make changes to this bug.