Open Bug 1834981 Opened 11 months ago Updated 11 months ago

Hang on tab creation after an unspecified amount of time - SQLite Mutex Issue

Categories

(Toolkit :: Places, defect, P3)

Firefox 113
defect

Tracking

()

UNCONFIRMED

People

(Reporter: nini, Unassigned)

References

(Depends on 1 open bug)

Details

User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:109.0) Gecko/20100101 Firefox/113.0

Steps to reproduce:

Go to any website, wait 10~20 minutes (crucial, as I can open tabs normally on browser start), try to open a new tab with your mouse or through Ctrl+T, firefox hangs for a while, making the browser non-responsive (freezing).

113.0.2 (64-bit) Windows.

Actual results:

The browser hangs/freezes for more minutes.

https://share.firefox.dev/426IYTO

Apparently the problem lies in sqlite mutexes

Expected results:

The browser should have opened a new tab in a timely manner.

Performance Impact: --- → ?

The flame graph shows that the main thread is spending most of its time blocked in sqlite3_mutex_enter() from a call to vdbeUnbind() which is a database-level mutex. The call is being made from nsNavBookmarks::FetchItemInfo via an entry point of nsINavHistoryService, so I'm moving this to Places. (I understand this was filed in the current location because of discussion in https://chat.mozilla.org/#/room/#perf:mozilla.org where :hiro suggested filing it under Core::Internationalization, but the underlying problem seems to be in Places.)

Presumably something long-running is happening on Places' async execution thread. It may be necessary to gather another profile with the thread pattern "mozStorage" included, which would include at least our explicit async execution threads. I'm not sure if Places also would use other threads or not.

Component: Internationalization → Places
Product: Core → Toolkit

Not too sure where this bug originated from, but I stopped experiencing it out of the blue. No updates were made, it's very strange.

The severity field is not set for this bug.
:mak, could you have a look please?

For more information, please visit BugBot documentation.

Flags: needinfo?(mak)

I can only confirm that this is a Places view updating on the main thread, while something else is executing some long running task on the helper thread. Without all the threads we can't tell what was running on the helper thread, so this may end up being resolves as incomplete without a complete profile.

That said, this is unfortunately a known issue due to running queries on 2 different threads, for which we have a multi-year effort ongoing to remove all the main-thread usage, in bug 975979. There's a few remaining parts to fix: frontend views and tags. The former is currently being worked on, but will still require some time.

If you will be able in the future to get a complete perf profile with all the threads, maybe we could find what was being particularly slow. IIRC there is a "Record all the threads" option in the profiler.

Severity: -- → S3
Depends on: OMTPlaces
Flags: needinfo?(mak)
Priority: -- → P3

Removing this from perf triage as it's a permahang, which is a correctness issue and not a perf issue.

Performance Impact: ? → ---
You need to log in before you can comment on or make changes to this bug.