Regular 20+ second hangs on shutdown with beachball, etc




9 years ago
9 years ago


(Reporter: bz, Assigned: mak)


Mac OS X
Dependency tree / graph
Bug Flags:
blocking1.9.2 -
wanted1.9.2 +

Firefox Tracking Flags

(Not tracked)



(2 obsolete attachments)

Most of the time when I shut down an m-c build that uses my "testing" profile, I get 20+ second hangs with the beachball spinning.  CPU usage seems to be minimal. This is particularly a problem if I happen to be compiling or linking or copying files or anything like that at the time, but happens even on an idle system.

Profiles suggest disk i/o and ExpireHistoryParanoid.

Anything I can provide here that would help with debugging this?  It's being a serious problem for little things like debugging....
Flags: blocking1.9.2?

Comment 1

9 years ago
Are you clearing history on shutdown? Anything special?
I can't understand how can paranoid expiration take so much.
How big is places.sqlite?

Comment 2

9 years ago
could you also check DB integrity with PRAGMA integrity_check; please?
> Are you clearing history on shutdown? Anything special?

Don't think so.

> I can't understand how can paranoid expiration take so much.

It doesn't always; only about 2/3 of the time.  Sometimes it's instant.

> How big is places.sqlite?

26599424 bytes at the moment.

> could you also check DB integrity with PRAGMA integrity_check; please?

How do I go about doing that?

Comment 4

9 years ago
(In reply to comment #3)

> > could you also check DB integrity with PRAGMA integrity_check; please?
> How do I go about doing that?

$ sqlite3 /path/to/profile/places.sqlite

> PRAGMA integrity_check;
sqlite3 ~/Library/Application\ Support/Firefox/Profiles/ankrljc6.testing/places.sqlite
SQLite version
Enter ".help" for instructions
Enter SQL statements terminated with a ";"
sqlite> PRAGMA integrity_check;
Are you getting this on mozilla-1.9.2 as well, bz? I don't think we can block release on this, but obviously we want a fix.
Flags: wanted1.9.2+
Flags: blocking1.9.2?
Flags: blocking1.9.2-
I can try 1.9.2, but I generally don't use those builds... Is it safe to try those builds on a profile I usually use m-c on?
Do you run mochitests in this profile by chance?
Mochitests, no (since those create their own profile).

I do run reftests in it.
Sorry - I was really asking about any set of tests that cycle through a bunch of pages that end up causing a bunch of visits.  You end up adding way more visits than our normal expiration code can get rid of (expire six pages every two minutes).  As a result, we have a lot to expire on shutdown.
But this is happening even if I didn't run reftest during the current run....

Comment 12

9 years ago
well, i'm taking a look at what we are doing, i don't think we can rewrite all expiration for 3.6, but maybe we could make it better with some simple changes to the constants.

Comment 13

9 years ago
assigning to me for small investigation and first look.
The scope is not to rewrite expiration, but to find if we can identify some cause, and provide a first experimental change for 3.6 that we can then inherit in 3.7 expiration rewrite.

As a first step i'm thinking about a really simple auto-adapting algo that can guess if user is an heavy or casual history user, so that we can expire more things during idle runs rather than during shutdown.
Assignee: nobody → mak77

Comment 14

9 years ago
Created attachment 400202 [details] [diff] [review]
wip v0.1

this removes a bunch of useless stuff at shutdown trying to distribute it better.
I still have to hammer ExpireOrphans though, if i should be hosnest i'm not sure we still need it, we could just consider that db cleanup and move it into PlacesDBUtils. Btw, i'll look into it next.

Comment 15

9 years ago
as a thought, i think partially this is due to the fact we now expire some of the visits in DBFlush, but as i said when we implemented this, it won't bother expiring orphan pages. Previous implementation was expiring pages with visits if needed.
We partially do this on idle, but, but we do this heavily on exit, about 5 times more. if the user is not going idle, it could expire up to 500 pages at shutdown.

Comment 16

9 years ago
Created attachment 400615 [details] [diff] [review]
wip 0.2

This also tries to expire some orphan asynchronously, but requires some deep thought about how it is hooking in the actual system (esp. notifications).

I pushed this to the tryserver to get some number.

Boris, in the meantime, if you would like to try this and see if anything changes for your use-case, would be awesome.
Attachment #400202 - Attachment is obsolete: true

Comment 17

9 years ago
this brought some reduction in Tshutdown, even if i would have expected more, for the medium dirty profile the gain is something more than 500ms, for the max dirty profile about 2s, with some fluctuation.
The max dirty profile TShutdown is still around 8s :\
On the other side i've not appreciated any Tp4 movement, and that's positive.

Comment 18

9 years ago
on the other side, ExpireHistoryParanoid, as pure query time, does not go over 250ms here on a quite big db, so looks like the vast majority of time is really spent in disk I/O to flush the journal to the db. For the max dirty profile iirc we are talking about a 200MB database. Btw i'll still do an experimental push to the tryserver along with bug 478718 that simplifies other shutdown paths.

Comment 19

9 years ago
the 2 patches together gave a further imrpovement by 200ms-2s, still a bit high (5-6s), but at this point i'm starting fearing this is just I/O binded.

Comment 20

9 years ago
Comment on attachment 400615 [details] [diff] [review]
wip 0.2

i'll move the patch to a separate bug about reducing expiration work at shutdown, since i have doubts this patch can fix Boris case, it could maybe make it better, but not fix it.
I cannot reproduce it with his db, nor in opt or in debug mode, so really sounds like just bad I/O, for sure needs better profiling.
Attachment #400615 - Attachment is obsolete: true


9 years ago
Depends on: 516940


9 years ago
Depends on: 478718

Comment 21

9 years ago
Boris, can you please tell if you see any improvement with latest trunk (after changeset 123e48c8edf4)
Will do.  I think we should mark this wfm for now, and if I see it again I'll reopen.
Last Resolved: 9 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.