Move last Places sync to xpcom-shutdown

RESOLVED FIXED in mozilla1.9.3a1

Status

()

Toolkit
Places
RESOLVED FIXED
9 years ago
8 years ago

People

(Reporter: mak, Assigned: mak)

Tracking

Trunk
mozilla1.9.3a1
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 10 obsolete attachments)

(Assignee)

Description

9 years ago
to be sure everything is correctly synced down, we should try to move the last sync to xpcom-shutdown.
(Assignee)

Comment 1

9 years ago
experimenting on this. it would simplify xpcshell testing a lot, plus the fact actually any xpcshell test using places out of places test folders is leaking/asserting.
Assignee: nobody → mak77
Status: NEW → ASSIGNED
(Assignee)

Updated

9 years ago
Depends on: 508394
(Assignee)

Comment 2

9 years ago
so, this is blocked by 2 issues:
- we leak as soon as any browser test uses a places service, or better, as soon as we notify something. that will cause PlacesDBFlush to initialize, and that's probably what we leak.
- bug 508394. A thread that is serving favicons has already shutdown here, for some reason, and the used async statement cannot be finalized. this does not allow us to close the db connection.
(Assignee)

Updated

9 years ago
Depends on: 507199
(Assignee)

Updated

9 years ago
No longer depends on: 507199
(Assignee)

Comment 3

9 years ago
Created attachment 396562 [details] [diff] [review]
wip v0.1

Comment 4

9 years ago
(In reply to comment #2)
> so, this is blocked by 2 issues:
> - we leak as soon as any browser test uses a places service, or better, as soon
> as we notify something. that will cause PlacesDBFlush to initialize, and that's
> probably what we leak.

Based on my investigation in bug 511735, this is indeed the case.  Specifically, nsPlacesDBFlush caches storage statements, and only finalizes them at xpcom-shutdown.
(Assignee)

Comment 5

9 years ago
i solved the toolkit leaks probably, it was due to the fact PlacesDBFlush was being released too late, i'm removing the observer categories from the category manager and that feels like working.

I'm trying to remove any Places backend dependancy on quit-application, since it is never called in xpcshell tests, and cleanup should be guaranteed by xpcom-shutdown. moving work down the line to xpcom-shutdown does not show any problem so far.

the remaining failing tests are:
- toolkit: test_moz-anno_favicon_mime_type.js (due to bug 508394)
- browser: most tests are leaking something quite similar to nsBrowserGlue
(Assignee)

Comment 6

9 years ago
solved the favicon issue, was due to the fact that test was not notifying anything, so nsPlacesDBFlush was not initing and we were trying to finalize the statement at ~FaviconService, most likely after xpcom-shutdown-threads.
(Assignee)

Comment 7

9 years ago
Created attachment 399148 [details] [diff] [review]
wip v0.2

Needs some clean-up but is mostly working.

the only real Places test failure is toolkit test_404630.js, this is still firing a strange assertion due to async favicons stuff:

WARNING: cannot post event if not initialized: file z:/mozilla-central/netwerk/protocol/http/src/nsHttpConnectionMgr.cpp, line 167
WARNING: nsExceptionService ignoring thread destruction after shutdown: file z:/mozilla-central/xpcom/base/nsExceptionService.cpp, line 194
###!!! ASSERTION: what happened to xpcom-shutdown?: 'mHostFiltersArray.Length() == 0 && mFilters == nsnull && mPACMan == nsnull', file z:/mozilla-central/netwerk/base/src/nsProtocolProxyService.cpp, line 319
necko!nsProtocolProxyService::`scalar deleting destructor'+0x000000000000000F
necko!nsProtocolProxyService::Release+0x00000000000000D4 (z:\mozilla-central\netwerk\base\src\nsprotocolproxyservice.cpp, line 290)
xpcom_core!nsCOMPtr_base::assign_assuming_AddRef+0x0000000000000059 (z:\mozilla-central\obj-i686-pc-mingw32\browser\dist\include\nscomptr.h, line 457)
xpcom_core!nsCOMPtr_base::assign_with_AddRef+0x000000000000002C (z:\mozilla-central\obj-i686-pc-mingw32\browser\xpcom\build\nscomptr.cpp, line 90)
xpcom_core!nsCOMPtr<nsISupports>::operator=+0x0000000000000012 (z:\mozilla-central\obj-i686-pc-mingw32\browser\dist\include\nscomptr.h, line 976)
xpcom_core!FreeServiceContractIDEntryEnumerate+0x000000000000002F (z:\mozilla-central\xpcom\components\nscomponentmanager.cpp, line 1736)
xpcom_core!PL_DHashTableEnumerate+0x00000000000000F8 (z:\mozilla-central\obj-i686-pc-mingw32\browser\xpcom\build\pldhash.c, line 754)
xpcom_core!nsComponentManagerImpl::FreeServices+0x000000000000005E (z:\mozilla-central\xpcom\components\nscomponentmanager.cpp, line 1748)
xpcom_core!mozilla::ShutdownXPCOM+0x000000000000026C (z:\mozilla-central\xpcom\build\nsxpcominit.cpp, line 831)
xpcom_core!NS_ShutdownXPCOM_P+0x000000000000000C (z:\mozilla-central\xpcom\build\nsxpcominit.cpp, line 744)
xpcom!NS_ShutdownXPCOM+0x000000000000000D (z:\mozilla-central\xpcom\stub\nsxpcomstub.cpp, line 179)
xpcshell!main+0x0000000000000CED (z:\mozilla-central\js\src\xpconnect\shell\xpcshell.cpp, line 1795)
xpcshell!__tmainCRTStartup+0x00000000000001A6 (f:\rtm\vctools\crt_bld\self_x86\crt\src\crtexe.c, line 586)
xpcshell!mainCRTStartup+0x000000000000000D (f:\rtm\vctools\crt_bld\self_x86\crt\src\crtexe.c, line 403)
Attachment #396562 - Attachment is obsolete: true
(Assignee)

Comment 8

9 years ago
the nsProtocolProxyService xpcom-shutdown oberver is added in nsProtocolProxyService::Init, sounds like the test is so short that init happens just after xpcom-shutdowm, and the destructor runs just after that.
(Assignee)

Comment 9

9 years ago
breaking the mainthread did help, just added a 0 timeout to the test. These situations are hard to be seen in real cases.
(Assignee)

Comment 10

9 years ago
Created attachment 399207 [details] [diff] [review]
patch v1.0

I've pushed this to the tryserver, Shawn, after vacuum review, if you want to give a first-review of this, would be awesome.
Attachment #399148 - Attachment is obsolete: true
Attachment #399207 - Flags: review?(sdwilsh)
(Assignee)

Comment 11

9 years ago
hm oh well actually i did a wrong edit to test_404630.js, i forgot to make it pending so i practically disabled it, sigh.
(Assignee)

Comment 12

9 years ago
tryservers did signal a crash in
test_database_sync_after_quit_application.js | application crashed (minidump found), actually trying to get the stack hoping tryservers have this feature.

and another one timed out in
test_database_sync_after_quit_application_with_removeAllPages.js

it is pretty awesome i did not hit any of these 2 locally, but i'm used to that at least.
Comment on attachment 399207 [details] [diff] [review]
patch v1.0

For more detailed review comments, please see http://reviews.visophyte.org/r/399207/

on file: browser/components/places/tests/unit/test_browserGlue_prefs.js line 232
>   dump("\nTEST " + (++testIndex) + ": " + test.description + "\n");

just change this to print


on file: browser/components/places/tests/unit/test_browserGlue_smartBookmarks.js line 189
> /*
>   dump("Found Children for folder with itemId " + aFolderItemId + ":\n");
>   for (let i = 0; i < cc; i++) {
>     let node = rootNode.getChild(i);
>     dump(i + ") " + node.title + "\n");
>   }
> */

should remove this in your final patch


on file: browser/components/places/tests/unit/test_browserGlue_smartBookmarks.js line 215
>   dump("\nTEST " + (++testIndex) + ": " + test.description + "\n");

print


on file: toolkit/components/places/public/nsPIPlacesDatabase.idl line 49
> [scriptable, uuid(8e6d4f8a-4b8e-4026-9fca-517c4494ddb7)]

We need to modify this uuid because of changes.


on file: toolkit/components/places/src/nsNavHistory.h line 437
>   nsresult FinalizeInternalStatements();

nit: NS_HIDDEN_(nsresult)


on file: toolkit/components/places/src/nsNavHistory.h line 444
>   nsresult CommitPendingChanges();

nit: NS_HIDDEN_(nsresult)


on file: toolkit/components/places/src/nsNavHistory.cpp line 5574
>     // need it for a final flush.  It could also be useful for tests.

drop the "It could be useful..." comment


on file: toolkit/components/places/src/nsPlacesDBFlush.js line 484
>     { category: "bookmark-observers", entry: "nsPlacesDBFlush"},
>     { category: "history-observers", entry: "nsPlacesDBFlush" },

I would prefer if we formatted this like this:
{ category: "bookmark-observers",
  entry: "nsPlacesDBFlush" },
...


on file: toolkit/components/places/src/nsPlacesMacros.h line 45
>     ENUMERATE_WEAKARRAY(array, type, method)                                   \
>     const nsCOMArray<type> &entries = cache.GetEntries();                      \
>     for (PRInt32 idx = 0; idx < entries.Count(); ++idx)                        \
>         entries[idx]->method;                                                  \

Can you explain this change please?  Why do we want to enumerate observers and
then categories?


on file: toolkit/components/places/tests/sync/test_database_sync_after_quit_application.js line 90
>   os.notifyObservers(null, "xpcom-shutdown", null);

This test's filename should be changed too!  This applies to the other tests
as well!


on file: toolkit/components/places/tests/unit/test_404630.js line 62
>   do_timeout(0, run);

explain this please (in commment in code)
Attachment #399207 - Flags: review?(sdwilsh)
(Assignee)

Comment 14

9 years ago
(In reply to comment #13)

> on file: toolkit/components/places/src/nsPlacesMacros.h line 45
> >     ENUMERATE_WEAKARRAY(array, type, method)                                   \
> >     const nsCOMArray<type> &entries = cache.GetEntries();                      \
> >     for (PRInt32 idx = 0; idx < entries.Count(); ++idx)                        \
> >         entries[idx]->method;                                                  \
> 
> Can you explain this change please?  Why do we want to enumerate observers and
> then categories?

I'm just trying to reduce orangeness, this is an alternative path to bug 511965, even if i think won't change much, but could be worth trying.

I solved the problem with favicons on shutdown discarding lazy messages. nsProtocolProxyService shutdowns at xpcom-shutdown, we can't rely on it, so i'm just dropping favicons lazy messages if we are shutting down.
(Assignee)

Comment 15

9 years ago
Created attachment 400742 [details] [diff] [review]
patch v1.1
Attachment #399207 - Attachment is obsolete: true
(Assignee)

Comment 16

9 years ago
PS: this builds on top of patch in bug 515952.
(Assignee)

Comment 17

9 years ago
Created attachment 402094 [details] [diff] [review]
patch v1.2

update, builds on top of Bug 516940 and some other patch expected to land before.
Attachment #400742 - Attachment is obsolete: true
(Assignee)

Comment 18

9 years ago
Created attachment 403845 [details] [diff] [review]
patch v1.3

my patches queue is getting better, unbitrot and cleanup.

i still have one issue to solve, the two sync/test_database_sync_after_shutdownXXX tests are crashing or timing out without apparent reasons, i suppose the fact i have to notify xpcom-shutdown manually is somehow involved, but i can't reproduce locally, so i have to use tryservers till i figure out what's up... sigh.
Attachment #402094 - Attachment is obsolete: true
(Assignee)

Comment 19

9 years ago
Created attachment 405268 [details] [diff] [review]
patch v1.4

one last detail to check, but Tryserver finally gave me a green!
Attachment #403845 - Attachment is obsolete: true
(Assignee)

Updated

9 years ago
Whiteboard: [builds on top of bug 516940]
(Assignee)

Comment 20

9 years ago
Created attachment 405499 [details] [diff] [review]
patch v1.5

this should be fine
Attachment #405268 - Attachment is obsolete: true
Attachment #405499 - Flags: review?(sdwilsh)
(Assignee)

Comment 21

9 years ago
test_database_sync_after_shutdown_with_removeAllPages is failing just on windows, not sure about the reason, on my vm i had a threading issues that caused an additive flush to be sent (onItemAdded called twice instead of once).
Apart that this is in good shape for a review i can keep investigate that.
(Assignee)

Updated

9 years ago
Whiteboard: [builds on top of bug 516940]
Comment on attachment 405499 [details] [diff] [review]
patch v1.5

http://reviews.visophyte.org/r/405499/

on file: toolkit/components/places/src/nsNavHistory.h line 589
>   void CommitLazyMessages(PRBool aIsShutdown = PR_FALSE);

NS_HIDDEN_(void) since you are here please


on file: toolkit/components/places/src/nsNavHistory.cpp line 5913
> void
> nsNavHistory::CommitLazyMessages(PRBool aIsShutdown)

likewise, NS_HIDDEN_(void)_ here too.


r=sdwilsh
Attachment #405499 - Flags: review?(sdwilsh) → review+
(Assignee)

Comment 23

9 years ago
Created attachment 405899 [details] [diff] [review]
patch v1.6

Sounds like Windows did not love my category manager registration change, it is probably due to persistence (bug 364864?).
Btw, just removing the new category entry "entry" value fixes all issues, so my problem was most likely due to an old catman registered sync component that was serving duplicate flushes (so i was getting 2 flushes for onItemAdded). Odd. And this was just on Windows (both my vm and the tryserver), while Linux and OS X were behaving correctly.
I just got rid of the "entry" description going back to the old registration and let xpcomUtils cleanup the entry on shutdown, while i use the new syncStopped to ensure we don't try to sync after shutdown.
Attachment #405499 - Attachment is obsolete: true
(Assignee)

Comment 24

9 years ago
Created attachment 405916 [details] [diff] [review]
patch v1.7

better, this still removes the category observer. Notice i also remove the category in history service, just to be sure we don't leak if somebody registers as an observer and forgets to unregister. Previous version was leaking, this fixes the leak. Tested on win and lin.
And should be final version to push.
Attachment #405899 - Attachment is obsolete: true
(Assignee)

Comment 25

9 years ago
Created attachment 406199 [details] [diff] [review]
patch v1.8

fixed test orange with a polling strategy. me 1 tryserver 0
Attachment #405916 - Attachment is obsolete: true
(Assignee)

Comment 26

9 years ago
crossing fingers
http://hg.mozilla.org/mozilla-central/rev/123e48c8edf4
Status: ASSIGNED → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla1.9.3a1
(Assignee)

Updated

9 years ago
Blocks: 515952
Given that we're deliberately playing with shutdown timings, this may not come as a surprise.

Nevertheless, this change is showing a 8.5% Ts Shutdown regression on Vista, but big wins on dirty profile shutdowns on Mac and Linux so far. Is this a trade-off we're making deliberately, or is the elevated Vista TShutdown cause for alarm?
(Assignee)

Comment 28

9 years ago
it's a bit unclear, previous changes did show a large improvement on Ts shutdown dirty profiles on Vista so this 8.5% on Vista comes a bit surprising.
There are no changes that could hit Vista more than other OSes.
The vista data is noisy - this might take a couple more cycles to shake out before we call it an honest to goodness regression.
(Assignee)

Comment 30

9 years ago
i have to revert a change to the fact i delete categories and sdwilsh pointed out components are cached and that could hurt, not sure if it will change anything. Btw will push it soon.

I also need more data abour what Ts Shutdown is exactly measuring, from when to when.
(Assignee)

Comment 31

9 years ago
pushed additional changeset to avoid removing categories on shutdown, as per irc conversation with sdwilsh.
http://hg.mozilla.org/mozilla-central/rev/9c466d6f82f1

Now i see a 61% regression on XP MAX dirty, it is still largely below the value before bug 516940 and this, but there's lot of strange noise around these tests.
I don't actually understand this win.  We aren't doing any less work on shutdown, are we?  We just moved the work we did in quit-application to xpcom-shutdown.
(Assignee)

Comment 33

9 years ago
yes, apart discarding lazy favicons messages (async channels opened after xpcom-shutdown cleanup) most of the stuff is just moving things later.

per irc discussion i will also push a change to make deleteCategoryEntry not persistent and file a bug to check why we leak if we don't remove the entry manually (the category cache should already clean that up on xpcom-shutdown).
(Assignee)

Comment 34

9 years ago
better checking https://developer.mozilla.org/en/XPCOM_Interface_Reference/nsICategoryManager/deleteCategory

looks like deleteCategory is safe, since it won't persist the removal across sessions

DeleteCategoryEntry instead is not safe, but thanks to bug 364864 it did not hurt us.

since nsICategoryManager is frozen and won't change in future, we could add back the deleteCategory calls in historyService and get rid of deleteCategoryEntry calls in DBFlush.

So 2 options:
- remove DeleteCategoryEntry calls and add back DeleteCategory
- make DeleteCategoryEntry calls get a false aPersist param
(Assignee)

Comment 35

9 years ago
even if following code, DeleteCategory calls mTable.clear(), that removes entries from the hash table, instead DeleteCategoryEntry with aPersist=false, if the entry is a persistent one it sets leaf->nonpValue = nsnull; (nonpersistentvalue), if it's not a persistent entry it removes it from the hash...

So looks like documentation could be wrong, but this code is a bit messed up.
(In reply to comment #34)
> - remove DeleteCategoryEntry calls and add back DeleteCategory
I would prefer this.

Reopening to track that this isn't done.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 37

9 years ago
actually the leak is due to the fact for some reason in tests we can end up notifying PlacesDBFlush for the first time on xpcom-shutdown or later (i suppose during).

nsCategoryCache empties its local cache at xpcom-shutdown, if the category manager sends to the categoryCache an addEntry notification after this cleanup, it will be added to the local cache and never removed, causing the leak.

I'm filing a bug against xpcom, since the code allows this possibility, but probably we also have to follow the stack and see what is adding a category manager entry after xpcom-shutdown.
(Assignee)

Comment 38

9 years ago
filed Bug 522353
Depends on: 522353
(Assignee)

Comment 39

9 years ago
Created attachment 406321 [details] [diff] [review]
be safe

i think while i investigate bug 522353, we should take this.
I prefer this rather than add back deleteCategory because these calls are really specific about requests, the third param specifies we don't want to persist the change.
DeleteCategory as well won't persist the change, but that is only documented on mdc, the idl does not say anything, and the code is following a different path from deleteCategoryEntry(persist=false). So i don't trust too much into that method.

there is no rush due to bug 364864, but i'd really prefer to put this in before next nightly, just to be safe.
Attachment #406321 - Flags: review?(sdwilsh)

Updated

9 years ago
Attachment #406321 - Flags: review?(sdwilsh) → review+
Comment on attachment 406321 [details] [diff] [review]
be safe

I'd really rather us just back this out until the issues are fixed, but this is fine as a stopgap.
(Assignee)

Comment 41

9 years ago
well i did some deeper analysis in bug 522353, and the reason for the leak is quite clear. So backout won't give back anything really useful, will probably just make harder to try understanding Ts Shutdown roller-coaster behavior.
(Assignee)

Comment 42

9 years ago
pushed.
http://hg.mozilla.org/mozilla-central/rev/1a780ff55536

Now, i'll wait tomorrow to check bug 522353 with Biesi.
(Assignee)

Comment 43

8 years ago
i pushed bug 522353 and removed the workaround in nsPlacesDBFlush
http://hg.mozilla.org/mozilla-central/rev/5b631f59dda3
Status: REOPENED → RESOLVED
Last Resolved: 9 years ago8 years ago
Resolution: --- → FIXED
(Assignee)

Updated

8 years ago
Depends on: 529821
You need to log in before you can comment on or make changes to this bug.