Last Comment Bug 758530 - Sync does not initialize automatically anymore
: Sync does not initialize automatically anymore
Status: VERIFIED FIXED
[qa+]
: regression
Product: SeaMonkey
Classification: Client Software
Component: Sync UI (show other bugs)
: Trunk
: All All
: -- critical (vote)
: seamonkey2.13
Assigned To: Jens Hatlak (:InvisibleSmiley)
:
Mentors:
Depends on: 725408 757320
Blocks: SM2.11b1
  Show dependency treegraph
 
Reported: 2012-05-25 01:25 PDT by Jens Hatlak (:InvisibleSmiley)
Modified: 2012-07-12 08:13 PDT (History)
11 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
-
verified
fixed
unaffected
+
verified
+
verified


Attachments
patch [Checkin: Comments 25, 28, 29] (2.19 KB, patch)
2012-06-04 12:56 PDT, Jens Hatlak (:InvisibleSmiley)
benjamin: review+
akeybl: approval‑mozilla‑aurora+
akeybl: approval‑mozilla‑beta+
Details | Diff | Splinter Review

Description Jens Hatlak (:InvisibleSmiley) 2012-05-25 01:25:27 PDT
If you have a profile with Sync set up and use it with current trunk or Aurora, Sync does not init anymore (i.e. SM won't start to sync some seconds after startup, like it used to - best seen with the Sync button added to some toolbar: it should start to spin).

Error Console output when clicking the Sync button or using Tools/Sync Now:

Error: TypeError: this._log is undefined
Source File: resource:///modules/services-sync/policies.js
Line: 639

Note that the above (which is in shared back-end code) is only a symptom: Sync is not initialized and so is the _log variable.

Firefox is not affected, so this must have been broken by a change on our side (c-c code).

Last known good: 04-17-00-03-26
First known bad: 04-17-18-08-55

Within that time frame I see two candidates:
* 3fdc5f308109 (bug 743692, affecting nsSuiteGlue.js)
* 74679c8cb51d and 5c06bd39f172 (bug 723138, setting MOZ_APP_MAXVERSION)

I'd bet on the first one but included the other for completeness (I'm not an expert in the field there). CC'ing people accordingly.

Of course it's possible that the root cause is something completely different. :-/

For proper testing (esp. regression window determination like I did), use a profile with Sync set up and start it with a SM version that works (e.g. 2.10b2) once (and let it sync) before doing any test with a possibly broken build.
Comment 1 Gregory Szorc [:gps] 2012-05-25 01:47:14 PDT
Did SM import services/common? A lot of Sync code has been refactored out of services/sync and into services/common. I would think that if you are missing this that you would get an error when importing services-sync modules, however. Just `hg log services/common` from m-c to identify the changesets.
Comment 2 Serge Gautherie (:sgautherie) 2012-05-25 02:18:28 PDT
(In reply to Gregory Szorc [:gps] from comment #1)
> Did SM import services/common?

Yes it does: bug 744702.
Comment 3 Jens Hatlak (:InvisibleSmiley) 2012-05-25 02:43:38 PDT
(In reply to Gregory Szorc [:gps] from comment #1)
> A lot of Sync code has been refactored out of
> services/sync and into services/common.

Like Serge said, we had already ported that change before this issue appeared. It was also my first guess that the regression was due to refactorings or changes on m-c, but the regression window suggests that it must have been a c-c change that introduced it. Also I didn't find any suspicious m-c landings in the time frame.
Comment 4 neil@parkwaycc.co.uk 2012-05-25 02:47:13 PDT
Sync initialises when service.js is first imported (one way of achieving this is to import main.js and access Weave.Service). However I don't know why we aren't doing this automatically any more.
Comment 5 Philip Chee 2012-05-25 05:32:56 PDT
> * 3fdc5f308109 (bug 743692, affecting nsSuiteGlue.js)
If you back this out and rebuild SM does Sync start to work again?

Also:
Bug 758582 Port |Bug 602876 - Implement network client for credentials exchange via J-PAKE| to SeaMonkey
Bug 758578 Port |Bug 609421 - Combine various JSMs| to SeaMonkey
Bug 758575 Port |Bug 636402 - Simplify Sync.js to avoid creating new objects| to SeaMonkey
Bug 758574 Port |Bug 662178 - Simplify timed callbacks| to SeaMonkey
Bug 758573 Port |Bug 669547 - Fix AsyncResource (or provide an alternative) to have better callback semantics| to SeaMonkey
Bug 758571 Port |Bug 730989 - Refactor identity and authentication management| to SeaMonkey
Bug 758567 Port |Bug 706545 - Implement a sync engine for apps exposed by navigator.mozApps| to SeaMonkey
Bug 758564 Port |Bug 562431 - Rewrite WeaveCrypto as a .jsm| to SeaMonkey
Bug 758554 Port |Bug 727210 - Implement token server client for Sync| to SeaMonkey
Bug 758552 Port |Bug 745396 - Refactor utils functions into {Common,Crypto}Utils| to SeaMonkey
Comment 6 Serge Gautherie (:sgautherie) 2012-05-25 06:57:05 PDT
(In reply to Jens Hatlak (:InvisibleSmiley) from comment #0)

> Within that time frame I see two candidates:
> * 3fdc5f308109 (bug 743692, affecting nsSuiteGlue.js)
> * 74679c8cb51d and 5c06bd39f172 (bug 723138, setting MOZ_APP_MAXVERSION)

At first glance, they don't seem related to Sync at all to me :-|

> Of course it's possible that the root cause is something completely
> different. :-/
> 
> For proper testing (esp. regression window determination like I did), use a
> profile with Sync set up and start it with a SM version that works (e.g.
> 2.10b2) once (and let it sync) before doing any test with a possibly broken
> build.

The other option is to try to debug where/why Sync fails to init.
Comment 7 Jens Hatlak (:InvisibleSmiley) 2012-05-31 12:52:15 PDT
Small update (no real progress yet):

* Changeset 3fdc5f308109 (bug 743692, nsSuiteGlue.js) is not so easy to revert since at least changeset d04fc59c787f (same bug) landed on top of it. Didn't go that far yet.

* Reverting bug 723138 (setting MOZ_APP_MAXVERSION) didn't seem to make a change, but maybe I missed some step like recreating configure (otherwise did a clean build, though).

* Opening the Go menu once initializes Sync. I guess this supports Neil's claim from comment 4 that accessing Weave.Service triggers the Sync init (Weave.Service is accessed in toggleTabsFromOtherComputers(), which is attached to the Go menu init through functions updateGoMenu() and FillHistoryMenu()). Firefox has similar code in similar locations (compared MXR search results for "Weave.Service" below /suite/ vs. /browser/). I just wonder how the main Sync init (independent from any popupshowing handlers) happens for Firefox then. I guess there are multiple ways to init Sync. Gregory, any idea?

I want to remind everyone that this bug is also on Aurora, which will become Beta next week. Some time after that we'll prepare 2.11b1. That's when Sync will break for all our beta testers (which includes me BTW) - unless we manage to fix this bug until then!

If we don't see daylight I might just add "Weave.Service;" to utilityOverlay.js. ;-)
Comment 8 Serge Gautherie (:sgautherie) 2012-05-31 14:08:08 PDT
(In reply to Serge Gautherie (:sgautherie) from comment #6)
> The other option is to try to debug where/why Sync fails to init.

Or how it used to successfully init before the regression...
Comment 9 Serge Gautherie (:sgautherie) 2012-05-31 14:15:43 PDT
(In reply to Jens Hatlak (:InvisibleSmiley) from comment #7)
> * Changeset 3fdc5f308109 (bug 743692, nsSuiteGlue.js) is not so easy to
> revert since at least changeset d04fc59c787f (same bug) landed on top of it.
> Didn't go that far yet.

That's really what you should check.
Or you could apply that patch on top of a non-regressed build...
Comment 10 Serge Gautherie (:sgautherie) 2012-05-31 14:23:15 PDT
(In reply to Jens Hatlak (:InvisibleSmiley) from comment #0)

> If you have a profile with Sync set up and use it with current trunk or
> Aurora, Sync does not init anymore (i.e. SM won't start to sync some seconds
> after startup, like it used to - best seen with the Sync button added to
> some toolbar: it should start to spin).

Does anyone else use (SM) Sync? Looking for a confirmation, fwiw...

> Last known good: 04-17-00-03-26
> First known bad: 04-17-18-08-55

Could you post the exact builds+revs and related hg log timeframes?
Comment 11 H. Hofer 2012-06-01 02:11:16 PDT
(In reply to Serge Gautherie (:sgautherie) from comment #10)
> Does anyone else use (SM) Sync? Looking for a confirmation, fwiw...

Yep! And I can confirm this, too (SM Win Nightly).

> Could you post the exact builds+revs and related hg log timeframes?

Unfortunately not. I wasn't home when this Problem started showing, so
I don't have the builds available.

Regards!
Comment 12 Serge Gautherie (:sgautherie) 2012-06-01 03:35:22 PDT
(In reply to H. Hofer from comment #11)
> Yep! And I can confirm this, too (SM Win Nightly).

(Thanks.)

> Unfortunately not. I wasn't home when this Problem started showing, so
> I don't have the builds available.

Yeah, I was asking Jens...
Fyi, older nightlies are available on ftp.
Comment 13 Sven Grull 2012-06-01 08:38:30 PDT
(In reply to Serge Gautherie (:sgautherie) from comment #10)

> Does anyone else use (SM) Sync? Looking for a confirmation, fwiw...

I see this on Win XP trunk builds too.
Comment 14 Jens Hatlak (:InvisibleSmiley) 2012-06-01 13:39:39 PDT
So, still no clue, but some more facts:

* Did a local backout of c-c changesets d04fc59c787f and 3fdc5f308109 (bug 743692, affecting nsSuiteGlue.js). No change, so I guess we can rule this one out (phew!).

Went back to the builds I had tried. The good news is that the builds I identified are definitely correct (just had a small typo there: last known good is 04-17-00-30-26, not 04-17-00-03-26). Unfortunately we /still/ don't have the c-c changeset in about:buildconfig but only the m-c one. Here goes (all tested with the same profile, Console2 add-on installed with settings: Errors+Warnings, JS+XML, Chrome+Resource; Options: Report Strict Warnings, Report All JS Exceptions):

Good:
Build ID: 20120417003026
Build Machine: sea-win32-02
Built from http://hg.mozilla.org/mozilla-central/rev/c61e7c3a232a
Console2 contains seemingly unrelated entries, Sync button spins after ~15sec

Bad 1:
Build ID: 20120417180855
Build Machine: sea-w2k3-par-01
Built from: http://hg.mozilla.org/mozilla-central/rev/93dfd98900ad
Console2 is completely empty no matter what you do - broken build or PEBKAC?

Bad 2:
Build ID: 20120418003018
Build Machine: sea-w2k3-par-03
Built from: http://hg.mozilla.org/mozilla-central/rev/35e13f42ee8a
Console2 contains seemingly unrelated entries, plus this._log error if Sync button clicked (doesn't spin)

Now if we look at the dates of m-c changesets c61e7c3a232a and 93dfd98900ad we end up with a regression window of 2012-04-16 13:00:00 to 2012-04-17 18:00:00 (I think that's PDT - actually whatever http://hg.mozilla.org/comm-central/pushloghtml likes to have for From = startdate / To = enddate). Assuming the same for c-c we get:

http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2012-04-16+13%3A00%3A00&enddate=2012-04-17+18%3A00%3A00
http://hg.mozilla.org/comm-central/pushloghtml?startdate=2012-04-16+13%3A00%3A00&enddate=2012-04-17+18%3A00%3A00

(for changeset 35e13f42ee8a, insert 2012-04-18 19:00:00)

Couldn't find anything new/suspicious landing on c-c in the above time frames so maybe we have to take a closer look at m-c changes. Unfortunately that's too much for me now, feel free to jump in!

Callek: The different build machines do not suggest a possible issue I hope? Just trying to make sure we don't miss anything (like releng changes happening in that time frame).

So much for today...
Comment 15 Justin Wood (:Callek) (Away until Aug 29) 2012-06-01 14:16:59 PDT
(In reply to Jens Hatlak (:InvisibleSmiley) from comment #14)
> So, still no clue, but some more facts:
> 
> * Did a local backout of c-c changesets d04fc59c787f and 3fdc5f308109 (bug
> 743692, affecting nsSuiteGlue.js). No change, so I guess we can rule this
> one out (phew!).
> 
> Went back to the builds I had tried. The good news is that the builds I
> identified are definitely correct (just had a small typo there: last known
> good is 04-17-00-30-26, not 04-17-00-03-26). Unfortunately we /still/ don't
> have the c-c changeset in about:buildconfig but only the m-c one. Here goes
> (all tested with the same profile, Console2 add-on installed with settings:
> Errors+Warnings, JS+XML, Chrome+Resource; Options: Report Strict Warnings,
> Report All JS Exceptions):
> 
> Good:
> Build ID: 20120417003026
> Build Machine: sea-win32-02
> Built from http://hg.mozilla.org/mozilla-central/rev/c61e7c3a232a
> Console2 contains seemingly unrelated entries, Sync button spins after ~15sec

http://ftp.mozilla.org/pub/mozilla.org/seamonkey/nightly/2012/04/2012-04-17-00-30-26-comm-central-trunk/seamonkey-2.11a1.en-US.win32.txt

20120417003026
http://hg.mozilla.org/mozilla-central/rev/c61e7c3a232a
http://hg.mozilla.org/comm-central/rev/6396ccf7405a

> Bad 1:
> Build ID: 20120417180855
> Build Machine: sea-w2k3-par-01
> Built from: http://hg.mozilla.org/mozilla-central/rev/93dfd98900ad
> Console2 is completely empty no matter what you do - broken build or PEBKAC?

http://ftp.mozilla.org/pub/mozilla.org/seamonkey/nightly/2012/04/2012-04-17-18-08-55-comm-central-trunk/seamonkey-2.11a1.en-US.win32.txt

20120417180855
http://hg.mozilla.org/mozilla-central/rev/93dfd98900ad
http://hg.mozilla.org/comm-central/rev/5e80e777ae2d

> Bad 2:
> Build ID: 20120418003018
> Build Machine: sea-w2k3-par-03
> Built from: http://hg.mozilla.org/mozilla-central/rev/35e13f42ee8a
> Console2 contains seemingly unrelated entries, plus this._log error if Sync
> button clicked (doesn't spin)

http://ftp.mozilla.org/pub/mozilla.org/seamonkey/nightly/2012/04/2012-04-18-00-30-18-comm-central-trunk/seamonkey-2.11a1.en-US.win32.txt

20120418003018
http://hg.mozilla.org/mozilla-central/rev/35e13f42ee8a
http://hg.mozilla.org/comm-central/rev/56ea15c0e0a2

> http://hg.mozilla.org/mozilla-central/pushloghtml?startdate=2012-04-
> 16+13%3A00%3A00&enddate=2012-04-17+18%3A00%3A00
> http://hg.mozilla.org/comm-central/pushloghtml?startdate=2012-04-
> 16+13%3A00%3A00&enddate=2012-04-17+18%3A00%3A00

So range is:
http://hg.mozilla.org/comm-central/pushloghtml?fromchange=6396ccf7405a&tochange=5e80e777ae2d

m-c range is much larger:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c61e7c3a232a&tochange=93dfd98900ad

> Callek: The different build machines do not suggest a possible issue I hope?
> Just trying to make sure we don't miss anything (like releng changes
> happening in that time frame).

I would _very_ much doubt that being an issue here.
Comment 16 Jens Hatlak (:InvisibleSmiley) 2012-06-01 23:07:29 PDT
Hah! I think I found it: http://hg.mozilla.org/mozilla-central/rev/ef55c163a23a (author Myk Melez, reviewer Benjamin Smedberg)

Now look at this: http://hg.mozilla.org/mozilla-central/diff/ef55c163a23a/services/sync/SyncComponents.manifest

This is gross. Didn't it appear to anyone that services/ is similar to toolkit/? I mean, hard-coding app IDs in shared code? Hello?

I wonder if there's any way to override this from c-c land at all.
Comment 17 Justin Wood (:Callek) (Away until Aug 29) 2012-06-01 23:58:02 PDT
(In reply to Jens Hatlak (:InvisibleSmiley) from comment #16)
> Hah! I think I found it:
> http://hg.mozilla.org/mozilla-central/rev/ef55c163a23a (author Myk Melez,
> reviewer Benjamin Smedberg)
> 
> Now look at this:
> http://hg.mozilla.org/mozilla-central/diff/ef55c163a23a/services/sync/
> SyncComponents.manifest
> 
> This is gross. Didn't it appear to anyone that services/ is similar to
> toolkit/? I mean, hard-coding app IDs in shared code? Hello?
> 
> I wonder if there's any way to override this from c-c land at all.

Honestly I bet they would accept a patch to add us (and maybe TB if they want) to

http://hg.mozilla.org/mozilla-central/rev/ef55c163a23a#l15.2 (SyncComponents.manifest) though I agree hardcoding appID is suboptimal here.

I think trying to override for c-c itself is just a disaster waiting to happen though.
Comment 18 Justin Wood (:Callek) (Away until Aug 29) 2012-06-01 23:59:19 PDT
Without looking in at it, carefully, this should be tracking+ on any branch its broken on, and _block_ a beta release on any branch its broken on
Comment 19 Serge Gautherie (:sgautherie) 2012-06-02 04:03:09 PDT
(In reply to Jens Hatlak (:InvisibleSmiley) from comment #16)
> hard-coding app IDs in shared code? Hello?

Indeed: can't expect to list all existing/future Gecko based applications that want to use this service :-[

Ftr,
http://mxr.mozilla.org/mozilla-central/search?string=application%3D&case=on&find=\.manifest%24
This is the only case of such issue.
Comment 20 Jason Smith [:jsmith] 2012-06-02 05:43:28 PDT
Myk - Any ideas? The above comments imply that changes to SyncComponents.manifest broke the Sync UI in SeaMonkey.

http://hg.mozilla.org/mozilla-central/diff/ef55c163a23a/services/sync/SyncComponents.manifest
Comment 21 Benjamin Smedberg [:bsmedberg] 2012-06-04 12:36:50 PDT
That sync code is built as part of each application: it is not part of "the platform". This means that you should just patch this code to emit the application modifier for your app.

This hack can probably be removed when bug 755724 lands, although there are still some issues there with whether/how we share this code between the desktop and metro sub-apps.
Comment 22 Jens Hatlak (:InvisibleSmiley) 2012-06-04 12:56:56 PDT
Created attachment 629895 [details] [diff] [review]
patch [Checkin: Comments 25, 28, 29]

I don't really like this, but since Benjamin pointed at a bug which will hopefully solve this eventually, I'm going for the simple approach that will fix us until then.

Please note that the approval requests are for the Mozilla 14 and 15 branches, i.e. after the ongoing uplift.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 725408
User impact if declined: SM Sync broken starting with SM 2.11 (Gecko 14-based)
Testing completed (on m-c, etc.): verified locally
Risk to taking this patch (and alternatives if risky): low
String or UUID changes made by this patch: none
Comment 23 Alex Keybl [:akeybl] 2012-06-04 15:40:04 PDT
Leaving the approval nom while we wait for an r+
Comment 24 Jens Hatlak (:InvisibleSmiley) 2012-06-05 13:28:41 PDT
Comment on attachment 629895 [details] [diff] [review]
patch [Checkin: Comments 25, 28, 29]

https://hg.mozilla.org/integration/mozilla-inbound/rev/ae2b223b5e7c
Comment 25 Jens Hatlak (:InvisibleSmiley) 2012-06-06 01:32:49 PDT
Comment on attachment 629895 [details] [diff] [review]
patch [Checkin: Comments 25, 28, 29]

http://hg.mozilla.org/mozilla-central/rev/ae2b223b5e7c
Comment 26 Alex Keybl [:akeybl] 2012-06-06 15:36:23 PDT
We agreed to wait for this to land on a nightly update, unless somebody can give me high confidence that this has no chance to regress Firefox.
Comment 27 Alex Keybl [:akeybl] 2012-06-06 15:38:23 PDT
Comment on attachment 629895 [details] [diff] [review]
patch [Checkin: Comments 25, 28, 29]

Callek suggests no risk to Firefox, so approving for all branches.
Comment 28 Justin Wood (:Callek) (Away until Aug 29) 2012-06-06 21:17:42 PDT
http://hg.mozilla.org/releases/mozilla-beta/rev/0f5f1db9d848

I just got my m-a repo corrupted :/ and I'm overtired, so my ability to watch m-a now is nil...

Jens if you get around to it can you please push there
Comment 29 Jens Hatlak (:InvisibleSmiley) 2012-06-07 00:01:21 PDT
Comment on attachment 629895 [details] [diff] [review]
patch [Checkin: Comments 25, 28, 29]

http://hg.mozilla.org/releases/mozilla-aurora/rev/143bc927d524
Comment 30 Jens Hatlak (:InvisibleSmiley) 2012-06-08 01:51:38 PDT
Verified fixed using latest Linux nightly (no new Windows nightlies due to bug 761635).
Comment 31 Anthony Hughes (:ashughes) [GFX][QA][Mentor] 2012-06-22 14:17:02 PDT
Jens, would you be able to verify this on the builds indicated by the tracking flags as well?
Comment 32 Jens Hatlak (:InvisibleSmiley) 2012-06-24 02:47:10 PDT
Verified using SM 2.11b3 (should be fixed since 2.11b1 actually) and SM 2.11a2 (20120623013002).
Comment 33 Ioana (away) 2012-07-12 08:13:50 PDT
Verified as fixed on Firefox 14.0 beta 12 (20120710123126) on Windows XP, Ubuntu 12.04 and Mac OS X 10.8.

Note You need to log in before you can comment on or make changes to this bug.