lightning regularly makes Thunderbird hang after startup on Mac OS X

RESOLVED FIXED in Lightning 0.1

Status

Calendar
Lightning Only
--
critical
RESOLVED FIXED
12 years ago
12 years ago

People

(Reporter: dmose, Assigned: dmose)

Tracking

Trunk
Lightning 0.1
PowerPC
Mac OS X

Details

Attachments

(1 attachment, 1 obsolete attachment)

(Assignee)

Description

12 years ago
About 50% of the time on Mac, Thunderbird with Lightning hangs after startup.  Specifically, it pops up an nsIAuthPrompt windows for me to authenticate to https server where my calendars are kept, and then it spins using a bunch of CPU (probably parsing the calendar) and then it just hangs without using much CPU.  The main Thunderbird window never appears.

At first, I thought this was because of my weird choice of build tools (gcc4, darwinports instead of fink).  However, it also happens just as frequently on branch builds of Lightning made with the standard tools on Tinderbox (running on top of Thunderbird 1.5).

When this happens while running a debug build, if the number of ++DOMWINDOW == messages is at 3, chances are fairly good that it won't hang.  If that number is at 4, it seems to (almost?) always hang.

Of course, as I'm filing this bug, I'm not able to reproduce it in a debug build, but next time I do, I'll post a stack.

My current speculation is that this may be related to the fact the nsIAuthPrompt in question is unparented and shows up before the main window, perhaps before something is properly initialized.
(Assignee)

Comment 1

12 years ago
The stack:

(gdb) where full
#0  0xffff8278 in ___spin_lock () at /System/Library/Frameworks/System.framework/PrivateHeaders/ppc/cpu_capabilities.h:179
No locals.
#1  0x90003818 in szone_malloc ()
No symbol table info available.
#2  0x9000374c in malloc_zone_malloc ()
No symbol table info available.
#3  0x903a32c0 in shape_offset ()
No symbol table info available.
#4  0x9170a9cc in OffsetRgn ()
No symbol table info available.
#5  0x931bbcd4 in WindowToGlobalRegion ()
No symbol table info available.
#6  0x931a6538 in GetWindowRegion ()
No symbol table info available.
#7  0x193f4a38 in nsWindow::Update (this=0x2ec62eb0) at ../../../../gfx/src/mac/nsCarbonHelpers.h:53
        portSetter = {
  mPortChanged = 135 '\207', 
  mOldPort = 0x77696e64, 
  mOldDevice = 0x1
}
        originSetter = {
  mSavePortRect = {
    top = -30843, 
    left = -27757, 
    bottom = -15797, 
    right = -15979
  }
}
        origin = {
  v = -30843, 
  h = -27757
}
        reentrant = 1
#8  0x193d4ff0 in nsMacWindow::WindowEventHandler (inHandlerChain=0xbffff220, inEvent=0x13bf6b0, userData=0x2ec62eb0) at ../../../../widget/src/mac/nsMacWindow.cpp:700
        what = 50331704
        retVal = 50331704
        myWind = 0x2ec63fe0
#9  0x9318cff4 in DispatchEventToHandlers ()
No symbol table info available.
#10 0x9318c74c in SendEventToEventTargetInternal ()
No symbol table info available.
#11 0x9318c5c8 in SendEventToEventTargetWithOptions ()
No symbol table info available.
#12 0x93260160 in HandleWindowEvent ()
No symbol table info available.
#13 0x93193890 in ToolboxEventDispatcherHandler ()
No symbol table info available.
#14 0x9318d244 in DispatchEventToHandlers ()
No symbol table info available.
#15 0x9318c74c in SendEventToEventTargetInternal ()
No symbol table info available.
#16 0x931934ec in SendEventToEventTarget ()
No symbol table info available.
#17 0x931d4564 in ToolboxEventDispatcher ()
No symbol table info available.
#18 0x932735b8 in TryEventDispatcher ()
No symbol table info available.
#19 0x9327320c in GetOrPeekEvent ()
No symbol table info available.
#20 0x93272f48 in GetNextEventMatchingMask ()
No symbol table info available.
#21 0x93272df0 in WNEInternal ()
No symbol table info available.
#22 0x93272d50 in WaitNextEvent ()
No symbol table info available.
#23 0x00422a78 in nsMacCommandLine::Initialize (this=0x456e9c, argc=@0x456e98, argv=@0x456e94) at ../../../toolkit/xre/nsCommandLineServiceMac.cpp:158
        anEvent = {
  what = 0, 
  message = 0, 
  when = 4922663, 
  where = {
    v = 813, 
    h = 1253
  }, 
  modifiers = 128
}
#24 0x00422b54 in InitializeMacCommandLine (argc=@0x1, argv=@0x3e8) at ../../../toolkit/xre/nsCommandLineServiceMac.cpp:407
No locals.
#25 0x00413138 in XRE_main (argc=0, argv=0x1312950, aAppData=0x44b230) at ../../../toolkit/xre/nsAppRunner.cpp:2288
        obsService = {
  mRawPtr = 0x13c0d30
}
        appStartup = {
  mRawPtr = 0x13b9230
}
        workingDir = {
  mRawPtr = 0x2ec62890
}
        chromeObserver = {
  mRawPtr = 0x139f9a8
}
        cmdLine = {
  mRawPtr = 0x1a55b610
}
        noEMRestart = 0x3000038 "ÿÿÿÿ"
        xpcom = {
  mServiceManager = 0x1312324
}
        rv = 0
        dirProvider = {
  <nsIDirectoryServiceProvider2> = {
    <nsIDirectoryServiceProvider> = {
      <nsISupports> = {
        _vptr$nsISupports = 0x451478
      }, <No data fields>}, <No data fields>}, 
  <nsIProfileStartup> = {
    <nsISupports> = {
      _vptr$nsISupports = 0x45149c
    }, <No data fields>}, 
  members of nsXREDirProvider: 
  mAppProvider = {
    mRawPtr = 0x0
  }, 
  mGREDir = {
    mRawPtr = 0x1311eb0
  }, 
  mXULAppDir = {
    mRawPtr = 0x0
  }, 
  mProfileDir = {
    mRawPtr = 0x1312740
  }, 
  mProfileLocalDir = {
    mRawPtr = 0x1312810
  }, 
  mProfileNotified = 1
}
        nativeApp = {
  mRawPtr = 0x130f300
}
        canRun = 1
        registryFile = {
  mRawPtr = 0x130f1b0
}
        profileLock = {
  mRawPtr = 0x13122f0
}
        startOffline = 0
        profD = {
  mRawPtr = 0x1312740
}
        profLD = {
  mRawPtr = 0x1312810
}
        upgraded = 0
        version = {
  <nsFixedCString> = {
    <nsCString> = {
      <nsCSubstring> = {
        <nsACString_internal> = {
          mVTable = 0x20e9078, 
          mData = 0xbffffa04 "1.6a1_0000000000/1.9a1_0000000000", 
          mLength = 33, 
          mFlags = 65553
        }, <No data fields>}, <No data fields>}, 
    members of nsFixedCString: 
    mFixedCapacity = 63, 
    mFixedBuf = 0xbffffa04 "1.6a1_0000000000/1.9a1_0000000000"
  }, 
  members of nsCAutoString: 
  mStorage = "1.6a1_0000000000/1.9a1_0000000000\000\000À¿ÿûÜ\000\000\000\020\000\000\000\003¿ÿûà¿ÿûð\000\000,¤¿ÿúÐ"
}
        versionOK = 0
        needsRestart = 0
        appInitiatedRestart = 4502064
#26 0x00002cb4 in main (argc=50331704, argv=0x0) at ../../../mail/app/nsMailApp.cpp:62
No locals.
(gdb)                                                          

It's not clear to me how the calendar code could have started executing while in the command-line handler, unless it's possible for the "profile-after-change" notification to fire from the command-line handler.      
Status: NEW → ASSIGNED
(Assignee)

Comment 2

12 years ago
Created attachment 210391 [details] [diff] [review]
make the alarm service start later, v1

This makes the alarm service be explicitly invoked by the GUI in its onload handler, rather than registering itself automagically using category entries.  I haven't had any instances of the hang since writing the patch.  Even if this patch doesn't actually cause the hang to be fixed (though I think it probably does), we want to do this anyway so that we don't try and pop up authentication dialogs before a parent window for them can exist.
Attachment #210391 - Flags: first-review?(jminta)

Comment 3

12 years ago
Comment on attachment 210391 [details] [diff] [review]
make the alarm service start later, v1

-        this.mStarted = true;
+
+        observerSvc = Components.classes["@mozilla.org/observer-service;1"]
+                      .getService(Components.interfaces.nsIObserverService);
+
+        observerSvc.addObserver(this, "profile-after-change", false);
+        observerSvc.addObserver(this, "xpcom-shutdown", false);
 
         this.calendarManager = Components.classes["@mozilla.org/calendar/manager;1"].getService(Components.interfaces.calICalendarManager);
         var calendarManager = this.calendarManager;
         calendarManager.addObserver(this.calendarManagerObserver);
 
         var calendars = calendarManager.getCalendars({});
         for each(var calendar in calendars) {
             this.observeCalendar(calendar);
@@ -250,16 +250,18 @@ calAlarmService.prototype = {
             alarmService: this,
             notify: function(timer) {
                 this.alarmService.findAlarms();
             }
         };
 
         this.mUpdateTimer = newTimerWithCallback(timerCallback, kHoursBetweenUpdates * 3600000, true);
 
+        this.mStarted = true;
+

Why are we moving this.mStarted later?  This seems to be setting up a (low-chance) race that could result in double-observers being added.  That is, if two async code areas each manually call startup() close in time to each other.  Unless there's a good reason for moving it, please put it back.

+        observerSvc = Components.classes["@mozilla.org/observer-service;1"]
+                      .getService(Components.interfaces.nsIObserverService);
That looks like a js-strict warning (in both places).

This is going to screw calendar.xpi users, so please coordinate with mostafah before landing this.  He may want to post a note about the new bug before he puts together another version.

Nice work, especially killing the leak! r=jminta with the above changes.
Attachment #210391 - Flags: first-review?(jminta) → first-review+
(Assignee)

Comment 4

12 years ago
(In reply to comment #3)
>
> Why are we moving this.mStarted later?

So that if something later in the startup throws an exception, mStarted won't be set without the startup having completed.

> This seems to be setting up a (low-chance) race that could result in
> double-observers being added.  That is, if two async code areas each manually
> call startup() close in time to each other.  Unless there's a good reason for
> moving it, please put it back.

The calendar code itself all runs on the UI thread, so there's no way this could happen unless startup() somehow re-enters itself, which would be a bug.

> +        observerSvc = Components.classes["@mozilla.org/observer-service;1"]
> +                      .getService(Components.interfaces.nsIObserverService);
> That looks like a js-strict warning (in both places).

Good catch; fixed.

> This is going to screw calendar.xpi users, so please coordinate with mostafah
> before landing this.  He may want to post a note about the new bug before he
> puts together another version.

Will do.
(Assignee)

Comment 5

12 years ago
Created attachment 210526 [details] [diff] [review]
patch, v2

strict warnings fixed.
Attachment #210391 - Attachment is obsolete: true
(Assignee)

Comment 6

12 years ago
As far as calendar.xpi changes go, the right way to deal with this there is probably do what mvl suggested in channel yesterday: overlay the toplevel firefox/thunderbird <window> with scriptage that explictly starts up the alarm service there.
(Assignee)

Comment 7

12 years ago
Comment on attachment 210526 [details] [diff] [review]
patch, v2

Carrying forward review.
Attachment #210526 - Flags: first-review+
(Assignee)

Comment 8

12 years ago
Bug 325660 filed for the calendar.xpi changes.
Status: ASSIGNED → RESOLVED
Last Resolved: 12 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.