So my analysis based on local log is this. When ~nsMsgDatabase() is called, this also calls RemoveFromCache() to remove the being-deleted nsMsgDatabase from m_dbCache. Unfortunately, during shutdown time, XPCOM shutdown proceeds asynchronously and this RemoveFromCache() cannot be called for the databases deleted near the end sometimes. So that leaves pointers (to nsMsgDatabase that has already been released !) in m_dbCache. This causes the reference to released heap in ~nsMsgDBService(). (I think deletetion of nsMsgDatabase should always come before nsMsgDBService. I am not sure if this is strictly followed.) I tried to see if I can somehow make the old pointer to nsMsgDBService() to call RemoveFromCache() even if XPCOM service itself is shutdown. The timing-dependence kicked in. Sometimes it works. But sometimes we get reference problem to already released heap again inside RemoveFromCache() even. So my tentative final solution is as follows. Count the number of times that RemoveFromCache() cannot be called due to early XPCOM shutdown. Then in ~nsMsgDBService(), I check not only m_dbCache.Length(), but whether it is smaller than or equal to the failure to call RemoveFromCache() because the service was not available any more. We allow such left over pointers to nsMsgDatabase as inevitable. We don't print the db left open message if m_dbCache.Length() <= the # of failures to invoke RemoveFromCache(...) by exiting early in ~nsMsgDBService. So far so good locally. : No more db left open message for the three tests that produced sanitizer: heap-use-after-free. No more heap-use-after-free (although there *ARE* leaks.) We won't crash, that is more important here. I am going to run the whole mochitest locally and then, if it succeeds, submit the tryserver run.
Bug 1677202 Comment 7 Edit History
Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.
So my analysis based on local log is this. When ~nsMsgDatabase() is called, this also calls RemoveFromCache() to remove the being-deleted nsMsgDatabase from m_dbCache. Unfortunately, during shutdown time, XPCOM shutdown proceeds asynchronously and this RemoveFromCache() cannot be called for the databases deleted near the end sometimes. So that leaves pointers (to nsMsgDatabase that has already been released !) in m_dbCache. This causes the reference to released heap in ~nsMsgDBService(). (I think deletion of nsMsgDatabase should always come before nsMsgDBService. I am not sure if this is strictly followed.) I tried to see if I can somehow make the old pointer to nsMsgDBService() to call RemoveFromCache() even if XPCOM service itself is shutdown. The timing-dependence kicked in. Sometimes it works. But sometimes we get reference problem to already released heap again inside RemoveFromCache() even. So my tentative final solution is as follows. Count the number of times that RemoveFromCache() cannot be called due to early XPCOM shutdown. Then in ~nsMsgDBService(), I check not only m_dbCache.Length(), but whether it is smaller than or equal to the failure to call RemoveFromCache() because the service was not available any more. We allow such left over pointers to nsMsgDatabase as inevitable. We don't print the db left open message if m_dbCache.Length() <= the # of failures to invoke RemoveFromCache(...) by exiting early in ~nsMsgDBService. So far so good locally. : No more db left open message for the three tests that produced sanitizer: heap-use-after-free. No more heap-use-after-free (although there *ARE* leaks.) We won't crash, that is more important here. I am going to run the whole mochitest locally and then, if it succeeds, submit the tryserver run.
So my analysis based on local log is this. When ~nsMsgDatabase() is called, this also calls RemoveFromCache() to remove the being-deleted nsMsgDatabase from m_dbCache. Unfortunately, during shutdown time, XPCOM shutdown proceeds asynchronously and this RemoveFromCache() cannot be called for the databases deleted near the end sometimes. So that leaves pointers (to nsMsgDatabase that has already been released !) in m_dbCache. This causes the reference to released heap in ~nsMsgDBService(). (I think deletion of nsMsgDatabase should always come before nsMsgDBService. I am not sure if this is strictly followed.) I tried to see if I can somehow make the old pointer to nsMsgDBService() to call RemoveFromCache() even if XPCOM service itself is shutdown. The timing-dependence kicked in. Sometimes it works. But sometimes we get reference problem to already released heap again inside RemoveFromCache() even. So my tentative final solution is as follows. Count the number of times that RemoveFromCache() cannot be called due to early XPCOM shutdown. Then in ~nsMsgDBService(), I check not only m_dbCache.Length(), but whether it is smaller than or equal to the # of failures to call RemoveFromCache() because the service was not available any more. We allow such left over pointers to m_dbCache referenced by ~nsMsgDatabase() as inevitable. We don't print the db left open message if m_dbCache.Length() <= the # of failures to invoke RemoveFromCache(...) by exiting early in ~nsMsgDBService(). So far so good locally. : No more db left open message for the three tests that produced sanitizer: heap-use-after-free. No more heap-use-after-free (although there *ARE* leaks.) We won't crash, that is more important here. I am going to run the whole mochitest locally and then, if it succeeds, submit the tryserver run.