Closed
Bug 845187
Opened 11 years ago
Closed 11 years ago
JavaScript interpreter encounters an uninitialized memory from XPCOM interface (thunderbird)
Categories
(MailNews Core :: Database, defect)
MailNews Core
Database
Tracking
(Not tracked)
RESOLVED
FIXED
Thunderbird 23.0
People
(Reporter: ishikawa, Assigned: ishikawa)
References
(Blocks 1 open bug)
Details
Attachments
(4 files, 1 obsolete file)
During the test run of thunderbird using "make mozmill" test, I have encountered uninitialized memory usage errors T where a service is called via XPCOM interface. It looks to me that JavaScript interpreter tries to import a value, created *OUTSIDE* the interpreter itself, into the internal data structure of JavaScript interpreter at runtime. But unfortunately, despite the data tag telling the interpreter to handle it as double, the passed value is actually uninitialized garbage. I wonder how I can debug this further. It seems to occur at the interfade of XPCOM -> JavaScript. Background: I noticed uninitialized memory usage errors when I ran TB (compiled source from comm-central under 32-bit Debian GNU/Linux). I ran the test "make mozmill" by running TB under valgrind/memcheck and encounted the problem(s) reported here. Since I could not load all the symbol files using --read-var-info due to the limited 32 bits memory space, and obtained detailed enough information about the variable or memory buffer, etc. where the uninitialized memory is located, I inserted valgrind probe into VALGRIND_CHECK_MEM_IS_DEFINED() (should return 0 if memory is valid and defined) suspicious areas so that I can abort() under "make mozmill" test framework if valgrind determines that an uninitialized value is used. This makes it easier for me to trace exactly what values are used, etc. "make mozmill" test framework waits for five minutes before running the next target, and in that 5 minutes, I can use remote gdb to attach to the aborted process image and look at stack and the like. Ok, this is the background. After many debug sessions, one problem still persist and is for real while one other mysterious case noticed earlier disappeared during investigation. Maybe the latest source changes in the last couple of weeks solved it (or GCC miscompilation is now avoided due to subtle code changes due to the insertion of valgrind probes. If so, this is a problem. I will investitage this further and report later on about this issue.) This disappearing case(s) kept me from reporting the problem here until today. Problem: It turns out that Java interpreter tries to convert non-initialized 8 byte area into a double in NativeData2JS routine. The problem is noticed in the following routine DOUBLE_TO_JSVAL(). This is an inline function in mozilla/js/public/Value.h static MOZ_ALWAYS_INLINE jsval DOUBLE_TO_JSVAL(double d) { /* * This is a manually inlined version of: * d = JS_CANONICALIZE_NAN(d); * return IMPL_TO_JSVAL(DOUBLE_TO_JSVAL_IMPL(d)); * because GCC from XCode 3.1.4 miscompiles the above code. */ jsval_layout l; *=> MOZ_ASSERT(VALGRIND_CHECK_MEM_IS_DEFINED(&d, sizeof(d)) == 0); if (MOZ_UNLIKELY(d != d)) l.asBits = 0x7FF8000000000000LL; else l.asDouble = d; return IMPL_TO_JSVAL(l); } Note my insertion of MOZ_ASSERT() and VALGRIND_CHECK_MEM_IS_DEFINED() macros in the quoted code above. Basically the problem is the passed value d is uninitialized. [Now that I think about it, can "d != d" check have something to do with the uninitialized value issue? The way code is written I suspect it must be related to some sort of NaN handling. Hmm...] The original comment in the source file makes me wonder if there is another GCC miscompilation issue, but I think this time the problem occurs earlier than the call to this function itself and seems to be related to an uninitialized memory area that is passed by XPCOM interface (and the data probably can be traced to the remote service that sits behind the XPCOM interface mechanism.) The above routine is called in XPCConvert::NatvieDATA2JS() [ mozilla/js/xpconnect/src/XPCConvert.cpp ] JSBool XPCConvert::NativeData2JS(XPCLazyCallContext& lccx, jsval* d, const void* s, const nsXPTType& type, const nsID* iid, nsresult* pErr) { NS_PRECONDITION(s, "bad param"); NS_PRECONDITION(d, "bad param"); JSContext* cx = lccx.GetJSContext(); // Allow wrong compartment or unset ScopeForNewObject when the caller knows // the value is primitive (viz., XPCNativeMember::GetConstantValue). NS_ABORT_IF_FALSE(type.IsArithmetic() || js::IsObjectInContextCompartment(lccx.GetScopeForNewJSObjects(), cx), "bad scope for new JSObjects"); if (pErr) *pErr = NS_ERROR_XPC_BAD_CONVERT_NATIVE; switch (type.TagPart()) { case nsXPTType::T_I8 : *d = INT_TO_JSVAL(int32_t(*((int8_t*)s))); break; case nsXPTType::T_I16 : *d = INT_TO_JSVAL(int32_t(*((int16_t*)s))); break; case nsXPTType::T_I32 : *d = INT_TO_JSVAL(*((int32_t*)s)); break; Here=> case nsXPTType::T_I64 : *d = DOUBLE_TO_JSVAL(double(*((int64_t*)s))); break; case nsXPTType::T_U8 : *d = INT_TO_JSVAL(int32_t(*((uint8_t*)s))); break; case nsXPTType::T_U16 : *d = INT_TO_JSVAL(int32_t(*((uint16_t*)s))); break; case nsXPTType::T_U32 : *d = UINT_TO_JSVAL(*((uint32_t*)s)); break; case nsXPTType::T_U64 : *d = DOUBLE_TO_JSVAL(double(*((uint64_t*)s))); break; case nsXPTType::T_FLOAT : *d = DOUBLE_TO_JSVAL(*((float*)s)); break; case nsXPTType::T_DOUBLE: *d = DOUBLE_TO_JSVAL(*((double*)s)); break; case nsXPTType::T_BOOL : { bool b = *((bool*)s); When the problem is noticed that (void*) |s| points to an uninitialized malloc'ed area consisting of 8 octets. I upload the ellided gdb session record that allowed me to figure out what was happening as attachment. Up a few call chain, I see the call to a function, CallMethodHelper::GatherAndConvertResults() in mozilla/js/xpconnect/src/XPCWrappedNative.cpp and the file has a comment near the beginning /* Wrapper object for reflecting native xpcom objects into JavaScript. */ So I think the problem is in the interface between XPCOM and JavaScript. Or that the routine on the other side of XPCOM forgets to return a valid value and instead leaves uninitialized memory in the data passed to the XPCOM interface. Ugh, really hard bug to debug. In my log, I see three such occurrences of the uninitialized usage, and all these problems originate from GatherAndConvertResults(). I only checked the first abort() using gdb so far. But I am farily sure that the conversion routine is the culprit in all cases from the previous session logs. (Except for the mysterious disappearing case I earlier mentioned.) In my use of valgrind, I instruct valgrind to fill newly malloc'ed area with 0xA5: the options passed valgrind were "--malloc-fill=0xA5 " "--free-fill=0xC3 " (Note 0xA5 = 0245. This value is printed when accessing non-initialized octet.) # This is the compilation option in my MOZCONFIG which is used for # compling TB. (There are other options, but basically this is it.) # I also enabled debug. So TB is debug build version. ac_add_options --enable-valgrind ac_add_options --enable-optimize="-g -O -freorder-blocks" ac_add_options --disable-jemalloc I set the following before running "make mozmill". I found the tip in mozilla web page for running valgrind/memcheck for mozilla software. # G_SLICE=always-malloc export G_SLICE Version of comm-central I am using: /home/ishikawa/TB-NEW/TB-3HG/new-src hg identify db8918e79907+ tip /home/ishikawa/TB-NEW/TB-3HG/new-src/mozilla hg identify 3b0b1092f9b4+ tip Here is the back trace and values of a few variables for debugging from within gdb. If anyone has a good idea how to debug this further BEYOND this XPCOM interface, I will appreciate it very much. I want to know WHAT JS file, what line, etc., the JavaScript interpreter is calling, etc. Maybe this time, we need to learn WHAT SERVICE is invoked using XPCOM interface? In a follow-up to an earlier news post, it was suggested that calling DumpJSStack() may give us a clue. (mozilla.dev.platform <69WdnZv7fKpGqbbMnZ2dnUVZ_uOdnZ2d@mozilla.org>) As for service, I think the particular instance which is investigated by the gdb session must be related to ==27358== Uninitialised value was created by a heap allocation ==27358== at 0x40273B8: malloc (vg_replace_malloc.c:270) ==27358== by 0x404BE13: moz_xmalloc (mozalloc.cpp:54) ==27358== by 0xA4563AF: nsNewsDatabaseConstructor(nsISupports*, nsID const&, void**) (mozalloc.h:200) ==27358== by 0xAD86523: mozilla::GenericFactory::CreateInstance(nsISupports*, nsID const&, void**) (GenericFactory.cpp:16) ==27358== by 0xAD7ADC9: nsCreateInstanceByContractID::operator()(nsID const&, void**) const (nsComponentManagerUtils.cpp:178) ==27358== by 0xA461165: nsCOMPtr<nsIMsgDatabase>::assign_from_helper(nsCOMPtr_helper const&, nsID const&) (nsCOMPtr.h:1226) ==27358== by 0xA6DF17F: nsMsgDBService::CreateNewDB(nsIMsgFolder*, nsIMsgDatabase**) (nsCOMPtr.h:605) I suspect somewhere in the nsMsgDBService, a double is returned, but where? I will investigate the use of DumpJSStack() and report the progress later, but if there are other tips in debugging XPCOM -> JavaScript interface, I will appreciate it very much. I would like my mailer to be rock solid. Seeing that it generates uninitialized memory usage warnings is not very reassuring, is it? TIA
Assignee | ||
Comment 1•11 years ago
|
||
Hi, Thanks to Neil's suggestion off-line, I call DumpJSStack() in CallMethodHelper::GatherAndConvertResults() ( success! ) when the uninitialized double data is noticed and abort there and checked the value of mMethodInfo and mCallee. (I learned from Neil that it should be revealing to look at the mMethodInfo member of CallMethodHelper, which is a pointer to a structure with a char *name member, and the mCallee member which is the pointer to the object on which the method is invoked. And yes it reveals some key issues.) ======================== DumpJSStack() Ouptut: ======================== 0 anonymous() ["resource:///modules/msgDBCacheManager.js":122] this = [object Object] 1 dbCache_CheckTimerCallback() ["resource:///modules/msgDBCacheManager.js":55] this = function dbCache_CheckTimerCallback() { msgDBCacheManager.checkCachedDBs(); } msgDBCacheManager.checkCachedDBs() does not seem to return a valid value, it ends with the following piece of code without returning a value. But this is not a big issue here, I think. for each (let [, db] in Iterator(dbs)) { if (mailSession.IsFolderOpenInWindow(db.folder)) continue; db.folder.msgDatabase = null; if (--dbsToClose == 0) break; } ------------------------ From the value of mMethodInfo, it semes that lastUseTime is the method that was invoked to return double (?) value? (gdb) print *mMethodInfo $3 = {<XPTMethodDescriptor> = {name = 0x67011a8 "lastUseTime", params = 0x67011b8, result = {flags = 128 '\200', type = {prefix = { flags = 6 '\006'}, argnum = 0 '\000', argnum2 = 0 '\000', type = { iface = 0, additional_type = 0}}}, flags = 128 '\200', num_args = 1 '\001'}, <No data fields>} *mCallee is this long structure that contains the called method information. Note the value of m_lastUseTime; -6510615555426900571 This is 5A5A5A5A5A5A5A5A (or -5A5A5A5A5A5B in 8 byte integer) meaning that it is NOT initialized! [I am learning the mechanism of XPCOM -> JavaScript right now, but I think I finally spot the problem. It seems "m_variableName" would be returned as "variableName" to the caller of XPCOM service, I think, by the implementation routine in this particular case, presumably written in C++.] So now we need to figure out where the value is NOT set or should have been set but it was not done so. Great progress, isn't it? mCallee value: (gdb) print *mCallee $6 = (nsNewsDatabase) { <nsMsgDatabase> = {<nsIMsgDatabase> = {<nsIDBChangeAnnouncer> = {<nsISupports> = {_vptr.nsISupports = 0xc73f9c8}, <No data fields>}, <No data fields>}, mRefCnt = {mValue = 4}, _mOwningThread = {mThread = 0x4313660}, static m_dbCache = 0x1560aaa0, static gLastUseTime = 1362151028540008, m_lastUseTime = -6510615555426900571, m_collationKeyGenerator = { mRawPtr = 0x0}, m_mimeConverter = {mRawPtr = 0x0}, m_retentionSettings = {mRawPtr = 0x0}, m_downloadSettings = { mRawPtr = 0x0}, m_folder = {mRawPtr = 0xd1ac7ac}, m_dbFolderInfo = 0xe0d6cc8, m_nextPseudoMsgKey = 4294967280, m_mdbEnv = 0x145666d4, m_mdbStore = 0xe357494, m_mdbAllMsgHeadersTable = 0x17289124, m_mdbAllThreadsTable = 0x1449d024, m_thumb = {mRawPtr = 0x0}, m_create = true, m_leaveInvalidDB = true, m_dbName = {<nsACString_internal> = { mData = 0x1456f8a8 "/TB-NEW/TB-3HG/objdir-tb3/mozilla/_tests/mozmill/mozmillprofile/News/localhost.msf", mLength = 82, mFlags = 5}, <No data fields>}, m_newSet = {<nsTArray_Impl<unsigned int, nsTArrayInfallibleAllocator>> = {<nsTArray_base<nsTArrayInfallibleAllocator>> = { mHdr = 0xc80477c}, <nsTArray_SafeElementAtHelper<unsigned int, nsTArray_Impl<unsigned int, nsTArrayInfallibleAllocator> >> = {<No data fields>}, <No data fields>}, <No data fields>}, m_mdbTokensInitialized = true, m_ChangeListeners = {<nsAutoTObserverArray<nsCOMPtr<nsIDBChangeListener>, 0u>> = {<nsTObserverArray_base> = {mIterators = 0x0}, mArray = {<nsAutoArrayBase<nsTArray<nsCOMPtr<nsIDBChangeListener> >, 0u>> = {<nsTArray<nsCOMPtr<nsIDBChangeListener> >> = {<nsTArray_Impl<nsCOMPtr<nsIDBChangeListener>, nsTArrayInfallibleAllocator>> = {<nsTArray_base<nsTArrayInfallibleAllocator>> = { mHdr = 0xe1f10c0}, <nsTArray_SafeElementAtHelper<nsCOMPtr<nsIDBChangeListener>, nsTArray_Impl<nsCOMPtr<nsIDBChangeListener>, nsTArrayInfallibleAllocator> >> = {<nsTArray_SafeElementAtSmartPtrHelper<nsIDBChangeListener, nsTArray_Impl<nsCOMPtr<nsIDBChangeListener>, nsTArrayInfallibleAllocator> >> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, { mAutoBuf = "\000\000\000\000\000\000\000\200", mAlign = { elem = 0 '\000'}}}, <No data fields>}}, <No data fields>}, m_hdrRowScopeToken = 128, m_threadRowScopeToken = 154, m_hdrTableKindToken = 151, m_threadTableKindToken = 152, m_allThreadsTableKindToken = 153, m_subjectColumnToken = 129, m_senderColumnToken = 130, m_messageIdColumnToken = 131, m_referencesColumnToken = 132, m_recipientsColumnToken = 133, m_dateColumnToken = 134, m_messageSizeColumnToken = 135, m_flagsColumnToken = 136, m_priorityColumnToken = 137, m_labelColumnToken = 138, m_statusOffsetColumnToken = 139, m_numLinesColumnToken = 140, m_ccListColumnToken = 141, m_bccListColumnToken = 142, m_threadFlagsColumnToken = 145, m_threadIdColumnToken = 144, m_threadChildrenColumnToken = 147, m_threadUnreadChildrenColumnToken = 148, m_messageThreadIdColumnToken = 143, m_threadSubjectColumnToken = 149, m_messageCharSetColumnToken = 150, m_threadParentColumnToken = 155, m_threadRootKeyColumnToken = 156, m_threadNewestMsgDateColumnToken = 146, m_offlineMsgOffsetColumnToken = 157, m_offlineMessageSizeColumnToken = 158, m_HeaderParser = 0x0, m_threads = {<nsTArray_Impl<nsMsgThread*, nsTArrayInfallibleAllocator>> = {<nsTArray_base<nsTArrayInfallibleAllocator>> = { mHdr = 0xc80477c}, <nsTArray_SafeElementAtHelper<nsMsgThread*, nsTArray_Impl<nsMsgThread*, nsTArrayInfallibleAllocator> >> = {<No data fields>}, <No data fields>}, <No data fields>}, m_headersInUse = 0x0, static gMsgDBHashTableOps = {allocTable = 0xad80a5c <PL_DHashAllocTable(PLDHashTable*, uint32_t)>, freeTable = 0xad80a7f <PL_DHashFreeTable(PLDHashTable*, void*)>, hashKey = 0xa6d7bc8 <nsMsgDatabase::HashKey(PLDHashTable*, void const*)>, matchEntry = 0xa6d7bd0 <nsMsgDatabase::MatchEntry(PLDHashTable*, PLDHashEntryHdr const*, void const*)>, moveEntry = 0xa6d7be2 <nsMsgDatabase::MoveEntry(PLDHashTable*, PLDHashEntryHdr const*, PLDHashEntryHdr*)>, clearEntry = 0xa6d7bfe <nsMsgDatabase::ClearEntry(PLDHashTable*, PLDHashEntryHdr*)>, finalize = 0xad80a29 <PL_DHashFinalizeStub(PLDHashTable*)>, initEntry = 0}, m_cachedHeaders = 0x0, m_bCacheHeaders = true, m_cachedThreadId = 4294967295, m_cachedThread = {mRawPtr = 0x0}, mMdbFactory = {mRawPtr = 0x160decac}, static gRefHashTableOps = { allocTable = 0xad80a5c <PL_DHashAllocTable(PLDHashTable*, uint32_t)>, freeTable = 0xad80a7f <PL_DHashFreeTable(PLDHashTable*, void*)>, hashKey = 0xad81948 <PL_DHashStringKey(PLDHashTable*, void const*)>, matchEntry = 0xad80e7b <PL_DHashMatchStringKey(PLDHashTable*, PLDHashEntryHdr const*, void const*)>, moveEntry = 0xad80e47 <PL_DHashMoveEntryStub(PLDHashTable*, PLDHashEntryHdr const*, PLDHashEntryHdr*)>, clearEntry = 0xad80eda <PL_DHashFreeStringKey(PLDHashTable*, PLDHashEntryHdr*)>, finalize = 0xad80a29 <PL_DHashFinalizeStub(PLDHashTable*)>, initEntry = 0}, m_msgReferences = 0x0, m_enumerators = {<nsTArray_Impl<nsMsgDBEnumerator*, nsTArrayInfallibleAllocator>> = {<nsTArray_base<nsTArrayInfallibleAllocator>> = { mHdr = 0xc80477c}, <nsTArray_SafeElementAtHelper<nsMsgDBEnumerator*, nsTArray_Impl<nsMsgDBEnumerator*, nsTArrayInfallibleAllocator> >> = {<No data fields>}, <No data fields>}, <No data fields>}, m_cacheSize = 512}, <nsINewsDatabase> = {<nsISupports> = { _vptr.nsISupports = 0xc73fc14}, <No data fields>}, m_readSet = 0x0} I am attaching the full session log in the attachment.
Assignee | ||
Comment 2•11 years ago
|
||
Additional Info. I am afraid that I missed the important information from the valgrind client check done by VALGRIND_CHECK_MEM_IS_DEFINED(). It prints the valgrind stack trace as well as the information about where the uninitialized value was created. In this case, it seems ==18740== Address 0xbe9e10d0 is on thread 1's stack ==18740== Uninitialised value was created by a heap allocation ==18740== at 0x40273B8: malloc (vg_replace_malloc.c:270) ==18740== by 0x404BE13: moz_xmalloc (mozalloc.cpp:54) ==18740== by 0xA4563BF: nsNewsDatabaseConstructor(nsISupports*, nsID const&, void**) (mozalloc.h:200) ==18740== by 0xAD86533: mozilla::GenericFactory::CreateInstance(nsISupports*, nsID const&, void**) (GenericFactory.cpp:16) ==18740== by 0xADE87FF: nsComponentManagerImpl::CreateInstanceByContractID(char const*, nsISupports*, nsID const&, void**) (nsComponentManager.cpp:1034) ==18740== by 0xAD7AD72: CallCreateInstance(char const*, nsISupports*, nsID const&, void**) (nsComponentManagerUtils.cpp:138) ==18740== by 0xAD7ADD9: nsCreateInstanceByContractID::operator()(nsID const&, void**) const (nsComponentManagerUtils.cpp:178) ==18740== by 0xA461175: nsCOMPtr<nsIMsgDatabase>::assign_from_helper(nsCOMPtr_helper const&, nsID const&) (nsCOMPtr.h:1226) ==18740== by 0xA6DF18F: nsMsgDBService::CreateNewDB(nsIMsgFolder*, nsIMsgDatabase**) (nsCOMPtr.h:605) ==18740== by 0xA889E3C: nsMsgNewsFolder::GetDatabase() (nsNewsFolder.cpp:351) [ ... ] Full excerpt is attached. TIA
Comment 3•11 years ago
|
||
The cause of this appears to be straightforward. nsMsgDatabase has an m_lastUseTime member which does not appear to get initialised. So if someone calls GetLastUseTime before SetLastUseTime or GetMsgHdrForKey then they will get an uninitialised read. The fix would appear to be to initialise m_lastUseTime, but I'm not sure whether to initialise it to 0 or to gLastUseTime. Also, thanks to chiaki for finding this bug.
Component: Untriaged → Database
OS: Linux → All
Product: Thunderbird → MailNews Core
Hardware: x86 → All
Assignee | ||
Comment 4•11 years ago
|
||
Hi, I have been running "make mozmill" test with the attached patch. I have figoured out that the m_lastUseTime is not set when a new DB is created and thus the problem described by Neil. The patch seems to have cured the particular problem. The test is still running (sinceI am running TB under valgrind, it is slow.) There were three such un-initialised cases noticed during the previous test run. One down, two more to go. So if the remaining two would not show up in the log, I think, we can claim we fixed it. I would like to thank Neil again for the helpful tips for calling DumpJSStack() in strategic place. TIA
Assignee | ||
Comment 5•11 years ago
|
||
Hi, (In reply to ISHIKAWA, chiaki from comment #4) > Created attachment 720200 [details] [diff] [review] > One down, two more to go. > So if the remaining two would not show up in the log, I think, we can claim > we fixed it. The particular problem about uninitialized value usage is gone. I have finished running "make mozmill" of TB under valgrind/memcheck. (I have noticed that another bug is now recognized in check() routine, which probably was masked by this earlier occurrence of the uninitialised value error. I will file the bugzilla entries for them later). So who should be asked to do the review for this patch? TIA
Assignee | ||
Comment 6•11 years ago
|
||
Comment on attachment 720200 [details] [diff] [review] A patch to call RemeberLastUseTime in CreateNewDB I am setting David as the reviewer since the particular position seems to be last modified by him. The patch fixed the particular issue of undefined value accessed during "make mozmill" run of comm-central TB under valgrind/memgrind. TIA
Attachment #720200 -
Flags: review?(mozilla)
Assignee | ||
Updated•11 years ago
|
Assignee: nobody → ishikawa
Assignee | ||
Comment 7•11 years ago
|
||
I have updated the attached patch in proper git format. TIA
Attachment #720200 -
Attachment is obsolete: true
Attachment #720200 -
Flags: review?(mozilla)
Attachment #739010 -
Flags: review?(mozilla)
Comment 8•11 years ago
|
||
Comment on attachment 739010 [details] [diff] [review] A patch to call RemeberLastUseTime in CreateNewDB in proper git form looks good
Attachment #739010 -
Flags: review?(mozilla) → review+
Updated•11 years ago
|
Keywords: checkin-needed
Comment 9•11 years ago
|
||
https://hg.mozilla.org/comm-central/rev/1692e1675694
Status: NEW → RESOLVED
Closed: 11 years ago
Keywords: checkin-needed
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 23.0
Assignee | ||
Comment 10•11 years ago
|
||
Thank you!
You need to log in
before you can comment on or make changes to this bug.
Description
•