C-C TB mochitest+valgrind uncovered uninitialized memory access.: |m_headerstartpos| is not always initialized in nsParseMailMessageState::FinalizeHeaders()
Categories
(MailNews Core :: Backend, defect)
Tracking
(thunderbird_esr68+ fixed, thunderbird73 fixed)
People
(Reporter: ishikawa, Assigned: ishikawa)
References
(Blocks 1 open bug)
Details
(Keywords: sec-moderate, valgrind)
Attachments
(6 files, 3 obsolete files)
8.24 KB,
patch
|
Details | Diff | Splinter Review | |
381.76 KB,
text/plain
|
Details | |
10.52 KB,
patch
|
Details | Diff | Splinter Review | |
415.49 KB,
text/plain
|
Details | |
10.81 KB,
patch
|
Details | Diff | Splinter Review | |
3.83 KB,
patch
|
KaiE
:
review+
mkmelin
:
approval-comm-beta+
mkmelin
:
approval-comm-esr68+
|
Details | Diff | Splinter Review |
The subject says it all
|m_headerstartpos| is not always initialized in nsParseMailMessageState::FinalizeHeaders()
The unitinitialized reference happens at
https://searchfox.org/comm-central/source/mailnews/local/src/nsParseMailbox.cpp#1223
I found this by running xpcshell-tests under valgrind.
If m_headerstartpos is not properly initialized, literally anything goes.
Header parsing, etc. get all screwed up.
The symptom of strange display of message in a message pane that shows only blank or in the middle of header, etc. could be at least partially explained by this bug. (Maybe not all the cases.)
I detected the issue using valgrind, and then began dumping m_headerstartpos after I set
export MALLOC_CHECK_=3
export MALLOC_PERTURB_=0x5A
in my environment. GNU libc malloc fills the uninitialized malloc'ed area with 0xA5.
Thus it is rather easy to find it in an otherwise normal dump.
cf. From man mallopt.
M_PERTURB (since glibc 2.4)
If this parameter is set to a nonzero value, then bytes of allocated memory (other than alloca‐
tions via calloc(3)) are initialized to the complement of the value in the least significant byte
of value, and when allocated memory is released using free(3), the freed bytes are set to the
least significant byte of value. This can be useful for detecting errors where programs incor‐
rectly rely on allocated memory being initialized to zero, or reuse values in memory that has al‐
ready been freed.
...
MALLOC_PERTURB_
Controls the same parameter as mallopt() M_PERTURB.
The tests that trigger and not trigger the symptoms in C-C portion of xpcshell-tests are as follows.
This is the output from egrep "(nrun-xpcshell-original.sh|a5a5a5a5)" log1155-xpcshell.txt
log1155-xpcshell.txt being the log output.
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/chat/components/src/test
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/chat/modules/test
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/chat/protocols/irc/test
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/chat/protocols/skype/test
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/chat/protocols/xmpp/test
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/common/saxparser/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/common/test/xpcshell
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/ldap/xpcom/tests/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mail/base/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mail/components/enterprisepolicies/tests/xpcshell
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mail/components/extensions/test/xpcshell
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mail/components/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/addrbook/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/base/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/compose/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/db/gloda/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/db/msgdb/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/extensions/bayesian-spam-filter/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/extensions/mdn/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/extensions/newsblog/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/imap/test/unit
1:14.92 pid:109013 (debug) VALGRIND: m_headerstartpos = 0xa5a5a5a5a5a5a5a5, mozstatus->value =0x0x55788019a046,m_headers.GetBuffer()=0x557880199fe0, m_envelope_pos=0x0
2:17.25 pid:110314 (debug) VALGRIND: m_headerstartpos = 0xa5a5a5a5a5a5a5a5, mozstatus->value =0x0x557082978326,m_headers.GetBuffer()=0x5570829782c0, m_envelope_pos=0x0
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/import/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/intl/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/jsaccount/test/unit
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/local/test/unit
0:09.37 pid:111723 (debug) VALGRIND: m_headerstartpos = 0xa5a5a5a5a5a5a5a5, mozstatus->value =0x0x5645cf9c8621,m_headers.GetBuffer()=0x5645cf9c85b0, m_envelope_pos=0x0
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/mime/jsmime/test
+ bash -vx /home/ishikawa/bin/nrun-xpcshell-original.sh --sequential --verbose comm/mailnews/mime/test/unit
I inserted the dump of m_headerstartpos using the patch in the attached file.
I have not fully analyzed WHEN/HOW the uninitialized access happens. But I suspect this has something to do with malformed or missing headers.
I don't want to see some script kiddies sending out such messages en masse to disrupt TB users, thus setting the security flag.
Once we can be sure this is of minor issue, then we can unflag it.
TIA
Assignee | ||
Comment 1•5 years ago
|
||
Oh, BTW, I notice that the hash calculation touches uninitialized memory region. The last hunk in the attached diff is for investigating this.
I knew that for a long time. But I let it pass since it is not unusual for some hash functions to
calculate the value using 4 or 8 octets at a time, and thus if we have interesting data which has a length of not multiple of 4, we may end up
accessing the last few uninitialized octets of a word: this happens in zip/unzip operation during |make mozmill| test
and I have suppressed the warning from valgrind after investigation revealed that it is a non-issue.
However, with this uninitialized memory access issue at hand, I investigated a bit more, and I now realized that uninitialized access during hash calcluation MAY end up generating message-ID for a message which does not have it to be attached a non-unique message ID.: that is, the message ID varies each time it is generated because of random data in uninitiailized memory. (Correct me if I am wrong here.)
This may happen when you import a message in an individual file and the message lacks a message ID.
If you include it in multiple folders, the same message will have a different message ID in different folders. So this may not be optimal.
But the practical impact of having the message with different message ID is not that great if I am not mistaken again.
HOWEVER, during investigation, I have noticed a glaring uninitialized memory creation and access in the code which I will report later. I am overwhelmed at what is going on and not sure where the problem begins, even.
Assignee | ||
Comment 2•5 years ago
•
|
||
I should probably mention this in the separate bug for the hash calculation (but the problem seems to be entangled somehow)., but
the dump of hash area shows this:
1:19.22 pid:109069 {debug} Dump:tmp_buf: totalsize = 1024
1:19.22 pid:109069 0x00: f r o m : y @ y . y 00 NL t o :
1:19.22 pid:109069 0x10: x @ x . x 00 NL s u b j e c t :
1:19.22 pid:109069 0x20: d o w n l o a d o n d e m
1:19.22 pid:109069 0x30: a n d p r o b l e m 00 NL d a t
1:19.22 pid:109069 0x40: e : F r i , 0 1 J a n 2
1:19.22 pid:109069 0x50: 0 1 0 1 2 : 0 0 - 0 5 0 0 00
1:19.22 pid:109069 0x60: NL c o n t e n t - t y p e : m
1:19.22 pid:109069 0x70: u l t i p a r t / m i x e d ;
1:19.22 pid:109069 0x80: b o u n d a r y = " - - - - P a
1:19.22 pid:109069 0x90: r t 0 + " 00 " 00 NL a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0xA0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0xB0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0xC0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0xD0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0xE0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0xF0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0x00: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0x10: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0x20: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0x30: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0x40: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0x50: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0x60: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0x70: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0x80: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0x90: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0xA0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0xB0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0xC0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0xD0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0xE0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0xF0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0x00: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0x10: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.22 pid:109069 0x20: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x30: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x40: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x50: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x60: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x70: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x80: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x90: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0xA0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0xB0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0xC0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0xD0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0xE0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0xF0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x00: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x10: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x20: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x30: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x40: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x50: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x60: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x70: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x80: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0x90: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0xA0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0xB0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0xC0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0xD0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0xE0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
1:19.23 pid:109069 0xF0: a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5 a5
That is the buffer sent for header-related processing has totallength (= GetSize()) of 1024 and filled with uninitialized value and the
hash is calculated over this area. Something is horribly wrong...
Assignee | ||
Comment 3•5 years ago
•
|
||
Re the tests that cause the uninitialized access of |m_headerstartpos| is as follows.
I am afraid that I ran the test partially before I ran the full test and so the same error was reported twice.
(Note in the following like '||' actually is | followed by \ and then |. I don't know what hides the \ . Is this part of HTML spec?)
Edited excerpt from egrep "(nrun-xpcshell-original.sh|a5a5a5a5|| Starting)" log1155-xpcshell.txt
1:14.90 INFO comm/mailnews/imap/test/unit/test_offlinePlayback.js | Starting doOfflineOps
1:14.92 pid:109013 (debug) VALGRIND: m_headerstartpos = 0xa5a5a5a5a5a5a5a5, mozstatus->value =0x0x55788019a046,m_headers.GetBuffer()=0x557880199fe0, m_envelope_pos=0x0
2:17.23 INFO comm/mailnews/imap/test/unit/test_offlinePlayback.js | Starting doOfflineOps
2:17.25 pid:110314 (debug) VALGRIND: m_headerstartpos = 0xa5a5a5a5a5a5a5a5, mozstatus->value =0x0x557082978326,m_headers.GetBuffer()=0x5570829782c0, m_envelope_pos=0x0
0:07.12 INFO comm/mailnews/local/test/unit/test_movemailDownload.js | Starting streamMessages
0:09.37 pid:111723 (debug) VALGRIND: m_headerstartpos = 0xa5a5a5a5a5a5a5a5, mozstatus->value =0x0x5645cf9c8621,m_headers.GetBuffer()=0x5645cf9c85b0, m_envelope_pos=0x0
Updated•5 years ago
|
Comment 4•5 years ago
|
||
I have spent 20 minutes trying to understand this code, but it's tricky.
However, it seems that most variables are initialized to zero in nsParseMailMessageState::Clear(), but m_headerstartpos isn't.
Can we add m_headerstartpos=0; and see if you can still reproduce the issue?
Assignee | ||
Comment 5•5 years ago
|
||
(In reply to Kai Engert (:KaiE:) from comment #4)
I have spent 20 minutes trying to understand this code, but it's tricky.
However, it seems that most variables are initialized to zero in nsParseMailMessageState::Clear(), but m_headerstartpos isn't.
Can we add m_headerstartpos=0; and see if you can still reproduce the issue?
Yes, the code is tricky. I tried understand when/why the part where the uninitialized m_headerstartpos was referenced, but have not figured out yet.
Let me try the bandage of setting "m_headerstartpos=0" and how it goes.
Assignee | ||
Comment 6•5 years ago
|
||
uninitialized m_headerstartpos is gone by the initialization and it does not seem to have ill-effect on other xpcshell-tests for now.
So I will watch out over mochitest in the future.
HOWEVER, the issue I found regarding the strange access to uninitialized buffer area mentioned in comment 2 is still there.
So we need more investigation. Before I file a bug on that I found this uninitialized buffer reference happens in many tests.
I am attaching the output of the following command (log1157-xpcshell.txt is a local log filename).
egrep "({debug} Dump:tmp_buf: totalsize = |nrun-xpcshell-original.sh|Failed: 1| a5|a5a5a5a5|\| Starting)" log1157-xpcshell.txt
(Again, the double | | before |Starting| is actuall |, , then followed by another |.,
Basically, when TB hits the code path to print |totalsize =| in my added dump statement, the code tries to reference a buffer whose last part is filled with uninitialized octet, and if I am not mistaken, the hash is calculated on these uninitialized areas instead of the properly initialized data.
I am not sure if the length passed to the routine is bogus or the buffer initialization code (that fills buffer with the header data) is bogus.
I have not had the time to look at the failing tests, but there must be a pattern to the data which triggered the uninitialized buffer access problem.
I wish the code here is simplified. There seems to be a code that tries to add ">" if a mail text starts a line with "From", but the code tries to be clever and obscures the entire logic.: I don't mind a simple memcpy() to shift the line image to create a place to insert the ">" at the beginning
so that we have a simpler easy-to-understand code.: Clever and hard-to-understand code is written once, but future developers and debuggers need to understand whatever tricky stuff THOUSAND times. TB community today cannot afford that kind of clever code any more (!).
Assignee | ||
Comment 7•5 years ago
|
||
The output mentioned in comment 6.
I have slightly changed the routine to dump the buffer.
I will upload the changed patch after this message.
Assignee | ||
Comment 8•5 years ago
|
||
I think instead of passing GetSize() of buffer area that stores the header data to md5->Update routine to calculate the MD5 and end up with a missing message ID, we should probably keep track of the ending position (NUL code at the end of the last header line when the headers are parsed and stored.: NEWLINE, CR LF seem to be replaced with NUL in the code) and pass the length from the m_headerstartpos to that position.
However, as the forgotten initializaton of m_headestartpos shows there seem to a can of worms around here.
Assignee | ||
Comment 9•5 years ago
|
||
I will file a separate bug VERY SOON NOW. If someone feels like doing that, I am fine.
Assignee | ||
Updated•5 years ago
|
Assignee | ||
Comment 10•5 years ago
|
||
I filed bug 1609607 for the uninitialized buffer access.
Just to be complete, an addition here.
The output mentioned in comment 6 is a bit misleading about the files that show the symptom of uninitialized buffer access.
The following command identifies the files better.
egrep "(TEST_START|{debug} Dump:tmp_buf: totalsize = |nrun-xpcshell-original.sh|Failed: 1| a5|a5a5a5a5|\| Starting)" log1157-xpcshell.txt
Assignee | ||
Comment 11•5 years ago
|
||
This is the output mentioned in comment 10.
One thing I noticed is that there are two cases where the totalsize is 3072 ( = 1024 x 3) and
have uninitialized area at the end.
Comment 12•5 years ago
|
||
Kai: can you take a stab at a security severity rating for this bug from a Tbird POV
Updated•5 years ago
|
Assignee | ||
Comment 13•5 years ago
|
||
Hi, I am testing the local patch for this bug and bug bug 1609607.
I think I will bundle the patch for this bug in the patch for bug 1609607 since they seem to be interwinded in a sutble way
and should be applied together at once to avoid uninitialized memory access.
BTW, uninitialized case of m_headerstartpos seemed to happen only when m_envelope_pos was equal to 0.
So this suggests that the mail (header + mainbody text) start at the beginning of a file.
Setting m_headerstartpos = 0 might have been in OK in that sense.
(However, I am still a bit uneasy.: I am not sure if this takes care of a possible "From - " line at the beginning, etc.)
Assignee | ||
Comment 14•5 years ago
|
||
Here is the patch.
I intentionally kept the comment line very generic so as not to alert that this is related the sec-low bugs.
The submission to try server is
https://treeherder.mozilla.org/#/jobs?repo=try-comm-central&revision=a6155fa767844fad63ded1d0a57e33ee746b7f03
I don't see any new problems here.
(I accidentally put another patch, but it is a no brainer and should not cause any issue with the problem mentioned here.)
Comment 15•5 years ago
|
||
Assignee | ||
Comment 16•5 years ago
|
||
I will run the clang-format locally and try to make the debug print out consistent.
Comment 17•5 years ago
|
||
There's also debugging code in the patches in bug 1242030, mostly like fprintf(stderr, "(debug) ...
. Personally I'd add as little debug as possible, and I'm wondering what the #ifdef USEVALGRIND
is about. As far as I can see, that's not a flag the environment supports, so there is no point adding that code.
Assignee | ||
Comment 18•5 years ago
•
|
||
(In reply to Jorg K (GMT+1) (PTO to 26th Jan 2020, sporadically reading bugmail) from comment #17)
There's also debugging code in the patches in bug 1242030, mostly like
fprintf(stderr, "(debug) ...
. Personally I'd add as little debug as possible, and I'm wondering what the#ifdef USEVALGRIND
is about. As far as I can see, that's not a flag the environment supports, so there is no point adding that code.
Good catch.
ifdef USEVALGRIND is a cruft left over when I used valgrind-supplied library routines to check the "definedness" of a memory location.
I no longer use it in this patch. So I am creating a new local patch in which I took it out and also changed one other |ifdef USEVALGRIND| to |ifdef DEBUG|.
The patched source tree was checked by |../mach clang-flow -p ...|.
As for bug 1242030, I have tried a bit locally to reduce the open-coded debug printf statement by creating a few macros and functions so as not to clutter up the original source code superficially, but the problem is that there are SIMPLY TWO FEW CHECKS for low-level I/O failures,
I need to add the check and somehow transmits the failure status to the calling routine (and to the debugging console or human debugger) in some way or the other.: This latter dump is very important for me to analyze OSX and Windows failure because I am developing patches under linux only. Any I/O related failures that I can't reproduce under linux need to be analyzed by dumping detailed context on tryserver runs.
Anyway, I will try a bit more in that angle.
Thank you again for your comments.
Comment 19•5 years ago
|
||
Hello Chiaki, thanks a lot for your very helpful detective work and preparation. However, I'd prefer to drop most of your patch, and use a fix that is as minimal as possible. I think most of such debugging output should usually be limited to local testing, or to times of active development, but should be avoided for production code.
I've also done some testing. I've temporarily used a local boolean variable, to track whether m_headerstartpos has already been initialized, or not, at the time we reach the delta value calculation.
I found an existing test in which m_headerstartpos remains uninitialized:
comm/mailnews/local/test/unit/test_nsIMsgParseMailMsgState.js
Then I've set m_headerstartpos=0, then I looked at the calculation of the delta value. It correctly calculated the beginning of the X-Mozilla-Status value. I conclude that the calculation code is correct, and we simply have to init m_headerstartpos=0.
Looking at the code that uses the delta value, it seems like past developers had already noticed some problems with that value, given their (hack) attempt to avoid very large values.
I suggest that we add an assertion, so we can notice in debug builds and tests if the delta has an unexpected big value.
I'm renaming the delta variable slightly, to ensure its purpose is more obvious.
Only if mozStatus is non-null the current code calculates the delta value, however, if the delta value is smaller then 0xffff, it's always used to call m_newMsgHdr->SetStatusOffset(delta). That seems wrong, I suggest to only set the value if mozStatus is non-null.
Also, it's unclear why the code calls SetStatusOffset(), and then immediately reads the value back using GetStatusOffset(). I don't know if there's a desired side effect triggered by that, but I'm guessing this was done simply for the following assertion check. I've added a TODO comment to research this detail, and potentially remove those lines. But for now, I want to focus on the bug, and those lines cannot hurt, even if they are unnecessary.
Regarding the change from GetSize() to GetBufferPos(), it seems you're right, but let's move this change to bug 1609607.
Updated•5 years ago
|
Comment 20•5 years ago
|
||
Comment 21•5 years ago
|
||
Comment 22•5 years ago
|
||
fix typo
Assignee | ||
Comment 23•5 years ago
|
||
(In reply to Kai Engert (:KaiE:) from comment #19)
Hello Chiaki, thanks a lot for your very helpful detective work and preparation. However, I'd prefer to drop most of your patch, and use a fix that is as minimal as possible. I think most of such debugging output should usually be limited to local testing, or to times of active development, but should be avoided for production code.
Thank you for the new patch.
I will take note.
Regarding the change from GetSize() to GetBufferPos(), it seems you're right, but let's move this change to bug 1609607.
Actually, I meant to put the patch to bug 1609607 and not this one. My mistake.
(In reply to Kai Engert (:KaiE:) from comment #21)
Comment on attachment 9122915 [details] [diff] [review]
1608539-v2.patchChiaki, is my assumption correct, that this patch is sufficient to avoid the
valgrind warning?
I will check the valgrind run with this patch.
Magnus, this patch doesn't change the calculation. The primary fix is to
provide a variable initialization. In addition, it renames a variable, it
adds an assertion, it changes the "if statement" for using the delta value,
and adds a comment.
Comment 24•5 years ago
|
||
Assignee | ||
Comment 25•5 years ago
|
||
(In reply to Jorg K (GMT+1) (no longer working on Thunderbird) from comment #24)
Comment on attachment 9122925 [details] [diff] [review]
...
Did clang-format indent it this way? I could be wrong but I think it would
have just indented that by four spaces.
I noticed that too.
I am testing the attached reformatted patch on local PC.
The whole valgrind run of xpcshell-test takes about 13 hours on my PC and it is into 6 hours now. So I can report the result either tonight or tomorrow morning.
(I kept the indentation of Kai's comment. clang-format is too eager to fill comment.
It should have a simple escape such as the paragraph starting with "leading space //\n" and
ending with "leading space//\n" is untouched. |// clang-format off|, and |// clang-format on| are rather ugly. But I digress. )
BTW, I have submitted bug 1611567, another uninitialized memory access. I wonder if people can take a look at it.
(There is yet another one from my xpcshell-test. I have yet to determine if that is a false positive or not. Probably positive one.)
TIA
Assignee | ||
Comment 26•5 years ago
|
||
Good news. With Kai's patch (reformatted), there was no uninitialized memory access due to the bug here, it seems.
I could execute more tests by enlarging timeout values for some more tests by inserting
requesttimeoutfactor = 25
or similar or larger values in relevant xpcshell.ini .
But that resulted in longer test elapsed time obviously since I need to run tests sequentially so as not to intermix the logs from different tests.
From my log: it took almost 13 hours.
ENDING_SEC=$(date +"%s")
++ date +%s
+ ENDING_SEC=1579944658
echo "It took " $((ENDING_SEC - STARTING_SEC)) " seconds."
+ echo 'It took ' 46823 ' seconds.'
It took 46823 seconds.
It would be nice to have a valgrind run under linux x64 from time to time.: maybe twice a month on the tryserver.
Updated•5 years ago
|
Comment 27•5 years ago
|
||
Assignee | ||
Comment 28•5 years ago
|
||
(In reply to Jorg K (GMT+1) (no longer working on Thunderbird) from comment #27)
Comment on attachment 9123029 [details] [diff] [review]
Reformatted Kai's patch.Review of attachment 9123029 [details] [diff] [review]:
::: mailnews/local/src/nsParseMailbox.cpp
@@ +1318,3 @@if (!m_IgnoreXMozillaStatus) { // imap doesn't care about // X-MozillaStatus
// clang-format off
Please don't use clang-format off for comments!
It's bad enough we use it here and there for code, but there is no need for
comments. Just let is reformat the comment, and tweak it later if it needs
further tweaking.
I can certainly revert the change re comment part.
However, I am a bit puzzled from the general point of workflow.
The way we use |../mach clang-format -p comm/directories.../... | for formatting patch is
after we modify source files (or apply a patch to them) then run the clang-format and then
create the patch. Correct?
But this means that
(1) if we run clang-format and come up with a change/patch and commit it, then
(2) tweak the comment part manually later as you suggest
and commit the change to the comment part later,
(3) the next person who touches the file and runs clang-format for her/his patch
will mess up the comment part modified manually in the step (2) above, does't s/he?
In a big file after an extensive change, comments may be messed up from the viewpoint of
clang-format.
And to be frank, someone doing the gratuitous formatting as scout-like cleanup,
when s/he notices the incorrect formatting of a previous patch as s/he creates a new patch
for further modification, may not care about letting clang-format to
fill the previous patch's comment paragraphs and mess up the intention of previous
manual tweak.
This is not an ideal work flow IMHO.
Just a thought, but I find this an obnoxious problem.
Comment 29•5 years ago
|
||
Chiaki, thank you very much for confirming the patch passes valgrind.
This updated patch reformats all of the code, including the comment.
No functional change, carrying forward r=mkmelin.
Comment 30•5 years ago
|
||
Comment 31•5 years ago
|
||
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Updated•5 years ago
|
Comment 32•5 years ago
|
||
Thunderbird 73.0b2:
https://hg.mozilla.org/releases/comm-beta/rev/432f80bf179a1c0d7d5996ed879397cfa263918d
Updated•5 years ago
|
Comment 34•5 years ago
|
||
uplift |
Updated•5 years ago
|
Updated•5 years ago
|
Updated•4 years ago
|
Description
•