Closed Bug 854888 Opened 11 years ago Closed 11 years ago

Many instances of two WARNING lines from "make mozmill" run of DEBUG BUILD of TB (comm-central) : NS_ENSURE_SUCCESS(rv, false) failed with result 0x8000FFFF: file ./mozilla/content/base/src/nsContentUtils.cpp, NS_ENSURE_TRUE(pusher.Push(aBoundElement)) fa

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla23

People

(Reporter: ishikawa, Assigned: ishikawa)

Details

Attachments

(3 files, 1 obsolete file)

I am checking the messages produced by "make mozmill" test
of DEBUG BUILD version of Thunderbird (comm-central).

I try to gather various statistics from the log to figure out where I
should investigate next to make Thunderbird a more robust mail client
which I use daily for work.

(Unfortunately, for the last few months, TB has become rather prone to
random crashes. For example, in the last 10 days, it crashed three
times. This is not acceptable for a reliable mail client.  That is why
I am attempting to find the most visible smoking gun to eliminate such
a bug one by one.)

Anyway, lately, I noticed that a couple of warning lines dominate
the log captured on the tty console where "make mozmill" is invoked.

The attachment is the WARNING message lines that are sorted in the
descending order of frequency of appearances, produced by running the
following filter to the log ($1 being the captured log from the tty
console window where "make mozmill" started.)

grep  "^WARNING" $1 | grep -v "sort operation has occurred for the SQL statement" | sort | uniq -c | sort -n -r


I excerpt the top five lines.

   6496 WARNING: NS_ENSURE_TRUE(pusher.Push(aBoundElement)) failed: file /COMM-CENTRAL/comm-central/mozilla/content/xbl/src/nsXBLProtoImplMethod.cpp, line 308
   6495 WARNING: NS_ENSURE_SUCCESS(rv, false) failed with result 0x8000FFFF: file /COMM-CENTRAL/comm-central/mozilla/content/base/src/nsContentUtils.cpp, line 3043
    670 WARNING: NS_ENSURE_TRUE(NS_SUCCEEDED(rv) && subjPrincipal) failed: file /COMM-CENTRAL/comm-central/mozilla/docshell/base/nsDocShell.cpp, line 8394
    592 WARNING: NS_ENSURE_TRUE(ps) failed: file /COMM-CENTRAL/comm-central/mozilla/editor/libeditor/html/nsHTMLEditor.cpp, line 2962
    577 WARNING: Subdocument container has no frame: file /COMM-CENTRAL/comm-central/mozilla/layout/base/nsDocumentViewer.cpp, line 2388


Note the first two lines. Note the big counts. (The numbers are close
to x 10 times the third one.)  
These come almost always in pairs.  (The only time one of them appears
independently of others is there was a timeout of some testing and so
very unusual situation.)  I regard them to happen almost always in
pairs, the warning from nsContentUtils.cpp comes immediately before
the warning from nsXBLProtoImplMethod.cpp like the following.

++DOMWINDOW == 30 (0xbac9fe0) [serial = 30] [outer = 0xa066f60]
WARNING: NS_ENSURE_SUCCESS(rv, false) failed with result 0x8000FFFF: file /COMM-CENTRAL/comm-central/mozilla/content/base/src/nsContentUtils.cpp, line 3043
WARNING: NS_ENSURE_TRUE(pusher.Push(aBoundElement)) failed: file /COMM-CENTRAL/comm-central/mozilla/content/xbl/src/nsXBLProtoImplMethod.cpp, line 308
++DOCSHELL 0xba8b1c8 == 13 [id = 13]

I am surprised that so many WARNING lines are produced during the
test. The while log file is 8.48 MiB (after removing CR). I am afraid
that the above two lines occupy close to 2MiB among it (!)  (Say, one
line is 150 bytes, and we have about 13000 lines, 150 * 13000 =
1,950,000.)

I think either 

 - the WARNING messages are produced for not so severe conditions that
   warrants warning in reality,

 - that the WARNING on only one of them would be enough (and still
   the warning is produced too many times IMHO),

 - that the two warnings always appear in pairs, there are some
   unoptimized error handling path (I mean if we know there is this
   type of error which would cause another error almost in pairs
   always, we can skip the latter operation in advance),

 - and there are very long runs of these WARNING pairs (more than 40+
   in one instance without any other messages!), that there
   are some strange error processing paths indeed. 
 
To me this suggest that the processing related to these warnings are
either flawed outright, not efficient, and there are real-world
conditions that can be handled better (pure guess).  (Or that the warnings
are produced when they should not be because, for example, they are
produced during initialization phase or something where they should be
suppressed? I see long runs of warnings when "make mozmill" invoked
Thunderbird, and before doing much else. for example.

I wonder someone in the know who is familiar with the processing about
these two WARNING messages can check whether there is some oversight
in the code paths that produces such voluminous warning lines.

TIA

If someone is interested, 
I can possibly upload gzipped log file (which is about 500KB)
$ gzip -c /FF-NEW/log60.txt | wc
   1985   11942  524808
Chiaki, have you reported any of the crashes as bugs and submitted the crash IDs?
(In reply to :aceman from comment #1)
> Chiaki, have you reported any of the crashes as bugs and submitted the crash
> IDs?

I thought the crashreporter submit the information.
But do we need to additionally file a bug about this to obtain an ID?
Hmm... What happens if we don't file a bug on each crashreport?
The crashreporter submits the crash data. However to work on it, we need to have a bug. It is best if the crash reporter can do it so he can describe what he was doing when it happened. If he does not do it, Wayne may come around and create a bug from the crash report, but he does not have any context.
(In reply to :aceman from comment #3)
> The crashreporter submits the crash data. However to work on it, we need to
> have a bug. It is best if the crash reporter can do it so he can describe
> what he was doing when it happened. If he does not do it, Wayne may come
> around and create a bug from the crash report, but he does not have any
> context.

(In reply to Wayne Mery (:wsmwk) from comment #4)
> https://support.mozillamessaging.com/en-US/kb/mozilla-crash-
> reporter#w_viewing-crash-reports


Thank you. The two of the three crashes in last two weeks are visible from
troubleshooting page. So I suspect that I need to create a bugzilla entry
with more info to them. Maybe Monday. [That computer is at the office.]

The other crash, unfortunately, is nowhere in the troubleshooting page.
I suspect that it is one of those situation where the infrastructure for
crashreport is no longer there and thus the crash info is lost :-(

TIA
I checked and we can disable the printing of warning from 
nsCxPusher::Push() in 
/content/base/src/nsContentUtils.cpp
and still get enough warnings (see for example, the attachment in
https://bugzilla.mozilla.org/attachment.cgi?id=731268
There I have produced a "make mozmill" run of thunderbird (comm-central) by
using this patch, and the warning lines are reduced and we still get
the warning about Push failure.

The session log shrunk from 8.53 MiB to 7.46MiB.
Attachment #731433 - Flags: review?(ehsan)
But how do you know this change is right? Maybe the warning should really appear as something is wrong. Also, if this gets accepted, put a comment above the code to say that we really do not want the warning. Otherwise somebody will come around and convert it back to NS_ENSURE as it is shorter :)
Well, the way, I see it, this warning ALWAYS appeared as a pair with the
remaining WARNING line output in
WARNING: NS_ENSURE_TRUE(pusher.Push(aBoundElement)) failed: file /COMM-CENTRAL/comm-central/mozilla/content/xbl/src/nsXBLProtoImplMethod.cpp, line 308
and, so I don't think we will lose anything if we leave at least one of them.
Either the other one or this one.
Right putting the comment may be a good idea.

Probably, the call of Push in nsXBLProtoImplMethod.cpp may need better
check before calling this, but I have no idea.

TIA
Comment on attachment 731433 [details] [diff] [review]
Reduce the most frequent pair of  WARNING lines by half (for Push failures)

Review of attachment 731433 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not the right reviewer here, try smaug?
Attachment #731433 - Flags: review?(ehsan) → review?(bugs)
Comment on attachment 731433 [details] [diff] [review]
Reduce the most frequent pair of  WARNING lines by half (for Push failures)

Hmm, well, ok, maybe we can finally do this. The warning has been super
useful for me, but perhaps I can add some DEBUG_smaug code there.

Coding style is
  if(!NS_SUCCEEDED(rv)) {
    return false;
  }
Attachment #731433 - Flags: review?(bugs) → review+
And just use NS_FAILED, not !NS_SUCCEEDED
smaug, maybe you could tell us what the warning means. Does TB do something wrong when we see it a lot?
I've seen the warning happening with Firefox too. Something in XBL tries to push
JSContext to stack at wrong point.

The cases when the warning is useful is when debugging some event listener handling related problems.
I have updated the patch to include the comment and #if/#else/#if to see
if this is what was meant.

> Something in XBL tries to push JSContext to stack at wrong point.

I think something is happening in TB code too.
Can you see any pattern in a short TB session TB log, posted to
https://bug854172.bugzilla.mozilla.org/attachment.cgi?id=733736

Around the middle part, you see about a two dozen lines of
WARNING: NS_ENSURE_TRUE(pusher.Push(aBoundElement)) failed: file /COMM-CENTRAL/comm-central/mozilla/content/xbl/src/nsXBLProtoImplMethod.cpp, line 308  

Without the patch here, each of these lines are preceded by the extra debug output line.

I think these push failures occur near the end of the TB session : note that
there are finishing sequences not much down the log.

But I am not familiar enough with the TB and interpreter system to figure out
what is going on. 

For a USER of TB, this is a little beyond my head...

TIA
Attachment #731433 - Attachment is obsolete: true
Attachment #733839 - Flags: review?
Attachment #733839 - Flags: review? → review?(bugs)
Comment on attachment 733839 [details] [diff] [review]
Reduce the most frequent pair of WARNING lines by half (for Push failures)  Take-2

s/2012/2013/
And I'll remove the comment
Attachment #733839 - Flags: review?(bugs) → review+
s/ralse/false/ ;)
(In reply to Olli Pettay [:smaug] from comment #17)
> Created attachment 734377 [details] [diff] [review]
> patch

Thank you. I added the checkin-needed keyword since it was already reviewed.
(not sure if should obsolete my last patch, though...)
Keywords: checkin-needed
Probably mark it obsolete so that it is obvious which patch to check in.
Component: Untriaged → General
OS: Linux → All
Product: Thunderbird → Core
Hardware: x86 → All
I'm pushing the patch
https://hg.mozilla.org/mozilla-central/rev/a37f106cf4ec
Assignee: nobody → ishikawa
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla23
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: