Closed Bug 1014350 Opened 10 years ago Closed 10 years ago

TEST-UNEXPECTED-FAIL | ../../../../mailnews/resources/logHelper.js | Error console says [stackFrame uncaught exception: 2147500037] - See following stack: in several files

Categories

(MailNews Core :: Testing Infrastructure, defect)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED
Thunderbird 32.0

People

(Reporter: jcranmer, Assigned: rkent)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file, 3 obsolete files)

Bug 997440 is almost certainly the cause here, although I haven't bisected to be sure.

It looks like the failure is due to how the asyncTestUtils.js code is written: when I took a failing tests and bisected it to find out what caused the errors to be reported, async_run_tests([]) was sufficient to cause failures.

I am guessing that many (most? all?) of these failures are due to StopIterations being thrown, a result of using old-style iterators. We have roughly three options:

1. Switch to new iterators.
2. Tell the log helper to ignore these error messages.
3. Rope in xpconnect people to see if they can not report this error.
(In reply to Joshua Cranmer [:jcranmer] from comment #0)
> 1. Switch to new iterators.

You almost certainly want to do this, or catch the exceptions, or something. There are similar fixes that landed with the bug.

> 3. Rope in xpconnect people to see if they can not report this error.

This is a pretty intractable thing to do at the moment.
I tried a try push to see if a simple fix works, but no luck (https://tbpl.mozilla.org/?tree=Thunderbird-Try&rev=4568be91c349)

I'll post the Mozilla patch here that seems to fix my local build.
Oops, that was my try patch, here is just the mozilla patch.
Attachment #8427430 - Attachment is obsolete: true
And if that simple fix is a no-go
We are faced with:
Not so simple when the situation is TinderboxPrint: xpcshell: 2066/103 CRASH
103 individual test fixes seems beyond the pale.
What I have not figured out yet is why Thunderbird tests fail but not Mozilla.
(In reply to Kent James (:rkent) from comment #16)
> What I have not figured out yet is why Thunderbird tests fail but not
> Mozilla.

The easy answer is that most of our tests indirectly pull in a helper which rates any error being reported to the error console as a test failure. Most of mozilla's tests don't.
Does anyone know how it was decided that any error thrown by an xpconnect component would log to the error console? Was that a carefully considered question, or was it just the side effect of some bug? I don't see any discussion of this issue directly in bug 997440 that is the presumed place this was added.

Returning a failure from a component can be a legitimate method of communication between methods, as with directory providers that throw an error if they do not process a particular directory type. That's how the C++ directory providers work. Now though the .js providers must instead return null (like in http://mxr.mozilla.org/comm-central/source/mozilla/b2g/components/DirectoryProvider.js#125) but that actually is a different response, forcing the users to check both error codes and null returns. I thought that the goal was to allow js components to do everything C++ could do, but this is a backwards step.

In any case the patch I suggested seems correct to me. Should we go ahead and try to convince the mozilla powers-that-me to accept it? I've tried to imagine workarounds and they all seem like kludges.

Also, it would be good if someone could point out what was wrong with my try run where I tried to apply a mozilla patch. It would be good to do a try run to confirm that my fix works.
(In reply to Kent James (:rkent) from comment #18)

> Also, it would be good if someone could point out what was wrong with my try
> run where I tried to apply a mozilla patch. It would be good to do a try run
> to confirm that my fix works.

--hgtool1 option should be also removed?
I believe asuth originally added the error handling as part of gloda tests (though I could be wrong). This was to ensure that any unforeseen errors got caught, and not just hidden away in the stack without causing any test failures.

I think that's a very good thing to do.

Your patch actually breaks standard xpcom expectations - that we should throw an error, rather that just return null. Although in this case, it just so happens that both are checked for in the one instance where I know this is called from, so it gets away with it.

However, I would also query that if javascript is throwing a valid error result, which is handled in c++, then that shouldn't be outputting to the console, and this would seem like a core bug.
Yes, logged errors being failures as likely added during gloda or subsequent mozmill test efforts.  There's a lot of comments in logHelper.js if the rationale is of interest.

I'm missing most of the context, but in ye olden times, this situation seems like it would be the result of setting the "dom.report_all_js_exceptions" preference to true.  I understand massive changes in how JS exceptions passed to/through XPConnect have occurred not to mention massive overhauls in wrappers, and a simple DXR search on comm-central suggests no one is setting this, so I suspect that's not a helpful observation, but :standard8 cc'ed me and I want to at least seem like I wanted to be helpful :)
(In reply to Kent James (:rkent) from comment #18)
> Does anyone know how it was decided that any error thrown by an xpconnect
> component would log to the error console? Was that a carefully considered
> question, or was it just the side effect of some bug? I don't see any
> discussion of this issue directly in bug 997440 that is the presumed place
> this was added.

The original purpose was to log JS errors that would otherwise be missed: JS isn't statically compiled, so a lot of "basic" errors manifest as something like a SyntaxError or a TypeError thrown at runtime. For some reason, in bug 997440, the rule that errors would not be reported if there was another script on the stack was changed. Perhaps because it was too hard to continue those scripts, but I hope bholley can answer this question better.

(In reply to Mark Banner (:standard8) from comment #20)
> I believe asuth originally added the error handling as part of gloda tests
> (though I could be wrong). This was to ensure that any unforeseen errors got
> caught, and not just hidden away in the stack without causing any test
> failures.
> 
> I think that's a very good thing to do.

I'd love to see core xpcshell make reports to the error console be a test failure, since there is fairly often enough reports to the error console that break a large fraction of comm-central tests that get ignored on mozilla-central due to very few tests checking this.
Flags: needinfo?(bobbyholley)
From the try run https://tbpl.mozilla.org/?tree=Thunderbird-Try&rev=e2f7c3d29cb3 the vast majority of the xpcshell failures are coming from the throw in head.js which was fixed in the patch.

I've looked at a few others:

1) test_compactFailure.js is purposely throwing errors. We would need to add some sort method to prevent those from being reported.

2) test_attachment_size.js is reporting directory provider errors, caused by adding mailTestUtils.registerUMimTypProvider  But I cannot figure out which provider is causing this. The js stack is empty when you break on the error, so it is REALLY hard to figure out which js caused this. Does anyone know how to get any debugger information of xpconnect items that would help pinpoint the js file that is being called?
So...  The old setup was to not report errors if there was another script on the stack on the same JSContext.  And then XPConnect had some hackery to look for existing JSContexts on the stack.

It might make sense to partially restore the old behavior by examining the full runtime stack, not just the stack of the JSContext.  But we have to be careful about things like set-aside frame chains.

Also, note that the old xpconnect behavior leads to unhandled exceptions being swallowed altogether in many cases, which is also not desirable, so it's sort of a lose-lose situation.  :(
A possible compromise I raised on IRC is this: don't report exceptions if they are the Cr.* error codes. That should fix the head.js issue and may fix the other scenarios here.
I'm pretty opposed to doing any extra hacking on the XPCWrappedJS error reporting machinery until we fix bug 981187. The setup is utterly broken right now, and perturbing it to fix one thing always causes other things to break. So I only want to perturb it with patches that move us away from the existing guessing-game and towards a setup that actually makes sense.

Long story short - file any wishlist items as bugs depending on bug 981187, and hack around it at the harness level in the mean time. :-(
Flags: needinfo?(bobbyholley)
I think it'd be reasonable to have logHelper do a regexp and ignore the Cr.*-looking errors as :jcranmer supposes in comment 25.

Ex: modify http://dxr.mozilla.org/comm-central/source/mailnews/test/resources/logHelper.js#59 to be:

      // meh, let's just use mark_failure for now.
      // and let's avoid feedback loops (happens in mozmill)
      if ((aMessage instanceof Components.interfaces.nsIScriptError) &&
        (!/stackFrame uncaught exception: \d+/.test(aMessage.errorMessage)) && // NEW LINE
        (!aMessage.errorMessage.contains("Error console says"))) {
        mark_failure(["Error console says", aMessage]);
      }


The intent was to catch things we'd miss, not require error reporting purity.
Uh, although if the "["/"]" stuff is part of the actual exception message, I'd include them in the regexp to avoid accidentally skipping other exceptions that somehow start with a number.  If it's not, some other additional end constraint would be advisable.
See https://tbpl.mozilla.org/?tree=Thunderbird-Try&rev=4bbe6ca5857a for a run simply disabling the failure-in-console-error. I did not incorporate the suggestions from comment 27 (yet)

There seem to be some new mozilla errors occurring - which is why this is urgent to get done.
Attached patch 1014350a.patch (obsolete) — Splinter Review
Detect and accept specific XPCOM errors
Attachment #8427431 - Attachment is obsolete: true
Attached patch 1014350a.patchSplinter Review
The only way forward at this point is to work around some of the errors that we are seeing. See try run at https://tbpl.mozilla.org/?tree=Thunderbird-Try&rev=4742bbf0bc46
Attachment #8428264 - Attachment is obsolete: true
Attachment #8428305 - Flags: review?(standard8)
Attachment #8428305 - Flags: review?(Pidgeot18)
Comment on attachment 8428305 [details] [diff] [review]
1014350a.patch

Review of attachment 8428305 [details] [diff] [review]:
-----------------------------------------------------------------

Try run looks good to me, but I have some nits.

::: mailnews/test/resources/logHelper.js
@@ +19,5 @@
>  
>  var _logHelperInterestedListeners = false;
>  
>  /**
> + * let test code extend the list of allowed XPCOM errors

nit: proper capitalization and punctuation, please.

@@ +67,5 @@
>          (!aMessage.errorMessage.contains("Error console says")))
> +        {
> +          // Unfortunately changes to mozilla-central are throwing lots
> +          // of console errors during testing, so disable (we hope temporarily)
> +          // failing on XPCOM console errors (See bug 1014350)

nit: lowercase 'see' here, and add a period.

@@ +88,5 @@
> +              if (XPCOMresult)
> +                do_print("Ignoring XPCOM error: " + message);
> +              return;
> +            }
> +            else do_print("Found XPCOM error: " + message);

Nit: put the do_print on a separate line here, please.
Attachment #8428305 - Flags: review?(Pidgeot18) → review+
Pushed https://hg.mozilla.org/comm-central/rev/56b1b0815ef0
Status: NEW → RESOLVED
Closed: 10 years ago
OS: Linux → All
Hardware: x86_64 → All
Resolution: --- → FIXED
Target Milestone: --- → Thunderbird 32.0
Attachment #8428305 - Flags: review?(standard8)
Assignee: nobody → kent
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: