Closed Bug 964318 Opened 8 years ago Closed 8 years ago

Global console listener tries to handle all console messages and causes a disconnect of the application when loading web pages with lots of errors and warnings

Categories

(Testing Graveyard :: Mozmill, defect)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: mario.garbi, Assigned: andrei)

References

Details

(Keywords: regression, Whiteboard: [mozmill-2.0.4+])

Attachments

(1 file, 5 obsolete files)

Attached file Minimized.js (obsolete) —
With the latest mozmill from github (master branch) we seem to get stuck loading the remote webpage until we hit the timeout after which mozmill is unable to close the client for whatever reasons. There are no modal windows involved here, unlike in bug 962514.

I have tried using other remote pages both existent and nonexistent but we only seem to get this freeze when using a nonexistent page of AMO. Loading an existent AMO page takes much longer than expected (5-10 sec) but doesn't cause a failure.

I have attached a minimized case that reproduces the failure 10/10 times on Linux with latest mozmill 2.0.3 from github.
I am still investigating why this happens and will provide more info as I find it.
On master we do not have Mozmill 2.0.3. It is 2.1-dev. But anyway, we will have to check if this should block 2.0.4 given that it seems to be a regression from the 2.0.3 release.
Blocks: 960495
Keywords: regression
Summary: [2.0.3] Loading remote pages with redirrect takes too long and causes client freeze → Loading remote pages with redirrect takes too long and causes client freeze
Whiteboard: [mozmill-2.0.4?]
So far we've only been able to reproduce this when opening addons.mozilla.org and only with the 2.1-dev. This is also reproducible by opening the Firefox client with mozmill -b and pasting the URL in the url bar directly, therefore not using controller.open. This might be caused by a preference that we set when we open the client with mozmill but I will continue the investigation and come with more accurate results.
Indeed I see a freeze of 10-15 secs while opening http://addons.mozilla.org from mozmill with --manual. During this time Firefox is completely unresponsive. Something is blocking the main thread.

This does not happen with a normal Firefox session.

This is the regressor in mozmill:
https://github.com/mozilla/mozmill/commit/d9100a5dd0bcc756543f510b3d0bd251904f2897

More specifically this made the error reporting work.

However, this freeze that we are seeing is because we are trying to match _every_ exception.
This includes all sorts of `JavaScript Warning` which refer to the actual content of the page.
This shows JS and CSS warnings.

For every such `warning` we have a detailed description that includes the whole minimised CSS file for the page.

I've dumped all messages. We have around 600 warnings, each once includes the whole CSS of the addons.mozilla.org page. And we try to match() these based on our regular expressions.

In total these amount to 17MB of data.
This is blocking the thread.
This is indeed blocking our 2.0.4 release. Andrei, could you have a look at it? As you say we should not catch all of them but only care about errors and exceptions on the chrome side. Most likely it is so intensive because we are creating the Regex for each message now. Can you move this up the chain so both regexes are getting defined only once? I would think this should help.
Blocks: 956426
Whiteboard: [mozmill-2.0.4?] → [mozmill-2.0.4+]
Sure.

As you said, we should only get the chrome errors and not concern ourselves at all with CSS warnings.

I'm not sure that creating the regexes only once will help. The problem is they try to parse 17MB of data (out of which 16.8MB is really out of scope for us)
Assignee: nobody → andrei.eftimie
Status: NEW → ASSIGNED
ATM we are using Services.console to check errors:
https://developer.mozilla.org/en-US/docs/XPCOM_Interface_Reference/nsIConsoleService

Which is deprecated:
> The Error Console is deprecated in Firefox, and is now only made available if you set the devtools.
> errorconsole.enabled preference to true. Use the Web Console instead, for web content, or the 
> Browser Console for chrome content.

Since its split into 2 distinct parts, one for web and the other for chrome messages, it might be worth it to use the newer Browser Console. That way we should exclude all irrelevant Web messages and should make our lives easier.

I'm checking this avenue now.
Well, the Browser console actually logs EVERYTHING: chrome + content (from all tabs) and there doesn't seem to be any easy way of turning off content logging. So this might not help us after all.

There have been avenues from separating messages between chrome and content in the old logger, most notable: bug 593540. But none has finished/shipped.

There is bug 860705 to remove content messages from the Browser console, but no work has been done there.

=====

Disappointed that we can't filter the messages properly.
I'll work on optimising what we have now.
Some data points:

While loading http://addons.mozilla.org

We try to match 124 js warnings, each takes about 190ms, which brings the total to about 23 seconds. None of these match because they are only js warnings (and we don't care about them).

Lets bring this number down.
Severity: normal → critical
OS: Linux → All
Hardware: x86 → All
Summary: Loading remote pages with redirrect takes too long and causes client freeze → Global console listener tries to handle all console messages and causes a disconnect of the application when loading web pages with lots of errors and warnings
String operations are way faster then regular expressions.

To fix this issue we filter out all content messages we don't care about before hitting the regex match().

I've seen both `[JavaScript Warning ...]` and `[JavaScript Error ...]`
This filters both out.

Instead of 190ms each step now takes about 2ms.
Both the testcase and GreeLarry.js are passing.
Attachment #8367258 - Flags: review?(hskupin)
Comment on attachment 8367258 [details] [diff] [review]
0001-Bug-964318-Discard-unwanted-messages-from-the-consol.patch

Review of attachment 8367258 [details] [diff] [review]:
-----------------------------------------------------------------

This would drop any kind of Javascript error and warning. Even for chrome and specifically for mozmill and jsbridge. An example is:

[JavaScript Error: "missing ; before statement" {file: "resource://mozmill/stdlib/utils.js" line: 15 column: 5 source: "cost Cu = Components.utils;
"}]

So when I move out the declaration of errorRegEx and make it a constant outside of the ConsoleObserver the page load for AMO works wonderful and no further hang is visible. I'm not sure why you aren't seeing this. Can you please show me what you had when you did this test?

Also we need a Mutt test for this specific case.
Attachment #8367258 - Flags: review?(hskupin) → review-
Updated patch.

Thanks Henrik for the input.
This fixes the problem without breaking our error reporting.

This is still missing a mutt test.
Attachment #8367258 - Attachment is obsolete: true
Attachment #8367298 - Flags: review?(hskupin)
Comment on attachment 8367298 [details] [diff] [review]
0001-Bug-964318-Discard-unwanted-messages-from-the-consol.patch

Review of attachment 8367298 [details] [diff] [review]:
-----------------------------------------------------------------

::: mozmill/mozmill/extension/components/handlers.js
@@ +64,5 @@
>    observe: function (aSubject, aTopic, aData) {
>      var msg = aSubject.message;
>  
> +    if (msg.contains('chrome') || msg.contains('resource') ||
> +        msg.contains('mozmill') || msg.contains('jsbridge')) {

It's better to start with mozmill and jsbridge. Also keep in mind that we only want to raise a critical framework issue for errors or exceptions. Further you should check again how those checks have to be combined. You cannot do 'or' on all of them.
Attachment #8367298 - Flags: review?(hskupin) → review-
> It's better to start with mozmill and jsbridge.
Sure np.

> Also keep in mind that we
> only want to raise a critical framework issue for errors or exceptions.
Great.
What does this have to do with this pre-filtering? We still have both regular expressions down the flow.

> Further you should check again how those checks have to be combined. You
> cannot do 'or' on all of them.
If we don't have any of these 4 strings in the message, we do not pass it further to the regexp matching. I don't understand what combination you are thinking of...
(In reply to Andrei Eftimie from comment #14)
> > Also keep in mind that we
> > only want to raise a critical framework issue for errors or exceptions.
> Great.
> What does this have to do with this pre-filtering? We still have both
> regular expressions down the flow.

I don't see a reason why we have to filter more than once on the same conditions. 

> > Further you should check again how those checks have to be combined. You
> > cannot do 'or' on all of them.
> If we don't have any of these 4 strings in the message, we do not pass it
> further to the regexp matching. I don't understand what combination you are
> thinking of...

No, that's not what you are testing here. If the chrome check above passes the remaining conditions are obsolete and will not play into this whole condition anymore. So if any other add-on or even Firefox itself puts an entry into the console which contains 'chrome' the if block will be entered.
(In reply to Henrik Skupin (:whimboo) from comment #15)
> (In reply to Andrei Eftimie from comment #14)
> > > Also keep in mind that we
> > > only want to raise a critical framework issue for errors or exceptions.
> > Great.
> > What does this have to do with this pre-filtering? We still have both
> > regular expressions down the flow.
> 
> I don't see a reason why we have to filter more than once on the same
> conditions. 
> 
> > > Further you should check again how those checks have to be combined. You
> > > cannot do 'or' on all of them.
> > If we don't have any of these 4 strings in the message, we do not pass it
> > further to the regexp matching. I don't understand what combination you are
> > thinking of...
> 
> No, that's not what you are testing here. If the chrome check above passes
> the remaining conditions are obsolete and will not play into this whole
> condition anymore. So if any other add-on or even Firefox itself puts an
> entry into the console which contains 'chrome' the if block will be entered.

Yes, so why not keep the more structured regexp test _when_ its needed.
We filter out most (maybe all) unwanted messages. with a simple and fast method. Doesn't have to be perfect. 

The more expensive method gets called fewer times, but ensures that the messages comply to what we look for.

I don't see why we should try cramming the regex into an overly complicated if-branch maze.

Or maybe I misunderstand what you want here.
Attached patch 3.patch (obsolete) — Splinter Review
This might not be polished as it should.
Remote test now based on the testcase from this bug.
We have a python test that asserts that both tests from the JS file pass.
The 2nd one should not pass when we timeout.

I haven't had time to test this thoroughly.
Attachment #8367298 - Attachment is obsolete: true
Attachment #8367427 - Flags: review?(hskupin)
Attached patch 3.patch (obsolete) — Splinter Review
Uploaded the old patch by mistake.
This is the good one.
Attachment #8367427 - Attachment is obsolete: true
Attachment #8367427 - Flags: review?(hskupin)
Attachment #8367429 - Flags: review?(hskupin)
Comment on attachment 8367429 [details] [diff] [review]
3.patch

Review of attachment 8367429 [details] [diff] [review]:
-----------------------------------------------------------------

Andrei, I hope you don't mind when I update the patch so we can get it landed today.

::: mozmill/mozmill/extension/components/handlers.js
@@ +64,5 @@
>    observe: function (aSubject, aTopic, aData) {
>      var msg = aSubject.message;
>  
> +    if (msg.contains('mozmill') || msg.contains('jsbridge') ||
> +        msg.contains('chrome') || msg.contains('resource')) {

I think we should return early here as nesting the following code deeper.

Also if we would exclude everything which is not related to mozmill and jsbridge, we wouldn't have to test for this again in the regex. But yes, I agree that we can leave the remaining parts in the regex.

::: mutt/mutt/tests/python/test_logged_messages.py
@@ +7,5 @@
> +
> +import mozmill
> +
> +class TestFrameworkFailure(unittest.TestCase):
> +    def test_logged_messages(self):

This test should be part of the test_console_errors.py file.
Attachment #8367429 - Flags: review?(hskupin) → review-
Attached patch Patch (updated)Splinter Review
This is an update to Andrei's patch which works the same way, and keeps most of formerly existent code around. I think it is totally enough to check for mozmill and jsbridge only first. There is no need for chrome and resource.
Attachment #8366021 - Attachment is obsolete: true
Attachment #8367429 - Attachment is obsolete: true
Attachment #8367574 - Flags: review?(dave.hunt)
Comment on attachment 8367574 [details] [diff] [review]
Patch (updated)

Review of attachment 8367574 [details] [diff] [review]:
-----------------------------------------------------------------

Works fine.
Attachment #8367574 - Flags: feedback+
Comment on attachment 8367574 [details] [diff] [review]
Patch (updated)

Review of attachment 8367574 [details] [diff] [review]:
-----------------------------------------------------------------

Flipping over to Andrew given that Dave is not around.
Attachment #8367574 - Flags: review?(dave.hunt) → review?(ahalberstadt)
Comment on attachment 8367574 [details] [diff] [review]
Patch (updated)

Review of attachment 8367574 [details] [diff] [review]:
-----------------------------------------------------------------

Looks fine to me, though lacking a bit of context.
Attachment #8367574 - Flags: review?(ahalberstadt) → review+
Product: Testing → Testing Graveyard
You need to log in before you can comment on or make changes to this bug.