Closed Bug 594145 Opened 14 years ago Closed 14 years ago

Intermittent failures in browser_HUDServiceTestsAll.js | Four children in output - Got 5, expected 4 and html page is logged - Didn't expect -1, but got it and javascript is logged - Didn't expect -1, but got it and log() is logged

Categories

(DevTools :: General, defect)

defect
Not set
normal

Tracking

(blocking2.0 betaN+)

RESOLVED FIXED
Tracking Status
blocking2.0 --- betaN+

People

(Reporter: pcwalton, Assigned: msucan)

References

Details

(Keywords: intermittent-failure)

Attachments

(1 file, 3 obsolete files)

Occasionally, in the basic network logging test in the Web Console, the loading of the HTML page itself fails to trigger a log message, while the resources do trigger log messages. This is causing random test failures, so I will comment it out for the time being.
This looks like the following:

INFO | runtests.py | Running tests: end.
mochitest-browser-chrome failed:
TEST-UNEXPECTED-FAIL |
chrome://mochikit/content/browser/toolkit/components/console/hudservice/tests/browser/browser_webconsole_basic_net_lo
gging.js | Four children in output - Got 2, expected 5
TEST-UNEXPECTED-FAIL |
chrome://mochikit/content/browser/toolkit/components/console/hudservice/tests/browser/browser_webconsole_basic_net_lo
gging.js | html page is logged - Didn't expect -1, but got it
TEST-UNEXPECTED-FAIL |
chrome://mochikit/content/browser/toolkit/components/console/hudservice/tests/browser/browser_webconsole_basic_net_lo
gging.js | Test timed out
make: *** [mochitest-browser-chrome] Error 1
Baking a temporary fix for this on try as changeset 028c7eccd0d1.
Depends on: 581069
Blocks: devtools4b7
Summary: Random test failure in Web Console: basic network logging fails to pick up the HTML page → In the basic network logging test of the Web Console, the HTML page often isn't picked up
Whiteboard: randomorange
Removing "randomorange" from the whiteboard as this hasn't been seen in mozilla-central, only in the split-up test in bug 581069.
Adding randomorange back; this has been seen in the wild.
Assignee: nobody → pwalton
Status: NEW → ASSIGNED
Whiteboard: randomorange
The problem here is that our observer picks up the network event, but it finds no console to deliver it to. (To deliver the message, it uses getHudIdByWindow().)
(In reply to comment #5)
> The problem here is that our observer picks up the network event, but it finds
> no console to deliver it to. (To deliver the message, it uses
> getHudIdByWindow().)

So it is possible that we still have network events that cannot be traced back to the window via the Channel given to us. If this is the case - and the window we try to use here is null, or the console UI is null and the network traffic is not an image, we may be able to discern the console UI properly via the loadGroup. My css error trapping patch in bug 567165 is reintroducing loadGroup storage to the HUDService.

If it is an image that does this we may be in trouble, as all of the image logging code is not perfect. Again, we need platform support going forward to make this kind of discovery rock solid.
Blocks: 438871
Whiteboard: randomorange → [orange]
For [orange] to work, the filename of the failing test has to be in the bug summary (plus enough to tell that the bug is the same as the fails in the log), so if you want bug 596255 to be a duplicate of this, you pretty much need to change the summary to the summary of that.
Summary: In the basic network logging test of the Web Console, the HTML page often isn't picked up → Intermittent failures in browser_HUDServiceTestsAll.js | Four children in output - Got 5, expected 4 and html page is logged - Didn't expect -1, but got it and javascript is logged - Didn't expect -1, but got it and log() is logged
(In reply to comment #8)
> For [orange] to work, the filename of the failing test has to be in the bug
> summary (plus enough to tell that the bug is the same as the fails in the log),
> so if you want bug 596255 to be a duplicate of this, you pretty much need to
> change the summary to the summary of that.

Done, thanks.
Attached patch Proposed patch. (obsolete) — Splinter Review
So the problem here is that the HUD Service's window registry is only updated when its observer receives the "content-document-global-created" event. That event may or may not be sent after the network observer detects the HTML page. The fix is to make the network observer's helper method, getHudIdByWindow(), query the DOM instead of the window registry to find the Web Console.

The proposed patch implements this fix. Feedback requested.
Attachment #475373 - Flags: feedback?(ddahl)
Followup note: The wiki [1] states that, for, "content-document-global-created":

> Sent immediately after a web content document window has been set up, but before any script code has been executed. This lets extensions inject API into content windows as needed. The data is a string indicating the URL of the page that will be loaded in the document window.

So it seems reasonable that this event could be fired after the HTML page is loaded. The only guarantee seems to be that it's fired before any scripts are executed, but that may be too late to catch the loading of content.
Forgot to mention: Once the global console object lands, this patch may obviate the need for the window registry entirely. The only reason I kept it around is that it still needs to exist to clean up after console objects attached to iframes. If the global console component can handle this case without having to defer to the HUD Service, then the window registry can be removed.
(In reply to comment #12)
> Forgot to mention: Once the global console object lands, this patch may obviate
> the need for the window registry entirely.

That is the plan I think, except we will just use the outer Window's ID instead of the window's URI or weakref - and the consoles will know exactly what window they are part of via the window's ID.
Comment on attachment 475373 [details] [diff] [review]
Proposed patch.

I do not mind if functions return string or object, but some reviewers may not dig it. Looks good.
Attachment #475373 - Flags: feedback?(ddahl) → feedback+
Attachment #475373 - Flags: review?(dietrich)
\(In reply to comment #11)
\
> So it seems reasonable that this event could be fired after the HTML page is
> loaded. The only guarantee seems to be that it's fired before any scripts are
> executed, but that may be too late to catch the loading of content.

Except that the observer that is started is aware of all http connections, as it is started by the HUDService before any consoles are active -  and is a global singleton.

When we cannot find the correct web console to log the message to, we may have to hodl those "orphaned messages" in a cache to be procesed later. (may be onload or ondocumentready)
Comment on attachment 475373 [details] [diff] [review]
Proposed patch.

r=me. though, is it valid for getHudIdByWindow to be called w/ a non-content window? should we throw in those cases?
Attachment #475373 - Flags: review?(dietrich) → review+
Version: unspecified → Trunk
(In reply to comment #17)
> Comment on attachment 475373 [details] [diff] [review]
> Proposed patch.
> 
> r=me. though, is it valid for getHudIdByWindow to be called w/ a non-content
> window? should we throw in those cases?

Sometimes, chrome performs network operations and triggers the network observer (examples are the favicon and the view source window). In those cases, getHudIdByWindow will return null to tell our observer to bail out.
Attachment #475373 - Flags: approval2.0?
Nominating for blocking status, on the theory that we need our trees green to ship Firefox 4. Additionally, this affects the reliability of the Web Console by making it common for network requests to be dropped when the page is reloaded.
blocking2.0: --- → ?
Comment on attachment 475373 [details] [diff] [review]
Proposed patch.

>diff --git a/toolkit/components/console/hudservice/HUDService.jsm b/toolkit/components/console/hudservice/HUDService.jsm

>   getHudIdByWindow: function HS_getHudIdByWindow(aContentWindow)
>   {

>+    let webNavigation = aContentWindow.QueryInterface(Ci.nsIInterfaceRequestor).
>+                        getInterface(Ci.nsIWebNavigation);
>+    let docShellTreeItem = webNavigation.QueryInterface(Ci.nsIDocShellTreeItem);
>+    let rootTreeItem = docShellTreeItem.rootTreeItem;
>+    if (!rootTreeItem) {

Doesn't look to me like this can actually happen... Seems like you actually want to check for (docShellTreeItem == rootTreeItem)...

... but I've actually been encouraging the use of chromeEventHandler for this kind of stuff. You could do:

let chromeEventHandler = aContentWindow.QueryInterface(Ci.nsIInterfaceRequestor)
                                       .getInterface(Ci.nsIWebNavigation)
                                       .QueryInterface(Ci.nsIDocShell)
                                       .chromeEventHandler;
if (!chromeEventHandler) {
  // chrome window
}

let xulWindow = chromeEventHandler.ownerDocument.defaultView;
// check window type

>+    let browser = gBrowser.getBrowserForDocument(contentDocument);

This can then just be:

let browser = chromeEventHandler;
Attachment #475373 - Flags: approval2.0? → feedback-
I think that the suggested code in comment 20 may also make the unwrapping unecessary, but I'm not sure.
blocking2.0: ? → betaN+
Attached patch Proposed patch, version 2. (obsolete) — Splinter Review
New patch addresses Gavin's comments. The incantation to acquire the notification box is much shorter now.
Attachment #475731 - Flags: review?(gavin.sharp)
Comment on attachment 475731 [details] [diff] [review]
Proposed patch, version 2.

Actually, I just realized we can make use of the chromeWin.getNotificationBox() pseudo-API that's used for other notifications:

let xulWindow = chromeEventHandler.ownerDocument.defaultView;
if (!xulWindow.getNotificationBox)
  return null; // window doesn't implement the getNotificationBox pseudo-API

let notificationBox = chromeWindow.getNotificationBox(aContentWindow.top);
Comment on attachment 475731 [details] [diff] [review]
Proposed patch, version 2.

r=me if you use the suggestion from comment 23, and use node.id rather than node.getAttribute("id").
Attachment #475731 - Flags: review?(gavin.sharp) → review+
Blocks: devtools4b8
No longer blocks: devtools4b7
Attached patch Proposed patch, version 3. (obsolete) — Splinter Review
New version addresses reviewer comments.
Attachment #475373 - Attachment is obsolete: true
Attachment #475731 - Attachment is obsolete: true
Comment on attachment 480778 [details] [diff] [review]
Proposed patch, version 3.

>diff --git a/toolkit/components/console/hudservice/HUDService.jsm b/toolkit/components/console/hudservice/HUDService.jsm

>+    if (!xulWindow.getNotificationBox) {
>+      // The window isn't a content window.
>+      return null;

My comment in comment 23 is more accurate - this check doesn't tell you that aContentWindow isn't a content window, it just tells you that it's chrome parent doesn't support this particular API (e.g., it's a view-source window).
This patch doesn't seem to fix the issue. Sigh.
New patch uses wrappedJSObject and fixes the comment as suggested by gavin.
Attachment #480778 - Attachment is obsolete: true
This test was tweaked in the split tests bug landing (bug 581069) - I disabled 2 checks. I did try to simplify the DOM query to just get ".hud-msg-node" nodes, but that did not help too much.
Reassigning to Mihai to see if this is still relevant.
Assignee: pwalton → mihai.sucan
I looked through the old mochitest code, and the new one, and also read the proposed patches, and the entire discussion in this bug. Thoughts:

What Patrick found is interesting, and seems to be reasonable and correct. The httpObserver does observe network activity *before* any Web Consoles are open.

I believe the problem with the intermittent failures was caused by a more "simpler" issue: the code expected that *exactly* four network requests will occur. However, that was not always true, because the browser also performed the request to load the favicon at times.

The real issue was with the test itself: it should only check for the network requests it knows about. That's what the test does today, and it runs fine without intermittent failures.

The latest patch here is fine, except it's no longer relevant for that specific mochitest code.

The only relevancy for this patch is ... if we want to remove the windowRegistry. Do we want that? It helps with mapping windows to hudIds faster than without.
(In reply to comment #31)
> The only relevancy for this patch is ... if we want to remove the
> windowRegistry. Do we want that? It helps with mapping windows to hudIds faster
> than without.

Having the window registry is fine. I suppose this bug is fixed then?
Yes.
Status: ASSIGNED → RESOLVED
Closed: 14 years ago
Resolution: --- → FIXED
Whiteboard: [orange]
Product: Firefox → DevTools
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: