Last Comment Bug 674738 - Geolocation XPCShell test really slow on Windows 7 build slave
: Geolocation XPCShell test really slow on Windows 7 build slave
Status: RESOLVED FIXED
[buildfaster:p1] [inbound]
:
Product: Core
Classification: Components
Component: Geolocation (show other bugs)
: unspecified
: x86_64 Linux
: -- normal (vote)
: mozilla9
Assigned To: Doug Turner (:dougt)
:
Mentors:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-07-27 15:33 PDT by William Lachance (:wlach)
Modified: 2011-08-25 04:38 PDT (History)
5 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
verbose output of geolocation test with extra debugging logs (2.02 KB, text/plain)
2011-07-27 15:35 PDT, William Lachance (:wlach)
no flags Details
Throwaway patch to add extra logging to geolocation test (1.19 KB, patch)
2011-07-27 15:37 PDT, William Lachance (:wlach)
no flags Details | Diff | Review
Add mock provider to geolocation xpcshell test. (3.45 KB, patch)
2011-08-05 13:09 PDT, Josh Matthews [:jdm]
no flags Details | Diff | Review
rebased w/ new response structure (3.11 KB, patch)
2011-08-24 11:36 PDT, Doug Turner (:dougt)
josh: review+
Details | Diff | Review

Description William Lachance (:wlach) 2011-07-27 15:33:28 PDT
So I was investigating https://bugzilla.mozilla.org/show_bug.cgi?id=617503 and found that the geolocation test (dom/tests/unit/test_geolocation_provider.js) in particular was really slow on win32.

Doing some further digging and adding log messages everywhere, it seems as if the test is finishing, but not exiting on win32. There's one single call to "do_test_pending()" at the top of the test, then a call to "do_test_finished()" when the geolocation provider gets a shut down event. Both are called in the case of win32 (see log), and yet the test keeps on spinning anyway. Most strange.
Comment 1 William Lachance (:wlach) 2011-07-27 15:35:45 PDT
Created attachment 548940 [details]
verbose output of geolocation test with extra debugging logs
Comment 2 William Lachance (:wlach) 2011-07-27 15:37:09 PDT
Created attachment 548941 [details] [diff] [review]
Throwaway patch to add extra logging to geolocation test
Comment 3 Josh Matthews [:jdm] 2011-07-27 15:42:05 PDT
It would be worthwhile breaking in a debugger when it's spinning at the end to see who's doing the spinning.
Comment 4 Josh Matthews [:jdm] 2011-07-27 15:42:37 PDT
Assuming that it's a nested event loop, that is.
Comment 5 William Lachance (:wlach) 2011-07-27 15:44:22 PDT
To add a bit more detail, it's after this logging messages that the test just sits there spinning:

TEST-INFO | (xpcshell/head.js) | test -1 finished

That is, there are these two debugging messages which come when the test completes:

TEST-PASS | (xpcshell/head.js) | 4 (+ 0) check(s) passed

TEST-INFO | (xpcshell/head.js) | 0 check(s) todo
Comment 6 Josh Matthews [:jdm] 2011-07-27 16:04:57 PDT
test -1 finished looks like a do_test_finish call occurred before a do_test_pending call.
Comment 7 William Lachance (:wlach) 2011-07-28 14:20:58 PDT
Ok, that test -1 finished was my fault. :) Please just disregard that, it's a red herring.

I finally determined the real cause of the issue: the geolocation provider wants to make a request to a google.com URI in the course of determining location. The build slaves know how to resolve google.com, but cannot connect to it. This problem should be trivially reproducible on *any* machine by adding these two lines to the main body of the test:

    var prefService = Cc["@mozilla.org/preferences-service;1"].getService(Ci.nsIPrefBranch).QueryInterface(Ci.nsIPrefService);
    prefService.setCharPref('geo.wifi.uri', 'https://10.5.32.4');

(10.5.32.4 should be an address which does not exist on your local network)

So that's the cause of the problem. It seems that there are two underlying issues:

1. The test is assuming that a connection to google.com can be made, but this isn't a valid assumption. Talked to Clint a bit about this, and we think the right workaround would be to create a fake local geolocation server that always returns the same location for the purposes of this test.

2. Even if we can't resolve the address, that's no excuse for the test to keep on spinning indefinitely. I dug a bit into our event handling, and it appears what's happening is that the test is hanging for several minutes after the timeout error event is processed (i.e. no further events in the test can occur). It doesn't seem like this issue is causing the actual browser to hang, but it still bears investigating.

Throwing this over to the geolocation guys: what do you think is the best way of proceeding with this stuff?
Comment 8 William Lachance (:wlach) 2011-07-28 14:23:02 PDT
Hey Doug, I understand that you're the guy who wrote this test and much of the underlying code. Any thoughts here?
Comment 9 Josh Matthews [:jdm] 2011-07-28 14:26:35 PDT
I suspect we can just use the same code we do in the geolocation mochitests:

http://mxr.mozilla.org/mozilla-central/source/dom/tests/mochitest/geolocation/geolocation_common.js

We'll need to figure out the best way to share it, of course.
Comment 10 Josh Matthews [:jdm] 2011-07-28 14:49:15 PDT
There's automatic geo device shutdown code already in place, so it's confusing that it's not kicking in. geo.timeout is the pref that controls this, and it defaults to checking for any existing callbacks every 6s. In this case, that means checking for the presence of any watchPosition callbacks that haven't been cleared yet. This suggests to me that the clearWatch timer in the test isn't working correctly, but I can't imagine why.
Comment 11 William Lachance (:wlach) 2011-07-28 14:53:59 PDT
I'm pretty sure the function to clear the watch *is* being called, actually (at least that's what my logs seem to indicate). The problem seems to be that the test just hangs several seconds after the watch is cleared for some unknown reason.
Comment 12 William Lachance (:wlach) 2011-08-04 08:42:17 PDT
Talked this over with Josh and he agreed to do the following:

1. Rework the test so as not to need to connect to google.com
2. Look into why the timeout isn't kicking in
Comment 13 Josh Matthews [:jdm] 2011-08-05 13:09:41 PDT
Created attachment 551134 [details] [diff] [review]
Add mock provider to geolocation xpcshell test.
Comment 14 Doug Turner (:dougt) 2011-08-24 11:36:50 PDT
Created attachment 555469 [details] [diff] [review]
rebased w/ new response structure
Comment 16 Marco Bonardo [::mak] 2011-08-25 04:38:30 PDT
http://hg.mozilla.org/mozilla-central/rev/9311205748c3

Note You need to log in before you can comment on or make changes to this bug.