Closed Bug 674738 Opened 8 years ago Closed 8 years ago

Geolocation XPCShell test really slow on Windows 7 build slave

Categories

(Core :: DOM: Geolocation, defect)

x86_64
Linux
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla9

People

(Reporter: wlach, Assigned: dougt)

Details

(Whiteboard: [buildfaster:p1] [inbound])

Attachments

(3 files, 1 obsolete file)

So I was investigating https://bugzilla.mozilla.org/show_bug.cgi?id=617503 and found that the geolocation test (dom/tests/unit/test_geolocation_provider.js) in particular was really slow on win32.

Doing some further digging and adding log messages everywhere, it seems as if the test is finishing, but not exiting on win32. There's one single call to "do_test_pending()" at the top of the test, then a call to "do_test_finished()" when the geolocation provider gets a shut down event. Both are called in the case of win32 (see log), and yet the test keeps on spinning anyway. Most strange.
It would be worthwhile breaking in a debugger when it's spinning at the end to see who's doing the spinning.
Assuming that it's a nested event loop, that is.
To add a bit more detail, it's after this logging messages that the test just sits there spinning:

TEST-INFO | (xpcshell/head.js) | test -1 finished

That is, there are these two debugging messages which come when the test completes:

TEST-PASS | (xpcshell/head.js) | 4 (+ 0) check(s) passed

TEST-INFO | (xpcshell/head.js) | 0 check(s) todo
test -1 finished looks like a do_test_finish call occurred before a do_test_pending call.
Ok, that test -1 finished was my fault. :) Please just disregard that, it's a red herring.

I finally determined the real cause of the issue: the geolocation provider wants to make a request to a google.com URI in the course of determining location. The build slaves know how to resolve google.com, but cannot connect to it. This problem should be trivially reproducible on *any* machine by adding these two lines to the main body of the test:

    var prefService = Cc["@mozilla.org/preferences-service;1"].getService(Ci.nsIPrefBranch).QueryInterface(Ci.nsIPrefService);
    prefService.setCharPref('geo.wifi.uri', 'https://10.5.32.4');

(10.5.32.4 should be an address which does not exist on your local network)

So that's the cause of the problem. It seems that there are two underlying issues:

1. The test is assuming that a connection to google.com can be made, but this isn't a valid assumption. Talked to Clint a bit about this, and we think the right workaround would be to create a fake local geolocation server that always returns the same location for the purposes of this test.

2. Even if we can't resolve the address, that's no excuse for the test to keep on spinning indefinitely. I dug a bit into our event handling, and it appears what's happening is that the test is hanging for several minutes after the timeout error event is processed (i.e. no further events in the test can occur). It doesn't seem like this issue is causing the actual browser to hang, but it still bears investigating.

Throwing this over to the geolocation guys: what do you think is the best way of proceeding with this stuff?
Hey Doug, I understand that you're the guy who wrote this test and much of the underlying code. Any thoughts here?
I suspect we can just use the same code we do in the geolocation mochitests:

http://mxr.mozilla.org/mozilla-central/source/dom/tests/mochitest/geolocation/geolocation_common.js

We'll need to figure out the best way to share it, of course.
There's automatic geo device shutdown code already in place, so it's confusing that it's not kicking in. geo.timeout is the pref that controls this, and it defaults to checking for any existing callbacks every 6s. In this case, that means checking for the presence of any watchPosition callbacks that haven't been cleared yet. This suggests to me that the clearWatch timer in the test isn't working correctly, but I can't imagine why.
I'm pretty sure the function to clear the watch *is* being called, actually (at least that's what my logs seem to indicate). The problem seems to be that the test just hangs several seconds after the watch is cleared for some unknown reason.
Assignee: nobody → josh
Talked this over with Josh and he agreed to do the following:

1. Rework the test so as not to need to connect to google.com
2. Look into why the timeout isn't kicking in
Attachment #551134 - Flags: review?(doug.turner)
Whiteboard: [buildfaster:p1]
Assignee: josh → doug.turner
Attachment #551134 - Attachment is obsolete: true
Attachment #551134 - Flags: review?(doug.turner)
Attachment #555469 - Flags: review?(josh)
Attachment #555469 - Flags: review?(josh) → review+
http://hg.mozilla.org/integration/mozilla-inbound/rev/9311205748c3
Whiteboard: [buildfaster:p1] → [buildfaster:p1] [inbound]
http://hg.mozilla.org/mozilla-central/rev/9311205748c3
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla9
You need to log in before you can comment on or make changes to this bug.