Geolocation XPCShell test really slow on Windows 7 build slave

RESOLVED FIXED in mozilla9

Status

()

Core
Geolocation
RESOLVED FIXED
6 years ago
6 years ago

People

(Reporter: wlach, Assigned: dougt)

Tracking

unspecified
mozilla9
x86_64
Linux
Points:
---

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [buildfaster:p1] [inbound])

Attachments

(3 attachments, 1 obsolete attachment)

So I was investigating https://bugzilla.mozilla.org/show_bug.cgi?id=617503 and found that the geolocation test (dom/tests/unit/test_geolocation_provider.js) in particular was really slow on win32.

Doing some further digging and adding log messages everywhere, it seems as if the test is finishing, but not exiting on win32. There's one single call to "do_test_pending()" at the top of the test, then a call to "do_test_finished()" when the geolocation provider gets a shut down event. Both are called in the case of win32 (see log), and yet the test keeps on spinning anyway. Most strange.
Created attachment 548940 [details]
verbose output of geolocation test with extra debugging logs
Created attachment 548941 [details] [diff] [review]
Throwaway patch to add extra logging to geolocation test

Comment 3

6 years ago
It would be worthwhile breaking in a debugger when it's spinning at the end to see who's doing the spinning.

Comment 4

6 years ago
Assuming that it's a nested event loop, that is.
To add a bit more detail, it's after this logging messages that the test just sits there spinning:

TEST-INFO | (xpcshell/head.js) | test -1 finished

That is, there are these two debugging messages which come when the test completes:

TEST-PASS | (xpcshell/head.js) | 4 (+ 0) check(s) passed

TEST-INFO | (xpcshell/head.js) | 0 check(s) todo

Comment 6

6 years ago
test -1 finished looks like a do_test_finish call occurred before a do_test_pending call.
Ok, that test -1 finished was my fault. :) Please just disregard that, it's a red herring.

I finally determined the real cause of the issue: the geolocation provider wants to make a request to a google.com URI in the course of determining location. The build slaves know how to resolve google.com, but cannot connect to it. This problem should be trivially reproducible on *any* machine by adding these two lines to the main body of the test:

    var prefService = Cc["@mozilla.org/preferences-service;1"].getService(Ci.nsIPrefBranch).QueryInterface(Ci.nsIPrefService);
    prefService.setCharPref('geo.wifi.uri', 'https://10.5.32.4');

(10.5.32.4 should be an address which does not exist on your local network)

So that's the cause of the problem. It seems that there are two underlying issues:

1. The test is assuming that a connection to google.com can be made, but this isn't a valid assumption. Talked to Clint a bit about this, and we think the right workaround would be to create a fake local geolocation server that always returns the same location for the purposes of this test.

2. Even if we can't resolve the address, that's no excuse for the test to keep on spinning indefinitely. I dug a bit into our event handling, and it appears what's happening is that the test is hanging for several minutes after the timeout error event is processed (i.e. no further events in the test can occur). It doesn't seem like this issue is causing the actual browser to hang, but it still bears investigating.

Throwing this over to the geolocation guys: what do you think is the best way of proceeding with this stuff?
Hey Doug, I understand that you're the guy who wrote this test and much of the underlying code. Any thoughts here?

Comment 9

6 years ago
I suspect we can just use the same code we do in the geolocation mochitests:

http://mxr.mozilla.org/mozilla-central/source/dom/tests/mochitest/geolocation/geolocation_common.js

We'll need to figure out the best way to share it, of course.
There's automatic geo device shutdown code already in place, so it's confusing that it's not kicking in. geo.timeout is the pref that controls this, and it defaults to checking for any existing callbacks every 6s. In this case, that means checking for the presence of any watchPosition callbacks that haven't been cleared yet. This suggests to me that the clearWatch timer in the test isn't working correctly, but I can't imagine why.
I'm pretty sure the function to clear the watch *is* being called, actually (at least that's what my logs seem to indicate). The problem seems to be that the test just hangs several seconds after the watch is cleared for some unknown reason.
Assignee: nobody → josh
Talked this over with Josh and he agreed to do the following:

1. Rework the test so as not to need to connect to google.com
2. Look into why the timeout isn't kicking in
Created attachment 551134 [details] [diff] [review]
Add mock provider to geolocation xpcshell test.
Attachment #551134 - Flags: review?(doug.turner)
Whiteboard: [buildfaster:p1]
(Assignee)

Comment 14

6 years ago
Created attachment 555469 [details] [diff] [review]
rebased w/ new response structure
Assignee: josh → doug.turner
Attachment #551134 - Attachment is obsolete: true
Attachment #551134 - Flags: review?(doug.turner)
Attachment #555469 - Flags: review?(josh)

Updated

6 years ago
Attachment #555469 - Flags: review?(josh) → review+
http://hg.mozilla.org/integration/mozilla-inbound/rev/9311205748c3
Whiteboard: [buildfaster:p1] → [buildfaster:p1] [inbound]
http://hg.mozilla.org/mozilla-central/rev/9311205748c3
Status: NEW → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla9
You need to log in before you can comment on or make changes to this bug.