Closed Bug 1159963 Opened 9 years ago Closed 8 years ago

Intermittent test_multiple_register.html,test_multiple_register_during_service_activation.html | could not register for push notification | Some test failed with error TypeError: result is null or Some test failed with error null

Categories

(Core :: DOM: Push Subscriptions, defect)

40 Branch
defect
Not set
normal

Tracking

()

RESOLVED FIXED
Tracking Status
firefox40 --- affected

People

(Reporter: KWierso, Unassigned)

References

Details

(Keywords: intermittent-failure)

10:09:21 INFO - 1 INFO TEST-START | dom/push/test/test_has_permissions.html
10:09:22 INFO - -*- Push.js: Push Constructor
10:09:22 INFO - MEMORY STAT vsize after test: 3374018560
10:09:22 INFO - MEMORY STAT vsizeMaxContiguous not supported in this build configuration.
10:09:22 INFO - MEMORY STAT residentFast after test: 261914624
10:09:22 INFO - MEMORY STAT heapAllocated after test: 91060000
10:09:22 INFO - 2 INFO TEST-OK | dom/push/test/test_has_permissions.html | took 1217ms
10:09:22 INFO - 3 INFO TEST-START | dom/push/test/test_multiple_register.html
10:09:22 INFO - 4 INFO TEST-PASS | dom/push/test/test_multiple_register.html | getEndpoint should return null when app not subscribed.
10:09:22 INFO - 5 INFO TEST-UNEXPECTED-FAIL | dom/push/test/test_multiple_register.html | could not register for push notification - expected PASS
10:09:22 INFO - 6 INFO TEST-UNEXPECTED-FAIL | dom/push/test/test_multiple_register.html | Some test failed with error TypeError: result is null - expected PASS
10:09:22 INFO - MEMORY STAT vsize after test: 3380506624
10:09:22 INFO - MEMORY STAT residentFast after test: 265748480
10:09:22 INFO - MEMORY STAT heapAllocated after test: 92321456
10:09:22 INFO - 7 INFO TEST-OK | dom/push/test/test_multiple_register.html | took 114ms
10:09:23 INFO - 8 INFO TEST-START | dom/push/test/test_permissions.html
I think we found that this was a server issues.  We upgraded our push server, and it causes some failures in the test infra.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WORKSFORME
Status: RESOLVED → REOPENED
Resolution: WORKSFORME → ---
It seems to on the server side, or at push service is not able to connect to the server:

22:02:43 INFO - -*- PushServiceWebSocket.jsm: beginWSSetup()
22:02:43 INFO - -*- PushServiceWebSocket.jsm: serverURL: wss://push.services.mozilla.com/
22:02:50 INFO - -*- PushServiceWebSocket.jsm: wsOnStop()
22:02:50 INFO - -*- PushServiceWebSocket.jsm: Socket error 2152398861
Summary: Intermittent test_multiple_register.html | could not register for push notification | Some test failed with error TypeError: result is null → Intermittent test_multiple_register.html,test_multiple_register_during_service_activation.html | could not register for push notification | Some test failed with error TypeError: result is null or Some test failed with error null
The log says: |PushServiceWebSocket.jsm: Socket error 2152398861| which is NS_ERROR_CONNECTION_REFUSED.
Blocks: 1206969
I have looked at  some logs

log:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=15438234
error:
 22:35:04     INFO -  2 INFO TEST-UNEXPECTED-FAIL | dom/push/test/test_data.html | Some test failed with error AbortError
 22:35:04     INFO -      runTest/<@dom/push/test/test_data.html:172:7
 22:35:04     INFO -      promise callback*runTest@dom/push/test/test_data.html:164:5
 22:35:04     INFO -      Async*@dom/push/test/test_data.html:176:1


function that is failing is:
  function start() {
    return navigator.serviceWorker.register("worker.js" + "?" + (Math.random()), {scope: "."})
    .then(swr => { registration = swr; return swr; });
  }
So I see 3 problems:

1) server not responding:

log:
https://treeherder.mozilla.org/logviewer.html#?repo=fx-team&job_id=5012664
https://treeherder.mozilla.org/logviewer.html#?repo=fx-team&job_id=4993209
test_data.html times out while waiting for the second notification.

in log
https://treeherder.mozilla.org/logviewer.html#?repo=fx-team&job_id=5071747
test_data.html times out while waiting for the last notification.

in log:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=15431511
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=15297329
test_data.html successfully register for push notification (there is a answer from the push server) but never got any notification.

in log:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-beta&job_id=569089
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-release&job_id=104676
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-beta&job_id=556675
connection to the server timed out.



2) problem in comment 55:

log:
https://treeherder.mozilla.org/logviewer.html#?repo=fx-team&job_id=5013888
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-aurora&job_id=1265089
is the same as in comment 55.

3) timeout waiting for unregister even though we are not waiting for the server to answer. 

in log:
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=15318152
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=15289700
https://treeherder.mozilla.org/logviewer.html#?repo=b2g-inbound&job_id=2961964
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=15237030
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-central&job_id=2365763
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=15173234
https://treeherder.mozilla.org/logviewer.html#?repo=fx-team&job_id=4973996
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=15228804
https://treeherder.mozilla.org/logviewer.html#?repo=mozilla-inbound&job_id=15192214
times out waiting for unregister to finish.



for problems 2 and 3 i would suggest to turn on some more logging to see what is happening
test_data.html is often timing out while waiting for the second message. If it is server random failure I would expect more randomness. We should turn on logging for couple of days to get some more info

Some of test_data.html failures are in bug 1206424
I have have failure with log turned on for test_data.html

The log:

this is the end of successful notification -  PushDB is succesfully updated:

-*- PushDB.jsm: update: Update successful for key ID 21a53390-2c0c-45f3-a296-5da7d70c4271"
-*- PushService.jsm: notifyApp() http://mochi.test:8888/tests/dom/push/test/
-*- PushServiceWebSocket.jsm: sendAck()
-*- PushServiceWebSocket.jsm: enqueue
-*- PushServiceWebSocket.jsm: Sending message: {\"messageType\":\"ack\",\"updates\":[{\"channelID\":\"21a53390-2c0c-45f3-a296-5da7d70c4271\"}]}


This is the next notification:

-*- PushServiceWebSocket.jsm: wsOnMessageAvailable() {\"version\": \"gAAAAABWIMIQkeDeuoJnjQWJIaf-IgPBGUEn8z5g6bm1hqVVNzKOEU1JxKutwpX7xo0F51mkSeCSA_xv4TmwZgIMrznBuC368zbBV1OSoWp5--Km6MHQmaV9N0-FevhcWX-IrEDu3ES1E58p9GabXmq3eiLDYDKWE5pdUU3fdIq5vS5jHEwog7E=\", \"channelID\": \"21a53390-2c0c-45f3-a296-5da7d70c4271\", \"messageType\": \"notification\"}
-*- PushServiceWebSocket.jsm: Reseting _retryFailCount and _pingIntervalRetryTimes
-*- PushService.jsm: Stopped existing alarm 7
AlarmService: remove(7, undefined)
AlarmService: _removeAlarmFromDb()
AlarmService: add(Fri Oct 16 2015 02:54:00 GMT-0700 (Pacific Standard Time))
-*- PushServiceWebSocket.jsm: handleNotificationReply()
-*- PushService.jsm: receivedPushMessage()
-*- PushDB.jsm: getByKeyID()
-*- PushDB.jsm: Fetch successful [object Object]
AlarmService: Callback after removing alarm from database.
AlarmService: Current alarm: null
AlarmService: Alarm queue: []
-*- PushDB.jsm: update: Ignoring invalid update for key ID 21a53390-2c0c-45f3-a296-5da7d70c4271

and the PushDB update fails.

This can be the reason for all failures in comment 56 (but not for the failures from comment 55)

I am push on try a patch with some more logging in the part that is failing, but I can not get it to fail...
(In reply to Dragana Damjanovic [:dragana] from comment #62)
> https://treeherder.mozilla.org/#/jobs?repo=try&revision=03b49d72a98e

In the linux debug build - the PushService part is correctly finished, even we have received unregister response from the push server (in WebSocket version we do not wait for that, when the record is deleted from db we return true)

So unregister of an existing subscription succeeded, and then the test tries to unregistr again which should return false.
This is the log for the last unregister:

"-*- PushService.jsm: receiveMessage(): Push:Unregister"}
"-*- PushService.jsm: unregister() {\"scope\":\"http://mochi.test:8888/tests/dom/push/test/\",\"requestID\":\"{fba214ad-a114-4c82-803b-16d6235557a4}\",\"originAttributes\":\"\"}"}
"-*- PushService.jsm: _unregister()"}
 "-*- PushDB.jsm: getByIdentifiers() { http://mochi.test:8888/tests/dom/push/test/, \"\" }"}

The last line is the db query that should return no record and than the function will return false as expected.
We do not have any more logging in PushService.jsm so it can be that:
1) db request never return (I think it is not that). There is no db error in the log
2) response between PushSevice and the test got lost :(
Seeing this 20-30% of the time on OSX e10s as well.
Fixed by bug 1244816.
Depends on: 1244816
Status: REOPENED → RESOLVED
Closed: 9 years ago8 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.