Closed
Bug 1404332
Opened 7 years ago
Closed 7 years ago
[Shield] Pioneer Enrollment - failure code: 0x80004005 on show pop-up notification
Categories
(Shield :: Shield Study, defect)
Shield
Shield Study
Tracking
(Not tracked)
VERIFIED
FIXED
People
(Reporter: aflorinescu, Assigned: osmose)
References
Details
[Prerequisites:] Ensure the following preferences are set : extensions.pioneer-enrollment-study.updateTimerInterval = 1 extensions.pioneer-enrollment-study.firstPromptDelay = 1 extensions.pioneer-enrollment-study.secondPromptDelay = 180000 extensions.pioneer-enrollment-study.studyEndDelay = 180000 extensions.pioneer-enrollment-study.studyEnrolledEndDelay = 180000 extensions.pioneer-enrollment-study.enrollmentState = ‘’ (empty string) [Steps:] 1. Open FF and set the prerequisites. 2. Open browser console. 3. Install add-on https://net-mozaws-prod-addons-signxpi-output.s3.amazonaws.com/pioneer-enrollment-study-2.0.1.xpi. [Actual Result:] When the pop-up notification is displayed, the browser console states: [Exception... "Component returned failure code: 0x80004005 (NS_ERROR_FAILURE) [nsIURI.hostPort]" nsresult: "0x80004005 (NS_ERROR_FAILURE)" location: "JS frame :: resource://gre/modules/PopupNotifications.jsm :: PopupNotifications_refreshPanel/< :: line 816" data: no] [Note1:] There is no visible failure in the expected workflow. [Note2:] There are two additional errors in the console, but not sure if there are also pioneer related: a. Object { error: "connectionClosed", message: "'getTab' active request packet to '…" } b. NotFoundError: Node was not found notification.xml:219 in view-source:chrome://global/content/bindings/notification.xml
Assignee | ||
Comment 1•7 years ago
|
||
After talking with Gijs, the PopupNotifications error does not appear to be related to Pioneer (PopunderNotifications has to do with doorhanger notifications, which we don't use at all). I also cannot replicate on MacOS or Windows 10. The getTab error also looks unrelated, and I cannot replicate. However, the "Node was not found" error does look like it's our fault, and I've filed https://github.com/mozilla/pioneer-enrollment-study/pull/1 to deal with it.
Assignee | ||
Updated•7 years ago
|
Assignee: nobody → mkelly
Comment 2•7 years ago
|
||
Commits pushed to master at https://github.com/mozilla/pioneer-enrollment-study https://github.com/mozilla/pioneer-enrollment-study/commit/6472143c35fc5d515357f084b7c22c43d46ba984 Bug 1404332: Do not throw if failing to clean up notifications. https://github.com/mozilla/pioneer-enrollment-study/commit/09a3934341dae127c166cf3024d476524039c5cc Merge pull request #1 from Osmose/node-not-found Bug 1404332: Do not throw if failing to clean up notifications.
Assignee | ||
Updated•7 years ago
|
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Reporter | ||
Comment 3•7 years ago
|
||
Comment 2 fix resolves the node issue. I cannot spot the node error anymore. Mike, any idea what the other two console errors might be? (resolve this and log separate issues for those?) Also related to browser console, I'm not sure if this is actually related to the Pioneer study, but I do get a bunch of telemetry errors - I wonder if these are expected: https://pastebin.mozilla.org/9069025
Status: RESOLVED → VERIFIED
Flags: needinfo?(mkelly)
Assignee | ||
Comment 4•7 years ago
|
||
(In reply to Adrian Florinescu [:AdrianSV] from comment #3) > Comment 2 fix resolves the node issue. I cannot spot the node error anymore. > Mike, any idea what the other two console errors might be? (resolve this and > log separate issues for those?) Yeah, I think logging them under Firefox::General with STR is our best bet. The popup notification one in particular could probably use a look from someone more familiar with that code. > Also related to browser console, I'm not sure if this is actually related to > the Pioneer study, but I do get a bunch of telemetry errors - I wonder if > these are expected: https://pastebin.mozilla.org/9069025 gfritzsche: chutten: We're getting the errors linked above when sending the persisted custom ping data from a Shield study, but I can't trace them to anything in particular that we're doing. What do you think about that error log?
Flags: needinfo?(mkelly) → needinfo?(chutten)
Comment 5•7 years ago
|
||
`TelemetryStopwatch: key "TELEMETRY_PENDING_LOAD_MS" was already initialized` means that TelemetryStopwatch.start() was called twice without an interleaving TelemetryStopwatch.stop or .cancel for the name "TELEMETRY_PENDING_LOAD_MS This isn't a good sign as this is a TELEMETRY_* probe which measures internal work. If the internals are misbehaving, something is likely wrong. `TelemetrySend::sendPersistedPings - failed to send ping b684490f-806e-4b7f-a249-003aac449b99: SyntaxError: The URI is malformed.` as you can imagine this isn't a particularly nice error to receive either. This means the URI we're trying to send pings to is malformed. When is this happening? During tests?
Flags: needinfo?(chutten)
Reporter | ||
Comment 6•7 years ago
|
||
(In reply to Chris H-C :chutten from comment #5) > `TelemetryStopwatch: key "TELEMETRY_PENDING_LOAD_MS" was already > initialized` means that TelemetryStopwatch.start() was called twice without > an interleaving TelemetryStopwatch.stop or .cancel for the name > "TELEMETRY_PENDING_LOAD_MS > > This isn't a good sign as this is a TELEMETRY_* probe which measures > internal work. If the internals are misbehaving, something is likely wrong. > > `TelemetrySend::sendPersistedPings - failed to send ping > b684490f-806e-4b7f-a249-003aac449b99: SyntaxError: The URI is malformed.` as > you can imagine this isn't a particularly nice error to receive either. > > This means the URI we're trying to send pings to is malformed. > > When is this happening? During tests? Yes, but as far as I can imagine, the tests are as relevant as the real thing, since we are using production environment. (production normandy with Shield opt-out study which delivers the add-on from https://bugzilla.mozilla.org/show_bug.cgi?id=1382761#c10). I could try to see if I get the same telemetry result having the add-on installed manually (I don't remember seeing these errors during the first tests w/o the Shield Study).
Flags: needinfo?(adrian.florinescu)
Assignee | ||
Comment 7•7 years ago
|
||
(In reply to Chris H-C :chutten from comment #5) > When is this happening? During tests? I'm able to replicate this by following the steps from bug 1382761 comment 11 on a brand new profile in Firefox 56 on release. Those steps install a Shield study add-on, and a few minutes after the study is installed and shows its prompts, the errors show up in the console, presumably when Telemetry gets around to sending the custom pings that the study generates. The add-on sends custom telemetry pings here: https://github.com/mozilla/pioneer-enrollment-study/blob/master/extension/StudyUtils.jsm#L901 Are there any particular usage patterns of submitExternalPing that might cause this? I can provide more context about how we send these pings if there's specific things to look out for.
Comment 8•7 years ago
|
||
tests have sometimes been known to not clean up after themselves, which is why I've gotten used to ignoring TelemetryStopwatch warnings in tests. To debug this we'd benefit from trace logging (Services.prefs.setCharPref("toolkit.telemetry.log.level", "Trace");) That ought to give us the information we'd need to get this sorted. I'm especially interested in the server of the _doPing call. The paths ought to be straightforward, so it's only the server portion that remains unknown.
Assignee | ||
Comment 9•7 years ago
|
||
Here's a gist of my logs with tracing enabled: https://gist.github.com/Osmose/20f955fdbb26cdd12641db4ab5b9e9c6 Note that I'm only able to replicate the stopwatch errors, and not the invalid URL errors. I'll leave it to Adrian to provide a trace log for that.
Reporter | ||
Comment 10•7 years ago
|
||
I can see the malformed URI lines for Ubuntu and Mac alike. Adding the browser console with Trace enabled: https://pastebin.mozilla.org/9069143 The steps to reproduce are: 1. Open FF with a new profile. 2. Set extensions.shield-recipe-client.user_id pref to the string "pioneer-quick-timer-2". 3. Set toolkit.telemetry.log.level - Trace 4. Set "extensions.shield-recipe-client.dev_mode" pref to true 5. Restart FF. 6. Open Browser console. The Pioneer notification is shown but not interacted with. Might this be the part we are missing?
Flags: needinfo?(adrian.florinescu)
Comment 11•7 years ago
|
||
Well, the "URL is malformed" part is easy to explain: 1507013861888 Toolkit.Telemetry TRACE TelemetrySend::_doPing - server: undefined, persisted: true, id: 89e1cc3f-a761-4343-bf90-ad37ac87cda7 We're trying to send a ping before TelemetrySend::setup (which is when the server host is grabbed from the prefs). Now, this should just be a temporary failure. We'll store that ping and resend in a scheduled interval. In fact, it should (and is) caught much earlier than _doPing: 1507013857014 Toolkit.Telemetry TRACE TelemetrySend::submitPing - can't send ping now, persisting to disk - canSendNow: false However, accepting the data submission policy seems to kick off triggerSendingPings (via _userNotified) which doesn't bother to check if it can actually send before trying. My guess is that this triggers an unlikely error flow that breaks our net of async functions allowing crazy things like re-entrant loadPendingPings: 1507013861855 Toolkit.Telemetry TRACE TelemetryStorage::loadPendingPing - id: 89e1cc3f-a761-4343-bf90-ad37ac87cda7 1507013861856 Toolkit.Telemetry TRACE TelemetryStorage::loadPendingPing - id: 2d8c1919-b9bb-408e-a058-8c0513108225 Which in turn causes the TelemetryStopwatch traces. tl;dr - It seems as though Telemetry doesn't handle (and doesn't test) the user accepting the data submission policy before TelemetrySend::setup, which causes a lot of noise in the logs. I'm not sure if this eventually results in the error as originally reported, so I'll file a bug specific to the Telemetry parts of this.
You need to log in
before you can comment on or make changes to this bug.
Description
•