Closed Bug 1277925 Opened 8 years ago Closed 7 years ago

Users stuck on beta build 44.0b1 (20151217102820)

Categories

(Release Engineering :: Release Requests, defect)

defect
Not set
critical

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: Harald, Unassigned)

References

(Depends on 1 open bug)

Details

Beta build 20151217102820, probably 44.0b1, generates still 1mil usage hours per day with a UT main crash rate of over 10. The numbers are stable since end of 2015 which implies that users have no way of updating and probably don't know that they are stuck on an old build.

Usage Hours: https://sql.telemetry.mozilla.org/queries/469/source#772
Crash Rate: https://sql.telemetry.mozilla.org/queries/469/source#773
ni? list via :lizzard to investigate
Flags: needinfo?(mhowell)
Flags: needinfo?(lhenry)
Flags: needinfo?(benjamin)
FWIW this is the sha1 watershed release. Which also is pointed at by our downloads from the website for Windows versions unable to be identified as supporting sha2.

See Bug 1234277.
Harald, can you confirm whether these users are largely on WinXP? If so, this is expected.
Flags: needinfo?(mhowell)
Flags: needinfo?(lhenry)
Flags: needinfo?(hkirschner)
Flags: needinfo?(benjamin)
All the current users are on Windows NT 6.1, aka Windows 7: https://sql.telemetry.mozilla.org/queries/469#774

Did we consider getting those users on an ESR release? Having users stuck on an old Beta does not help testing, skews our data and potentially impacts user sentiment.
Flags: needinfo?(hkirschner) → needinfo?(benjamin)
The "watershed" is just that, a version of Firefox that users we can't confirm have Operating Systems that support a Sha2 signing format have to pass through.

Once 44.0b1 is installed, if their OS supports the features we need (win7 should fit this bill, as we have understood it) the in-app updater should download the latest version and update those users to that version.

Do we have data on what Win7 Service pack these users have?
Can release QA test and make sure that users are still being issued updates from this build? If that's the case, we should hand this to the install/update team to diagnose.
Flags: needinfo?(benjamin) → needinfo?(sledru)
sure, Florin, Andrei, could you help? Merci
Flags: needinfo?(sledru)
Flags: needinfo?(florin.mezei)
Flags: needinfo?(andrei.vaida)
Hello, I tested on Windows 7 x32 and x64 bit with Firefox 44 beta 1 build 2 and 44 beta 9 and I did not received an update to a newer build. I am wondering if there is a timelimit when the update is triggered since I waited for more than 4 hours and I did not received an update. I did not force the update in any way from About Firefox or any some prefs. (I noticed that Nightly or DevEdition receives an update in seconds after I open an old version of the build).
Flags: needinfo?(florin.mezei)
Flags: needinfo?(benjamin)
Flags: needinfo?(andrei.vaida)
Bogdan, could you set app.update.log to true, open the about dialog, and copy / paste into this bug the lines in the browser console that start with *** AUS ? Thanks!
Flags: needinfo?(bogdan.maris)
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #9)
> Bogdan, could you set app.update.log to true, open the about dialog, and
> copy / paste into this bug the lines in the browser console that start with
> *** AUS ? Thanks!

I posted the console output in pastebin: http://pastebin.com/v0sxYtBe
I tested using Firefox 44beta1 on Windows 7 x86 Professional Version 6.1 (Build 7601: Service Pack 1).
Flags: needinfo?(bogdan.maris) → needinfo?(robert.strong.bugs)
According to the log it downloaded
http://download.cdn.mozilla.net/pub/firefox/candidates/47.0-candidates/build3/update/win32/en-US/firefox-47.0.complete.mar

and the update was successfully staged
AUS:SVC UpdateManager:refreshUpdateStatus - Notifying observers that the update was staged. state: applied-service, status: applied

The update was also in the downloading state when you started
AUS:SVC Downloader:_selectPatch - found existing patch with state: downloading
AUS:SVC Downloader:_selectPatch - resuming download

It appears that all is well with updating on your system. Do you have any indication that the update didn't succeed?
Flags: needinfo?(robert.strong.bugs) → needinfo?(bogdan.maris)
As I said in comment 8, I did not force an update using About dialog. I just started the browser and left it be, than I waited for an update. Wille waiting I did saw a few http://download.cdn.mozilla.net/pub/firefox/candidates/47.0-candidates/build3/update/win32/en-US/firefox-47.0.complete.mar lines in browser console but no notification about an update or update after restart could be seen during my session (~4hours).
Also I stated above that I am not sure if there is a time set via prefs when the update will be applied or not.
Flags: needinfo?(bogdan.maris)
Thanks. It appears that your system is updating properly and won't be of help here.
Bogdan, after quitting and relaunching Firefox did it actually update correctly?

This is mysterious to me, but I'll start by creating a dataset of the recent update telemetry for users on this build to see if there are common patterns.
Flags: needinfo?(benjamin) → needinfo?(bogdan.maris)
If I *don't* access about dialog page, Firefox will not prompt me with an update even if I restart, exit Firefox, I'll still get 44 beta 1. 
If I *do* access about dialog page and wait for the update to download, after a restart or exit I do get Firefox 47.0.
Flags: needinfo?(bogdan.maris)
That is strange. The log you gave rstrong shows us staging a 47beta update. This should be applied at next startup without any prompting (no prompting is normal and expected). So if we're staging the update but not applying it that's a problem.

rstrong, could you have a look please?
Assignee: nobody → robert.strong.bugs
Flags: needinfo?(robert.strong.bugs)
Bogdan, could you reproduce the case you described with the app.update.log pref set to true and after a few hours copy / paste the values that start with *** AUS into this bug? It might be possible to see if something is wrong with a few of the following "http://download.cdn.mozilla.net/pub/firefox/candidates/47.0-candidates/build3/update/win32/en-US/firefox-47.0.complete.mar lines in browser console" but please just leave it open in case an error is reported at a later time.
Flags: needinfo?(robert.strong.bugs) → needinfo?(bogdan.maris)
Well, I retested again and now I do get the update from 44beta1 to 48beta1 after exit and relaunch. Here is the requested console output (http://pastebin.com/4DsDxN90).
Flags: needinfo?(bogdan.maris)
This build shows no change in usage hours. Can we explore other avenues for debugging this?
Flags: needinfo?(robert.strong.bugs)
Matthew and Stephen, any ideas as to how to research this? Perhaps telemetry?
Flags: needinfo?(spohl.mozilla.bugs)
Flags: needinfo?(robert.strong.bugs)
Flags: needinfo?(mhowell)
Seems like a good chance this is the same issue as bug 1284915; 43.0.1 is also a SHA-2 watershed. But I have no idea yet what that issue might be; telemetry on release 43.0.1 has not shown me anything to implicate that specific version and, as with this bug, there's no OS version correlation either. Still working on it.
Flags: needinfo?(mhowell)
I get the impression that the affected users are either all running Windows 7 with KB3123479 applied, or not applied[1].

I don't believe we can detect whether or not KB3123479 is applied via telemetry.

Bogdan, could you:
1. Install Windows 7.
2. Install Beta build 20151217102820 (probably 44.0b1 per comment 0).
3. Apply all Windows updates.
4. Ensure that KB3123479 was installed.
5. Set app.update.log to true in prefs.js in your browser profile.
6. Run Firefox.
7. Open Browser Console.
8. Wait for update check to occur.
9. Copy/paste everything with *** AUS in this bug or pastebin.

Please repeat the above, but uninstall KB3123479 before opening Firefox and letting it check for updates.

Thanks!

[1] https://support.microsoft.com/en-us/kb/3123479
Flags: needinfo?(spohl.mozilla.bugs) → needinfo?(bogdan.maris)
(In reply to Stephen A Pohl [:spohl] from comment #22)
> I get the impression that the affected users are either all running Windows
> 7 with KB3123479 applied, or not applied[1].
> 
> I don't believe we can detect whether or not KB3123479 is applied via
> telemetry.
> 
> Bogdan, could you:
> 1. Install Windows 7.
> 2. Install Beta build 20151217102820 (probably 44.0b1 per comment 0).
> 3. Apply all Windows updates.
> 4. Ensure that KB3123479 was installed.
> 5. Set app.update.log to true in prefs.js in your browser profile.
> 6. Run Firefox.
> 7. Open Browser Console.
> 8. Wait for update check to occur.
> 9. Copy/paste everything with *** AUS in this bug or pastebin.
> 
> Please repeat the above, but uninstall KB3123479 before opening Firefox and
> letting it check for updates.
> 
> Thanks!
> 
> [1] https://support.microsoft.com/en-us/kb/3123479

Here are the results:
- With KB3123479 installed, Firefox 44 beta 1 will update to 48 beta 6 (when I tested that was the latest) in a few hours. 
-- Pastebin link: http://pastebin.com/qikM8WK5
- Without KB3123479 installed, Firefox 44 beta 1 updated to 48 beta 1 in a few hours and did not update further during that session.
-- Pastebin link: http://pastebin.com/ta2ntkT8

Let me know if there is anything I can help with.
Flags: needinfo?(bogdan.maris) → needinfo?(spohl.mozilla.bugs)
(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #23)
> (In reply to Stephen A Pohl [:spohl] from comment #22)
> > I get the impression that the affected users are either all running Windows
> > 7 with KB3123479 applied, or not applied[1].
> > 
> > I don't believe we can detect whether or not KB3123479 is applied via
> > telemetry.
> > 
> > Bogdan, could you:
> > 1. Install Windows 7.
> > 2. Install Beta build 20151217102820 (probably 44.0b1 per comment 0).
> > 3. Apply all Windows updates.
> > 4. Ensure that KB3123479 was installed.
> > 5. Set app.update.log to true in prefs.js in your browser profile.
> > 6. Run Firefox.
> > 7. Open Browser Console.
> > 8. Wait for update check to occur.
> > 9. Copy/paste everything with *** AUS in this bug or pastebin.
> > 
> > Please repeat the above, but uninstall KB3123479 before opening Firefox and
> > letting it check for updates.
> > 
> > Thanks!
> > 
> > [1] https://support.microsoft.com/en-us/kb/3123479
> 
> Here are the results:
> - With KB3123479 installed, Firefox 44 beta 1 will update to 48 beta 6 (when
> I tested that was the latest) in a few hours. 
> -- Pastebin link: http://pastebin.com/qikM8WK5
> - Without KB3123479 installed, Firefox 44 beta 1 updated to 48 beta 1 in a
> few hours and did not update further during that session.
> -- Pastebin link: http://pastebin.com/ta2ntkT8
> 
> Let me know if there is anything I can help with.

It looks like we're making progress, but there are two things that don't make sense. Is it possible that the two pastebin links got mixed up? The second one actually shows a successful download and apply of the update, while the first one shows a failure. Could you please clarify which one belongs to which scenario?

Also, the first pastebin link shows that a download was already in progress when the test ran. Could you please retest both scenarios with a new/clean profile? Thanks!
Flags: needinfo?(spohl.mozilla.bugs) → needinfo?(bogdan.maris)
Also the first pastebin doesn't have the aus5 link, so we can verify the actual size declared in the update XML.
You will probably have to copy/paste the console output several times for each scenario. Otherwise, since you need to restart Firefox to apply updates, the previous console output will be lost. If you could create two pastebins (one for each scenario), where you append all the console output just before you restart Firefox, that would be really helpful.
If the first pastebin[1] in comment 23 is representative for the issue here, we should be seeing an excessive number of error code 14 for UPDATE_DOWNLOAD_CODE_COMPLETE in Beta 44. We're currently unable to verify this via our telemetry dashboard v4[2] due to bug 1286580.

[1] http://pastebin.com/qikM8WK5
[2] https://mzl.la/29vGCs9
Depends on: 1286580
(In reply to Stephen A Pohl [:spohl] from comment #26)
> You will probably have to copy/paste the console output several times for
> each scenario. Otherwise, since you need to restart Firefox to apply
> updates, the previous console output will be lost. If you could create two
> pastebins (one for each scenario), where you append all the console output
> just before you restart Firefox, that would be really helpful.

Finally I managed to do that \o/

I edited the bins from comment 23 with final results, and also commented inline:
- With KB3123479 installed:
-- Pastebin link: http://pastebin.com/qikM8WK5
- Without KB3123479 installed:
-- Pastebin link: http://pastebin.com/ta2ntkT8

Let me know if there is something I can help with further on.
Flags: needinfo?(bogdan.maris) → needinfo?(spohl.mozilla.bugs)
Both of these pastebins now show successful updates, so we have not been able to show that KB3123479 has anything to do with this issue yet.

I am a bit concerned about the fact that we seemingly don't run any background update checks the first time after an update has been applied and I filed bug 1287851 for it.

We've released 48.0b7 recently and we fixed an issue along the way that tried to incorrectly update users to 48.0b6 directly, even though they should have been updated to 48.0b1 (a watershed build) before being updated to 48.0b6. I've rerun the reports in comment 0 and comment 4 and it looks like since the beginning of July, the situation is finally improving. Not sure why the crash rate was skyrocketing recently though. Harald, can you confirm? Or is this "improvement" only due to the fact that data is still coming in and we might not have an accurate picture for the month of July yet?
Flags: needinfo?(spohl.mozilla.bugs) → needinfo?(hkirschner)
> Harald, can you confirm?

Looks like the numbers are going down \o/. I'll keep an eye on it.

> Not sure why the crash rate was skyrocketing recently though.

The crash spike in the end is just data stabilizing with pings still missing.
Flags: needinfo?(hkirschner)
Is there anything left here?
It seems the trend downwards didn't turn out to be real. The usage hours reported are pretty constant. Was the expectation that this number should have been 0 by now?
Flags: needinfo?(spohl.mozilla.bugs)
Agreed. I'm now very certain that we're dealing with the equivalent of bug 1284915 here (see comment 21). Matt has been looking into this, but unfortunately, I don't believe we have a lead yet.
Flags: needinfo?(spohl.mozilla.bugs)
Maybe bug 1284484 will improve the situation - since this PT morning we are pointing XP beta downloads to 49.0b8, see https://bugzilla.mozilla.org/show_bug.cgi?id=1284484#c17

Release is still the same (43.0.1), but plan to serve 49.0 to XP users when it's released (Sep 13, IIRC).
Per comment 4, this affects Windows 7 users, not XP.
Ah, I see... It shouldn't affect this bug in this case. At least https://github.com/mozilla-services/go-bouncer/blob/master/handlers.go#L29-L30 tells me that "NT 6.1" shouldn't be affected.
Summary: Users stuck on beta build 20151217102820 → Users stuck on beta build 44.0b1 (20151217102820)
IIUC we don't know why firefox doesn't update itself right?
I see a lot of crashes on "js::jit::EnterBaselineMethod" on 44.0b1 (20151217102820).
In the last 30 days 1926 to be exact. 22 persons have left a "valid" email during crashing.
Could we contact the persons that are stuck on it and try to debug it that way?
Robert, is this still the case? Maybe we should generate multiple partials (due to the watersheds) to try to update theses users?
Flags: needinfo?(robert.strong.bugs)
As with all updates partials would help. I think that getting partials for release users using versions prior to 47.0.2 would have a bigger impact overall though.
Flags: needinfo?(robert.strong.bugs)
This is still a problem: https://sql.telemetry.mozilla.org/queries/2327.
During the past week, the number of users on 44b was around half the users of 51b.
David, can you help with that?
Flags: needinfo?(ddurst)
Well, we have partials for 43.0.1 (bug 1309130) and 47.0.2 (bug 1319905). Do we have partials for things in between?

We know that between watersheds and general slowness to update, we are looking for ways to encourage users to upgrade (whether they are not updating or not able to update is harder to divine). Partials for all the things probably wouldn't hurt, but I defer to rstrong and rail, as I thought that we do this as a rule.
Flags: needinfo?(robert.strong.bugs)
Flags: needinfo?(rail)
Flags: needinfo?(ddurst)
I've just filed bug 1334220 for issues downloading updates to the next watershed after 44, which is 48.0b2. That could be a cause of this bug.
Flags: needinfo?(robert.strong.bugs)
It definitely looks like this bug is due to bug 1334220.

I filed bug 1334419 in the hope of preventing this from happening in the future.
Flags: needinfo?(rail)
Assignee: robert.strong.bugs → nobody
Bug 1334220 is now resolved. Hopefully the numbers should start improving very soon.
Note: these clients will need to first update to 48 beta and then to 52 beta and releng has only created complete updates for this so it will take some time.
I'll re-rerun the query on Telemetry in a week/two weeks.
Flags: needinfo?(mcastelluccio)
Beta 44 DAU is falling off a cliff in a very satisfactory fashion: https://sql.telemetry.mozilla.org/queries/2428/source#4497
Flags: needinfo?(mcastelluccio)
(In reply to Chris H-C :chutten from comment #48)
> Beta 44 DAU is falling off a cliff in a very satisfactory fashion:
> https://sql.telemetry.mozilla.org/queries/2428/source#4497

\o/ \o/ \o/ \o/ \o/ \o/ \o/ \o/ 

I owe :mhowell a drink for findings in Bug 1334220. :)
Depends on: 1335732
Great news, many thanks to all!
I reported bug 1335732 to avoid this from happening in the future
there are also a couple of other old beta versions which are eye-catching in crash stats data and might have users stranded on them - i've filed bug 1335736 for this...
See Also: → 1337148
I'm going to kill this bug with fire. Thanks all!
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Blocks: 1349235
You need to log in before you can comment on or make changes to this bug.