Closed Bug 1305787 Opened 8 years ago Closed 8 years ago

Verify that the changes to the update url are actually applied in the 47.0.2 release

Categories

(Toolkit :: Application Update, defect, P2)

defect

Tracking

()

RESOLVED FIXED

People

(Reporter: lizzard, Assigned: robert.strong.bugs)

References

(Blocks 1 open bug)

Details

(Whiteboard: [platform-rel-Forcepoint])

User Story

Deploy a hotfix to Firefox 40-47 which:

* Acts on Windows only

* Detects the presence or absense of WebSense by checking the following paths:

** %WINDIR%\System32\qipcap.dll
** %WINDIR%\System32\qipcap64.dll
** %WINDIR%\sysnative\qipcap.dll
** %WINDIR%\sysnative\qipcap64.dll


If WebSense is not present, add "(nowebsense)" directly after "%OS_VERSION%" to the default value of the app.update.url pref

* New value will be https://aus5.mozilla.org/update/6/%PRODUCT%/%VERSION%/%BUILD_ID%/%BUILD_TARGET%/%LOCALE%/%CHANNEL%/%OS_VERSION%(nowebsense)/%SYSTEM_CAPABILITIES%/%DISTRIBUTION%/%DISTRIBUTION_VERSION%/update.xml

If WebSense is present, add "(websense)"

Attachments

(2 files)

We have an unexpectedly high number of users on 47, 48, and 49 who may or may not have websense installed. We'd like to figure out why. They may not have received the hotfix for 47 or 48, or there may be some other reason why we weren't able to detect websense.   
 

+++ This bug was initially created as a clone of Bug #1298404 +++
Where is the majority of those users? (how is the distribution between 47, 48 and 49)

For 49 the hotfix uninstalls itself, so that's expected: no user, after having reached 49, will continue to report their Websense status.

Can we ask someone from AMO to check how many installs were there for this hotfix, and compare with the userbase?
The latest version of hotfix installed is reported by the telemetry environment: http://searchfox.org/mozilla-central/source/toolkit/components/telemetry/TelemetryEnvironment.jsm#1068

Someone from data analysis could look at that to see if it's a case of users not having received the hotfix, or having received and it failed or got uninstalled.
one issue that was flagged by QA during testing is that the hotfix addon doesn't propagate to users whose update settings in firefox are "Check for updates, but let me choose whether to install them"  or "Never check for updates" (so those users would remain stranded on 47.0.1 in the current setup)
https://bugzilla.mozilla.org/show_bug.cgi?id=1298404#c41
FWIW, this is the pref "extensions.hotfix.lastVersion". And what should be looked at is the proportion of users which has that pref with the value "v20160826.01" or not (only Windows users, and broken down by version number)
chutten, ali, could that data analysis come from one or both of you? I'll find you on irc.
Flags: needinfo?(chutten)
Flags: needinfo?(aalmossawi)
So... what's the precise ask here? If you're asking for proportions of a given user population who has a given string value (20160826.01) for environment.build.hotfix_version, then a decent place to start would be dumping the following into sql.tmo:

SELECT
	build[1].version AS version,
	build[1].hotfix_version = '20160826.01' AS has_it,
	COUNT(*)
FROM
	longitudinal
WHERE
	normalized_channel = 'release'
GROUP BY 1, 2
ORDER BY 3 DESC

And then comparing. For instance, 48.0.2 users in longitudinal have it about 82% of the time. (and definitely don't have it 11% of the time, and report 'null' for it the rest of the time)
Flags: needinfo?(chutten)
I think it would be useful to check if the hotfix is userDisabled is telemetry. From a quick glance at my Nightly Health Report I'm not sure if we report disabled extensions. I only see activeAddons.
I modified the query from :chutten slightly, here: https://sql.telemetry.mozilla.org/queries/1308/source

And what I got is the following (probably needs validation)

------- 47.0 + 47.0.1 --------
Users with the hotfix    = 25%
Users without the hotfix = 75%


--- 48.0 + 48.0.1 + 48.0.2 ---
Users with the hotfix    = 76%
Users without the hotfix = 24%


Strange difference, and I don't know how to interpret it.

This data is from 2016-09-18
Blocks: websense
Felipe, I think the difference there is that our update rules allow 48.* "unknown" instances to update to 49.0.1. It made sense to hold the unknown status users on 47.0.1 there, because they don't hit the crash at all. But from 48, either they're crashing sometimes or they aren't, and if they are, updating to 49 seems unlikely to make the crash worse/more frequent.   So it makes sense that we have a higher percentage on 47 who don't have the hotfix.
Flags: needinfo?(aalmossawi)
Priority: -- → P2
At this point, we've tried shipping a hotfix and a system add-on, but we still can see many installs which haven't gotten either fix, are still on 47.0.1, with Websense status unknown. Our next step is to ship 47.0.2 with the websense-detecting code.  I'm not sure who would write that. Matt? Felipe? 

We also probably need to create a relbranch and work with that so we have somewhere to land this and build from. Catlee, can you help find an owner to help with this extra releng work?
Flags: needinfo?(mhowell)
Flags: needinfo?(felipc)
Flags: needinfo?(catlee)
With the system add-on, the same code can be shipped built-in. So nothing new needs to be written, it's just a matter of landing the patch from bug 1306081 in the right tree/branch.
Flags: needinfo?(felipc)
Is it possible to ship this as a special MAR rather than a whole new release? Basically ship the system addon via the regular firefox update mechanism?
Flags: needinfo?(catlee)
No, that mechanism can only work with new releases. And we'd have to bump the version number to avoid tremendous confusion on all sides, so we really are talking about 47.0.2. But we'd have a partial MAR as usual of course, so it's not like 47.0.1 users would be redownloading the whole browser.
Flags: needinfo?(mhowell)
It sounds like this will be difficult (and uncertain) enough for releng that we may not want to do it. The last time we tried to ship out of the usual channels was for 38.0.5: https://wiki.mozilla.org/Firefox/Channels/Postmortem/38.0.5

Rail is investigating what it would take to do this in bug 1309894.
I have a machine nearby which doesn't seem to get the Websense detection code. Please tell if there is anything I can help with that install.
Flags: needinfo?(felipc)
Hi! Can you toggle the following prefs to true?

extensions.logging.enabled
devtools.chrome.enabled

Then, on the browser console, clear any existing output and run the following:

Components.utils.import("resource://gre/modules/AddonManager.jsm");
AddonManagerPrivate.backgroundUpdateCheck();

And watch the output for anything important or concerning. If everything goes well, there should be a message like "Installing new system add-on set". The new set will only appear and be enabled after a restart. You can look in about:support before and after to compare and see if "Websense Helper" got installed.

This will trigger an update check and that will let us clarify if the check is not being triggered, or if there's an error getting it downloaded or installed.
Flags: needinfo?(felipc)
Flags: needinfo?(xidorn+moz)
So I tried the steps in comment 18, and the attachment is the log file from the browser console.

After running that, it seems the update url does get "(nowebsense)" attached, and Firefox starts trying to upgrade itself then.

However, it seems even after this, there is no "Websense Helper" listed in about:support, and I think the code actually adding that tag is the Firefox Hotfix which I believe I didn't see before I ran that command.
Flags: needinfo?(xidorn+moz)
Xidorn, two questions:

1 - Do you have a non-default setting for updates? In Preferences > Advanced > Updates

2 - Did you restart Firefox after running that command? The new system add-on set requires Firefox to be restarted to update
Flags: needinfo?(xidorn+moz)
(In reply to :Felipe Gomes (needinfo me!) from comment #21)
> 1 - Do you have a non-default setting for updates? In Preferences > Advanced
> > Updates

It doesn't seem so. It has the recommended option selected.

> 2 - Did you restart Firefox after running that command? The new system
> add-on set requires Firefox to be restarted to update

No. IIRC the url pref was updated without restart. And when I opened About dialog, it started downloading immediately.
Flags: needinfo?(xidorn+moz)
Whiteboard: [platform-rel-Forcepoint]
Matt, it sounds like releng is getting close to being able to ship a 47.0.2 release from a relbranch. Can you create a patch to detect websense that we could land, once we know where to land it?  Thanks.
Flags: needinfo?(mhowell)
Just noticed comment 12 says we already have the patch. Great.
Andrei, if we were ready to do this build in the next day or two, would your team be able to do some testing on 47.0.2?  We can plan to give you a couple of days (it doesn't have to happen overnight).
Flags: needinfo?(mhowell) → needinfo?(andrei.vaida)
(In reply to Liz Henry (:lizzard) (needinfo? me) from comment #24)
> Just noticed comment 12 says we already have the patch. Great.
> Andrei, if we were ready to do this build in the next day or two, would your
> team be able to do some testing on 47.0.2?  We can plan to give you a couple
> of days (it doesn't have to happen overnight).

Hey Liz, we're already outlining a plan for testing this on 47.0.2 and I think we can easily accommodate it if we'll have more than a day to sign it off.

Have there been any developments on this matter? When will 47.0.2 gtb?
Flags: needinfo?(andrei.vaida)
Our test plan for this specific patch is here: https://public.etherpad-mozilla.org/p/1305787. Feel free to provide feedback or other suggestions/scenarios.
47.0.2 has been released.

The checks have been added to 47.0.2 vs. having to push out the checks and all users from the previous watershed of 43.0.1 to 47.0.1 have the update to 47.0.2 advertised to them.

Closing this out as wontfix since the push of the checks to 47.0.2 happened in another bug and the issue with this system add-ons not making it to a large number of users is filed as bug 1307568.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Have we verified that the new update pings (from 47.0.2) actually have the websense status reported correctly, and that the trend is in the right direction?
Flags: needinfo?(robert.strong.bugs)
This bug didn't land any code. I can look for the bug that landed the code where the things you asked for should occur if you would like? Also, do you want me to take on those things you are asking for or should it be the person that implemented it?
Flags: needinfo?(robert.strong.bugs) → needinfo?(benjamin)
This bug isn't about code: it's about not having the right update pings. So I don't think we should resolve this bug until we've verified that we are now actually getting the right update pings. I don't know whether that's something you should do or Liz (or somebody else along the way). But I don't feel comfortable closing this until we know it's actually done.
Flags: needinfo?(benjamin)
OK... I'll take care of it.
Assignee: nobody → robert.strong.bugs
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
Summary: High number of users whose Websense status is still unknown → Verify that the changes to the update url are actually applied in the 47.0.2 release
Status: REOPENED → ASSIGNED
bhearsum, I understand you did an analysis of the update pings at some point. Since I don't have access to the logs and you presumably have prior work could you do the analysis again for 47.0.2 clients?
Flags: needinfo?(bhearsum)
Perhaps rail as well.

Note: I'll take a look at telemetry to see if 47.0.2 clients are and are not being offered updates since if they are / are not this will also be an indicator.
I also received a couple of emails from felipe (the person that wrote the system add-on) regarding this... will take over from here.
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #33)
> Perhaps rail as well.
> 
> Note: I'll take a look at telemetry to see if 47.0.2 clients are and are not
> being offered updates since if they are / are not this will also be an
> indicator.

I'm going to redirect to rail and other release folks on this. I did initial analysis because I was the only one with log access, but others have access now (and have much more context on this overall).
Flags: needinfo?(bhearsum) → needinfo?(rail)
Created a sample of version movement for the last day.

47.0.2 Total: 33742

Matches in MainSummary: 11447

Distribution of clients last version in MainSummary that have reported 47.0.2 in telemetry

Version | Count |
--------+-------+
 46.0.1 |     1 |
 47.0   |     1 |
 47.0.1 |     9 |
 47.0.2 |  8113 |
 48.0   |     3 |
 48.0.1 |     1 |
 48.0.2 |    30 |
 49.0.1 |     6 |
 49.0.2 |  3283 |

The other versions may have an issue with timing.

This is just to show the general trend of movement over a very short time period since that is all I have to work with at this time. The majority of clients aren't in the MainSummary parquet yet and this data should be more reasonable in the near future.
Limiting the date range to 20161102

47.0.2 Total: 2019

Matches in MainSummary: 873

Distribution of clients last version in MainSummary that have reported 47.0.2 in telemetry

Version | Count |
--------+-------+
 46.0.1 |     1 |
 47.0   |     1 |
 47.0.1 |     3 |
 47.0.2 |   425 |
 48.0   |     1 |
 48.0.2 |     3 |
 49.0.2 |   439 |

At least to me there doesn't appear to be anything to be concerned about. I'll check again next week.
I analyzed (a fancy word for grepping) the 2016-11-04 balrog logs.

Total update pings from 47.0.2: 1,189,556
Update pings with "(websense)": 21,591
Update pings with "(nowebsense)": 1,068,173
Update pings without "websense": 99,792


Most of the pings without websense info (92,157) are modified by Avast (avast=1 in the query). 
This leaves us with 7,635 out of 1,189,556 update pings without websense information.


Example URL without wensense info:
"https://aus5.mozilla.org:443/update/6/Firefox/47.0.2/20161031133903/WINNT_x86-msvc-x64/pt-PT/release/Windows_NT%2010.0.0.0%20(x64)/SSE3/default/default/update.xml HTTP/1.1" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0"

Example URL with wensense info:
"https://aus5.mozilla.org:443/update/6/Firefox/47.0.2/20161031133903/WINNT_x86-msvc-x64/en-US/release/Windows_NT%2010.0.0.0%20(x64)(nowebsense)/SSE3/default/default/update.xml HTTP/1.1" "Mozilla/5.0 (Windows NT 10.0; WOW64; rv:47.0) Gecko/20100101 Firefox/47.0"
Flags: needinfo?(rail)
Rail, thank you very much!

That systems that didn't pick up the change and weren't the avast update ping comprise 0.64% of all systems.

Note: some subset of these are likely due to the app.update.url.override which I am planning on removing.

Liz and Benjamin, are you both ok with closing this bug now?
Flags: needinfo?(lhenry)
Flags: needinfo?(benjamin)
As an aside, a similar analysis was done for the system add-on that was pushed to users with the following results.

On Oct 2, 2016 there were 93,399,345 updates in total. The websense related break down looks like this:

* with "(websense)" in the update ping, total 41,270 (0.04%)
  48.x:   17406
  47.x:   17406
  other:   2624

* with "(nowebsense)" in the update ping, total 7,333,708 (7.9%)
  47.x:  3,360,795
  48.x:  2,834,095
  other: 1,138,818

* no information about websense, total 86,024,367 (92.1%)
  47.x:    9,136,373
  48.x:    3,121,208
  other: 73,766,786

It would be a good thing to get some attention on bug 1307568 and other bugs related to system add-ons not being successfully deployed to clients.
Rail, can I get a sample ping from an Avast client. I will likely need to manage our relationship with Avast again and this info might be handy.
Flags: needinfo?(rail)
Rail, I'm not asking you to do this but did you exclude duplicate IP Addresses? I ask because iirc Avast checks for updates to all of the products it tracks much more often than we do as well as checks when Firefox isn't running. I'm just trying to figure out why their number is so high.
Flags: needinfo?(rail)
Nope, I just grepped regardless the IP addresses.
Flags: needinfo?(rail)
Thanks again and that makes sense for the large numbers of avast pings.
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #37)
> Limiting the date range to 20161102
> 
> 47.0.2 Total: 2019
> 
> Matches in MainSummary: 873
> 
> Distribution of clients last version in MainSummary that have reported
> 47.0.2 in telemetry
> 
> Version | Count |
> --------+-------+
>  46.0.1 |     1 |
>  47.0   |     1 |
>  47.0.1 |     3 |
>  47.0.2 |   425 |
>  48.0   |     1 |
>  48.0.2 |     3 |
>  49.0.2 |   439 |
> 
> At least to me there doesn't appear to be anything to be concerned about.
> I'll check again next week.

Same restriction of 20161102

47.0.2 Total: 2019

Matches in MainSummary: 952

Version | Count |
--------+-------+
 46.0.1 |     3 |
 47.0   |     1 |
 47.0.1 |     3 |
 47.0.2 |   365 |
 48.0.2 |     5 |
 49.0.2 |   575 |

I'll increase the date range from just 20161102 to 20161102 - 20161104 and do another comparison of movement over days similar to the comparison above but over a larger range so those that don't boot their computer and run Firefox that often have a greater chance of being included in the results.
Just realized that the 47.0.2 Total isn't for unique client ID's and the number of unique client ID's matched to the MainSummary is around 99.4% so the 47.0.2 Total is extremely close to the Matches in MainSummary nummber. I'll take that into account for the larger range.
Clients that reported 47.0.2

Date Range: 20161102 - 20161104

47.0.2 Total: 21897

Matches in MainSummary: 21772 (99.43% of all pings)

Version | Count | Percent |
--------+-------+---------+
 39.0   |     2 |   0.01% |
 40.0.3 |     1 |   0.00% |
 41.0.2 |     1 |   0.00% |
 42.0   |     1 |   0.00% |
 43.0.1 |    11 |   0.05% |
 43.0.4 |     1 |   0.00% |
 44.0   |     1 |   0.00% |
 44.0.2 |     3 |   0.01% |
 45.0   |     2 |   0.01% |
 45.0.1 |     3 |   0.01% |
 45.0.2 |     2 |   0.01% |
 46.0.1 |    10 |   0.05% |
 47.0   |     7 |   0.03% |
 47.0.1 |    66 |   0.30% |
 47.0.2 | 14327 |  65.80% |
 48.0   |     1 |   0.00% |
 48.0.2 |    61 |   0.28% |
 49.0   |     2 |   0.01% |
 49.0.1 |    11 |   0.05% |
 49.0.2 |  7259 |  33.34% |
--------+-------+---------+
        | 21772 | 100.00% |

I'll check again how these clients are doing sometime next week so they have a chance to boot their computers, run Firefox, send a telemetry ping, etc.
That satisfies me: I think maybe we should just publish updates to the avast users and risk them being broken by websense? But Liz should review this also.
Flags: needinfo?(benjamin)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #49)
> That satisfies me: I think maybe we should just publish updates to the avast
> users and risk them being broken by websense? But Liz should review this
> also.

AFAIK Balrog cannot match based on query string (avast is passed as a part query string).
Oh, btw, we are not blocking those users from updates, we only block the users with "(websense)" in the ping.
We're still blocking users whose websense status is unknown. We don't do anything in particular relating to Avast. 

Crash-stats ADI data looks good, and I'll keep an eye on it this week. You can see when we shipped 47.0.2 and that even through the weekends, users are updating to 47.0.2 and beyond -- the ADI for 49.0.2 increased significantly. 

https://crash-stats.mozilla.com/crashes-per-day/?p=Firefox&v=47.0.1&v=47.0.2&v=48.0.2&v=49.0.2&hang_type=any&os=Windows&os=Mac+OS+X&os=Linux&date_start=2016-10-19&date_end=2016-11-07&submit=Generate
Flags: needinfo?(lhenry)
(In reply to Benjamin Smedberg [:bsmedberg] from comment #49)
> That satisfies me: I think maybe we should just publish updates to the avast
> users and risk them being broken by websense? But Liz should review this
> also.
Avast has been very good about implementing the changes we implement such as this. I'd prefer to at least contact them and giving them the opportunity to implement the check we added if for no other reason than to continue our relationship in good standing.
(In reply to Robert Strong [:rstrong] (use needinfo to contact me) from comment #48)
> Clients that reported 47.0.2
> 
> Date Range: 20161102 - 20161104
> 
> 47.0.2 Total: 21897
> 
> Matches in MainSummary: 21772 (99.43% of all pings)
> 
> Version | Count | Percent |
> --------+-------+---------+
>  39.0   |     2 |   0.01% |
>  40.0.3 |     1 |   0.00% |
>  41.0.2 |     1 |   0.00% |
>  42.0   |     1 |   0.00% |
>  43.0.1 |    11 |   0.05% |
>  43.0.4 |     1 |   0.00% |
>  44.0   |     1 |   0.00% |
>  44.0.2 |     3 |   0.01% |
>  45.0   |     2 |   0.01% |
>  45.0.1 |     3 |   0.01% |
>  45.0.2 |     2 |   0.01% |
>  46.0.1 |    10 |   0.05% |
>  47.0   |     7 |   0.03% |
>  47.0.1 |    66 |   0.30% |
>  47.0.2 | 14327 |  65.80% |
>  48.0   |     1 |   0.00% |
>  48.0.2 |    61 |   0.28% |
>  49.0   |     2 |   0.01% |
>  49.0.1 |    11 |   0.05% |
>  49.0.2 |  7259 |  33.34% |
> --------+-------+---------+
>         | 21772 | 100.00% |
> 
> I'll check again how these clients are doing sometime next week so they have
> a chance to boot their computers, run Firefox, send a telemetry ping, etc.

Clients that reported 47.0.2

From all pings
Date Range: 20161102 - 20161104

From MainSummary
Date Range: 20161102 - 20161108 (updated every night)

47.0.2 Total: 21897

Matches in MainSummary: 21831 (99.70% of all pings)

Version | Count | Percent |
--------+-------+---------+
 39.0   |     3 |   0.01% |
 39.0.3 |     1 |   0.00% |
 40.0.3 |     1 |   0.00% |
 41.0   |     1 |   0.00% |
 41.0.2 |     1 |   0.00% |
 43.0.1 |    52 |   0.24% |
 43.0.3 |     1 |   0.00% |
 43.0.4 |     3 |   0.01% |
 44.0   |     1 |   0.00% |
 44.0.2 |     8 |   0.04% |
 45.0   |     7 |   0.03% |
 45.0.1 |     6 |   0.03% |
 45.0.2 |     7 |   0.03% |
 46.0   |     2 |   0.01% |
 46.0.1 |    28 |   0.13% |
 47.0   |    17 |   0.08% |
 47.0.1 |   108 |   0.49% |
 47.0.2 |  9307 |  42.63% |
 48.0   |     1 |   0.00% |
 48.0.1 |     1 |   0.00% |
 48.0.2 |    92 |   0.42% |
 49.0   |     2 |   0.01% |
 49.0.1 |    16 |   0.07% |
 49.0.2 | 12163 |  55.71% |
 50.0   |     2 |   0.01% |
--------+-------+---------+
        | 21831 | 100.00% |

The movement of clients from 47.0.2 to 49.0.2 is easily seen over the 4 days from the initial query on Saturday to today's query on Wednesday.

I'll contact Avast in the very near future.

Closing per comment #49 and comment #52

If there are other questions please file a new bug.
Status: ASSIGNED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → FIXED
I contacted Avast and they will be adding the websense detection and the detection added in bug 1311515
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: