Closed Bug 736373 Opened 8 years ago Closed 4 years ago

Limit or remove OS information in User-Agent

Categories

(Core :: Networking: HTTP, enhancement)

enhancement
Not set

Tracking

()

RESOLVED WONTFIX

People

(Reporter: jayhenn, Unassigned)

References

(Blocks 1 open bug)

Details

(Keywords: privacy)

As documented by Peter Eckersley (link below), the User Agent header can be used as an aide in uniquely identifying an individual or computer (Peter estimates that on average about 1/1500 people have the same agent as you).

One aspect that helps make the User Agent unique is the OS component, which contains not only the type of OS (Linux, Windows, Mac) but also the version being run.

Some examples are:
    (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.3)
    (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.3)
    (X11; U; Linux i686; en-US; rv:1.9.0.14) Gecko/2009090216 Ubuntu/9.04 (jaunty)

I propose that an option be added to restrict or eliminate OS information included in the User Agent header. By reducing entropy contained in the header, user privacy can be enhanced!

UI options could include a checkbox on the Privacy tab entitled "Send system information to websites I visit" or perhaps a cascading set of options with one called "Send Operating System type to sites I visit" and a conditional one called "Send Operating System version to sites".

The only downside I can think of would be software download sites which customize download pages depending on the user's OS. These sites could still potentially use javascript. An interesting corollary to this request would be to similarly restrict OS information via javascript.

Respectfully Yours,
Jay

User Agent tracking: https://www.eff.org/deeplinks/2010/01/tracking-by-user-agent
Panopticlick: https://panopticlick.eff.org/
Severity: normal → enhancement
Component: General → Networking: HTTP
Product: Firefox → Core
QA Contact: general → networking.http
Version: unspecified → Trunk
What impact would this have on naive users visiting Web sites that sniff the UA string?
(In reply to David E. Ross from comment #1)
> What impact would this have on naive users visiting Web sites that sniff the
> UA string?

Software download sites are likely the only ones that would be important enough to consider comparing to the user privacy aspect.

As a quick experiment, I tried adjusting a Windows-based User Agent string by removing the OS revision, replacing "(Windows NT ...;" with just "(Windows;", and browsing to some of the more popular multiplatform software download sites (mozilla.org/firefox, openoffice.org and sf.net/projects/scribus).

All continued to offer the Windows download. It even worked when I removed the "Windows;" from the agent as well, indicating that at least some sites are using javascript for OS detection.
Keywords: privacy
(In reply to jayhenn from comment #0)
> One aspect that helps make the User Agent unique is the OS component, which
> contains not only the type of OS (Linux, Windows, Mac) but also the version
> being run.
> 
> Some examples are:
>     (Windows; U; Windows NT 5.1; en-US; rv:1.9.1.3)
>     (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.3)
>     (X11; U; Linux i686; en-US; rv:1.9.0.14) Gecko/2009090216 Ubuntu/9.04 (jaunty)
 
Currently Firefox outputs this:
      (X11; Linux x86_64; rv:13.0) Gecko/20120408 Firefox/13.0a2
Is that amount of data really too much?

How do you consider Vary: User-Agent and caching will work with something like:
      (X11; rv:13.0) Gecko/20120408 Firefox/13.0

See https://www.varnish-cache.org/docs/trunk/tutorial/vary.html#pitfall-vary-user-agent
When someone presents a question in a news.mozilla.org newsgroup (e.g., mozilla.support.thunderbird) and I think I have an answer, one of the first things I do is check the UA string in the message source.  If the UA string indicates the use of Windows (which is also the operating system I use), then I describe the answer in terms of what that user should do in Windows.  With a Max or Linux UA string, I might not be able to provide an answer.  Without any UA string, I won't know how to answer.  

Then, of course, there is the issue of how bugzilla.mozilla.org will indicate the OS of a new bug report so that developers might know if the bug applies in general or to a specific OS.  

As for comment #2:  How do you judge the importance of Web sites that sniff for the UA string?  A large number of financial services Web sites sniff, not only for the browser type but also the version.  Are they not important?
Yeah, the OS is too useful to remove from here, and doing so would probably break compatibility with some sites. There are sites out there with horrible UA sniffing that check the OS along with the browser version to see what they "support", even if there's nothing OS-dependent involved. There is the argument that getting rid of this would get rid of the sniffing surface for idiocies like that, but unless all vendors do it at once I don't think it would have enough effect.

Limiting the info given about the OS, however, sounds like a very promising idea to me. Sites that do need to know the OS, for downloads for example, generally don't need to know what OS version. Dropping that from Windows and Mac (to match what's already been done at some point for Linux) might be a good idea. Keeping the architecture (32/64-bit or x86/PPC for Mac) in there would probably get enough use to be worthwhile, though.

That being said, removal of an OS version from the UA string has been WONTFIXed in bug 414057 already. Because that's already been decided I'm going to mark this one as WONTFIX as well.

Removal of the chip architecture for Mac is bug 728582, by the way.
Status: UNCONFIRMED → RESOLVED
Closed: 8 years ago
Resolution: --- → WONTFIX
Is there a newsgroup thread where this and bug 414057 have been decided? If not, we should have that conversation on dev planning or platform, and apply that decision.
(In reply to Tom Lowenthal [:StrangeCharm] from comment #6)
> Is there a newsgroup thread where this and bug 414057 have been decided? If
> not, we should have that conversation on dev planning or platform, and apply
> that decision.

The discussion appears to have all been in bug 414057, and some of it got strangely ugly. It was rejected on super review and with a final WONTFIX from Dan Witte. The prime argument for keeping the OS version is to allow sites to provide compatible program downloads. Some version of something could require Windows 7 instead of XP or Vista, for example. Personally, I'd be in favor of removing the OS versions from Mac and Windows, as I don't think it's really worth it, but I don't think it's a big enough deal to care about. (it's not up to me anyway) This bug is WONTFIX just because of the other bug which would necessarily be a part of any form of this one.
(In reply to John Drinkwater (:beta) from comment #3)
> How do you consider Vary: User-Agent and caching will work with something
> like:
>       (X11; rv:13.0) Gecko/20120408 Firefox/13.0
> 
> See
> https://www.varnish-cache.org/docs/trunk/tutorial/vary.html#pitfall-vary-
> user-agent

Thanks for your comment, John.

Examining the 3 platform-specific software download sites listed above (firefox, openoffice and sourceforge), I observed no "Vary: User-Agent" HTTP header. Are you aware if this is used much in the wild?

The impact upon caches (such as varnish) would be no more than the impact of removing the OS or version information.

(In reply to David E. Ross from comment #4)
> [...]
> Without any UA string, I won't know how to answer.  

This bug does not propose to eliminate the UA string completely - just make certain portions optional. News.mozilla.org could easily have a guideline that recommended disabling the "Send OS version to sites" option.

> Then, of course, there is the issue of how bugzilla.mozilla.org will
> indicate the OS of a new bug report so that developers might know if the bug
> applies in general or to a specific OS.  

Bugzilla could use javascript as it does for other things.

> As for comment #2:  How do you judge the importance of Web sites that sniff
> for the UA string?  A large number of financial services Web sites sniff,
> not only for the browser type but also the version.  Are they not important?

That's the beauty of having it be a user preference: users could determine how much they want to tell the sites they access. The idea here is to give users a choice: there's nothing in the standards (like RFC 2616) that requires User Agents to give away so much information about a user's platform.

To answer your question, personally I think the privacy benefits that would be gained by users vastly outweighs the curiosity of financial institutions as to which versions of the OS their customers use. Financial institutions would still know the browsers their customers used.
(In reply to Dave Garrett from comment #5)
> [...]
> That being said, removal of an OS version from the UA string has been
> WONTFIXed in bug 414057 already. Because that's already been decided I'm
> going to mark this one as WONTFIX as well.

I respectfully disagree with this WONTFIX logic and am reverting the status in hopes that more action will be taken. Bug 414057 had some key differences, as it:
1) Would have removed the OS version in all cases, whereas this one adds a UI element to make it optional
2) Was MacOS specific.
Status: RESOLVED → UNCONFIRMED
Resolution: WONTFIX → ---
(In reply to jayhenn from comment #9)
> Bug 414057 had some key differences, as it:
> 1) Would have removed the OS version in all cases, whereas this one adds a
> UI element to make it optional

To this, I guess I should reply to one part in comment 0:

(In reply to jayhenn from comment #0)
> I propose that an option be added to restrict or eliminate OS information
> included in the User Agent header. By reducing entropy contained in the
> header, user privacy can be enhanced!

This is logically wrong, at least in the context of bug 572650. (and the same reason I argued that the do-not-track flag makes you easier to track)

The vast majority of users keep defaults, and even if many were to opt-in to removing some or all OS information from their UA string, those people would be differing from the norm. In the population of all UA strings, this option increases the variance amongst users and thus increases the fingerprintability of these users via this method.

Hiding this information by option would only provide a small privacy benefit, and possibly a security benefit by obscuring the fact that certain users are on an obsolete and insecure OS (ex: Windows XP in the future), but even this is arguably not that useful.

If different people were to have varying UA strings because of an option like this, it would probably cause compatibility problems with sites that expect it. It'd really be best to either remove it or not.

> 2) Was MacOS specific.

If this were to be done it should be done equally for all OSes.
One other(In reply to Dave Garrett from comment #10)
> The vast majority of users keep defaults, and even if many were to opt-in to
> removing some or all OS information from their UA string, those people would
> be differing from the norm. In the population of all UA strings, this option
> increases the variance amongst users and thus increases the
> fingerprintability of these users via this method.
> [...]

I would support making the restriction of OS information a default. If a site really needs to figure it out, then they could either ask the user or have a sane default. Ultimately, if functionality is really that important than it shouldn't depend on the unspecified (like the User Agent string).

Also, from your reference to bug 414057, I noticed in comment 6 that several other MacOS browsers do not include OS version information (presumably they function decently):
> No version of OS X to see:
> Opera: Opera/9.50 (Macintosh; Intel Mac OS X; U; en)
> iCab: iCab/4.0 (Macintosh; U; Intel Mac OS X)
> Safari 3.04: Mozilla/5.0 (Macintosh; U; Intel Mac OS X; en-us)
> AppleWebKit/523.10.6 (KHTML, like Gecko) Version/3.0.4 Safari/523.10.6
> 
> IE Mac: Mozilla/4.0 (compatible; MSIE 5.23; Mac_PowerPC)
> Omniweb: Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-US) AppleWebKit/522+
> (KHTML, like Gecko, Safari/522) OmniWeb/v614.0.94258
I concur that it makes sense to remove OS information entirely. OS info materially increases entropy for fingerprinting, but provides only a very small benefit to a few sites which could readily replace this functionality with user-interaction.
Re comment #12:  In other words, owners of Web sites distributing software tailored to different operating systems will all have to redesign their sites.  I really do not think that will happen.  Instead, those sites will direct users to install different browsers.
I'd really suggest not worrying about fingerprintablity in the UA anymore. All the other meaningful issues in the UA and rest of the header have been mostly dealt with, namely addons' junk, obsolete/redundant tokens, language, and the build ID (eventually to be removed fully). We're basically done here, at least when it comes to junk in the header, beautification aside.

If any users actually want an option to remove it, and to actually increase their fingerprintablity in the process, use one of the addons out there that let them customize their UA all they want. For a small fingerprintablity reduction in exchange for breaking a few things I suggest spoofing yourself to be latest stable Firefox on Windows XP or 7 32-bit.

If people insist on more arguing that's not going to go anywhere, then let me save you the trouble. Here's what the debate boils down to:

1) Kill it; it's bad for privacy and fingerprinting and UA sniffing
2) Not that bad; it can be fantastically useful for support and a few other sites
3) Then get it through the user when you actually need it
4) You'd think that were easy, but no, some users have problems with simple things
5) Maybe just restrict it a little bit?
6) Nah, that little bit has some important uses too
7) Ok, it's probably not worth it

If the people who actually are in charge of making such a decision are still against this as they were in the prior bug, then arguing here is just wasted effort. If one of those people would like to ditch the OS version, I think it would be a good idea to file a new concise bug on just that.

Quoting Dan Witte from bug 414057 comment 56:
"This provides useful info, costs five bytes, and is not a significant source of entropy."

Can we go elsewhere and argue about something more important please? ;)
Status: UNCONFIRMED → RESOLVED
Closed: 8 years ago8 years ago
Resolution: --- → WONTFIX
I disagree with your assessment. OS in the UA provides "functionality" which is unexpected for most users, and therefore contravenes our #1 privacy principle (https://mozilla.org/privacy)

According to the close timestamp on the bug 414057, that conversation happened 20 months ago. Since then, plenty has changed, including new research (like https://research.microsoft.com/pubs/80964/sigcomm09.pdf) has come to light. I think that this is the sort of conversation that should be occurring on message boards, not in a bug.
Status: RESOLVED → REOPENED
Ever confirmed: true
Resolution: WONTFIX → ---
(In reply to Tom Lowenthal [:StrangeCharm] from comment #15)
> I disagree with your assessment. OS in the UA provides "functionality" which
> is unexpected for most users, and therefore contravenes our #1 privacy
> principle (https://mozilla.org/privacy)

We are currently here:
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:14.0) Gecko/20120505 Firefox/14.0a2

Following the notion that including information that is not directly and measurably beneficial to user or which could cause privacy issues is absolutely correct and we must follow it to the letter. 
We therefore should, nay must, move to:
User-Agent: Firefox

/s
(In reply to John Drinkwater (:beta) from comment #16)
> We therefore should, nay must, move to:
> User-Agent: Firefox

Sadly, right now, a significant fraction of sites sniff for "Mozilla/5.0" "rv:xx.x)" and "Gecko/xxxxxxxx", so including those has a measurable benefit to users. If we could deal with the fallout, I would wholeheartedly support a UA of "Firefox". Perhaps sites could use media-queries and test for features before using them? If HTML's doctype is now "HTML", the sky is the limit.

Apart from the legacy problem of sites not knowing how to deal with it, what's the technical argument against a UA string of "Firefox"?
Do not be so ready to dismiss the impact on legacy sites.  Instead, have concern for the costs -- both effort and time -- to reprogram such sites.  Since many of those sites are for financial institutions that are required to test their sites, those costs are not negligible.  I suspect that some financial institutions would prefer to block users with Gecko-based browsers instead of reprogramming for a Gecko change that does not benefit them.
I think after the recent PRISM revelations and #torsploit, we should certainly consider implementing this again.
(In reply to David Dahl :ddahl from comment #19)
> I think after the recent PRISM revelations and #torsploit, we should
> certainly consider implementing this again.

The TOR Browser exploit targeted the specific version of the Firefox ESR code base they used, which appears to have been out of date. (possibly with the exploit already fixed in a later version, but don't quote me on that) All target users were using that 3rd party build and as a result the specifics of the UA were fairly meaningless in this situation.

As to future targeting of known vulnerable old versions, yeah, limiting that info would help, as has been said before. That's been the one minor good thing in favor of limiting it. However, just having a decent update system and policy which no longer tolerates users accidentally or defiantly running old insecure versions would be a lot better.

All that being said, this bug is about the OS info, not version. The TOR exploit targeted Windows, which you can just statistically assume, not that it really matters. There's nothing stopping anyone from just trying the different OS' exploits one at a time in order if they have them. Checking the UA for it is just a matter of convenience.

At this point, I think the security/privacy benefits of this particular bug are marginal at best. The real win if the OS info were to be axed would be a drastic reduction in UA sniffing breakage risk for non-Windows OSes. Reducing its precision might be nice but would probably not be that useful and just break stuff as mentioned in older comments.

This bug is old and messy. If you want to reopen the debate over what should be in the UA, please do so, but I would *highly* recommend filing a new bug with a precise suggestion and closing this one or duping it somewhere useful. (ditto for any other bugs in this area that are still open)
(In reply to Dave Garrett from comment #20)
> (In reply to David Dahl :ddahl from comment #19)
> > I think after the recent PRISM revelations and #torsploit, we should
> > certainly consider implementing this again.
> 
> The TOR Browser exploit targeted the specific version of the Firefox ESR
> code base they used, which appears to have been out of date. (possibly with
> the exploit already fixed in a later version, but don't quote me on that)
> All target users were using that 3rd party build and as a result the
> specifics of the UA were fairly meaningless in this situation.
> 

No doubt, these events merely triggered me remembering this bug and that it is important

> This bug is old and messy. If you want to reopen the debate over what should
> be in the UA, please do so, but I would *highly* recommend filing a new bug
> with a precise suggestion and closing this one or duping it somewhere
> useful. (ditto for any other bugs in this area that are still open)

Agreed. I will think about a concrete proposal before filing a bug.
See Also: → 1114475
Another reason pro this bug is security. Exploit toolkits are known to use user agent strings to find their targets. Removing OS information would make their lives harder, firefox users should profit from that. I know that outdated firefox versions shouldn't be a concern to mozilla, but they exist and we can't do much about that. If there is a way to improve security for those users (at least a bit), shouldn't it be done?
no more changes are anticipated to the UA string format as far as I understand
Status: REOPENED → RESOLVED
Closed: 8 years ago4 years ago
Resolution: --- → WONTFIX
Duplicate of this bug: 1495488
I totally disagree with WONTFIX here.
Most of websites will sill be able to detect OS by using JavaScript (navigator.platform property).
Please, don't include OS version to User-Agent string when ResistFingerprint is enabled.
Duplicate of this bug: 1495488
Today Mozilla claim that Firefox block tracker. This is simply untrue. It's trivial to track by fingerprint even with JS disable.

You can try https://amiunique.org/ and see that user agent is a big part of the problem.

For sure, tracker will then use mouse-motion or keyboard typing patterns (ever user type on a keyboard differently) but at least it will make it more costly and this can also be fixed on the OS side.
You need to log in before you can comment on or make changes to this bug.