Closed Bug 71569 Opened 24 years ago Closed 9 years ago

[META] User-Agent string tracking bug

Categories

(Core :: Networking: HTTP, defect, P5)

defect

Tracking

()

RESOLVED FIXED
Future

People

(Reporter: dbaron, Assigned: dbaron)

References

()

Details

(Keywords: meta)

This is a tracking bug to track issues related to the User-Agent string,
which is sent via HTTP and also accessible through JS through the
|navigator| object.

The current UA-string spec is at
http://mozilla.org/build/revised-user-agent-strings.html

The current UA-string code is in
http://lxr.mozilla.org/mozilla/ident?i=InitUserAgentComponents

Navigator object code lives in
(old)http://lxr.mozilla.org/mozilla/source/dom/public/idl/base/Navigator.idl
(new)http://lxr.mozilla.org/mozilla/source/dom/public/idl/base/nsIDOMNavigator.idl
http://lxr.mozilla.org/mozilla/ident?i=NavigatorImpl
Status: NEW → ASSIGNED
Keywords: meta
Priority: -- → P5
Target Milestone: --- → Future
For quick-ref info: UI to override U-A string in bug 46029; UI to edit all prefs
in bug 17199.
When the user agent spec was originally hammered out in the .netlib group the 
PrereleaseVersion consisted of a unique(ish) searchable token followed by the 
actual version. Otherwise it can be quite difficult to pull this version out of 
the middle of the comment string in theory given the optional and potentially 
suppressed (localization and CPU) bits in the future.  Ekrock suggested "prv:" 
(pre-release version) as it'd be pretty nearly unique.

I don't think we'll ever be able to change Mozilla/5.0 to anything else. We 
can't bump it back to match the actual Mozilla version (conflicts with old 
Navigators) and we know that at least some sites are looking for 5.0 
specifically and might be thrown by a more specific version. We could leave a 
permanent "5.0" just as IE 5 remains "Mozilla/4.0 (compatible" with the real 
version later in the string.

Given that, a searchable marker token is even more important. I suggest a 
slight change from ekrock's suggestion, "mrv:" for Mozilla release version. It 
doesn't really matter what it is though, as long as it's something unique 
that's highly unlikely to appear elsewhere in the user agent; the mnemonic 
value of something like mrv: is better than something completely random.

The Gecko comment section was removed long ago, but it might actually have 
important value for branded versions like Netscape. When Netscape shipped 6.0 
the user Agent contained "Gecko/20001108". That was a lie, though, it was not 
based on a November Gecko, it was based on Gecko/20000922 -- 11/08 was merely 
the build date. The 6.01 user agent was even more wrong.

We're not going to stop people like Netscape from wanting to embed their build 
date in the user-agent string, but we should push that when we go on a branch 
that the Gecko date be the real Gecko date, and the build date be elsewhere. We 
could raise a stink and make Netscape put that in their own Vendor comment, but 
in fact mozilla.org has a similar problem since technically they ship milestone 
builds off a short branch. The mozilla.org branches are so small that no one is 
likely to care, but it is technically the wrong Gecko date in the user agent, 
and if we come up with a generic mechanism that mozilla.org uses (the Gecko 
comment?) then it'll be that much easier to get commercial vendors to do the 
right thing.

Of course this completely conflicts with the desire to keep the user agent 
short.
Rather than a Gecko comment to indicate branch-ness it could also be done 
fairly easily as a subversion. Netscape 6.0 branched from 20000922 and shipped 
on 20001108 so the Gecko version could be 20000922.47 (build date 47 days after 
the branch), or perhaps to make the math easier 20000922.186 (922+186 = 1108, 
the real build date).

If people are going to want to do math with the version as a whole to determine 
which version is larger then we could also consider zero-padded variants of the 
above. Three digits (enough for almost three years) might be sufficient in the 
first case, but the second case would really require five because I can easily 
imagine a stable branch lasting more than a year, all four digits would give 
us. Visually the first two examples would thurn into  20000922.047 and 
20000922.00186 -- a bit bulky, especially considering how short most milestone 
branches are and how rare long-lived branches.
This is a tracking bug.
(What I meant by my previous comment was that if you actually want me (or
someone else :-) to do something about these issues, they would be better
off in a separate bug, since I wasn't planning to look at this bug except
when I wanted to find / track user-agent issues.)
Depends on: 150351
Adding the following bugs to this tracking bug.

Bug 46029 - [RFE] debug-only GUI for multiple User-Agent prefs like in Opera
Bug 65764 - New mozilla user-agent string
Bug 80658 - [RFE] Site specifig User-Agent
Bug 92716 - Need way to completely disable user-agent header
Bug 102042 - Separate the useragent used by http from the real one seen by plug-ins
Bug 115773 - navigator.appVersion doesn't change when spoofing useragent
Bug 129038 - mozilla has "U;" in useragent when PSM is not installed

Apologies if these do not belong, or if I added them incorrectly
Depends on: 168778
Adding [META] to summary.
Summary: User-Agent string tracking bug → [META] User-Agent string tracking bug
Depends on: 83376
QA Contact: tever → networking.http
The EFF has a nice site that shows how unique a user-agent string is:
http://panopticlick.eff.org/

Please take privacy seriously.
http://www.theregister.co.uk/2010/05/17/browser_fingerprint/

Ran test on http://panopticlick.eff.org/ to find out my browser leaves a unique
fingerprint in 900,000+ samples they have received so far.
(In reply to comment #9)
> http://www.theregister.co.uk/2010/05/17/browser_fingerprint/
> 
> Ran test on http://panopticlick.eff.org/ to find out my browser leaves a unique
> fingerprint in 900,000+ samples they have received so far.

Yes, but what does that fingerprint say about you? I took the same test, and it said that I'm using the SeaMonkey 2.1a1pre nightly for Linux dated 2010-05-10 (like one browser in 511,477), that I have a certain combination of HTTP_ACCEPT values, a certain combination of installed plugins and a certain combination of installed fonts (each of the three separately like exactly one browser in 1,022,954) and that the rest of my identifying data isn't particularly newsworthy (between one in 1.22 for cookies, which I had set at "allow for session", and one in 37.53 for screen size and color depth). So what? My user-agent string, even though pretty uncommon, is not even the most "unique" thing about me (and in normal circumstances it changes every day, except that at the moment I'm waiting for the new addons manager UI to stabilize); and I'm not sure how useful it is to broadcast my installed fonts and plugins all over the web, but I have other things to worry about.
P.S. I took it again with the standard version of Konqueror which comes with openSUSE 11.2, and that site hadn't even seen its user-agent string yet, not to mention the rest. I guess if you really want to look anonymous, get the second-latest Micro$oft OS (Vista, at the moment), don't set any user preferences, and in that case you'll probably look "just anyone".
(In reply to comment #11)
> P.S. I took it again with the standard version of Konqueror which comes with
> openSUSE 11.2, and that site hadn't even seen its user-agent string yet, not to
> mention the rest. I guess if you really want to look anonymous, get the
> second-latest Micro$oft OS (Vista, at the moment), don't set any user
> preferences, and in that case you'll probably look "just anyone".

But if you do any Security updates the second-latest Micro$oft OS, you will have a unique fingerprint. 

Therefore even if your ISP changes the IP address, frequently, a site will still be able to fingerprint you (even after deleting cookies etc.).

Their paper: http://panopticlick.eff.org/browser-uniqueness.pdf

states:

"The obvious solution to this problem would be to make the version numbers
less precise. Why report Java 1.6.0 17 rather than just Java 1.6, or DivX Web
Player 1.4.0.233 rather than just DivX Web Player 1.4?"
Edit: Typos/Grammar

(In reply to comment #11)
> P.S. I took it again with the standard version of Konqueror which comes with
> openSUSE 11.2, and that site hadn't even seen its user-agent string yet, not to
> mention the rest. I guess if you really want to look anonymous, get the
> second-latest Micro$oft OS (Vista, at the moment), don't set any user
> preferences, and in that case you'll probably look "just anyone".

But if you do any Security updates to your second-latest Micro$oft OS, you will
get a unique fingerprint. 

Even if your ISP changes the IP address, frequently, a site will
still be able to fingerprint you (even after deleting cookies etc.).

Their paper: http://panopticlick.eff.org/browser-uniqueness.pdf

states:

"The obvious solution to this problem would be to make the version numbers
less precise. Why report Java 1.6.0 17 rather than just Java 1.6, or DivX Web
Player 1.4.0.233 rather than just DivX Web Player 1.4?"
Depends on: 567679
Depends on: 577994
No longer depends on: 577994
Depends on: 625238
No longer depends on: 55366
Status: ASSIGNED → RESOLVED
Closed: 9 years ago
Resolution: --- → FIXED
Depends on: 1520419
You need to log in before you can comment on or make changes to this bug.