Closed Bug 761873 Opened 8 years ago Closed 7 years ago

Need a way to identify metrics/marketshare for b2g phones using Fennec UA string

Categories

(Firefox OS Graveyard :: General, defect, P2)

ARM
Linux
defect

Tracking

(blocking-kilimanjaro:+, blocking-basecamp:+)

RESOLVED FIXED
blocking-kilimanjaro +
blocking-basecamp +

People

(Reporter: tchung, Unassigned)

References

Details

(Whiteboard: fixed by bug 777710)

The UA string from fennec will be also used for b2g.  

 Mozilla/5.0 (Android; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0

however, this will pose a problem if we wanted to parse out metric numbers for marketshare and other data on b2g phones.   this is a tracking bug to figure out how can we add settings and instrumentation, without hopefully touching the UA string.
I propose we make a small tweak like

 Mozilla/5.0 (Android like(b2g); Mobile; rv:12.0) Gecko/12.0 Firefox/12.0

as long as it passes the same tests that the original fennec-android string did for site compat.
(In reply to Tony Chung [:tchung] from comment #0)
> The UA string from fennec will be also used for b2g.

Is that decision taken?

I am not convinced of the rightness of B2G saying it's Android.

(In reply to Chris Jones [:cjones] [:warhammer] from comment #1)
> I propose we make a small tweak like
> 
>  Mozilla/5.0 (Android like(b2g); Mobile; rv:12.0) Gecko/12.0 Firefox/12.0

We'd need to check the various standards; I suspect nested brackets might be something to avoid.

Are we sure "B2G" is the string we want to bake into a UA, rather than some future name we will acquire when the rebranding is done?

Gerv
(In reply to Gervase Markham [:gerv] from comment #2)
> (In reply to Tony Chung [:tchung] from comment #0)
> > The UA string from fennec will be also used for b2g.
> 
> Is that decision taken?
> 

It's not written in stone, but forceful arguments will need to be made to the contrary, IMHO :).

> I am not convinced of the rightness of B2G saying it's Android.
> 

Depending on the definition of "rightness", I could either agree or disagree.  But forceful arguments concerning this should be made more publicly, probably on dev-planning.

> (In reply to Chris Jones [:cjones] [:warhammer] from comment #1)
> > I propose we make a small tweak like
> > 
> >  Mozilla/5.0 (Android like(b2g); Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
> 
> We'd need to check the various standards; I suspect nested brackets might be
> something to avoid.
> 

Yep, definitely.

> Are we sure "B2G" is the string we want to bake into a UA, rather than some
> future name we will acquire when the rebranding is done?
> 

Nope, not sure.  "Gonk" might be more appropriate in this context, since b2g technically doesn't denote any particular OS below gecko.
(In reply to Chris Jones [:cjones] [:warhammer] from comment #3)
> It's not written in stone, but forceful arguments will need to be made to
> the contrary, IMHO :).

Don't get me wrong; I'm entirely in favour of almost every aspect of that - it's just the platform denotation that I am querying.

> > I am not convinced of the rightness of B2G saying it's Android.
> 
> Depending on the definition of "rightness", I could either agree or
> disagree.  But forceful arguments concerning this should be made more
> publicly, probably on dev-planning.

OK. I will open up that discussion.

Is it reasonable to scope it by saying that the two options are:

Mozilla/5.0 (Gonk; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
and
Mozilla/5.0 (Gonk, like Android; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
(or some other way of shoehorning the word Android in)

?

> > Are we sure "B2G" is the string we want to bake into a UA, rather than some
> > future name we will acquire when the rebranding is done?
> 
> Nope, not sure.  "Gonk" might be more appropriate in this context, since b2g
> technically doesn't denote any particular OS below gecko.

Right. If this is an OS designator, then "Gonk" is the name we've given the OS we've built.

Gerv
I think any proposal that elides "Android" will need to be very carefully argued wrt the metrics data used to determine the new fennec UA.

I'm personally happy with "Gonk, like Android" or something to that effect.
Do we know the impact if we were to remove "Android" from the UA?
Metrics team has data we can use to quantify, yeah.
Before we start a discussion, we need some data. John Jensen is the man for that. John: can you please do some more of your mobile UA compatibility testing for us? Let me know if you want a dedicated bug filed (and which product/component to use). The strings are:

Mozilla/5.0 (Android; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
(current Android, for control)

Mozilla/5.0 (Gonk; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
(Gonk version)

Mozilla/5.0 (Gonk, like Android; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
(Spoofing version)

Thanks,

Gerv
> John: can you please do some more of your mobile UA compatibility
> testing for us? 

Yes, I can do that. I assume the output is similar to what I produced for bug 588909 and bug 690287. I'll produce something that looks like this:
https://docs.google.com/spreadsheet/ccc?key=0AushOZLFQoR0dGQ0Ry1HYmZGUEg5dXJDYUstS2dwcWc

Look for it in the next day or so.
Hi all, here are some numbers.

Methodology:

For each of the URLs from the top 18,000 sites on the web, download the home page using one of five different user agents. The agent uses the python requests library, which automatically follows any HTTP 302 redirects.

Then, for each possible pair of UAs for the HTML from each site, a difflib statistic was computed -- 0 means HTML delivered was completely different, while 1 means it was the same.

I then averaged the data across all sites to come up with this:

The UAs were:

UA1: Mozilla/5.0 (Android; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
UA2: Mozilla/5.0 (Gonk; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
UA3: Mozilla/5.0 (Gonk, like Android; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
IPHONE: Mozilla/5.0 (iPhone; CPU iPhone OS 5_0 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A334 Safari/7534.48.32
ANDROID: Mozilla/5.0 (Linux; U; Android 4.0; es-es; Tuna Build/IFK77E) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.67
    
Mean difflib scores were:

0.98803282838 ('UA1', 'UA3')
0.97271203903 ('ANDROID', 'IPHONE')
0.971233293632 ('ANDROID', 'UA1')
0.970770961442 ('ANDROID', 'UA3')
0.956681343144 ('IPHONE', 'UA1')
0.956539898413 ('IPHONE', 'UA3')
0.898223443277 ('UA2', 'UA3')
0.896391374783 ('UA1', 'UA2')
0.884978598564 ('ANDROID', 'UA2')
0.875707224963 ('IPHONE', 'UA2')

This means that, for example, the content delivered to UA1 was most similar to that delivered to UA3. The content delivered to UA2 was most dissimilar from that delivered to the iPhone or Android.

Given the above, it appears that from a compatibility perspective having the word "Android" in the UA helps most.

I am happy to provide the raw data or code used to generate it, just ask.

John
Just to throw this out there, as long as we're shoehorning in the name of another platform (Android), should we use iPhone instead? (As in, "Gonk like iPhone".) Does that improve compat at all? Is that any less "right"?
> should we use iPhone instead? (As in, "Gonk like iPhone".)

I'll put that into the mix and run it again.
An important thing to remember about John's results is that a score of 1 for any diff is not always a good thing. 1 does not mean "perfect compatibility", it means "the same HTML". If we get exactly the same as the Android browser or the iPhone browser, and that includes Webkit-specific CSS or iPhone-specific styling, that is actually a worse user experience than getting something that is different to those browsers. A site which properly supports Gecko may well send different HTML to Gecko, Android and iPhone.

John: any reason you used a Spanish Android UA (es-es) rather than en-US? That might skew the results a bit, because the default for many sites will be English. That "Tuna Build" stuff also looks a bit weird, although it seems like different phones stuff all sorts of rubbish into that field, so maybe it doesn't matter.

What is clear from the results is that sending UA 2 gives HTML which is more different than UA1 or UA3 from both Android and iPhone, suggesting that we are probably getting desktop sites. Some sites detect Mobile browsers by sniffing for "Android" or "iPhone", rather than "Mobile". This breaks other browers than us, of course.

The question is, do we fight against this in the name of enabling the Mozilla value of browser (and OS) diversity and choice, or do we give in?

On the question of spoofing bits of the iPhone UA, we had that discussion when we decided the Fennec UA (on which the B2G UA is closely based). There are a number of reasons it's a bad idea, including the possibility of breaking existing working sites where devs have worked to make sure we are supported. If we were going to change that, we'd need better evidence of a significant improvement in compatibility than difflib results, with all the issues they have (see above).

Gerv
Gerv makes some excellent points above.

(In reply to Gervase Markham [:gerv] from comment #13)
> On the question of spoofing bits of the iPhone UA, we had that discussion
> when we decided the Fennec UA (on which the B2G UA is closely based). There
> are a number of reasons it's a bad idea, including the possibility of
> breaking existing working sites where devs have worked to make sure we are
> supported. If we were going to change that, we'd need better evidence of a
> significant improvement in compatibility than difflib results, with all the
> issues they have (see above).

I agree that we shouldn't spoof the iPhone UA. However, it's not clear to me how "like Android" is better than "like iPhone" or some other option that may provide better Web compatibility. Although Gonk is built on Linux and may use some Android drivers I don't know that it is really "like Android" more than "like iPhone".
I assert without evidence but based on a hunch that there is UA-sniffing code out there which looks for "iPhone" and then sends webkit-specific CSS to generate an iPhone-like UI. I also assert without evidence that this problem is less strong for Android, because I understand the Android webkit has fewer extensions, and they have less strong platform UI guidelines. Also, as the iPhone was first to market, many sites may offer "iPhone" and "other"; we want "other".

Of course, these assertions are hard to back up without careful non-automated testing and debugging of sites, so we know what the problem is in each case, if any, and what options the site provides.

I also think there is compat. value in staying as close to the Fennec UA as possible - and it, of course, currently features "Android", although we will be trying to persuade people not to rely on that for exactly the reasons which lead to this bug here.

Gerv
I agree with your points and think your assertions may in fact be right. My point is that we don't at this point have evidence to support one decision over another. (I'm not advocating for using iPhone over Android, just that we consider the options.) John has provided some evidence that we can use with the caveats that you mentioned earlier. 

Given our time frame, I think it is likely that, to some degree, we will have to base this decision on our collective hunches. The best we can do is to support our decision with as much evidence and experience as we can.
(In reply to Tony Chung [:tchung] from comment #0)
> The UA string from fennec will be also used for b2g.  
> 
>  Mozilla/5.0 (Android; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
> 
> however, this will pose a problem if we wanted to parse out metric numbers
> for marketshare and other data on b2g phones.   this is a tracking bug to
> figure out how can we add settings and instrumentation, without hopefully
> touching the UA string.

Having the same UA string for both would actually be an awesome feature in two ways:
1) There is no UA sniffing-related way for Web authors to make a mistake that results in them supporting only one of B2G and Firefox for Android. (If you give authors an opportunity for a mistake, Murphy's Law will take care of mistakes getting made.)
2) There are fewer different UA strings, so UA strings become less unique and there is less fingerprinting surface.

For the purpose of Mozilla gathering statistics (as opposed to a third-party gathering statistics), surely we could expose more information than the UA string in the update and block list pings just like Windows builds include service pack information in the update thing but don't included in the UA string. So as far as Mozilla's own statistics gathering goes, I think it's not absolutely necessary to have different third-party-visible UA strings between B2G in Firefox for Android.

(In reply to Gervase Markham [:gerv] from comment #2)
> I am not convinced of the rightness of B2G saying it's Android.

What does it matter? IE isn't Netscape but says "Mozilla" and Safari isn't using Gecko but says "Gecko". Silk doesn't run on Mac OS X but claims to do so (in the server-assisted config, IIRC). The UA string isn't about truth. It's about what works with sites while still being different enough to shout "Look at me! I have market share, too!" for the statistics.

I think the strongest argument for not saying "Android" at all in the UA string for B2G would be avoiding sites showing prompts saying "Please move off the Web and install our Android app instead!"

(In reply to Gervase Markham [:gerv] from comment #8)
> Mozilla/5.0 (Android; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
> (current Android, for control)
> 
> Mozilla/5.0 (Gonk; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
> (Gonk version)
> 
> Mozilla/5.0 (Gonk, like Android; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
> (Spoofing version)

It would probably be worthwhile to test if the position of the "Android" token matters. That is, whether
Mozilla/5.0 (Android; Gonk; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
is more successful than the three options above.

(In reply to Lawrence Mandel [:lmandel] from comment #16)
> I agree with your points and think your assertions may in fact be right. My
> point is that we don't at this point have evidence to support one decision
> over another. (I'm not advocating for using iPhone over Android, just that
> we consider the options.)

Since Firefox for Android and B2G have the same Gecko engine and, therefore, should have the same Web compatibility characteristics, I think it would make no sense for one of them to have an iPhone-like UA string but the other not to. As for changing them both to have an iPhone-like UA string, I think it would be harmful for evangelism efforts for us to keep changing the UA string for Firefox for Android again and again. (Also, the discussion already happened a while ago.)
(In reply to Henri Sivonen (:hsivonen) from comment #17)
> I think the strongest argument for not saying "Android" at all in the UA
> string for B2G would be avoiding sites showing prompts saying "Please move
> off the Web and install our Android app instead!"

Blimey. I hadn't thought of that one. I keep encountering those sorts of prompts on Android, and they are irritating enough even when I _could_ install the app! And telling people not to sniff for "Android" when trying specifically to detect Android will be a difficult sell - their (perhaps reasonable) response would be "well, stop claiming to be Android, then!".

John Jensen: can you look at the distribution of the difflib scores for the Gonk-only UA (UA 2). Is it the case that a chunk are near-identical and a chunk are quite different (perhaps indicating a desktop site)? Or do the difflib values range evenly? What does the curve look like?

Gerv
> 1 does not mean "perfect compatibility", it means "the same HTML". 
> If we get exactly the same as the Android browser or the iPhone browser, 
> and that includes Webkit-specific CSS or iPhone-specific styling, that is
> actually a worse user experience than getting something that is different to
> those browsers. A site which properly supports Gecko may well send different
> HTML to Gecko, Android and iPhone.

Well, yes, but Gecko has 0.0X% market share, and given that both Android and iPhone both use WebKit, it is much more likely that they would receive similar markup and that Fennec, as an "unknown" mobile browser, would be sent the desktop, or, worse, WAP version. What we are testing here is not the level of Gecko compatibility (that is a separate issue), but whether or not Fennec is getting mobile-ish markup at all.

> John: any reason you used a Spanish Android UA (es-es) rather than en-US? 

Sorry, that was an accidental typo when I entered this into the bug. Actual string was/is "en-us" -- I have doublechecked.

> John Jensen: can you look at the distribution of the difflib scores for the
> Gonk-only UA (UA 2). Is it the case that a chunk are near-identical and a
> chunk are quite different (perhaps indicating a desktop site)? Or do the
> difflib values range evenly? What does the curve look like?

It's difficult to put pretty charts in Bugzilla but here's a summary based
on skewness and kurtosis measures. I've also added the iPhone Mobile Safari UA as well:
Mozilla/5.0 (iPhone; CPU iPhone OS 5_0 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A334 Safari/7534.48.32

Mean = Mean value of all the pairs
CVar = Coefficient of Variation of all the pairs -- for our purposes, lower is better
Skewness = How skewed the distribution is -- probably want a highly negative value, ie nearly all values at the right side of the distribution
Kurtosis = How "pointed" the distribution is -- probably want a highly positive value

You can see that that the tests with UA2 have less negative skewness values and higher CVar -- ie there are more sites that present significantly different markup than that presented to UA1, UA3 and UA4. 

Mean, CVar, Skewness, Kurtosis, Pair, Num
0.986 0.104 -8.598 74.596 ('UA1', 'UA3') 17061
0.971 0.145 -5.621 31.390 ('ANDROID', 'IPHONE') 17050
0.965 0.163 -5.031 24.580 ('ANDROID', 'UA1') 17060
0.965 0.159 -5.021 24.879 ('UA3', 'UA4') 17056
0.964 0.165 -4.971 23.958 ('ANDROID', 'UA3') 17060
0.963 0.165 -4.940 23.895 ('IPHONE', 'UA4') 17047
0.962 0.165 -4.816 22.693 ('UA1', 'UA4') 17060
0.953 0.185 -4.260 17.322 ('ANDROID', 'UA4') 17053
0.950 0.194 -4.050 15.416 ('IPHONE', 'UA3') 17062
0.949 0.197 -4.020 15.117 ('IPHONE', 'UA1') 17071
0.889 0.296 -2.250 3.546 ('UA2', 'UA3') 17076
0.889 0.295 -2.249 3.547 ('UA1', 'UA2') 17077
0.883 0.304 -2.164 3.145 ('UA2', 'UA4') 17070
0.882 0.299 -2.124 3.044 ('ANDROID', 'UA2') 17072
0.872 0.313 -2.000 2.504 ('IPHONE', 'UA2') 17077
I've started a discussion in mozilla.dev.planning about this.

Gerv
blocking-basecamp: --- → +
blocking-kilimanjaro: --- → +
Duplicate of this bug: 726062
bump. The discussion in dev-planning seems to have settled down. We need to make a decision. Is there any other information that is needed before we make the UA decision for B2G?
The discussion has indeed wound down. I suggest the following points are true:

- There's no way other than anecdotal to know how prevalent the problem of "install our app!" 
  popups is. That makes it hard to determine the relative impact of the two problems.

- The evangelism sell is much harder for "I know it says Android, but it's not, and you 
  should handle it specially" than "here's our new thing".

- The open web is a better place choice/innovation-wise if we put effort into getting people 
  to detect "Mobile" rather than in getting people to make a special exception to their 
  Android handling to deal with B2G.

- We already have, and are building, a kick-ass evangelism team. That's not an argument 
  either way, just an observation :-)

I'd also say it's likely that people will keep getting the "Android-but-not-Android" thing wrong as they build new sites, whereas as we evangelise "detect Mobile", that's easier for developers to remember and books and tutorials to pick up.

So I think the balance of evidence is against including the word "Android".

We also need to consider another development. We now know that the final branding of B2G itself will be "Firefox OS". However, that's a brand for the whole system. If we make that the OS indicator, then what happens if people build B2G phones not in collaboration with us (which we want them to do; viva open source!). If we use FirefoxOS as the OS indicator, their phones may then have "Firefox" in the UA even though they don't have Firefox/<version>. Or something. I can see all sorts of confusion that might be caused by having Firefox in both positions, or just one, or the other.

So I think that if we do have an OS indicator, it should be "Gonk", rather than either "B2G" or "Firefox OS".

The last question raised in the discussion in dev.planning is whether we should have an OS indicator at all. Given that we are branding the entire thing as Firefox OS, and so later we may ship something as "Firefox OS" that is not built on Gonk (if we decide to change the underlying technology), then we would want to avoid having to change the User Agent string at that time, risking unnecessary incompatibility with already-working sites and code.

"The web is the platform". For that and the above reasons, I say that we should eliminate the underlying OS indicator.

All of the above means that my view is that we should go for:
Mozilla/5.0 (Mobile; rv:12.0) Gecko/12.0 Firefox/12.0

This is the same as the Fennec UA except that the OS token is not present.

Gerv
(In reply to Gervase Markham [:gerv] from comment #23)
> All of the above means that my view is that we should go for:
> Mozilla/5.0 (Mobile; rv:12.0) Gecko/12.0 Firefox/12.0

This means that if I want to detect / analyze / make stats of what share of how many people access my app on Firefox OS (codenamed "B2G") I need to detect the magic absence of a platform and hope that this will always be the sign of that, right?
The web is the platform, not Firefox OS. (In hindsight, perhaps putting "OS" in the name is a bad move from that perspective.) So why does it matter where the users are coming from? Platform _capabilities_ should be (and are) detectable using feature detection. Why does the name of the platform matter?

I guess it's possible that one potential downside of this choice is that it makes OS marketshare graphs harder work to create (or, alternatively, less accurate). But perhaps they'll just have a "Whatever; who cares" segment of the pie chart :-) But it does mean that B2G on other OS bases, or other Gecko clients, can just say to a site "hey, it's the web - what more do you need?" without us having another big evangelism job.

Gerv
(In reply to Gervase Markham [:gerv] from comment #25)
> The web is the platform, not Firefox OS. (In hindsight, perhaps putting "OS"
> in the name is a bad move from that perspective.) So why does it matter
> where the users are coming from?

It only matters when I want to analyze what my users on my web app or website are, and I want to see how many of those come using which OS or browser - which actually is the original reason that UA strings have been created for. Switching behavior based on those strings was, is and always will be at best unintended use, at worst just wrong or even mailicious-to-the-web usage of them.

Also, as this introduces a new pattern of how our UA strings look (only two comments), all UA string detectors that look at the whole string need to be adjusted for that case.

That said, if we had a platform identifier there, I disagree that it should be "Gonk", as nobody should ever care what the underlying layer is that B2G runs on, if that's a Mer core or the current Gonk stuff or whatever is irrelevant. What's relevant is that it's the OS that runs under the code name of "B2G" or the brand of "Firefox OS". What might be relevant to websites offering installers, etc. is the fact that there's no other kind of application they can offer to be installed there than web apps.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #26)
> Also, as this introduces a new pattern of how our UA strings look (only two
> comments), all UA string detectors that look at the whole string need to be
> adjusted for that case.

Not so; the number of semi-colon-separated parts in the comment is already variable, not standardized at 3. For example, some OSes have two OS-related parts there.

> That said, if we had a platform identifier there, I disagree that it should
> be "Gonk", 

What would you use?

> as nobody should ever care what the underlying layer is that B2G
> runs on, if that's a Mer core or the current Gonk stuff or whatever is
> irrelevant. What's relevant is that it's the OS that runs under the code
> name of "B2G" or the brand of "Firefox OS". What might be relevant to
> websites offering installers, etc. is the fact that there's no other kind of
> application they can offer to be installed there than web apps.

Right. Hence no OS identifier, just the web. All other platforms or OSes which only offer web app support, no matter what their underlying technology, can do the same thing - no OS identifier.

Gerv
(In reply to Gervase Markham [:gerv] from comment #27)
> (In reply to Robert Kaiser (:kairo@mozilla.com) from comment #26)
> > Also, as this introduces a new pattern of how our UA strings look (only two
> > comments), all UA string detectors that look at the whole string need to be
> > adjusted for that case.
> 
> Not so; the number of semi-colon-separated parts in the comment is already
> variable, not standardized at 3. For example, some OSes have two OS-related
> parts there.

Yes, those analyzers already needed adjustments for other changes we did there - those were in UA strings that are seen pretty fast as they are very common on the web, though. The B2G one will go unnoticed for a while for a number of those, I guess (not for mine, I'm planning to adjust it once we have an implementation landing here).

> > That said, if we had a platform identifier there, I disagree that it should
> > be "Gonk", 
> 
> What would you use?

I personally would use "B2G" there for consistency.

In evangelizing,I'd tell people to sniff for "Mobi" if they want to target mobile interfaces (as we are already doing) and only look for the OS (e.g. "Android") where they are specifically targeting e.g. downloads for that one mobile OS only but not other mobile devices. In our UA strings, any OS our code runs on can appear there, be it Android, Maemo/MeeGo/Jolla/Mer/whatever, B2G or something else. I think that's also already what our people might be doing right now.
(In reply to Robert Kaiser (:kairo@mozilla.com) from comment #26)
> It only matters when I want to analyze what my users on my web app or
> website are, and I want to see how many of those come using which OS or
> browser

The rendering engine combined with the absence of an OS token should be informative enough.

> Also, as this introduces a new pattern of how our UA strings look (only two
> comments), all UA string detectors that look at the whole string need to be
> adjusted for that case.

It's not really a new pattern. The number of comment tokens generally isn't stable and the OS token isn't currently at a fixed position. Most "parsers" just do substring matching.

> That said, if we had a platform identifier there, I disagree that it should
> be "Gonk", as nobody should ever care what the underlying layer is that B2G
> runs on, if that's a Mer core or the current Gonk stuff or whatever is
> irrelevant.

An umbrella term would be as irrelevant as a concrete system identifier, except for those interested in tracking market share. Presumably those people will always prefer as much detail as they can get.
Note - Before going forward with this UA, I think we should analyze and test this. I know from the mobile evangelism team's perspective we are definitely looking understand the state of the world for evangelism issues of sites on FF for Android vs. this new UA.
jsmith: what sort of testing are you hoping to get done, and on which of the choices we are facing? 

The problem we have is that we can test for the "upside" of including the string "Android" (increased difflib scores) but we can't test for the downside - "Install our App!" popups. So it's very hard to use metrics to make that particular decision.

If you mean include "Gonk" (or "B2G") vs. no OS identifier at all, then sure, we could do difflib testing on that. But I don't expect to see much difference.

Gerv
FWIW I am surprised that so much of this discussion is circling around the issue of "install this app!" popups. I know most of us find this annoying, but I am not sure most users do. From a user experience point of view it often makes a lot of sense to install a native app for a frequently-visited site, and thus to be prompted to install one when visiting the site via a browser.

More importantly, I think an argument can be made that these popups will fade away as a) mobile web-based platforms become more performant and available and b) these popups become less effective.

As for the difflib testing, I can easily duplicate my difflib smoketests using "Gonk", "B2G" and "" (nothing) OS identifiers. I will put that on my list for this week.
(In reply to John Jensen from comment #32)
> FWIW I am surprised that so much of this discussion is circling around the
> issue of "install this app!" popups. I know most of us find this annoying,
> but I am not sure most users do. From a user experience point of view it
> often makes a lot of sense to install a native app for a frequently-visited
> site, and thus to be prompted to install one when visiting the site via a
> browser.

The prompts are for native Android apps. Firefox OS won't support these.
(In reply to John Jensen from comment #32)
> From a user experience point of view it
> often makes a lot of sense to install a native app for a frequently-visited
> site, and thus to be prompted to install one when visiting the site via a
> browser.

Yes, but it doesn't make any sense to be prompted to install an Android app when you are on Firefox OS, which doesn't support Android apps. That's the problem.

> More importantly, I think an argument can be made that these popups will
> fade away as a) mobile web-based platforms become more performant and
> available and b) these popups become less effective.

It would be nice if that were so, but when we are launching the first such platform, it's not where we are now.

> As for the difflib testing, I can easily duplicate my difflib smoketests
> using "Gonk", "B2G" and "" (nothing) OS identifiers. I will put that on my
> list for this week.

Great :-) When using "nothing", make sure you don't have a stray semicolon preceding the "Mobile" token.

Gerv
Just a note -- this data collection is running now -- 649,000 different combinations so it will take 1-2 days.
Here's the results with the additional UAs (UA5, UA6, and UA7). All three were not as good as those with "Android" or "iPhone" in the string. The columns in the table below are (loosely) defined in comment 19 above.

'UA1':'Mozilla/5.0 (Android; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0'
'UA2':'Mozilla/5.0 (Gonk; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0'
'UA3':'Mozilla/5.0 (Gonk, like Android; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0'
'UA4':'Mozilla/5.0 (Gonk, like iPhone; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0'
'UA5':'Mozilla/5.0 (Gonk; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0'
'UA6':'Mozilla/5.0 (B2G; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0'
'UA7':'Mozilla/5.0 (Mobile; rv:12.0) Gecko/12.0 Firefox/12.0'
'IPHONE':'Mozilla/5.0 (iPhone; CPU iPhone OS 5_0 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A334 Safari/7534.48.32'
'ANDROID':'Mozilla/5.0 (Linux; U; Android 4.0; en-us; Tuna Build/IFK77E) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.67'

Mean, CVar, Skewness, Kurtosis, Pair, Num
0.963 0.168 -5.005 24.246 ('ANDROID', 'IPHONE') 17019
0.960 0.178 -4.716 21.294 ('ANDROID', 'UA1') 17011
0.959 0.179 -4.641 20.729 ('IPHONE', 'UA4') 16987
0.958 0.183 -4.542 19.643 ('ANDROID', 'UA3') 17009
0.947 0.203 -3.947 14.500 ('IPHONE', 'UA3') 16993
0.947 0.202 -3.981 14.750 ('IPHONE', 'UA1') 17010
0.947 0.202 -3.977 14.813 ('ANDROID', 'UA4') 17009
0.873 0.316 -2.035 2.649 ('ANDROID', 'UA7') 17034
0.872 0.317 -2.027 2.610 ('ANDROID', 'UA6') 17028
0.872 0.317 -2.023 2.597 ('ANDROID', 'UA5') 17034
0.872 0.317 -2.022 2.592 ('ANDROID', 'UA2') 17039
0.867 0.325 -1.948 2.277 ('IPHONE', 'UA7') 17007
0.866 0.325 -1.943 2.262 ('IPHONE', 'UA5') 17015
0.866 0.325 -1.943 2.261 ('IPHONE', 'UA2') 17017
0.866 0.325 -1.942 2.256 ('IPHONE', 'UA6') 17007
John: is UA 2 identical to UA 5? If so, how are their results a tiny bit different?

The difflib results divide the UAs into 2 groups. There are those which spoof another OS (1, 3 and 4) and those which don't (2, 5, 6 and 7). The variance within each group is not great, apart from showing us the (unsurprising :-) result that pretending to be an iPhone makes you a little bit more likely to get the same content as the iPhone, and the analogous result for pretending to be Android.

Of the 2nd group, leaving out the OS entirely ranks a tiny bit better than the other three, but probably not statistically significantly. Still, it's useful to know that this option does not change the difflib analysis. It's also useful to know (although perhaps obvious) that if we invent a new OS name, it doesn't matter what it is. (Although this result might not hold if we decide to insert "Firefox OS" in that slot.)

However, the results still don't give us any further data on the questions:

- Do we always want the same content as Android or iPhone?
- What content do we get served when we get the same stuff, and does it appear broken?
- What content do we get served when we get different stuff, and does it appear broken?
- How common is the problem you get when spoofing Android of saying "please install our Android app"?

We are still attempting to trade off two problems, one of which we can't measure well. So I think the analysis in comment 23 still applies.

Gerv
(In reply to Gervase Markham [:gerv] from comment #37)
> John: is UA 2 identical to UA 5? If so, how are their results a tiny bit
> different?

Yes. The two are only slightly different, no doubt because a) minor changes
in the HTML returned (eg URLs to advertisements) and b) the crawler
probably had trouble returning data from some sites. You can see that the 
counts for each tested pair (the last column of each row) are slightly
different:

Mean, CVar, Skewness, Kurtosis, Pair, Num
0.872 0.317 -2.023 2.597 ('ANDROID', 'UA5') 17034
0.872 0.317 -2.022 2.592 ('ANDROID', 'UA2') 17039
0.866 0.325 -1.943 2.262 ('IPHONE', 'UA5') 17015
0.866 0.325 -1.943 2.261 ('IPHONE', 'UA2') 17017

> (Although this result might not hold if
> we decide to insert "Firefox OS" in that slot.)

That is a good point...I wonder if sites sniff for "Firefox" instead
of "Gecko" sometimes?

> However, the results still don't give us any further data on the questions:
> 
> - Do we always want the same content as Android or iPhone?

On the whole, given the choice between desktop, WAP, or Android/iPhone
content, we probably want the latter. But I agree that is not always
the case.

> - What content do we get served when we get the same stuff, and does it
> appear broken?
> - What content do we get served when we get different stuff, and does it
> appear broken?

These questions are being partly addressed by the efforts of the QA team
who are looking at Fennec compatibility issues from a UA sniffing, CSS,
and DOM perspective. See http://arewecompatibleyet.com for example.

> - How common is the problem you get when spoofing Android of saying "please
> install our Android app"?

How generally is this done? An alert() box? I very rarely see them in my 
own mobile browsing. I can change the crawler to look for this sort of thing
I suppose.

John

> 
> We are still attempting to trade off two problems, one of which we can't
> measure well. So I think the analysis in comment 23 still applies.
> 
> Gerv
>> (Although this result might not hold if
>> we decide to insert "Firefox OS" in that slot.)
>
> That is a good point...I wonder if sites sniff for "Firefox" instead
> of "Gecko" sometimes?

Ouch! That sounds a lot like bug 334967 and we have been advising sites to *not* do that.  "Firefox OS" is a bit different than "Firefox" but the communication might be tricky.
(In reply to chris hofmann from comment #39)
> >> (Although this result might not hold if
> >> we decide to insert "Firefox OS" in that slot.)
> >
> > That is a good point...I wonder if sites sniff for "Firefox" instead
> > of "Gecko" sometimes?
> 
> Ouch! That sounds a lot like bug 334967 and we have been advising sites to
> *not* do that.

This is off-topic for this bug. All of the proposed strings include the Firefox token.
(In reply to John Jensen from comment #38)
> On the whole, given the choice between desktop, WAP, or Android/iPhone
> content, we probably want the latter. But I agree that is not always
> the case.

Getting desktop content (which typically isn't functionally broken in Firefox even if the layout is unideal for a small screen) isn't nowhere nearly as bad, IMO, as getting WAP content, WebKit-specific content that's broken in Gecko or content that says to B2G users "please install our Android/iPhone app".

Did the UA string group without "Android" or "iPhone" generally get Firefox-compatible desktop content or WAP content?
Henri: how could one determine that? Look for the WAP MIME type on the returned content?

It's not easy to tell whether one gets mobile or desktop content, but it might be possible to detect WAP stuff.

Gerv
(In reply to Gervase Markham [:gerv] from comment #42)
> Henri: how could one determine that? Look for the WAP MIME type on the
> returned content?

I was thinking of eyeballing through screenshots.

> It's not easy to tell whether one gets mobile or desktop content, but it
> might be possible to detect WAP stuff.

"WAP" stuff hasn't been original WAP for a long time but instead dumbed down text/html, so you can't really tell from the MIME type. It might be possible to make a "not desktop" guess for pages that use a doctype for one of the "mobile" XHTML profiles, though, but it's unclear how much that would help, since not all "not desktop" pages use those.
(In reply to John Jensen from comment #38)
> > (Although this result might not hold if
> > we decide to insert "Firefox OS" in that slot.)
> 
> That is a good point...I wonder if sites sniff for "Firefox" instead
> of "Gecko" sometimes?

They do; but all proposed UAs say Firefox, and I don't think anyone's making an argument for a change there.
 
> > - How common is the problem you get when spoofing Android of saying "please
> > install our Android app"?
> 
> How generally is this done? An alert() box? I very rarely see them in my 
> own mobile browsing. I can change the crawler to look for this sort of thing
> I suppose.

I've seen it done via a <div> overlay. Alert boxes are so 1990s ;-) Seriously, I'm pretty sure you won't be able to find a heuristic which detects these. It's all just content. You can take that as a challenge if you like ;-)

Gerv
Putting Android in the UA string has also caused bug 777633, which is another example of a class of bugs (serving Android intents) which saying "Android" can cause.

I have made a decision, but this bug has morphed; I have opened bug 777710 to track the change of UA string.

Gerv
Sounds good. Should we close this bug since the decision is made and the problem originally in this bug is solved? We can track the implementation work in bug 777710.
Gerv, cjones, can you comment if we can resolve this bug now that bug 777710 landed?  or is there more work specifically for b2g builds to pick up?
tchung: you filed it; are your concerns addressed? :-)

Gerv
Priority: -- → P2
Andreas - This bug is about determining what to use for the B2G UA. Seems like a P1 to me.
Renom if you think we can't ship a v1 without this.
blocking-basecamp: + → ---
(In reply to Andreas Gal :gal from comment #50)
> Renom if you think we can't ship a v1 without this.

I'd generally suggest that if you disagree with a "blocking-basecamp" flag to please set the basecamp flag to a ? mark again for re-triage. Don't unnom it, as we'll lose track of bugs. If we lose that nomination flag, we're bound to thrash on confusion of we originally made decisions on, some of which were very recent.

Different teams also make use of the priority flag in different ways, so I'd find another way to flag "soft-blockers."

In the context of this bug - making a decision on a user agent for a phone is critical, as our evangelism group needs to know exactly what we need to advertise to tier 1 partners. I need more context why you don't think this blocks.
blocking-basecamp: --- → ?
b2g can now be identified as it doesn't claim to be Android anymore.
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Whiteboard: fixed by bug 777710
blocking-basecamp: ? → +
[Adding this based on a discussion in the b2g mailing list.]

> the real questions are around (for each option) how much
> bustage
> you get, how severe that bustage is, 

Here's another take on "bustage", using the same dataset as I used to prepare a broader analysis of the impact of various proposed UAs in ticket https://bugzilla.mozilla.org/show_bug.cgi?id=761873 . (I'll also post this to the ticket for completeness' and posterity's sake).

The different User Agent strings tested against the top 17,000 global websites were:

UA1: Mozilla/5.0 (Android; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
UA2: Mozilla/5.0 (Gonk; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
UA3: Mozilla/5.0 (Gonk, like Android; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
UA4: Mozilla/5.0 (Gonk, like iPhone; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
UA5: Mozilla/5.0 (Gonk; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
UA6: Mozilla/5.0 (B2G; Mobile; rv:12.0) Gecko/12.0 Firefox/12.0
UA7: Mozilla/5.0 (Mobile; rv:12.0) Gecko/12.0 Firefox/12.0

(yes, UA2 == UA5; there were a number of iterations of this and I accidentally performed a run with two identical UAs).

The User Agent string that ended up being selected is/was UA7.

Each of these were tested against the "reference" UA strings -- those provided by the stock Android and iOS browsers:

IPHONE: Mozilla/5.0 (iPhone; CPU iPhone OS 5_0 like Mac OS X) AppleWebKit/534.46 (KHTML, like Gecko) Version/5.1 Mobile/9A334 Safari/7534.48.32
ANDROID: Mozilla/5.0 (Linux; U; Android 4.0; en-us; Tuna Build/IFK77E) AppleWebKit/534.30 (KHTML, like Gecko) Version/4.0 Mobile Safari/534.67

To assess similarity, Python's difflib.SequenceMatcher.quick_ratio() was used to assess the similarity of content sent by a site to a crawler that requested the home page using each of these UAs. For example, a site that sent the exact same content to a crawler using UA1 that it did to a crawler using the Android browser UA would have a score of 1.00 for that pair. If it had nothing in common then the score would be 0.00, etc.

For the purposes of this discussion, I assigned "bustage" as "getting a score substantially lower than 1.00 when comparing content sent to a proposed UA and that sent to the iPhone or Android UAs". I chose 0.75 as a conservative cutoff figure, after looking at a number of screenshots and looking at the data. The median scores for "busted" sites were usually much lower (in the range of 0.1 to 0.4).

Here are the results. Columns are:

UA: Proposed UA string
Ref: "Reference" UA (IPHONE or ANDROID)
Num: Number of "busted" sites (ie score less than 0.75)

UA  Ref     Num
UA1 ANDROID 812
UA1 IPHONE  1099
UA1 AVERAGE 955.5

UA2 ANDROID 3055
UA2 IPHONE  3188
UA2 AVERAGE 3121.5

UA3 ANDROID 874
UA3 IPHONE  1119
UA3 AVERAGE 996.5

UA4 ANDROID 1103
UA4 IPHONE  825
UA4 AVERAGE 964.0

UA5 ANDROID 3052
UA5 IPHONE  3188
UA5 AVERAGE 3120.0

UA6 ANDROID 3040
UA6 IPHONE  3183
UA6 AVERAGE 3111.5

UA7 ANDROID 3031
UA7 IPHONE  3172
UA7 AVERAGE 3101.5

From the above, the UAs that performed the best according to this methodology ("breaking" the fewest number of domains) were UA1, UA3, and UA4. The others "broke" approximately 3x more sites.
(In reply to John Jensen from comment #53)
> From the above, the UAs that performed the best according to this
> methodology ("breaking" the fewest number of domains) were UA1, UA3, and
> UA4. The others "broke" approximately 3x more sites.

I think we heard that message already in the comments you made earlier. And still, this doesn't and can't take into account any "download our Android app" and "here's this special iPhone app for you" buttons, or redirects to the app, like YouTube is using, or any other content specifically tailored to which phone OS you are using. We explicitly do not want such content on Firefox OS.
You need to log in before you can comment on or make changes to this bug.