Closed Bug 782453 Opened 12 years ago Closed 12 years ago

Add site-specific User Agent infrastructure and use it to fix AOL Mail

Categories

(Core :: Networking: HTTP, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla18
Tracking Status
firefox17 --- fixed

People

(Reporter: dao, Assigned: dao)

References

Details

(Keywords: dev-doc-needed)

Attachments

(1 file, 6 obsolete files)

Attached patch patch (obsolete) — Splinter Review
For maximum flexibility, I think we should do this on a per application basis. Mobile already did something similar for youtube (removed in bug 778561, since youtube has been fixed).

I tried to contact AOL, pointing to bug 778408, but got no response so far. When they react, we can remove the workaround (while keeping the infrastructure for other sites with problems). I don't think we gain much by waiting till the last minute before applying such workarounds.
Attachment #651562 - Flags: superreview?(gerv)
Comment on attachment 651562 [details] [diff] [review]
patch

Firstly, I do think there's value in fixing stuff late on, as it allows us to put more pressure on the site if the webmasters can see it doesn't work. But I also agree that every system should have at least one test case, and a site which has been really non-responsive is a good choice.

Secondly, I think we should instead use a pref-based system, e.g.:

general.useragent.override.com.aol = "<insert UA here>"

My Gecko-fu is weak and old, but we used to have a PrefBranch system which would do this sort of thing. The idea would be that you split the UA up into parts, and then if the first part matches a branch, look for first+second, and so on. If you hit a leaf, use that string as the UA. This allows us to put in place and remove overrides for arbitrary DNS scopes without code changes (so we could do it e.g. in an update to the quick-patching addon). 

For bonus points, allow the value to be a s/// regexp which you run on the current UA; so you could either do:

general.useragent.override.com.aol = "Mozilla/5.0 (X11; Linux i686; rv:17.0) Gecko/17.0 Firefox/17.0"

or

general.useragent.override.com.aol = "s!Gecko/[\d\.]+!Gecko/20100101!"

Gerv
Attachment #651562 - Flags: superreview?(gerv) → superreview-
Attached patch patch v2 (obsolete) — Splinter Review
Attachment #651562 - Attachment is obsolete: true
Attachment #651635 - Flags: superreview?(gerv)
Component: General → Networking: HTTP
Product: Firefox → Core
(In reply to Gervase Markham [:gerv] from comment #1)
> Firstly, I do think there's value in fixing stuff late on, as it allows us
> to put more pressure on the site if the webmasters can see it doesn't work.

There's little value because applying the override doesn't harm the web at large. Doing it late is risky as it reduces manual testing time; a site may have remaining issues that become visible after applying the override.
Attached patch patch v2 (obsolete) — Splinter Review
I removed the constraint that a domain must contain a dot.
Attachment #651635 - Attachment is obsolete: true
Attachment #651635 - Flags: superreview?(gerv)
Attachment #651637 - Flags: superreview?(gerv)
Comment on attachment 651637 [details] [diff] [review]
patch v2

This seems much more like it - thank you :-)

I'm not the right guy to be giving this code proper code review, of course.

I picked "!" as the separator without too much thought. Ideally, we'd use "/", but it appears far too often in UA strings. We need something which doesn't, given that we are just using "split". "!" is a common alternative choice. Does anyone have any comments on my choice?

Nit: I'd expect there to be a final "!", to match normal regexp definition syntax, at least in Perl. Then:
  let [search, replace] = val.substring(1).split("!", 2);

Does channel.URI.host return the hostname in Punycode form, for IDNs? I think we need to use a call which does.

gData needs to have a more descriptive name.

How about some tests? :-)

Another thought: we'll need some lightweight policy about when we deploy this mechanism, and for what sites. I don't want the list to grow without bound.

Gerv
Attachment #651637 - Flags: superreview?(gerv) → superreview+
(In reply to Gervase Markham [:gerv] from comment #5)
> I picked "!" as the separator without too much thought. Ideally, we'd use
> "/", but it appears far too often in UA strings. We need something which
> doesn't, given that we are just using "split". "!" is a common alternative
> choice. Does anyone have any comments on my choice?

http://www.useragentstring.com/pages/All/ lists "!" a couple of times but "#" only once. "§" would be another option. We could also use multiple characters.

> Nit: I'd expect there to be a final "!", to match normal regexp definition
> syntax, at least in Perl. Then:
>   let [search, replace] = val.substring(1).split("!", 2);

It's not normal regexp definition syntax, e.g. you can't escape "!". I basically just need a separator. It might even make sense to even drop the leading delimiter.

> Does channel.URI.host return the hostname in Punycode form, for IDNs? I
> think we need to use a call which does.

URI.asciiHost should do the trick.
§ seems a bit esoteric. I'm happy to stick with !, given that the UAs which use it are utterly irrelevant, but would also accept #.

The question about which delimiters to have is one of those "uncanny valley" questions - if it's not precisely regexp syntax, do you make it as close as possible so people will understand it, or do you make it different to reduce the risk of people hitting the gaps in your coverage? Given the lack of likelihood of people wanting to use or escape !, I'd go for the former, and use !from!to!. 

Gerv
Any delimiter but / is probably too exotic for it to help most people recognize the first part as a regular expression...
(In reply to Dão Gottwald [:dao] from comment #6)
> http://www.useragentstring.com/pages/All/ lists "!" a couple of times but
> "#" only once. "§" would be another option. We could also use multiple
> characters.

"|" is also unused.
But | is a character commonly used in regexps for alternation. If we use it as the separator, it can't be used for that.

Gerv
True. Another thing that come to my mind with regards to leading and trailing delimiters is that trailing flags are also not supported.
Yep, good point. OK, let's go for a single separator, of "!".

Gerv
Hang on, that would make "presence of a ! anywhere" as the regexp marker, differentiating it from a straight UA. Which is a bit obscure. Aargh.

This is serious bikeshedding :-)

Let's stick with what we have now! Ideally, it would detect a stray trailing ! and ignore it, but I think that'll happen anyway. Or you could still switch to:
  let [search, replace] = val.substring(1).split("!", 2);

Gerv
(In reply to Gervase Markham [:gerv] from comment #13)
> Hang on, that would make "presence of a ! anywhere" as the regexp marker,
> differentiating it from a straight UA. Which is a bit obscure. Aargh.

Think of it as a separator between two pref value parts rather than an explicit regexp marker. (! is a rather unusual separator, though, so I'm leaning towards #.)
Attached patch patch v3 (obsolete) — Splinter Review
Not sure how to approach tests here...
Attachment #651637 - Attachment is obsolete: true
I was thinking of a simple modification to the C++ code that builds the UA string and a global pref for turning off all site-specific hacks (the Safari way). That sort of thing should be enough for desktop (considering that it is for Chrome and Safari), but *maybe* we need a more generic system for mobile if the problem is wider on mobile. Does this patch work for B2G?

(In reply to Gervase Markham [:gerv] from comment #1)
> (so we could do it e.g. in an update to the quick-patching addon). 

How do hotfix add-ons interact with prefs? Do they write prefs or add more fallback values? What ensures that the pref gets cleaned up when no longer needed?
(In reply to Henri Sivonen (:hsivonen) from comment #16)
> I was thinking of a simple modification to the C++ code that builds the UA
> string

That's not necessarily simple. nsHttpHandler.cpp would need to know about the host...

> That sort of thing should be enough for desktop (considering that it
> is for Chrome and Safari), but *maybe* we need a more generic system for
> mobile if the problem is wider on mobile. Does this patch work for B2G?

b2g/chrome/content/shell.js should be able to invoke the module just like nsBrowserGlue.js is doing in my patch. Prefs would be set in b2g/app/b2g.js.

> (In reply to Gervase Markham [:gerv] from comment #1)
> > (so we could do it e.g. in an update to the quick-patching addon). 
> 
> How do hotfix add-ons interact with prefs? Do they write prefs or add more
> fallback values? What ensures that the pref gets cleaned up when no longer
> needed?

I believe hotfix add-ons could write user values and manually clear them when being uninstalled or add default prefs that would automatically disappear with the add-on. Anyway, I'm pretty sure there are ways to handle this, as modifying prefs was a prime use case for hotfix add-ons; this question is neither specific to this bug nor new.
Attachment #651807 - Flags: review?(bzbarsky)
Henri is right that we need a global disable switch for these changes. QA people looking to see if sites are fixed will certainly want to set this. And we may actually want to set it permanently in Nightly and Aurora (but that's a decision for another time).

general.useragent.overrides_enabled = true (default)

Gerv
Comment on attachment 651807 [details] [diff] [review]
patch v3

can you make a silly offer of proof that shows this is trivially cheap for hosts that are not aol.com?
The overhead will depend on the number of overrides due to the need to match the hosts.
Right. If we start needing a large number of these rules (and I sincerely hope we will not) then we should switch to a hash table rather than string matching. But this is the simple solution if the number of rules is small.

Gerv
what's the overhead of the patch as proposed (with just the aol entry) for something that doesn't match?
These three function calls:

+  let channel = aSubject.QueryInterface(Ci.nsIHttpChannel);
+  let host = channel.URI.asciiHost;
+  let hostLength = host.length;

plus a string equality check and, in most cases, a lastIndexOf per entry.

Actually, as of Firefox 17, we could use string.endsWith() (bug 772733): 

+    if (host.endsWith(domain) &&
+        (hostLength == domainLength ||
+         host[hostLength - domainLength - 1] == ".")) {
+      channel.setRequestHeader("User-Agent", gOverrides[domain], false);
+      break;
+    }

That should be even faster. Micro-optimization :-)

Gerv
Attachment #651807 - Flags: review?(bzbarsky)
Attached patch patch v4 (obsolete) — Splinter Review
- using String.endsWith now
- global pref added
Attachment #651807 - Attachment is obsolete: true
Attachment #652133 - Flags: review?(bzbarsky)
Comment on attachment 652133 [details] [diff] [review]
patch v4

>+function HTTP_on_modify_request(aSubject, aTopic, aData) {
>+  let channel = aSubject.QueryInterface(Ci.nsIHttpChannel);
>+  let host = channel.URI.asciiHost;
>+
>+  for (let domain in gOverrides) {
>+    if (host == domain ||
>+        host.endsWith("." + domain)) {

I could move "." + domain outside of the loop to reduce the overhead further when there's larger number of overrides...
I was actually more interested in the impact of adding uses of modify-request where that list is often empty. (?) The micro-optimization of string matching isn't really interesting to me.
I don't see why the mere existence of the observer would be particularly interesting here. There's the usual XPCOM overhead, as in many other places where we run code.

Note that Fennec has been observing http-on-modify-request for some time now.
bz: polite ping on this review? This mechanism is somewhat in demand...

Gerv
Comment on attachment 652133 [details] [diff] [review]
patch v4

r=me on the neckoish bits; I'm not sure I'm competent to review the sBrowserGlue.js stuff, though it looks fine at first glance.
Attachment #652133 - Flags: review?(bzbarsky) → review+
Attachment #652133 - Flags: review?(ehsan)
Comment on attachment 652133 [details] [diff] [review]
patch v4

Review of attachment 652133 [details] [diff] [review]:
-----------------------------------------------------------------

r=me with the below fixed.  But we also need to test this.  One simple testing strategy is to set the site_specific_overrides pref to true in a test, add a pref for an override on example.com, and then load a page from that domain to make sure that the UA has indeed changed.  The test should test both the regex and non-regex case.

::: browser/app/profile/firefox.js
@@ +221,5 @@
>  #else
>  pref("general.autoScroll", true);
>  #endif
>  
> +pref("general.useragent.override.aol.com", "Gecko/[^ ]*#Gecko/20100101");

Please add a comment saying what this does and why it's needed.

::: modules/libpref/src/init/all.js
@@ +22,5 @@
>  pref("keyword.URL", "https://www.google.com/search?ie=UTF-8&oe=utf-8&q=");
>  pref("keyword.enabled", false);
>  pref("general.useragent.locale", "chrome://global/locale/intl.properties");
>  pref("general.useragent.compatMode.firefox", false);
> +pref("general.useragent.site_specific_overrides", true);

I really think that this magic is worth a bit of documentation.  I can't think of a better place for that documentation to live than here.  :/

I'm fine if you prefer the documentation to live on MDN...

::: netwerk/protocol/http/UserAgentOverrides.jsm
@@ +22,5 @@
> +    gPrefBranch.addObserver("", buildOverrides, false);
> +
> +    Services.prefs.addObserver(PREF_OVERRIDES_ENABLED, buildOverrides, false);
> +
> +    Services.obs.addObserver(HTTP_on_modify_request, "http-on-modify-request", false);

Hmm, I'm assuming that hopefully the site_specific_overrides pref would be false most of the time!  :-)  Assuming that, I think we need to restructure this code a bit to avoid adding an http-on-modify-request observer unless that pref is set to true.
Attachment #652133 - Flags: review?(ehsan) → review+
> Hmm, I'm assuming that hopefully the site_specific_overrides pref would be
> false most of the time!  :-)  Assuming that, I think we need to restructure
> this code a bit to avoid adding an http-on-modify-request observer unless
> that pref is set to true.

The pref only exists for testing purposes. If we want no overrides at all, we can just avoid loading the module.
(In reply to comment #31)
> > Hmm, I'm assuming that hopefully the site_specific_overrides pref would be
> > false most of the time!  :-)  Assuming that, I think we need to restructure
> > this code a bit to avoid adding an http-on-modify-request observer unless
> > that pref is set to true.
> 
> The pref only exists for testing purposes. If we want no overrides at all, we
> can just avoid loading the module.

That sounds fine to me.
> ::: modules/libpref/src/init/all.js
> @@ +22,5 @@
> >  pref("keyword.URL", "https://www.google.com/search?ie=UTF-8&oe=utf-8&q=");
> >  pref("keyword.enabled", false);
> >  pref("general.useragent.locale", "chrome://global/locale/intl.properties");
> >  pref("general.useragent.compatMode.firefox", false);
> > +pref("general.useragent.site_specific_overrides", true);
> 
> I really think that this magic is worth a bit of documentation.  I can't
> think of a better place for that documentation to live than here.  :/
> 
> I'm fine if you prefer the documentation to live on MDN...

I don't think all.js is the right place for that documentation, as that pref exists for testing only and the actual overrides would be set elsewhere, e.g. firefox.js.
Keywords: dev-doc-needed
Attachment #652133 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/418955e7c3a9
Status: ASSIGNED → RESOLVED
Closed: 12 years ago
Flags: in-testsuite?
Resolution: --- → FIXED
Target Milestone: --- → mozilla18
Comment on attachment 656347 [details] [diff] [review]
patch v4 with some comments added

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 588909
User impact if declined: bug 778408
Testing completed (on m-c, etc.): manual on m-c
Risk to taking this patch (and alternatives if risky): low, no fallout expected
String or UUID changes made by this patch: none
Attachment #656347 - Flags: approval-mozilla-aurora?
Backed out for (weird) Windows xpcshell failures:
https://hg.mozilla.org/mozilla-central/rev/c72b90b1c8d2
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Attachment #656347 - Flags: approval-mozilla-aurora?
Attached patch patch v5Splinter Review
(In reply to Ed Morley [:edmorley] from comment #37)
> Backed out for (weird) Windows xpcshell failures:

I still don't know what's wrong with those Places xpcshell tests, nor do I understand why they fail on 32-bit Win7 debug only, but protecting UserAgentOverrides against init being called twice or uninit being called before init seems to avoid the orange: https://tbpl.mozilla.org/?tree=Try&rev=3337b15fc775
Attachment #656347 - Attachment is obsolete: true
https://hg.mozilla.org/mozilla-central/rev/19a91d0fd50b
Status: REOPENED → RESOLVED
Closed: 12 years ago12 years ago
Resolution: --- → FIXED
Comment on attachment 656817 [details] [diff] [review]
patch v5

[Approval Request Comment]
Bug caused by (feature/regressing bug #): bug 588909
User impact if declined: bug 778408
Testing completed (on m-c, etc.): manual on m-c
Risk to taking this patch (and alternatives if risky): low, no fallout expected
String or UUID changes made by this patch: none
Attachment #656817 - Flags: approval-mozilla-aurora?
Blocks: 787889
Blocks: 787054
No longer blocks: 787889
Attachment #656817 - Flags: approval-mozilla-aurora? → approval-mozilla-aurora+
Blocks: 788422
Blocks: 797363
Flags: in-testsuite?
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: