Home screen fetches url from history

RESOLVED INVALID

Status

()

Firefox for iOS
Home screen
RESOLVED INVALID
2 years ago
2 years ago

People

(Reporter: contact, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

2 years ago
User Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:48.0) Gecko/20100101 Firefox/48.0
Build ID: 20160601004014

Steps to reproduce:

Hi.
My firefox on iOS 9.3.2 (v4.0 11) is synchronized with my desktop firefox (developper 48.0a2), and when I open a new Tab on my iOS firefox, the home screen shows the most visited websites of both desktop and mobile firefox.

When this screen is showing up, the url of the most visited websites are fetched in background (I saw this by monitoring my apache web server logs).

I'm a web developper and I was working on a web page (say http://domain.com/script/mail.php) where I send an email to my inbox, and I noticed that even if I didn't refresh the page on my desktop firefox (firefox developper, win 10), i received emails when browsing different websites on my mobile firefox...

After some tests, I found that the mail (generated by the webpage I worked on) was sent when I openend a new tab on my iOS firefox. 

I had trouble to identify from where the call to my webpage (http://domain.com/script/mail.php) was done because in my web server log, the ios firefox has a strange user agent :
"Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.1.46 (KHTML, like Gecko) Safari/601.1.46" Why is it Intel Mac OS X, I'm on an apple device...


Actual results:

My web page was called just because I opened a new tab showing most visited websites. 


Expected results:

A new tab should never fetch the url of displayed most visited websites as it can lead to unwanted code execution
(In reply to contact from comment #0)
> I had trouble to identify from where the call to my webpage
> (http://domain.com/script/mail.php) was done because in my web server log,
> the ios firefox has a strange user agent :
> "Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_1) AppleWebKit/601.1.46
> (KHTML, like Gecko) Safari/601.1.46" Why is it Intel Mac OS X, I'm on an

That is not the Firefox UA. We identify as [1].

[1] https://developer.mozilla.org/en-US/docs/Web/HTTP/Gecko_user_agent_string_reference#Firefox_for_iOS

Visit http://www.whoishostingthis.com/tools/user-agent/ in Firefox

I'll let others speak for the crux of this report.
(Reporter)

Comment 2

2 years ago
I know that it's not the firefox UA, but that line in my log come from the firefox app when opening a new tab so there is something weird...

Updated

2 years ago
tracking-fxios: --- → ?
(Reporter)

Comment 3

2 years ago
No response ?
Hey contact,

Thanks for filing this - it's a great find. Currently, whenever loading the 'Top Sites' panel, we send out a web request to any sites that will be appearing on the panel that don't have a favicon. To do this, we make a request to the actual page and parse out the favicon links from the HEAD tag. Looks like in your case this request is trigger the mailer you're testing. We really shouldn't be making a request at all - especially for just for the favicon. Instead of requesting for the favicon, we could show a default favicon icon.
(Reporter)

Comment 5

2 years ago
Hi,

Thank you for your answer, I think it would effectively be better to not make that request because it can lead to unexpected behavior, my mailer script sended 8 times emails to about 1000 customers as i opened 8 new tabs, not good :(
Maybe you could just call the domain and not include the end of the url, in my case it could have call  http://domain.com/ (where there is a favicon header tag) and not http://domain.com/script/mail.php (where there is no html outputed as it is just a php script)
Hope it will be fixed soon beacause I really enjoy to share my history between all my browsers :) For now I just disabled sync of history.

Best regards
Bastien
(Reporter)

Comment 6

2 years ago
Or maybe you can just try to fetch http://domain.com/favicon.ico (I think most of the time favicon are in the root folder)
To confirm: this is the top sites panel hitting your page when a) you've visited it recently/frequently on another device, but b) not on this one. 

The top sites panel tries to display a nice high quality favicon for each of the sites it is offering you (a list of recently/frequently visited sites in your history). If you haven't been to the page on this device, then you it downloads the HTML for that page, and looks for the something that looks like a favicon in the HTML. Once it has that, it downloads that with a GET request.

It downloads the page, and not site, because some domains use different favicons for different pages on the same domain: google.com/maps v google.com.

We should probably do a better job at caching the images we eventually download so we don't have to download the HTML each and every time, however there are many incidences of favicons being dynamically generated and/or changing frequently (e.g. social networks that display profile pics in favicons).

The UA string you see is deliberately the same as the desktop user agent. We tried using the mobile UA string, but found that quite a few sites were browser sniffing, and sending us HTML pages which pointed to very low quality images because we were outside their list of recognized user-agents. Unfortunately these sites were large enough and numerous enough we could not ignore. We should do a better job at convincing them to a) not use browser sniffing b) recognize us as an iOS browser and c) not use browser sniffing.

> Or maybe you can just try to fetch http://domain.com/favicon.ico (I think most of the time favicon are in the root folder)
We use favicon.ico as a fallback of last resort: .ico is an old Windows format which has a maximum resolution of 32x32. This looks pretty horrible on today's high res screens. Unfortunately every other standard and convention on favicons require us to download and parse the HTML.

> I think it would effectively be better to not make that request because it can lead to unexpected behavior,
> my mailer script sended 8 times emails to about 1000 customers as i opened 8 new tabs, not good :(
That's a heck of a good example of unexpected behavior. I would not expect that from a GET request. 

 - What happens when Googlebot crawls your site? (web crawlers use GET requests; the well behaved ones, I expect, you are preventing using http://domain.com/robots.txt ). 
 - Do any of your users bookmark that page?

If you want to use requests that cause those sorts of side-effects you should probably consider PUT, POST or DELETE requests instead. GET and HEAD requests are explicitly supposed to be 'safe' for clients to retrieve at will. https://www.w3.org/Protocols/rfc2616/rfc2616-sec9.html#sec9.1.1 

You can workround this: visit the page in Firefox on your device, so the device has that page and favicon in its local history and image cache. However, this will only help you if you're the only user of this page. Other users may've bookmarked this page and returned to it in any other browser, desktop or mobile. Web crawlers may try to spider the page, even if there is no link connection to the wider web.

We are also looking to allow some customization of the new tab experience, including avoiding the top site panel altogether.

btw: I am envious of your use of domain.com. :)
Status: UNCONFIRMED → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → INVALID
(Reporter)

Comment 8

2 years ago
Hello,

Thank you for this precise response. I understand better the problem of favicon.

My page is never indexed by bots because it's not an url that bots are aware of, in fact it's just a php script launched with a cron job (running in command line), but as I'am a web developper I have also the possibility to launch the script in my browser (it's only accessible from my ip address for test purpose). By doing my tests, i refreshed the page a large number of times and so the url became a "top visited website", leading to the problem I described before when opening new tabs on my mobile. I modified the script so it can no more be launched in a browser in a production environment.

For the workaround of visiting the page on my mobile, not sure it would work because my page return nothing but a blank page, so no favicon there...visit the page in Firefox on your device, so the device has that page and favicon in its local history and image cache. However, this will only help you if you're the only user of this page. Other users may've bookmarked this page and returned to it in any other browser, desktop or mobile. Web crawlers may try to spider the page, even if there is no link connection to the wider web.

I think that a nice feature would be to have the choice of displaying this list or not (my teammate who has an androïd phone has this feature if I'm right).

Just a question, wouldn't it be possible to save the favicon url beside the visited url in the synchronized history when pages are visited ? By doing that you would just have to fetch this url instead of parsing headers.

Concerning domain.com, I would love to own it, but the real domain name I use is a bit less attractive :)

Best regards and good luck finding a solution
(Reporter)

Comment 9

2 years ago
Sorry, there is a part of your response after mine in that paragraph :

For the workaround of visiting the page on my mobile, not sure it would work because my page return nothing but a blank page, so no favicon there...

Regards
The real solution here is to stop your site sending emails for GET methods. This will fix the problem for all user-agents.

> but as I'am a web developper I have also the possibility to launch the script in my browser
It sounds like you and cron are the only intended users of this script. I'm a bit puzzled why this is attached to a web server at all. 

If you're only using the browser as a convenient way to test your script, I would consider simplifying and launching it using /usr/bin/php from the command line.

Attaching it to a webserver might be a valid thing to do as part of a wider webhook strategy, but triggering side effects (destructive or not) with a GET method is not the correct way to use HTTP.

If you can't fix your site for _all user agents_, and you don't want to be launching this site from top sites, you can remove it by long pressing on the domain.com icon, then clicking on the close X.

If you do want to launch domain.com from top sites then putting a favicon.ico in the root of your domain should stop this from happening too many times.

There are bugs with the top sites panel, but this is not one of them. Good luck.
(Reporter)

Comment 11

2 years ago
Thank you I know that I can use command line, It would be too long to explain why I have to use GET request and not /usr/bin/php on this website and I think that it's out of purpose here... 

By the way there are numerous case that require to lauch cronjob with GET request (shared hosting etc.) so I think I'm not the only one to do that.

I don't focus on this particular website or on a user agent, from my point of view it's a general problem to fetch url silently for just a favicon, and Stephan Leroux seams to share this idea... You have acces to the favicon when the page is browsed on a browser, so why make an extra request later ? Store it the first time and retrieve it on the others synchronized browsers...

The list of top visited site was far too long to remove manually each element, and does removing a top visited website will prevent him to appear again if I visit him often on my desktop browser ?

Concerning your remark about adding a favicon.ico as it's allready the case, I have a favicon.ico at the root of my domain and the background request was done anyway each time i opened a new tab...
tracking-fxios: ? → ---
You need to log in before you can comment on or make changes to this bug.