891654 - Validator can't find hosted manifest that works fine in every other way

Reporter

Description

•

11 years ago

Steps:
 - Validate http://www.segundamano.es/firefox_manifest.webapp
 - Error: No manifest was found at that URL. Check the address and try again

Expected result:
 - Validation OK

Manifest returns correct mime-type and status code, tested via CURL with FxOS and FF Android user agent.

This blocks submission for a carrier launch partner.

scolville@mozilla.com

Comment 1

•

11 years ago

Without a UA string the server is providing an empty response.

curl http://www.segundamano.es/firefox_manifest.webapp
curl: (52) Empty reply from server

Mathieu Pillard [:mat]

Comment 2

•

11 years ago

It also blocks urllib:
% curl --user-agent 'Python-urllib/2.6' http://www.segundamano.es/firefox_manifest.webapp
curl: (52) Empty reply from server

Christopher Van Wiemeersch [:cvan]

Updated

•

11 years ago

Target Milestone: --- → 2013-07-11

Christopher Van Wiemeersch [:cvan]

Comment 3

•

11 years ago

Bug 888085 would fix it so we send a custom user-agent. Let's do that.

Matt Basta [:basta]

Comment 4

•

11 years ago

I'm inclined to WONTFIX. If the server is explicitly blocking specific UAs, how are we supposed to get around that? If we changed the UA to FXOS's UA and the server started blocking that one too, should we change it to a different UA? Should we use googlebot's UA? At what point do we draw the line?

If this were something that zamboni/the validator were doing incorrectly I'd say that we should fix it, but really any application with default settings would run into this same problem.

:Harald Kirschner :digitarald

Reporter

Comment 5

•

11 years ago

No user agent seems a unexpected request and I am not surprised that it triggers unexpected results. Is there a motivation behind that, like suppressing any UA-detection?

If this is a expected limitation in the Validator it might help to be called out when the response isn't what was expected. Maybe developers can expand more information to see the request headers that were sent and the response headers that were received.

Matt Basta [:basta]

Comment 6

•

11 years ago

(In reply to Harald Kirschner :digitarald from comment #5)
> If this is a expected limitation in the Validator it might help to be called
> out when the response isn't what was expected. Maybe developers can expand
> more information to see the request headers that were sent and the response
> headers that were received.

We're very limited in the information that we can return to the user about what failed in HTTP requests because of security reasons. We've had to cut back the amount of data in the past and have an open bug to cut it back even more. At one point, we gave you very detailed information about the result of bad requests, but that's unfortunately no longer an option.

:Harald Kirschner :digitarald

Reporter

Comment 7

•

11 years ago

Is there a bug that explains the security/privacy concerns?

The partner disabled those requests to block spiders that don't provide a UA. Does Marketplace make the request from a fixed IP so the partner can unblock it?

Flags: needinfo?(mattbasta)

Matt Basta [:basta]

Comment 8

•

11 years ago

Principles of the open web aside (we should never encourage partners to filter requests just because they don't include identifying information), no, there is no guarantee that the validator will be run from a particular IP.

Flags: needinfo?(mattbasta)

:Harald Kirschner :digitarald

Reporter

Comment 9

•

11 years ago

OK, if the IP can change, we shouldn't let them filter by IP.

What is the decision on bug 888085? I really would faithfully request resources with the right UA.

Is there a bug that explains the security/privacy concerns?

Matt Basta [:basta]

Comment 10

•

11 years ago

Bug 878368 explains why we can't give more information about request failures. PM me if you need to get access to it.

> I really would faithfully request resources with the right UA.

Make no mistake, we are sending a user agent; it's the default one provided by Python's urllib2 module. It should look something like:

  Python-urllib/2.7

The requests we're sending are a.) not at all malformed and b.) not in any way incorrect (we are indeed requesting the manifest from Python's urllib2 module). I don't think we should be special casing our code for Firefox OS.

If another platform adds WebAPI support, will they not be able to install apps because the remote server is filter out their user agent? How many times should we request the manifest with different UAs before we determine that the remote server isn't giving priority or preference to one UA over another?

And what about Android? What if the remote server returns a manifest for FXOS but for an FX4A UA they have an Apache redirect in place that redirects all Android traffic to their "mobile site", which isn't a manifest at all.

At the end of the day, it's a wash. But in any case I don't believe specifying a UA is going to do anything more than gloss over the one-off cases where developers have poorly-conceived spider (or otherwise) blocking mechanisms installed on their servers. This is why robots.txt exists.

Christopher Van Wiemeersch [:cvan]

Comment 11

•

11 years ago

%% curl 'http://www.segundamano.es'
curl: (52) Empty reply from server

Repeating the remarks above, it's pretty unacceptable and anti-the-way-of-the-web to be checking against some magic whitelist of UAs.

To move forward: is there a way the partner can request their IT team/hosting provider of segundamano.es to not block on empty/unknown User-Agents?

:Harald Kirschner :digitarald

Reporter

Comment 12

•

11 years ago

I am requesting that, but wanted to make sure that I don't put additional work on a partner when bug 888085 didn't get discussed yet.

Can somebody cc me on bug 878368, please?

Wil Clouser [:clouserw]

Comment 13

•

11 years ago

(In reply to Christopher Van Wiemeersch [:cvan] from comment #11)
> To move forward: is there a way the partner can request their IT
> team/hosting provider of segundamano.es to not block on empty/unknown
> User-Agents?

This is the right thing to do.  Python-urllib is a standard library on the web and having us conform to other companies arbitrary lists of blocked UAs feels like a maintenance headache for no purpose.

Status: NEW → RESOLVED

Closed: 11 years ago

Resolution: --- → WONTFIX

Bugzilla

Quick Search

Validator can't find hosted manifest that works fine in every other way

Categories

(Marketplace Graveyard :: Validation, defect)

Tracking

(Not tracked)

People

(Reporter: Harald, Unassigned)

References

Details

Crash Data

Security

(public)

User Story

Description

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13