Closed Bug 816306 Opened 7 years ago Closed 7 years ago

[Settings] Check for updates gets stuck on "Checking for updates..." because 404

Categories

(Firefox OS Graveyard :: General, defect, P1)

x86_64
Linux
defect

Tracking

(blocking-basecamp:+, firefox19 wontfix, firefox20 wontfix, firefox21 fixed, b2g18 fixed)

VERIFIED FIXED
B2G C4 (2jan on)
blocking-basecamp +
Tracking Status
firefox19 --- wontfix
firefox20 --- wontfix
firefox21 --- fixed
b2g18 --- fixed

People

(Reporter: dholbert, Assigned: marshall)

Details

(4 keywords)

Attachments

(4 files, 4 obsolete files)

I just updated to the stable dogfood build that went up this morning, Build ID 20121127132302.

Using that build, I just did "check for updates".

That added the "Checking for updates..." text below the button, like usual, but now that text never goes away. (It doesn't tell me whether there's an update available or not.)

Additionally, my logcat output contains:
{
11-28 15:22:16.049 E/GeckoConsole(  703): [JavaScript Error: "syntax error" {file: "http://update.boot2gecko.org/stable/unagi1/18.0/20121127132302/update.xml?build_id=20121127132302&version=18.0&dogfood_id=14162&force=1" line: 1 column: 50 source: "<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">"}]
11-28 15:22:16.049 I/Gecko   (  703): *** AUS:SVC Checker:onProgress - 329/329
11-28 15:22:16.049 E/GeckoConsole(  703): AUS:SVC Checker:onProgress - 329/329
11-28 15:22:16.059 I/Gecko   (  703): *** AUS:SVC Checker:onLoad - request completed downloading document
11-28 15:22:16.059 E/GeckoConsole(  703): AUS:SVC Checker:onLoad - request completed downloading document
11-28 15:22:16.059 I/Gecko   (  703): *** AUS:SVC Checker:updates get - unexpected node name!
11-28 15:22:16.059 E/GeckoConsole(  703): AUS:SVC Checker:updates get - unexpected node name!
11-28 15:22:16.059 I/Gecko   (  703): *** AUS:SVC Checker:onLoad - there was a problem checking for updates. Exception: Error: Unexpected node name, expected: updates, got: parsererror
11-28 15:22:16.059 E/GeckoConsole(  703): AUS:SVC Checker:onLoad - there was a problem checking for updates. Exception: Error: Unexpected node name, expected: updates, got: parsererror
11-28 15:22:16.059 I/Gecko   (  703): *** AUS:SVC Checker:onLoad - request.status: 404
11-28 15:22:16.059 E/GeckoConsole(  703): AUS:SVC Checker:onLoad - request.status: 404
11-28 15:22:16.059 I/Gecko   (  703): *** AUS:SVC getStatusTextFromCode - transfer error: Update XML file not found (404), code: 404
11-28 15:22:16.059 E/GeckoConsole(  703): AUS:SVC getStatusTextFromCode - transfer error: Update XML file not found (404), code: 404
11-28 15:22:16.069 I/Gecko   (  703): *** AUS:SVC UpdateService:onError - error during background update: Update XML file not found (404)
11-28 15:22:16.069 E/GeckoConsole(  703): AUS:SVC UpdateService:onError - error during background update: Update XML file not found (404)
11-28 15:22:17.629 E/GeckoConsole(  703): [JavaScript Error: "[Exception... "'Illegal value' when calling method: [nsIPromptFactory::getPrompt]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0"  data: no]"]
11-28 15:22:17.629 E/GeckoConsole(  703): [JavaScript Error: "[Exception... "'Illegal value' when calling method: [nsIPromptFactory::getPrompt]"  nsresult: "0x80070057 (NS_ERROR_ILLEGAL_VALUE)"  location: "native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0"  data: no]"]
}
Keywords: regression
Here's the logcat output, starting from when I hit "Check for updates".
blocking-basecamp: ? → +
Keywords: qawanted
Not sure who can help on checking this Gecko log, request SC first, he is the most relevant person I know.
Flags: needinfo?(schien)
Priority: -- → P2
Mass Modify: All un-milestoned, unresolved blocking-basecamp+ bugs are being moved into the C3 milestone. Note that the target milestone does not mean that these bugs can't be resolved prior to 12/10, rather C2 bugs should be prioritized ahead of C3 bugs.
Target Milestone: --- → B2G C3 (12dec-1jan)
Sorry for the late response, I think :marshall_law can help answering this question.

:marshall_law, looks like we get 404 while retrieving update.xml, which results the exception on nsIPromptFactory.getPrompt(). Do you know more details about this exception?
Flags: needinfo?(schien) → needinfo?(marshall)
Is this a Gaia bug that Settings app doesn't handle the error response, or a Gecko bug that DOM API doesn't return the error response?

We need an assignee and clarification here.
Flags: needinfo?(schien)
Flags: needinfo?(ehung)
The 404 exception is expected when a new update hasn't been posted yet (this error is meant to be swallowed in our current setup).
Flags: needinfo?(marshall)
FWIW, this is currently WORKSFORME -- "checking for updates..." is replaced with "This is already the latest version of Firefox OS" after a few seconds.

That might be because the update URL is _not_ currently a 404, though.  (In logcat, it looks like we successfully download update.xml, and then it says "AUS:SVC UpdateService:selectUpdate - skipping update because the update's application version is less than the current application version" and then "UpdatePrompt: Setting gecko.updateStatus: already-latest-version").

So, we may still suffer from the symptoms described in comment 0 when the update URL *is* a 404 (e.g. from network issues or server downtime).  If so, we should still be sure to handle that gracefully.
I took a look of the code logic in Settings app, Gaia should be notified if one of the following cases happened:
 * - no-updates
 * - already-latest-version
 * - check-complete
 * - retry-when-online
As my understand, the update URL is specified in Gecko, so Gaia can do nothing but represents the fact either *no update* or *update failed*. I guess Gecko needs to handle 404 URL case, and report correct error in this case.

Assign to Steve for testing and tracking this issue.
Assignee: nobody → schung
Flags: needinfo?(ehung)
Assignee: schung → arthur.chen
I used build 20121213110234 and test two cases. Corresponding logs are as the following:

Network (WiFi) turning off:
E/GeckoConsole( 802): AUS:SVC getStatusTextFromCode - transfer error: Update server not found (check your internet connection), code: 2152398878

Network (WiFi) turning on but failing to connect to the internet:
E/GeckoConsole( 802): AUS:SVC getStatusTextFromCode - transfer error: Network is offline (go online), code: 2152398864

In the first case Gecko triggers a property change of "gecko.updateStatus" with value "retry-when-online". In the second one nothing happens, thus Gaia is unable to handle the error. 

Marshall, Any suggestion on this?
Flags: needinfo?(marshall)
P1 because update related.
Priority: P2 → P1
Summary: [Settings] Check for updates gets stuck on "Checking for updates..." in 20121127132302 build, w/ "AUS:SVC Checker:updates get - unexpected node name!" and "request.status: 404" → [Settings] Check for updates gets stuck on "Checking for updates..." because 404
IMO, a 404 should reply with the "no-updates" status, since that is what the HTTP response code actually means. Does anyone object to me fixing platform this way?

(In reply to Arthur Chen from comment #9)
> I used build 20121213110234 and test two cases. Corresponding logs are as
> the following:
> 
> Network (WiFi) turning off:
> E/GeckoConsole( 802): AUS:SVC getStatusTextFromCode - transfer error: Update
> server not found (check your internet connection), code: 2152398878
> 
> Network (WiFi) turning on but failing to connect to the internet:
> E/GeckoConsole( 802): AUS:SVC getStatusTextFromCode - transfer error:
> Network is offline (go online), code: 2152398864
> 
> In the first case Gecko triggers a property change of "gecko.updateStatus"
> with value "retry-when-online". In the second one nothing happens, thus Gaia
> is unable to handle the error. 
> 
> Marshall, Any suggestion on this?

Hrm, that's a surprising result. How quickly are you turning off / on the wifi? After wifi is off, is your device still using cellular data?

The "retry-when-online" status should only happen when the network is offline. AFAIK the network online/offline status doesn't actually check for an external "internet" connection -- this is just a simple check of (wifi_connected || cell_data_connected).
Flags: needinfo?(marshall)
It needs modifications from the Gecko side.  Assign to nobody to let others check it.
Assignee: arthur.chen → nobody
Component: Gaia::Settings → General
Are you able to take this, Marshall?
Assignee: nobody → marshall
Attached patch report network errors - v1 (obsolete) — Splinter Review
New tests are a WIP and coming soon
Attachment #695857 - Flags: review?(netzen)
Flags: needinfo?(schien)
This issue is NOT observed on Unagi build 20121231070201.
Keywords: qawanted
Attached patch report network errors - v2 (obsolete) — Splinter Review
added tests for the new update testing frontend
Attachment #695857 - Attachment is obsolete: true
Attachment #695857 - Flags: review?(netzen)
Attachment #697084 - Flags: review?(netzen)
Target Milestone: B2G C3 (12dec-1jan) → B2G C4 (2jan on)
Comment on attachment 697084 [details] [diff] [review]
report network errors - v2

Review of attachment 697084 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good overall, I think just minor changes are needed or how the errorCode is filled.

::: b2g/components/UpdatePrompt.js
@@ +78,5 @@
> +        this._updatePrompt.setUpdateStatus("retry-when-online");
> +        break;
> +      default:
> +        this._updatePrompt.setUpdateStatus("check-error-" + errorCode);
> +        break;

This file should be reviewed by someone else

::: toolkit/mozapps/update/nsUpdateService.js
@@ +1960,5 @@
>    },
>  
>    onError: function AUS_onError(request, update) {
>      LOG("UpdateService:onError - error during background update: " +
> +        update.errorCode + " - " + update.statusText);

nit: "error code: " + update.errorCode + ", status text: "+ update.statusText

@@ +3026,5 @@
>        var request = event.target;
>        var status = this._getChannelStatus(request);
>        LOG("Checker:onLoad - request.status: " + status);
>        var update = new Update(null);
> +      update.errorCode = status;

I think this will just set the error code to the HTTP error code, I'd prefer to set this to an error cod defined up top like for example CERT_ATTR_CHECK_FAILED_HAS_UPDATE.  Especially since some HTTP status codes could conflict with the previous values.

@@ +3058,5 @@
>      if (status == Cr.NS_ERROR_OFFLINE) {
>        // We use a separate constant here because nsIUpdate.errorCode is signed
>        update.errorCode = NETWORK_ERROR_OFFLINE;
> +    } else {
> +      update.errorCode = status;

ditto
Attachment #697084 - Flags: review?(netzen)
give cheker http error codes an offset
Attachment #697084 - Attachment is obsolete: true
Attachment #699246 - Flags: review?(netzen)
Comment on attachment 699246 [details] [diff] [review]
part 1: report network errors - v3

Review of attachment 699246 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/mozapps/update/nsUpdateService.js
@@ +156,5 @@
>  const BACKGROUNDCHECK_MULTIPLE_FAILURES = 110;
>  const NETWORK_ERROR_OFFLINE             = 111;
>  
> +// Error codes should be < 1000. Errors above 1000 represent http status codes
> +const HTTP_ERROR_OFFSET                 = 1000;

Please also add a comment here to not use codes 1100-1599 because they are used internally for /nsUpdateService.js
/toolkit/mozapps/update/common/errors.h

@@ +3060,2 @@
>      var request = event.target;
>      var status = this._getChannelStatus(request);

If I understand correctly this._getChannelStatus can return either an HTTP status code or an nsresult status.  Yuck, but not your code.  I'm not sure what you're be using the errorCode for internally but it may be possible for _getChannelStatus to return a value which is within the range of HTTP status codes.  I'll leave it up to you if you want to try to fix that now by changing _getChannelStatus.  If so I think that function should be split into 2.
Attachment #699246 - Flags: review?(netzen) → review+
Comment on attachment 699246 [details] [diff] [review]
part 1: report network errors - v3

Review of attachment 699246 [details] [diff] [review]:
-----------------------------------------------------------------

actually on second thought if the function does return both status codes and nsresults (which I think it does) then I think it's best to fix it now.
Attachment #699246 - Flags: review+
Comment on attachment 699247 [details] [diff] [review]
part 2: report http errors to gaia + tests - v1

Review of attachment 699247 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/mozapps/update/test/marionette/update_test_status.js
@@ +1,2 @@
> +/* Any copyright is dedicated to the Public Domain.
> + * http://creativecommons.org/publicdomain/zero/1.0/ */

is this really the license we use for tests?
Attachment #699247 - Flags: review?(fabrice) → review+
(In reply to Fabrice Desré [:fabrice] from comment #22)
> > +/* Any copyright is dedicated to the Public Domain.
> > + * http://creativecommons.org/publicdomain/zero/1.0/ */
> 
> is this really the license we use for tests?

Yes, per bottom of https://www.mozilla.org/MPL/headers/
> actually on second thought if the function does return both status codes and
> nsresults (which I think it does) then I think it's best to fix it now.

I'm not sure I fully understand how you want to fix this function -- should I just move the http range / offset logic into _getChannelStatus?

Splitting the function up seems like it might cause more headaches than it's worth, since Firefox error strings are localized based on both HTTP and nsresult error codes:

https://mxr.mozilla.org/mozilla-central/source/toolkit/locales/en-US/chrome/mozapps/update/updates.properties#100
Inside _getChannelStatus, I think that:
 var status = 0;
    try {
      status = request.status; <<---- returns http status code
    }
    catch (e) {
    }

    if (status == 0)
      status = request.channel.QueryInterface(Ci.nsIRequest).status; <<---- returns nsresult
    return status;

Can you verify that is the case? I think it is.
What I'm suggesting is to make a _getChannelHTTPStatus and a _getChannelNSResult function which does each part respectively and delete _getChannelStatus.
You'd replace the existing calls of _getChannelStatus with calls to _getChannelHTTPStatus, but if that fails then you'd call _getChannelNSResult.  If the call to _getChannelHTTPStatus succeeded then you'd set the errorCode at that time.

I'm not sure I fully understand what you mean about the localization please expand if you think it's still a problem with my clarification.
Comment on attachment 699246 [details] [diff] [review]
part 1: report network errors - v3

Review of attachment 699246 [details] [diff] [review]:
-----------------------------------------------------------------

Seems this won't be needed because nsresults will never contain a code in the HTTP status code range.  So this is r+.
Attachment #699246 - Flags: review+
Attachment #699835 - Flags: review?(felash)
Keywords: late-l10n
Attachment #699835 - Flags: review?(stas)
Attachment #699835 - Flags: review?(stas) → review+
Comment on attachment 699835 [details] [diff] [review]
part 3: gaia check error message - v1

Review of attachment 699835 [details] [diff] [review]:
-----------------------------------------------------------------

::: apps/settings/js/settings.js
@@ +519,5 @@
>         * use
>         * http://mxr.mozilla.org/mozilla-central/ident?i=setUpdateStatus&tree=mozilla-central&filter=&strict=1
>         * to check if this is still current
>         */
> +      if (value && value.indexOf('check-error-') == 0) {

nit: I'd prefer to use === everywhere

@@ +521,5 @@
>         * to check if this is still current
>         */
> +      if (value && value.indexOf('check-error-') == 0) {
> +        // Use a catch all for check errors for now
> +        systemStatus.textContent = _('check-error', null, 'check-error');

you just need _('check-error'); the 3rd parameter is for fallback

@@ +527,2 @@
>          systemStatus.textContent = _(value, null, value);
>        }

I'd prefer something like :

if (value !== 'check-complete') {
  systemStatus.textContent = _(value) || _('check-error');
  console.info('Error during system update, code is', value);
}

So that we always have a generic catch-all error, but still output the real error code to the console.
Attachment #699835 - Flags: review?(felash)
Attachment #699247 - Attachment is obsolete: true
Attachment #700444 - Flags: review?(felash)
Comment on attachment 700444 [details] [diff] [review]
part 3: gaia check error message - v2

Review of attachment 700444 [details] [diff] [review]:
-----------------------------------------------------------------

ok, the settings test are broken right now, but that's not because of your patch. Will file a bug for this.

let's land this now, it begins to bite me ;)

::: apps/settings/js/settings.js
@@ +520,5 @@
>         * http://mxr.mozilla.org/mozilla-central/ident?i=setUpdateStatus&tree=mozilla-central&filter=&strict=1
>         * to check if this is still current
>         */
>        if (value !== 'check-complete') {
> +        systemStatus.textContent = _(value, null, _('check-error'));

clever :-)

(probably less efficient but that's ok here)
Attachment #700444 - Flags: review?(felash) → review+
Attachment #699247 - Attachment is obsolete: false
Attachment #700444 - Attachment description: part 2: report http errors to gaia + tests - v2 → part 3: gaia check error message - v2
Attachment #699835 - Attachment is obsolete: true
posting the patch in my pull request here for posterity
Attachment #700444 - Attachment is obsolete: true
Verified fixed in 2013-01-24-14-08-59 pvt nightly b2g18 build
Status: RESOLVED → VERIFIED
ok so the fallback message didn't work at all after all.
Filed Bug 862785
You need to log in before you can comment on or make changes to this bug.