Closed Bug 1120668 Opened 9 years ago Closed 6 years ago

OTA updates fail, with "AUS:SVC UpdateService:onError - error during background update. error code: -2141577348, status text: Update XML file malformed (200)", if the date on the device is wrong by a month or two

Categories

(Firefox OS Graveyard :: General, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: dholbert, Unassigned)

References

()

Details

Attachments

(1 file)

13.17 KB, application/vnd.tcpdump.pcap
Details
I've got a Flame device that's currently got a 2.2 Nightly on it, from mid-december (Build ID 20141215040201).

I'm trying to get it to download an OTA update, but it's failing during the update-check. Here's what I see in logcat, when I force an update check in Settings:
{
01-26 18:46:00.535 I/Gecko   (  204): UpdatePrompt: Forcing update check
01-26 18:46:00.535 I/Gecko   (  204): *** AUS:SVC Checker: checkForUpdates, force: true
01-26 18:46:00.595 I/Gecko   (  204): *** AUS:SVC Checker:getUpdateURL - update URL: https://aus4.mozilla.org/update/3/B2G/37.0a1/20141215040201/flame/en-US/nightly/Boot2Gecko%202.2.0.0-prerelease%20(SDK%2019)/default/default/update.xml?force=1
01-26 18:46:00.595 I/Gecko   (  204): *** AUS:SVC Checker:checkForUpdates - sending request to: https://aus4.mozilla.org/update/3/B2G/37.0a1/20141215040201/flame/en-US/nightly/Boot2Gecko%202.2.0.0-prerelease%20(SDK%2019)/default/default/update.xml?force=1
01-26 18:46:00.595 I/GeckoDump(  204): XXX FIXME : Got a mozContentEvent: force-update-check
01-26 18:46:00.625 D/wpa_supplicant(  969): RX ctrl_iface - hexdump(len=11): 53 49 47 4e 41 4c 5f 50 4f 4c 4c
01-26 18:46:00.625 D/wpa_supplicant(  969): wlan0: Control interface command 'SIGNAL_POLL'
01-26 18:46:00.625 D/wpa_supplicant(  969): nl80211: survey data missing!
01-26 18:46:00.805 I/Gecko   (  204): *** AUS:SVC Checker:onError - request.status: 2153389948
01-26 18:46:00.805 I/Gecko   (  204): *** AUS:SVC getStatusTextFromCode - transfer error: Update XML file malformed (200), default code: 200
01-26 18:46:00.805 I/Gecko   (  204): UpdatePrompt: Setting gecko.updateStatus: check-error-2153389948
01-26 18:46:00.815 I/Gecko   (  204): *** AUS:SVC UpdateService:onError - error during background update. error code: -2141577348, status text: Update XML file malformed (200)
}

It looks like it's trying to download this URL, and it's rejecting the result as invalid XML:
https://aus4.mozilla.org/update/3/B2G/37.0a1/20141215040201/flame/en-US/nightly/Boot2Gecko%202.2.0.0-prerelease%20(SDK%2019)/default/default/update.xml?force=1

However, if I load that in my browser on Desktop firefox, it certainly looks like valid XML. And I verified that I can download it directly from my device, using "curl -k" in adb shell -- that gives me this output:
{
<?xml version="1.0"?>
<updates>
    <update type="minor" displayVersion="37.0a1" appVersion="37.0a1" platformVersion="37.0a1" buildID="20150112010228">
        <patch type="complete" URL="http://ftp.mozilla.org/pub/mozilla.org/b2g/nightly/2015/01/2015-01-12-01-02-28-mozilla-central-flame-kk/b2g-flame-gecko-update.mar" hashFunction="sha512" hashValue="e19ebb2f4049d43f599734f39b40f6b46d2190d65a1d7643df45a39117bc9339b9afa461ebf1dd1a659ecea12cc4c1765c30e94cbc07ac6f9568a3e7e419168b" size="72231948"/>
    </update>
</updates>
}

So, I'm unclear why I'm getting an error when trying to update.  CC'ing marshall_law since he's helped me with update issues in the past; Marshall, please CC others who might be able to help here, and let me know if there's any other information that you need or any other diagnostics I should try.
Are you sure it's not your Flame's network: a captive portal, something like that ?
I'm pretty sure, given that "curl" works just fine from the device itself (in 'adb shell').

(I'm connected to the openwireless.org network from the Mozilla MV office, which IIRC has no captive portal.)
just remembered -- I also tried loading blizzard.com in the browser, and it loaded fine. So, doesn't seem to be a limited-WiFi-network/captiveportal sort of issue.
(And just for thoroughness' sake, I've now connected the device to the "Mozilla" office WiFi network, and I still get the same error.  The device's browser app still can connect to the web just fine.)
Error code 2153389948 seems to be relevant.

2153389948 = 0x805A1f7c which seems to originate ERROR_MODULE_SECURITY.

I haven't been able to figure out the exact code which corresponds to 0x1f7c.

I've found mention of error 2153390069 = 0x805A1ff5 which is close, and means server certificate expired.
I hit this again today, after turning this Flame back on again for the first time since comment 4.

I noticed that my device's date & time were set incorrectly (though it wasn't extremely far off -- it was set to 6pm on 2/12/2015, less than a month ago.)

After I manually corrected the date, my next update-check *succeeded*!!  So, I think the incorrect date may have been partially to blame. Note also that in comment 0, the date was off there too -- I posted that comment on 1/13, but the logcat output shows 1/26.

But, I don't immediately see how a "small" (<1 month-off) date issue like this could be causing cert errors, because it looks like the cert used by aus4.mozilla.org is valid from 09/22/2013 to 09/27/2016. Maybe there's an additional assertion that gets passed along with the cert & the update, which is only valid for a shorter period of time, perhaps?
Summary: Unable to download OTA updates for my Flame w/ 2014.12.15 2.2 Nightly, with "*** AUS:SVC UpdateService:onError - error during background update. error code: -2141577348, status text: Update XML file malformed (200)" → Unable to download OTA updates for my Flame w/ 2014.12.15 2.2 Nightly, with "*** AUS:SVC UpdateService:onError - error during background update. error code: -2141577348, status text: Update XML file malformed (200)" (w/ wrong date on device)
I can reliably trigger this bug by setting the device's date to be 1 month in the past.

We should be robust & able to download updates, in the face of such small misconfigurations -- let's refocus this bug to be about that, now that the issue has been identified.

(Perhaps it's acceptable to reject updates if the device's date is misconfigured to be 1970, or 2020 -- or anytime outside of aus4.mozilla.org's valid cert-window. But ~1 month of error is pretty small, and should cause permanent-failure-to-update.)
Summary: Unable to download OTA updates for my Flame w/ 2014.12.15 2.2 Nightly, with "*** AUS:SVC UpdateService:onError - error during background update. error code: -2141577348, status text: Update XML file malformed (200)" (w/ wrong date on device) → OTA updates fail, with "AUS:SVC UpdateService:onError - error during background update. error code: -2141577348, status text: Update XML file malformed (200)", if the date on the device is wrong by a month or two
(sorry: *shouldn't* cause permanent-failure-to-update.)

Here's what this error looks like right now, in logcat, with an up-to-date Flame with its date set 1 month in the past:
{
02-04 15:13:09.341 I/Gecko   (  205): UpdatePrompt: Forcing update check
02-04 15:13:09.341 I/Gecko   (  205): *** AUS:SVC Checker: checkForUpdates, force: true
02-04 15:13:09.411 I/Gecko   (  205): *** AUS:SVC Checker:getUpdateURL - update URL: https://aus4.mozilla.org/update/3/B2G/39.0a1/20150304010324/flame/en-US/nightly/Boot2Gecko%203.0.0.0-prerelease%20(SDK%2019)/default/default/update.xml?force=1
02-04 15:13:09.411 I/Gecko   (  205): *** AUS:SVC Checker:checkForUpdates - sending request to: https://aus4.mozilla.org/update/3/B2G/39.0a1/20150304010324/flame/en-US/nightly/Boot2Gecko%203.0.0.0-prerelease%20(SDK%2019)/default/default/update.xml?force=1
02-04 15:13:09.411 I/GeckoDump(  205): XXX FIXME : Got a mozContentEvent: force-update-check
02-04 15:13:09.521 E/QCALOG  (  329): [MessageQ] ProcessNewMessage: [XTWiFi-PE] unknown deliver target [OS-Agent]
02-04 15:13:09.591 I/Gecko   (  205): *** AUS:SVC Checker:onError - request.status: 2153389949
02-04 15:13:09.591 I/Gecko   (  205): *** AUS:SVC getStatusTextFromCode - transfer error: Update XML file malformed (200), default code: 200
02-04 15:13:09.591 I/Gecko   (  205): UpdatePrompt: Setting gecko.updateStatus: check-error-2153389949
02-04 15:13:09.601 I/Gecko   (  205): *** AUS:SVC UpdateService:onError - error during background update. error code: -2141577347, status text: Update XML file malformed (200)
}

(Note that the "request.status" & "error code" values are slightly different now -- they're off by 1 from the values in comment 0. I'm guessing that's not too significant.)
Identifier info for my build:
 Build Identifier: 20150304010324
 Update channel: Nightly
 Git Commit Info: 2015-03-03 21:58:43 3fc0ac30

needinfo=dhylands to see if he can reproduce, and to see if any progress can be made now that we better-know what's going on (I think).
Flags: needinfo?(dhylands)
OS: Linux → Gonk (Firefox OS)
Hardware: x86_64 → ARM
Since you have a phone that seems to reproduce, perhaps I can get you to grab some files off of the phone?

IIRC, there should a directory created on the phone /data/local/updates that may have subdirectories.

Can you pull the contents of the /data/local/updates directory tree?

Something like: 
> adb pull /data/local/updates 

should grab all of the files. If you have an sdcard, then there may also be a /sdcard/updates directory tree to grab as well.
Flags: needinfo?(dhylands)
I can, though have you tried reproducing since I figured out that the date seems to be the issue, per comment 7?

(Just manually set your device's date to be 1 month in the past, and then do a manual update check, and look at your logcat for error output like comment 8)

If you can reproduce with your device's date tweaked (and I'm hopeful that that's all that's required), then it'll save us both time to skip past any debugging-over-bugzilla stuff. :)
Flags: needinfo?(dhylands)
Attached file tcpdump capture
I changed the https: in the URL to http: in the URL and was then able to do a tcpdump capture.

The tcpdump seems to be showing a 301 Moved Permanently
Flags: needinfo?(dhylands)
Hmm. Maybe that's saying that the http: address has been moved permanently to https:

The place in the code where its sending the request is here:
https://dxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/nsUpdateService.js?from=nsUpdateService.js&case=true#3726-3755

And then the onError callback gets called:
https://dxr.mozilla.org/mozilla-central/source/toolkit/mozapps/update/nsUpdateService.js?from=nsUpdateService.js&case=true#3886-3910

But I'm not quite sure how to get at the content that's being retrieved.
If I set my linux desktop's date to 1 month in the past, and then try to visit the rejected URL from comment 0 ( https://aus4.mozilla.org/update/3/B2G/37.0a1/20141215040201/flame/en-US/nightly/Boot2Gecko%202.2.0.0-prerelease%20(SDK%2019)/default/default/update.xml?force=1 ), I get:

> Secure Connection Failed
>
> An error occurred during a connection to aus4.mozilla.org.
> The OCSP response is not yet valid (contains a date in the future).
> (Error code: sec_error_ocsp_future_response)

That's presumably the same issue that B2G is hitting when checking this update server, with an incorrect date on the device.
I guess it would be good to figure out a way to get a similar error string logged rather than just the error code.
I can actually hit this OCSP error (on my desktop) if I configure my system to be just 1 week in the past, actually. That's a pretty small amount of error, and it's believable that users could end up with phones that are off by that amount, by one route or another. Right now we're effectively stranding those users & refusing to give them system-updates, with no easily-discoverable explanation.

I'm not sure what the best way to fix this is -- some ideas:
 a) Ignore date issues in OCSP assertions, when checking for updates.
    (or semi-equivalently: don't send OCSP assertions from this server)
 b) Greatly increase the valid-date window in our OCSP asertions from the update server (if that's something you can even do)
 c) Prompt the user that their device's date may be off, when we detect this sort of error.
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 6 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: