Closed Bug 1181372 Opened 6 years ago Closed 6 years ago

[System][OTA] OTA fails during uncompression, appears to run out of space in /system folder

Categories

(Firefox OS Graveyard :: GonkIntegration, defect)

ARM
Gonk (Firefox OS)
defect
Not set
normal

Tracking

(blocking-b2g:2.5+, b2g-master verified)

VERIFIED FIXED
FxOS-S4 (07Aug)
blocking-b2g 2.5+
Tracking Status
b2g-master --- verified

People

(Reporter: Marty, Unassigned)

References

Details

(Keywords: regression, smoketest, Whiteboard: [2.5-Daily-Testing])

Attachments

(1 file)

Attached file logcat_ota.txt
Description:
Flame updates will download, and appear to uncompress before finally failing and displaying an error pop up at the bottom of the screen.

It is possible that the device is running out of space in the /system folder.  Using the following command, i was able to capture the free /system space as uncompression failed.

|while true; do adb shell df /system; sleep 0.5; done;|
Filesystem               Size     Used     Free   Blksize
/system                354.2M   351.3M     2.9M   4096
Filesystem               Size     Used     Free   Blksize
/system                354.2M   351.9M     2.3M   4096
Filesystem               Size     Used     Free   Blksize
/system                354.2M   353.2M   944.0K   4096
Filesystem               Size     Used     Free   Blksize
/system                354.2M   353.6M   584.0K   4096
Filesystem               Size     Used     Free   Blksize
/system                354.2M   263.4M    90.8M   4096

From IRC discussion with nhirata_, it is also possible that read-only is getting applied to early, also causing the update to fail (seen in attached logcat)

<nhirata_> at the same time what it does is unzip the mar file and then replace all the files.
<nhirata_> so it basically needs the space of the download + the same amount of space to unzip.
<nhirata_> so we either slightly bloated some more...
<nhirata_> or it's the read-only being a little too quick

Repro Steps:
1) Update a Flame to 20150706010204
2) Connect to a WiFi or Data network
3) In Developer settings, change the OTA channel to 'nightlytest'
4) Check for updates and then begin installing the update.

Actual:
The update will will download, but will fail during uncompression

Expected:
The update uncompresses and is applied properly.

Environmental Variables:
Device: Flame 2.5 (Full Flash)
Build ID: 20150706010204
Gaia: dc6c18c0dea7af3c40bfff86c530fd877d899dc4
Gecko: 136c41fca853
Gonk: a4f6f31d1fe213ac935ca8ede7d05e47324101a4
Version: 42.0a1 (2.5)
Firmware Version: v18D
User Agent: Mozilla/5.0 (Mobile; rv:42.0) Gecko/42.0 Firefox/42.0

Repro frequency: 10/10
See attached: logcat
QA Whiteboard: [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
[Blocking Requested - why for this release]:
Regression that fails smoke tests.

Requesting a window to see if we can determine the source of bloat.
blocking-b2g: --- → 2.5?
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(pbylenga)
QA Contact: jmercado
Getting a regression window is complicated for this issue.  Bug 1153976 was occurring around this time and prevents me from getting a smaller window.  I kept with Nightly Central user builds because when switching to builds such as the pvt central engineering builds, they had wildly different dates, and are not representative of what I expect a user would see (going from an engineering build to user build through an ota update).  One thing to note is that the our builds and update files seem to be growing significantly in the last couple of months, and we've seen other problems such as with bug 1181188.  

For example, the builds currently on the update channels show a 12.4 MB difference with less than 2 months apart.
nightly channel - build from 5/21 - 76.08
nightlytest channel - current build - 88.48

I've included the results of using the "adb shell df /system" command after flashing to each of these builds.  Of note is that both gaia and gecko grew during this window and I could not update from EITHER build after performing swaps because of this growth.  The Gecko portion grew the most but I am including both pushlogs because of this.

Central Regression Window:

Last Working 
Environmental Variables:
Device: Flame 2.5
BuildID: 20150410010202
Gaia: e768af6558957ddb0f6a9ce579ea41c3e3d0b203
Gecko: fec90cbfbaad
Gonk: b83fc73de7b64594cd74b33e498bf08332b5d87b
Version: 40.0a1 (2.5) 
Firmware Version: v18D
User Agent: Mozilla/5.0 (Mobile; rv:40.0) Gecko/40.0 Firefox/40.0

Filesystem               Size     Used     Free   Blksize
/system                354.2M   231.6M   122.6M   4096

First Broken 
Environmental Variables:
Device: Flame 2.5
BuildID: 20150414072436
Gaia: c8cb0c0ebb8dd1f5c0c9037e38f8e4b237beb77b
Gecko: 388f5861dc7d
Gonk: b83fc73de7b64594cd74b33e498bf08332b5d87b
Version: 40.0a1 (2.5) 
Firmware Version: v18D
User Agent: Mozilla/5.0 (Mobile; rv:40.0) Gecko/40.0 Firefox/40.0

Filesystem               Size     Used     Free   Blksize
/system                354.2M   239.4M   114.7M   4096

Last Working gaia / First Broken gecko - Issue DOES occur
Gaia: e768af6558957ddb0f6a9ce579ea41c3e3d0b203
Gecko: 388f5861dc7d

Filesystem               Size     Used     Free   Blksize
/system                354.2M   236.7M   117.5M   4096

First Broken gaia / Last Working gecko - Issue DOES occur
Gaia: c8cb0c0ebb8dd1f5c0c9037e38f8e4b237beb77b
Gecko: fec90cbfbaad

Filesystem               Size     Used     Free   Blksize
/system                354.2M   231.6M   122.6M   4096

Gaia Pushlog: https://github.com/mozilla-b2g/gaia/compare/e768af6558957ddb0f6a9ce579ea41c3e3d0b203...c8cb0c0ebb8dd1f5c0c9037e38f8e4b237beb77b

Gecko Pushlog: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=fec90cbfbaad&tochange=388f5861dc7d
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
NI for Naoki to take a look.
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(pbylenga) → needinfo?(nhirata.bugzilla)
Basically, I did my initial analysis.  I'm not 100 % sure where to go from here.  We need a dev to look into this.  Gregor, do you know who deals with updates?  I don't think it's a partition size issue any more.  I could be mistaken...
Flags: needinfo?(nhirata.bugzilla) → needinfo?(anygregor)
Lets see if Alex has some time or Dave once he is back.
Flags: needinfo?(anygregor)
Spoke with Alexandre.  We didn't change the partition size : 
https://github.com/mozilla-b2g/device-flame/blob/master/BoardConfig.mk#L9
see proper size listed in: https://bugzilla.mozilla.org/show_bug.cgi?id=1085230#c12

When I repacked the camera, I used the partition size mentioned in the code not the size in the bug.
Alexandre is writing a bug in regards to the partition size in the code; I'll have to go home and fix the build for the partition size and then resubmit to T2M. I will try to take care of this sunday.
Flags: needinfo?(nhirata.bugzilla)
Internal test build : https://drive.google.com/a/mozilla.com/file/d/0B_0LdM1CVycIZEdOZ2ZFTzlFc0k/view?usp=sharing

Once this passes internal QA, We'll pass this to T2M to host again.

Difference is system partition size was enlarged from 390 to 419 MB
IS this still an issue?
blocking-b2g: 2.5? → 2.5+
http://cds.w5v8t3u9.hwcdn.net/v18D_nightly_v4.zip

mdn will be updated soon.  Chris Mills has been informed.

SHA512(./v18D_nightly_v4.zip)= 9105e29fd39da1ae487b01da4431a803d619d31482147b4383002b8a10268905fd444b108a438395a78d289cfe4e8fba10c3fb6b0d187f3535f027bf90c2391a
build id : 20150527010201

Everyone will need to update with this version.  Data will be lost doing a full flash.
The important flashing are the system.img and the partition.img

if you are already on 18D, then it's possible to get by with : 
fastboot flash boot boot.img
fastboot flash system system.img
fastboot flash partition gpt_both0_big.bin

At the same time I take no responsibility for the settings and weird conflicts you may get from having an old profile if you only flash the 3 parts.
Status: NEW → RESOLVED
Closed: 6 years ago
Flags: needinfo?(nhirata.bugzilla)
Resolution: --- → FIXED
I flashed the above build only. My data was lost of course (I expected this) but it worked fine. The Region, Date and Time settings in the FTU didn't show up though, I had to set those afterwards.
So I backed up my profile, flashed v4, shallow flashed nightly build from 2015-07-07 (because that's what I had flashed some weeks ago and the profile backup/restore tool wouldn't let me restore a profile to a device with a previous build than what was backed up), and finally restored my profile.

All seems well except my marketplace app is missing from my home screen, this only happens after I restore the profile, so that's the cause it seems. Any chance there's a way to get it back without having to give up my profile?
NVM my previous comment 13, easy to go to https://marketplace.firefox.com/app/marketplace/, I must have missed that before somehow.
Target Milestone: --- → FxOS-S4 (07Aug)
This issue is verified fixed on the latest Flame 2.5 builds.
The user is able to successfully run an OTA from build 20150729030209 to build 20150730030209

Environmental Variables:
Device: Flame 2.5
Build ID: 20150730030209
Gaia: bf8565e0c3ad216ccb3f109c17f8a2eb2c42f6b8
Gecko: 62469b20ec84
Gonk: 41d3e221039d1c4486fc13ff26793a7a39226423
Version: 42.0a1 (2.5)
Firmware Version: v18D
User Agent: Mozilla/5.0 (Mobile; rv:42.0) Gecko/42.0 Firefox/42.0
Status: RESOLVED → VERIFIED
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(pbylenga)
I can verify it's fixed, I did an OTA from yesterday's build to today's one (20150729030209 to 20150731030207)
Fixed by releasing a new base image with a new partition table (the system partition is now bigger).
Component: Gaia::System → General
Component: General → GonkIntegration
Duplicate of this bug: 1185831
You need to log in before you can comment on or make changes to this bug.