[System][OTA] Unable to perform an OTA from a Shallow Flash

RESOLVED WONTFIX

Status

defect
RESOLVED WONTFIX
5 years ago
2 years ago

People

(Reporter: Marty, Unassigned)

Tracking

({regression})

unspecified
ARM
Gonk (Firefox OS)

Firefox Tracking Flags

(b2g-v2.1 affected, b2g-v2.2 affected)

Details

Attachments

(3 attachments)

Posted file log-OTA.txt
Description:
When trying to perform an OTA update from 2.1 to 2.1, the device downloads the update, uncompresses for a while, then provides an error stating "There was an error while downloading the updates."

Note: The update was able to be successfully downloaded and installed when using a Full Flash.
Additionally, this occurs specifically on v188-1.  Using v188, the OTA was able to successfully download and install on a Shallow Flash.
   
Repro Steps:
1) Base a device to v188-1
2) Update a Flame device to BuildID: 20141103001220 (Shallow Flash)
3) Sign in to a WiFi network, and then search for updates in Settings
4) Download and install an update.
  
Actual:
The update downloads and uncompresses, but does not install, resulting in an error message.
  
Expected: 
The update downloads, uncompresses, and installs properly.
  
Environmental Variables:
Device: Flame 2.1
BuildID: 20141103001220
Gaia: 027a7de0c95320cea0579bfd1a4ceef3e9038f34
Gecko: ffecb2be228b
Version: 34.0 (2.1)
Firmware: V188-1
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0
  
Repro frequency: 7/7
Link to failed test case: https://moztrap.mozilla.org/manage/case/2313/
See attached: logcat

-------------------------------------------

This issue also occurs on Flame 2.2.
The update downloads and uncompresses, but does not install, resulting in an error message.

Environmental Variables:
Device: Flame 2.2 Master
BuildID: 20141103040202
Gaia: bc168c17474dabbcceaa349e9bc7c95654435aec
Gecko: 5999e92e89ff
Version: 36.0a1 (2.2 Master)
Firmware: V188-1
User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0
Attaching text doc of a 'df' comparing the memory on the phone immediately following basing the phone to v188 and v188-1 respectively. 
===========================

'/system' usage comparison:
- v188-1
	267.3M
- v188
	251.3M
---------------
Diff of: 16.0M
Flags: needinfo?(pbylenga)
Looks to be a regression on the vendor size, we're getting a write error which usually means out of space.
Component: Gaia::System → Vendcom
Flags: needinfo?(pbylenga)
Keywords: smoketest
You should try |while true; do adb shell df /system; sleep 0.5; done;| while performing the OTA. Chances are you are reaching 0 before the OTA could be unpacked.
Flags: needinfo?(pbylenga)
Flags: needinfo?(mshuman)
Posted file OTA-Storage.txt
Attaching the output from that command here.
Looks like you are right, and the free storage is reaching 0 during the uncompressing step.  I see the free memory steadily decrease as soon as the phone displays 'uncompressing' and the OTA fails right when it reaches 0.
Flags: needinfo?(pbylenga)
Flags: needinfo?(mshuman)
Then it's base system image regression ...
Flags: needinfo?(asa)
Investigating the difference between v188 and v188-1 we got it down to /system/lib/modules/ath6kl-3.5/ adding 12M and /system/lib/modules/wlan.ko adding 4M.

CC'ing mwu.
We deleted just /system/lib/modules/ath6kl-3.5/ and it wasn't enough to do an OTA.

However, we also removed the /system/media/bootanimation.zip (10M) and then we successfully could do an OTA after shallow flashing on top of v188-1.
ni? wesly and viralwang to recommend if T2M can help with scaling down the v188-1 userDebug images so we have space to apply updates.   can we trim some of the areas that is mentioned in comment 7? 

Alexandre to comment if these areas are okay to remove for OTA updates and any other unaffected areas to the user.
Flags: needinfo?(wehuang)
Flags: needinfo?(vwang)
Flags: needinfo?(lissyx+mozillians)
(In reply to Peter Bylenga [:PBylenga] from comment #6)
> Investigating the difference between v188 and v188-1 we got it down to
> /system/lib/modules/ath6kl-3.5/ adding 12M and /system/lib/modules/wlan.ko

Bravo, you've just broke your WiFi :).

You can wipe ath6kl modules, as far as I can tell those are for a chip not used in Flame. But wlan.ko is a symlink to the actual driver, and this is used to load the driver.

> adding 4M.
> 
> CC'ing mwu.

(In reply to Peter Bylenga [:PBylenga] from comment #7)
> 
> However, we also removed the /system/media/bootanimation.zip (10M) and then
> we successfully could do an OTA after shallow flashing on top of v188-1.

Which means you will loose the Firefox moving boot animation during Gecko startup.

This is not the proper way to get some space. When we do our own builds, we have much more space available. T2M should make sure they produce images that we can use for OTA.

FYI, a FOTA package for Gecko+Gaia, with VARIANT=userdebug, all keyboards and all locales enabled on my Nexus S now reaches 102MB.
Flags: needinfo?(lissyx+mozillians)
(In reply to Alexandre LISSY :gerard-majax from comment #9)
> (In reply to Peter Bylenga [:PBylenga] from comment #6)
> > Investigating the difference between v188 and v188-1 we got it down to
> > /system/lib/modules/ath6kl-3.5/ adding 12M and /system/lib/modules/wlan.ko
> 
> Bravo, you've just broke your WiFi :).

I didn't delete wlan.ko, it's why we ended up having to remove bootanimation.zip to make up for it.  I thought about replacing the wlan.ko with the one from v188 but didn't think this would still be compatible.

> You can wipe ath6kl modules, as far as I can tell those are for a chip not
> used in Flame. But wlan.ko is a symlink to the actual driver, and this is
> used to load the driver.
> 
> > adding 4M.
> > 
> > CC'ing mwu.
> 
> (In reply to Peter Bylenga [:PBylenga] from comment #7)
> > 
> > However, we also removed the /system/media/bootanimation.zip (10M) and then
> > we successfully could do an OTA after shallow flashing on top of v188-1.
> 
> Which means you will loose the Firefox moving boot animation during Gecko
> startup.
> 
> This is not the proper way to get some space. When we do our own builds, we
> have much more space available. T2M should make sure they produce images
> that we can use for OTA.
> 
> FYI, a FOTA package for Gecko+Gaia, with VARIANT=userdebug, all keyboards
> and all locales enabled on my Nexus S now reaches 102MB.

If this isn't a good test environment we'll wait for T2M update before moving smoke testing to 188-1.
QA Whiteboard: [QAnalyst-Triage+]
Is there any special reason that we must have OTA based on "v188-1", which is a userdebug SW? My understanding is when Flame was setup it's decided to put focus on user build (the one to be released on MDN) so we had move some features from userdebug to it in order to facilitate all the development work, then never touch userdebug SW. It's until recently a special request from TPE QA (to have ADB on by default), so we get v188-1 as userdebug SW only for that request.

My idea is if this is something also in user build then we follow w/ T2M for correction, but not touch/change any difference between user and userdebug SW, to have more focus on the SW to be released and worked. (that's user variant)

How do yo think?
Flags: needinfo?(wehuang)
From comment 1, we can see the free space of /system is about 86.8MB. We need more space to get OTA installed properly.
======
V188-1
======
Filesystem               Size     Used     Free   Blksize
/system                354.2M   267.3M    86.8M   4096

Wesly, Can we discuss with T2M if it is possible increase the total size of /system partition? Otherwise similar issue will never end.
Flags: needinfo?(vwang) → needinfo?(wehuang)
Hi KaiZhen, since v188-1 is userdebug SW which is only tend to be used by QA's test (like mentioned in comment#11) so I would suggest not to focus on it. Or do you see this space issue is also happen in v188, which is user SW?
Flags: needinfo?(wehuang)
(In reply to Wesly Huang from comment #13)
> Hi KaiZhen, since v188-1 is userdebug SW which is only tend to be used by
> QA's test (like mentioned in comment#11) so I would suggest not to focus on
> it. Or do you see this space issue is also happen in v188, which is user SW?

On v188 user build, /system free space and OTA package size is working fine. But if we want to feed an OTA package with bigger size there is chance to hit this error, as the the free space is close to the boundary.
This issue also applies to OTA updates with v188 (so I can't update).

/system                354.2M   267.9M    86.2M   4096
Following up with the above comment:
We are unable to perform OTA updates with base v188 on 2.2 KK Master builds on flame devices.

Repro Steps:
0) Flash Base Image v188
1) Shallow Flash Flame Device to 20141114040205
2) Enable Wi-Fi.
3) Search for 'Daily Updates'.
4) Begin downloading the system update.
5) After download completes observe 'Uncompressing' state in Status bar.
6) Observe Phone UI while phone attempts 'Uncompressing'

Actual: 'There was an error while downloading the updates' appears. Phone does not perform system update. A re-base was required before another shallow flash could be performed.

Expected: Phone polls user after 'Uncompressing' completes asking if they would like to restart phone now. Phone performs update and is shown on new build after reset finishes.

Repro Rate: 6/6

***************************************
Environmental Variables:
----------------------------------------------

Device: Flame 2.2
BuildID: 20141114040205
Gaia: 1e300eac2e56d98ad51d414766d031db7d33221f
Gecko: bbb68df450c2
Gonk: 
Version: 36.0a1 (2.2)
Firmware: V188
User Agent: Mozilla/5.0 (Mobile; rv:36.0) Gecko/36.0 Firefox/36.0

OTA Download Size: 69.60 MB
======================================

Issue DOES NOT REPRO on 2.1 build flame devices.
Results: Phone polls user after 'Uncompressing' completes asking if they would like to restart phone now. Phone performs update and is shown on new build after reset finishes.

Device: Flame 2.1
BuildID: 20141114001204
Gaia: af6533781356acc62b0f40c9e040aa5b47d3b709
Gecko: 551326425826
Gonk: 
Version: 34.0 (2.1)
Firmware: V188
User Agent: Mozilla/5.0 (Mobile; rv:34.0) Gecko/34.0 Firefox/34.0

OTA Download Size: 68.06 MB
----------------------------------------------
***************************************
Flashing 20141117040203 using the Flash_GG script, we are provided with the following System storage:
Filesystem               Size     Used     Free   Blksize
/system                354.2M   267.9M    86.2M   4096
While uncompressing the OTA package, the Free memory decreases all the way to 0, and the OTA errors out.

Flashing 20141117040203 using the B2G flash tool, we are provided with this System storage:
Filesystem               Size     Used     Free   Blksize
/system                354.2M   249.4M   104.8M   4096
The OTA package is able to successfully finish uncompressing, leaving the user with the following System storage:
Filesystem               Size     Used     Free   Blksize
/system                354.2M   337.0M    17.2M   4096
QA Whiteboard: [QAnalyst-Triage+] → [QAnalyst-Triage?]
Flags: needinfo?(pbylenga)
Duplicate of this bug: 1101294
After further inspection of the memory within the system directory of the phone (after performing the 2 flash methods mentioned in comment 17), I've observed:
- Flash_Gg.sh leaves 18.7 Mb of memory within the system directory under "media" => (system\media)
- Removing the above memory leaves us at 249.0M used in the phone.
- OTA performed correctly once ./media memory was removed from the phone.

I'll editing our script and ensure that it is causing no other unintended consequences. I'm curious now if the Base Image is still maintaining old media data for it's ringtones, alarms and other assets, that the flash_pvt.py script circumvents, and is pulling these assets from a separate directory.
QA Whiteboard: [QAnalyst-Triage?] → [QAnalyst-Triage+]
Flags: needinfo?(pbylenga)
FYI I've also run into this yesterday. Meh :-(
FYI, for some devices, you have to reboot the device in order for the disk space to be accurate...
I'm very sorry for my question but using Windows I won't be able to flash base image v188 and the latest nightly 2.2 build again.

I have a Flame device with base image v188 and 2.2 with the latest but one OTA update. If I download today's OTA update, will I have this issue?

Many thanks!
> root@flame:/ # df /system
> Filesystem               Size     Used     Free   Blksize
> /system                412.5M   226.0M   186.5M   4096
(In reply to Naoki Hirata :nhirata (please use needinfo instead of cc) from comment #24)
> dup of bug 1085230?

This has not be made a duplicate yet. Should it be made one to cut down on the number of similar bugs to explore?
As part of Bug 1051146 we will be introducing large, about 21M, language/acoustic models for speech recognition. We are wondering how this will affect OTA and what we can do to minimize this problem.

Note: In future we may not include such models but may have speech recognition done, as it is on
Android and iOS, by sending the audio to servers and having the speech-to-text done on the servers
and the text result returned to the device. But, in KISS tradition, we have a simple initial
implementation that has as few moving parts as is possible.
See Also: → 1181372
Is this bug fixed now?
Flags: needinfo?(asa)
Firefox OS is not being worked on
Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.