app uninstallation strands profiles, causing other apps to stop running

RESOLVED FIXED in Firefox 30

Status

()

defect
P1
critical
RESOLVED FIXED
5 years ago
5 years ago

People

(Reporter: myk, Assigned: myk)

Tracking

31 Branch
Firefox 32
ARM
Android
Points:
---

Firefox Tracking Flags

(firefox30+ fixed, firefox31+ fixed, firefox32 fixed)

Details

(Whiteboard: [WebRuntime])

Attachments

(1 attachment)

Assignee

Description

5 years ago
On Nightly, all the apps I previously installed, including Twitter, NYTimes, and HuffPost, started failing to run this morning, hanging at the launch screen.

Apps I install using today's build, like 2048, Drag the Dots, and Mozzle, on the other hand, run as expected.  Logs don't show anything obvious.
Can reproduce on nightly. 

Logs:
D/GeckoScreenOrientation(31006): updating to new orientation PORTRAIT_PRIMARY
D/GeckoSharedPrefs(31006): Already in background thread, migrating directly
D/GeckoSharedPrefs(31006): Current version = 1, prefs version = 1
D/GeckoLocales(31006): Calling setContextGetter: org.mozilla.fennec.WebApps$WebApp14@42a69fe0
D/GeckoSessInfo(31006): Building SessionInformation from prefs: 1396530176580, 0, false, false
D/GeckoProfile(31006): Found profile dir.
I/GeckoApp(31006): Creating HealthRecorder.
D/GeckoApp(31006): OS locale is en_GB, app locale is null
I/GeckoWebappImpl(31006): Trying to launch: {"url":"https:\/\/marketplace.firefox.com\/app\/a6fb966c-a1cc-4372-a5a5-17751b72d375\/manifest.webapp?feature_profile=755aa9fe6e68.47.4"}
D/GeckoSessInfo(31006): Recording start of session: 1396530199256
W/GeckoGLController(31006): GLController::serverSurfaceChanged(1080, 1701)
I/GeckoAxis(31006): Prefs: 0.85,0.97,10.0,0.1,0.04,0.3,0.5
W/GeckoGLController(31006): GLController::updateCompositor with mCompositorCreated=false
D/GeckoHardwareUtils(31006): System memory: 1855MB.
D/GeckoLoader(31006): Gecko environment env0: null
W/GeckoGLController(31006): done GLController::updateCompositor
E/GeckoLibLoad(31006): Load sqlite start
E/GeckoLinker(31006): /data/app/org.mozilla.fennec-1.apk!/assets/libnss3.so: Warning: unhandled flags #8 not handled
E/GeckoLinker(31006): /data/app/org.mozilla.fennec-1.apk!/assets/libnss3.so: Warning: relocation to NULL @0x0018fc24
E/GeckoLinker(31006): /data/app/org.mozilla.fennec-1.apk!/assets/libnss3.so: Warning: relocation to NULL @0x0018ffdc for symbol "__cxa_begin_cleanup"
E/GeckoLinker(31006): /data/app/org.mozilla.fennec-1.apk!/assets/libnss3.so: Warning: relocation to NULL @0x0018ffe0 for symbol "__cxa_type_match"
E/GeckoLibLoad(31006): Load sqlite done
E/GeckoLibLoad(31006): Load nss start
E/GeckoLibLoad(31006): Load nss done
E/GeckoLinker(31006): /data/app/org.mozilla.fennec-1.apk!/assets/libxul.so: Warning: unhandled flags #8 not handled
E/GeckoLinker(31006): /data/app/org.mozilla.fennec-1.apk!/assets/libmozalloc.so: Warning: unhandled flags #8 not handled
E/GeckoLinker(31006): /data/app/org.mozilla.fennec-1.apk!/assets/libmozalloc.so: Warning: relocation to NULL @0x00003f68
E/GeckoLinker(31006): /data/app/org.mozilla.fennec-1.apk!/assets/libmozalloc.so: Warning: relocation to NULL @0x00003fdc for symbol "__cxa_begin_cleanup"
E/GeckoLinker(31006): /data/app/org.mozilla.fennec-1.apk!/assets/libmozalloc.so: Warning: relocation to NULL @0x00003fe0 for symbol "__cxa_type_match"
D/GeckoMemoryMonitor(24609): onTrimMemory() notification received with level 20
I/GeckoAnnounce(24609): fennec :: AnnounceBrSvc :: Registering AnnouncementsService.
I/GeckoHealth(24609): fennec :: HealthReportBroadcastService :: Registering HealthReportUploadService.
I/GeckoAnnounce(24609): fennec :: BackgroundService :: Setting inexact repeating alarm for interval 43200000
I/GeckoHealth(24609): fennec :: BackgroundService :: Setting inexact repeating alarm for interval 3600000
E/GeckoLibLoad(31006): Loaded libs in 120ms total, 80ms(100ms) user, 20ms(20ms) system, 0(0) faults
W/GeckoThread(31006): zerdatime 98420026 - runGecko
I/GeckoThread(31006): RunGecko - args =  -P webapp14
D/GeckoAppShell(31006): GeckoLoader.nativeRun /data/app/org.mozilla.fennec-1.apk -greomni /data/app/org.mozilla.fennec-1.apk  -P webapp14 -url https://marketplace.firefox.com/app/a6fb966c-a1cc-4372-a5a5-17751b72d375/manifest.webapp?feature_profile=755aa9fe6e68.47.4 -width 1080 -height 1776
E/Profiler(31006): BEGIN mozilla_sampler_init
E/Profiler(31006): BPUnw: [1 total] thread_register_for_profiling(me=0x7714bce8, stacktop=0x77aae972)
E/Profiler(31006): SPS:
E/Profiler(31006): SPS: Unwind mode       = combined
E/Profiler(31006): SPS: Sampling interval = 0 ms (zero means "platform default")
E/Profiler(31006): SPS: Entry store size  = 0 (zero means "platform default")
E/Profiler(31006): SPS: UnwindStackScan   = 0 (max dubious frames per unwind).
E/Profiler(31006): SPS: Use env var MOZ_PROFILER_MODE=help for further information.
E/Profiler(31006): SPS:
E/Profiler(31006): Registering start signal
E/Profiler(31006): BPUnw: [2 total] thread_register_for_profiling(me=0x7531db50, stacktop=0x81924cbf)
I/Gecko:DumpUtils(31006): Fifo watcher disabled via pref.
E/GeckoConsole(31006): Could not read chrome manifest 'file:///data/data/org.mozilla.fennec/chrome.manifest'.
E/Profiler(31006): BPUnw: [3 total] thread_register_for_profiling(me=0x770c0dd8, stacktop=0x854fdcbf)
E/Profiler(31006): BPUnw: [4 total] thread_register_for_profiling(me=0x770c1020, stacktop=0x855fbcbf)



Also while trying to reproduce, a crash of fennec:

D/GeckoLocales(30391): Calling setContextGetter: org.mozilla.fennec.WebApps$WebApp14@42a62d48
D/GeckoSessInfo(30391): Building SessionInformation from prefs: 1396530144559, 0, true, false
D/GeckoAppShell(30391): Notification client already set
W/GeckoNotificationManager(30391): NotificationHelper.init() called twice!
I/GeckoApp(30391): Creating HealthRecorder.
D/GeckoApp(30391): OS locale is en_GB, app locale is null
I/GeckoWebappImpl(30391): Trying to launch: {"url":"https:\/\/marketplace.firefox.com\/app\/a6fb966c-a1cc-4372-a5a5-17751b72d375\/manifest.webapp?feature_profile=755aa9fe6e68.47.4"}
W/GeckoBatteryManager(30391): Already started!
D/GeckoSessInfo(30391): Recording start of session: 1396530176580
W/GeckoGLController(30391): GLController::serverSurfaceChanged(1080, 1701)
D/GeckoLayerClient(30391): Screen-size changed to (1080,1776)
D/GeckoLayerClient(30391): Window-size changed to (1080,1701)
W/GeckoGLController(30391): GLController::updateCompositor with mCompositorCreated=false
D/GeckoMemoryMonitor(24609): onTrimMemory() notification received with level 20
I/GeckoAnnounce(24609): fennec :: AnnounceBrSvc :: Registering AnnouncementsService.
I/GeckoAnnounce(24609): fennec :: BackgroundService :: Setting inexact repeating alarm for interval 43200000
I/GeckoHealth(24609): fennec :: HealthReportBroadcastService :: Registering HealthReportUploadService.
I/GeckoHealth(24609): fennec :: BackgroundService :: Setting inexact repeating alarm for interval 3600000
F/libc    (30391): Fatal signal 11 (SIGSEGV) at 0x00000000 (code=1), thread 30416 (Gecko)
I/DEBUG   (  179): pid: 30391, tid: 30416, name: Gecko  >>> org.mozilla.fennec:org.mozilla.fennec.Webapp14 <<<
I'm finding it very difficult to reproduce using apks I've installed today.
Assignee

Comment 3

5 years ago
(In reply to James Hugman [:jhugman] [@jhugman] from comment #2)
> I'm finding it very difficult to reproduce using apks I've installed today.

Right!  Per the summary and description, this only happens with apps that were already installed.  It doesn't happen with apps that you newly install.
Assignee

Comment 4

5 years ago
I backed up the data for my Nightly app and have been examining it for clues to the problem.  Unfortunately so far I haven't found anything.  Both a working and a broken app have the expected entries in the webapps.xml file that stores SharedPreferences, and they both have the expected entries in webapps.json.  They also both have profile directories, and while there are some differences between the profile directories, they seem insignificant.
Severity: blocker → critical
Assignee

Updated

5 years ago
Assignee: nobody → myk
Assignee

Comment 5

5 years ago
It looks like the problem here is a stranded profile due to profiles.ini corruption.

profiles.ini includes a section for each profile, and the sections are named ProfileN, where N is an integer number.  nsToolkitService expects N to increment contiguously, starting from zero, so it stops parsing profiles.ini as soon as it reaches a value of N for which there isn't a section.

When this problem occurs, profiles.ini contains discontiguous ProfileN sections.  For example, one of the test installations in which I'm experiencing this problem has these values for N: 0, 1, 6, 13, 14, and 15.  If an app's ProfileN section is in a discontiguous range, that profile is stranded, and the app can't be started.

There are two ways I know that an app's profile can be removed, UninstallListener.doUninstall and EventListener.uninstallWebapp, and they both call GeckoProfile.removeProfile, which renames sections to maintain contiguity, so I have no idea how the sections can become discontiguous.

Nor do I know what to do to fix the problem after it has occurred.  The values of N are arbitrary and bear no relationship to webapp slot numbers (nor the numbers in webapp profile names, which are the slot numbers), which are 0, 12, 13, 14, and 27 for the N values 1, 13, 14, 15, and 6 in the aforementioned test installation (the sixth profile, with the N value 0, is the "default" browser profile).

So I could implement an INI parser that checks profiles.ini for this problem and fixes it if it's present.  But would we want to check that every time a webapp process is started?

In any case, the next step is to figure out steps to reproduce the problem, for which the obvious culprit is one or the other of the uninstall paths, although it's possible that something entirely different is responsible for corrupting profiles.ini.
Assignee

Comment 6

5 years ago
This turns out to be a bug in INIParser.renameSection, which changes its hash key for the section but doesn't actually change the INISection name, which is what actually gets written back to the profiles.ini file.

So the loop in GeckoProfile.remove that renames sections with indices greater than the one being removed ends up failing to rename the sections, which leaves a "hole" in the sequence of section indices.

And that strands all profiles identified by those sections, since nsToolkitProfileService stops reading sections from the file as soon as it reaches an index for which there isn't a section:

https://github.com/mozilla/gecko-dev/blob/89db1286416bdae14da02c19929f18d69c547c70/toolkit/profile/nsToolkitProfileService.cpp#L424-L430

The fix is for INIParser.renameSection to change both the hash key and the INISection name, so the change is reflected in the profiles.ini file when the hash of INISection objects is serialized back to disk.

This patch also breaks out of the enumeration loop once the right section has been identified, so GeckoProfile.remove doesn't continue enumerating sections uselessly (and possibly removing additional sections, if mutating a HashTable during its enumeration can cause the loop to revisit keys).

Finally, it makes GeckoProfile.remove call parser.write() after removing a section whose name doesn't begin with "Profile", which is presumably what the code intended to do.  (Although it's bizarre that it won't do that if the name begins with "Profile" and continues with non-digits, which means that it'll happily remove a section named "Invalid" but will report an error on a section named "ProfileValid"!)
Attachment #8425804 - Flags: review?(mark.finkle)
Assignee

Updated

5 years ago
Summary: previously-installed apps stop running → app uninstallation strands profiles, causing other apps to stop running
Assignee

Comment 7

5 years ago
In addition to fixing the bug that causes the profiles.ini corruption, we might want to do one or two additional things:

1. Ensure the profile has a valid entry in profiles.ini before starting Gecko with that profile.

Currently:

  * WebappImpl.getDefaultProfileName returns |"webapp" + getIndex()| (getIndex() is the slot number);
  * GeckoApp.onCreate passes that value to GeckoProfile.get, which returns a GeckoProfile object
    with that name;
  * GeckoThread.addCustomProfileArg uses it for the -P flag it passes to GeckoAppShell.runGecko,
    retrieving it via GeckoAppShell.getGeckoInterface().getProfile().getName().

At no point does any of this code check to see if the name is valid, i.e. if it's possible to find a profile with that name in profiles.ini.  If we added such a check, then we'd be able to report the problem (which would've made finding this bug so much easier!) or (even better) create a new profile with that name so the app will still run instead of hanging or dying.

Arguably, nsToolkitProfileService or some other code in Gecko should also be throwing an error instead of silently failing.  But it seems worth doing it on the Java side of the app as well (unless it's too expensive in the startup path).

2. Fix discontiguous sections in profiles.ini.

This one's a harder sell, as such sections should never occur.  And presumably the only ones in existence are those created by this bug.  But perhaps it's worth doing a background thread check for such corruption and fixing it if present.  I dunno.
Assignee

Updated

5 years ago
Comment on attachment 8425804 [details] [diff] [review]
actually rename sections when removing profile

Thanks for the detailed explanation.
Attachment #8425804 - Flags: review?(mark.finkle) → review+
(In reply to Myk Melez [:myk] [@mykmelez] from comment #7)
> In addition to fixing the bug that causes the profiles.ini corruption, we
> might want to do one or two additional things:
> 
> 1. Ensure the profile has a valid entry in profiles.ini before starting
> Gecko with that profile.

> 2. Fix discontiguous sections in profiles.ini.

Since you went through the analysis, could you file bugs for these? I think we could find someone to at least work on #1. Your idea for a background cleanup in #2 is a good idea too.
Assignee

Comment 10

5 years ago
(In reply to Mark Finkle (:mfinkle) from comment #9)
> Since you went through the analysis, could you file bugs for these? I think
> we could find someone to at least work on #1. Your idea for a background
> cleanup in #2 is a good idea too.

Ok, filed as bug 1014018 (ensure profile has valid entry) and bug 1014027 (check for and correct discontiguous sections).
https://hg.mozilla.org/mozilla-central/rev/b919c7e64d39
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → Firefox 32
Assignee

Comment 13

5 years ago
Comment on attachment 8425804 [details] [diff] [review]
actually rename sections when removing profile

[Approval Request Comment]
Bug caused by (feature/regressing bug #): 
  Bug 740586.

User impact if declined: 
  When a user uninstalls an app, all apps they installed after that one
  will stop working.

Testing completed (on m-c, etc.): 
  I tested locally, and this landed on Central today.

Risk to taking this patch (and alternatives if risky): 
  The patch is a low-risk, minimal fix.

String or IDL/UUID changes made by this patch:
  None.
Attachment #8425804 - Flags: approval-mozilla-beta?
Attachment #8425804 - Flags: approval-mozilla-aurora?
Tracking this, though it's been a regression in product since Firefox 16, because I know we are putting more work and attention into marketplace apps and this would be a very unfortunate user experience.  Given the low risk, will take uplift.
Attachment #8425804 - Flags: approval-mozilla-beta?
Attachment #8425804 - Flags: approval-mozilla-beta+
Attachment #8425804 - Flags: approval-mozilla-aurora?
Attachment #8425804 - Flags: approval-mozilla-aurora+
You need to log in before you can comment on or make changes to this bug.