Crash in java.lang.IllegalArgumentException: at org.mozilla.gecko.GeckoProfile.<init>(GeckoProfile.java)

VERIFIED FIXED in Firefox 64

Status

()

P1
critical
VERIFIED FIXED
6 months ago
3 months ago

People

(Reporter: gsvelto, Assigned: petru)

Tracking

({crash})

unspecified
Firefox 66
Unspecified
Android
crash
Points:
---

Firefox Tracking Flags

(firefox64blocking verified, firefox65blocking verified, firefox66+ verified)

Details

(crash signature)

Attachments

(1 attachment)

(Reporter)

Description

6 months ago
This bug was filed from the Socorro interface and is
report bp-4803cf0a-564b-45a6-be07-f1a370181007.
=============================================================

Java stack trace:

java.lang.IllegalArgumentException
	at org.mozilla.gecko.GeckoProfile.&lt;init&gt;(GeckoProfile.java:315)
	at org.mozilla.gecko.GeckoProfile.get(GeckoProfile.java:236)
	at org.mozilla.gecko.GeckoProfile.get(GeckoProfile.java:184)
	at org.mozilla.gecko.db.PerProfileDatabases.getDatabasePathForProfile(PerProfileDatabases.java:52)
	at org.mozilla.gecko.db.PerProfileDatabases.getDatabaseHelperForProfile(PerProfileDatabases.java:75)
	at org.mozilla.gecko.db.AbstractPerProfileDatabaseProvider.getWritableDatabase(AbstractPerProfileDatabaseProvider.java:61)
	at org.mozilla.gecko.db.AbstractTransactionalProvider.delete(AbstractTransactionalProvider.java:175)
	at android.content.ContentProvider$Transport.delete(ContentProvider.java:341)
	at android.content.ContentResolver.delete(ContentResolver.java:1684)
	at org.mozilla.gecko.fxa.receivers.FxAccountDeletedService.clearRemoteDevicesList(FxAccountDeletedService.java:187)
	at org.mozilla.gecko.fxa.receivers.FxAccountDeletedService.onHandleWork(FxAccountDeletedService.java:74)
	at android.support.v4.app.JobIntentService$CommandProcessor.doInBackground(JobIntentService.java:391)
	at android.support.v4.app.JobIntentService$CommandProcessor.doInBackground(JobIntentService.java:382)
	at android.os.AsyncTask$2.call(AsyncTask.java:333)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
	at java.lang.Thread.run(Thread.java:764)
We must have uplifted something for Fennec beta 15 that made this crash start showing up in 64.
Bug 1504780 maybe?

The exception message is "Custom profile must have a directory".
Flags: needinfo?(vlad.baicu)
status-firefox64: --- → affected
status-firefox65: --- → affected
tracking-firefox64: --- → +
I'm going to treat this as blocking for fennec 64 until we at least investigate what's going on.  Should we just backout 1504780, or would that also be bad?

Susheel fyi.
tracking-firefox64: + → blocking
Flags: needinfo?(sdaswani)
Priority: -- → P1
Petru, any ideas here?
Flags: needinfo?(petru.lingurar)
This indeed looks like a regression. Removing my NI since Liz already pinged Petru.
Flags: needinfo?(vlad.baicu)

Comment 6

4 months ago
Removing my NI but Petru should work on this ASAP.
Flags: needinfo?(sdaswani)
(Assignee)

Updated

4 months ago
Assignee: nobody → petru.lingurar
Status: NEW → ASSIGNED
Flags: needinfo?(petru.lingurar)
(Assignee)

Comment 7

4 months ago
The patch from bug 1504780 refers to the CrashHandler (app started, ran for a while, an exception occured, CrashHandler started to allow reporting about the exception)
while this crash mainly happens when the app and Gecko starts. 
As such I don't think that patch, on it's own, could have caused this.

There are though many new crashes after the 20181128100103 Nightly and also in the 64.9b15 with this Java Stack Trace
> java.lang.IllegalArgumentException: Custom profile must have a directory
>	at org.mozilla.gecko.GeckoProfile.<init>(GeckoProfile.java:317)
>	at org.mozilla.gecko.GeckoProfile.get(GeckoProfile.java:238)
>	at org.mozilla.gecko.GeckoProfile.get(GeckoProfile.java:198)
>	at org.mozilla.gecko.GeckoProfile.initFromArgs(GeckoProfile.java:162)
>	at org.mozilla.gecko.GeckoThread.getProfile(GeckoThread.java:347)
>	at org.mozilla.gecko.GeckoThread.getMainProcessArgs(GeckoThread.java:299)
>	at org.mozilla.gecko.GeckoThread.run(GeckoThread.java:447)
so many new crashes happening at app startup, 0 - 10 seconds.
(Assignee)

Updated

4 months ago
See Also: → bug 1259653

Comment 8

4 months ago
:davidb is aware of this and will start looking.
Flags: needinfo?(dbolter)
(Assignee)

Comment 9

4 months ago
Going by the comment here - https://bugzilla.mozilla.org/show_bug.cgi?id=1259653#c1
> Custom profiles are an edge use case (must be passed in via Intent arguments)

So I think we want to crash (we crash on purpose [1]) and so the recent spike in crashes related to the above stack trace should not be a release blocker. But would wait for somebody's opinion from the Gecko team.

[1] https://dxr.mozilla.org/mozilla-central/rev/c2593a3058afdfeaac5c990e18794ee8257afe99/mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoProfile.java#317
One of the URLs that crashed is about:accounts?action=manage, and there is one crash comment that it happened when the person was disconnecting from Sync.
This was previously [@ java.lang.IllegalArgumentException: Custom profile must have a directory at android.database.DatabaseUtils.readExceptionFromParcel(DatabaseUtils.java)] - bug 1399973

There was a crash-stats change that redacts some information from the Java stack trace. bug 1496599 Any crash that appears in early October should be considered a suspect.
Crash Signature: [@ java.lang.IllegalArgumentException: at org.mozilla.gecko.GeckoProfile.<init>(GeckoProfile.java)] → [@ java.lang.IllegalArgumentException: at org.mozilla.gecko.GeckoProfile.<init>(GeckoProfile.java)] [@ java.lang.IllegalArgumentException: Custom profile must have a directory at android.database.DatabaseUtils.readExceptionFromParcel(DatabaseUtils.java)]
Duplicate of this bug: 1399973
(Assignee)

Comment 14

4 months ago
The above patch will log a warning instead of throwing when we have an empty profile name (don't know why this means a custom profile) with a null directory (custom profile must have a directory)
In this case we will fallback to use the currently active / default Gecko profile.
James any chance you can review?
Flags: needinfo?(snorp)
Is using the default profile safe when we have guest browsing and family browsing? Could this cause information leakage?
I think we likely just want to catch this exception in `FxAccountDeletedService` instead of modifying `GeckoProfile`. We could retry the failed `delete` call at a later time or just not care at all. I'm not sure which is more appropriate for FxA. Grisha?
Flags: needinfo?(snorp) → needinfo?(gkruglov)
Flags: needinfo?(dbolter)
(Assignee)

Comment 18

3 months ago
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #17)
> I think we likely just want to catch this exception in
> `FxAccountDeletedService` instead of modifying `GeckoProfile`. We could
> retry the failed `delete` call at a later time or just not care at all. I'm
> not sure which is more appropriate for FxA. Grisha?

While there are a few crashes because of `FxAccountDeletedService`, the recent spike in beta refers to crashes actually happening when Gecko starts - https://crash-stats.mozilla.com/report/index/08e408ea-0054-4c7b-a009-3fb410181210 (1 second uptime)
See Also: → bug 1511727

Comment 19

3 months ago
CC'ing Janet in case Grisha is not the right person to look at this.
Flags: needinfo?(jdragojevic)

Comment 20

3 months ago
removing NI from self, Grisha is the right person.
Grisha is on PTO, I've asked :nalexander to take a peek too, but I think it would not hurt to get the FxA team involved too.
Flags: needinfo?(nalexander)
Who would be the right person to implement one of the suggestions that James made in https://bugzilla.mozilla.org/show_bug.cgi?id=1497148#c17 ?
(In reply to James Willcox (:snorp) (jwillcox@mozilla.com) from comment #17)
> I think we likely just want to catch this exception in
> `FxAccountDeletedService` instead of modifying `GeckoProfile`. We could
> retry the failed `delete` call at a later time or just not care at all. I'm
> not sure which is more appropriate for FxA. Grisha?

My initial thought is two-fold:

1) yes, this particular exception should be handled by `FxAccountDeletedService`.  The exception is clearly bustage in how Gecko is exposing the error but the edge case should still be handled at the edge.

2) we've really changed the expectations for how Gecko and the ContentProviders interact.  The real issue is that something that's _purely background_, i.e., reading and writing ContentProviders, is now starting Gecko.  That's generally not what we have expected since Fennec was first built and probably violates subtle assumptions.

Now, Sync 1.5 on Android _does_ have some light profile management built in, but it's _always_ the default case -- i.e., I think there's a &profile="default" parameter hanging around everywhere.  So Gecko shouldn't be regressing that.

I'll take an NI to investigate how this startup is getting triggered later today.
Flags: needinfo?(gkruglov)
Flags: needinfo?(jdragojevic)
Comment hidden (typo)
(In reply to Nick Alexander :nalexander [he/him] from comment #24)
> (In reply to Petru-Mugurel Lingurar[:petru] from comment #7)
> > The patch from bug 1504780 refers to the CrashHandler (app started, ran for
> > a while, an exception occured, CrashHandler started to allow reporting about
> > the exception)
> > while this crash mainly happens when the app and Gecko starts. 
> > As such I don't think that patch, on it's own, could have caused this.
> > 
> > There are though many new crashes after the 20181128100103 Nightly and also
> > in the 64.9b15 with this Java Stack Trace
> > > java.lang.IllegalArgumentException: Custom profile must have a directory
> > >	at org.mozilla.gecko.GeckoProfile.<init>(GeckoProfile.java:317)
> > >	at org.mozilla.gecko.GeckoProfile.get(GeckoProfile.java:238)
> > >	at org.mozilla.gecko.GeckoProfile.get(GeckoProfile.java:198)
> > >	at org.mozilla.gecko.GeckoProfile.initFromArgs(GeckoProfile.java:162)
> > >	at org.mozilla.gecko.GeckoThread.getProfile(GeckoThread.java:347)
> > >	at org.mozilla.gecko.GeckoThread.getMainProcessArgs(GeckoThread.java:299)
> > >	at org.mozilla.gecko.GeckoThread.run(GeckoThread.java:447)
> > so many new crashes happening at app startup, 0 - 10 seconds.
> 
> This is helpful: it means we're confident the offending code is at
> https://hg.mozilla.org/releases/mozilla-beta/file/cfb1a45b322e/mobile/
> android/geckoview/src/main/java/org/mozilla/gecko/GeckoProfile.java#l315.

Gah!  https://hg.mozilla.org/releases/mozilla-beta/file/cfb1a45b322e/mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoProfile.java#l317 (not #l315).
Capturing some thoughts along the way:

At https://searchfox.org/mozilla-central/source/mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoProfile.java#216 we have

        // Null name? | Null dir? | Returned profile
        // ------------------------------------------
        //     Yes    |    Yes    | Active profile or default profile.
        //     No     |    Yes    | Profile with specified name at default dir.
        //     Yes    |    No     | Custom (anonymous) profile with specified dir.
        //     No     |    No     | Profile with specified name at specified dir.

        if (profileName == null && profileDir == null) {
            // If no profile info was passed in, look for the active profile or a default profile.
            final GeckoProfile profile = GeckoThread.getActiveProfile();
            if (profile != null) {
                return profile;
            }

            return GeckoProfile.initFromArgs(context, sIntentArgs);
        } else if (profileName == null) {
            // If only profile dir was passed in, use custom (anonymous) profile.
            profileName = CUSTOM_PROFILE;
        }

We crash later (in https://searchfox.org/mozilla-central/source/mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoProfile.java#238) at https://searchfox.org/mozilla-central/source/mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoProfile.java#317:

        if (profileName == null) {
            throw new IllegalArgumentException("Unable to create GeckoProfile for empty profile name.");
        } else if (CUSTOM_PROFILE.equals(profileName) && profileDir == null) {
            throw new IllegalArgumentException("Custom profile must have a directory");
        }

That means that we have profileName != null and that profileDir == null, which in turn means that we had to have profileName = CUSTOM_PROFILE and profileDir == null in the original call to GeckoProfile.get at https://searchfox.org/mozilla-central/source/mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoProfile.java#198.

Now, that function is:

    @RobocopTarget
    public static GeckoProfile get(Context context, String profileName, String profilePath) {
        File dir = null;
        if (!TextUtils.isEmpty(profilePath)) {
            dir = new File(profilePath);
            if (!dir.exists() || !dir.isDirectory()) {
                Log.w(LOGTAG, "requested profile directory missing: " + profilePath);
            }
        }
        return get(context, profileName, dir);
    }

Since profileDir _is_ null later on, that means that the given profilePath is the empty string "".  So profileName = "" and profilePath = "" at GeckoProfile.initFromArgs at https://searchfox.org/mozilla-central/source/mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoProfile.java#162.

Now in that function we have:

        String profileName = null;
        String profilePath = null;

        if (args != null && args.contains("-P")) {
            final Pattern p = Pattern.compile("(?:-P\\s*)(\\w*)(\\s*)");
            final Matcher m = p.matcher(args);
            if (m.find()) {
                profileName = m.group(1);
            }
        }

        if (args != null && args.contains("-profile")) {
            final Pattern p = Pattern.compile("(?:-profile\\s*)(\\S*)(\\s*)");
            final Matcher m = p.matcher(args);
            if (m.find()) {
                profilePath =  m.group(1);
            }
        }

        if (profileName == null && profilePath == null) {
            // Get the default profile for the Activity.
            return getDefaultProfile(context);
        }

        return GeckoProfile.get(context, profileName, profilePath);

So _either_ we have bad "-P" _and_ "-profile" arguments (and it's probably another bug that those accept _empty_ strings for profile names and paths, respectively) or we're failing to get a sensible default profile from `getDefaultProfile`.  But we know that we _don't_ enter that path because we crash at the final line (#162)!

So we really do have bad "-P" and "-profile" arguments.  I need to eat lunch but I'll try to understand how that can be; in the interim, perhaps snorp Just Knows.
The relevant call is:

            mInitInfo.profile = GeckoProfile.initFromArgs(context,
                    mInitInfo.extras.getString(EXTRA_ARGS, null));

at https://searchfox.org/mozilla-central/source/mobile/android/geckoview/src/main/java/org/mozilla/gecko/GeckoThread.java#347.  So that means we're somehow getting a bad string from `EXTRA_ARGS`.  Let's hunt that down.

The following usage seems relevant: it's passing through "args" from some other place.  

https://searchfox.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/BrowserApp.java#624

Otherwise, it's just the "mainline" flow:

https://searchfox.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/GeckoService.java#213

That leads into GeckoService.getIntentForAction(), which seems like it would only give the (name, path) = ("", "") combination if things are badly wrong _for a running profile_.  Otherwise, it's the custom tab service at

https://searchfox.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/customtabs/GeckoCustomTabsService.java#55

which again looks like it has to have a valid profile, or the push service at

https://searchfox.org/mozilla-central/source/mobile/android/base/java/org/mozilla/gecko/push/PushService.java#258

The latter is "playing back" data saved to disk... and it null-checks but _does not_ empty-string check.
OK, I have to think about something else for a while.  AFAICT, there's nothing tying any of these crashes to the FxAccountDeletedService in particular; it's all flowing through mainline code paths (that the FxA deleted service may trigger).

I think that the fix landed in Bug 1504780 probably didn't go far enough.  That fix tells us that we have "interestingly" formed profiles.ini files on devices in the wild.  The fix ignores those with missing Path= settings but doesn't really do the same thing in the like 7 or 10 other places where we parse INI files.  (I can't make this up.).  My guess is that we are seeing bad data on disk (somehow, someway) and we're playing whack-a-mole trying to do reasonable things with it.

If my understanding is correct, we're also racing Gecko startup against GeckoView (Fennec) startup because both are trying to read _and write_ profiles.ini.  That would be what I would expect to cause this type of corruption, especially because Bug 1504780 indicates issues with startup crashes -- exactly when you'd see races of this problem (and potentially see corruption due to racers).

Now, what to do.  I think #c17 isn't helpful: it's not the case that these crashes appear to be from one edge of the edge-case constellation.  That makes Petru's patch more attractive, but I'll wait for snorp to weigh in before reviewing.  I think the longer term fix is to consolidate yet more of this profile handling, which somehow is sprinkled between GV, Fennec, _and_ the CrashReporterActivity (and _Gecko_ native code itself).  'cuz other than actually bad arguments coming out of the INI parsing, I don't see how we can get the situations we're seeing.
snorp: thoughts on #c28?
Flags: needinfo?(snorp)
OK, I didn't realize we had crashes from "normal" startups as well. I concur, working around interesting profiles.ini may be the way to go.

GeckoView and Fennec should not be fighting/racing to do any work -- they both use GeckoRuntime and GeckoThread, etc.
Flags: needinfo?(snorp)

Comment 31

3 months ago
Pushed by rvandermeulen@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/c0b651621eb8
Log warning for custom profile with null dir and avoid throwing exception. r=geckoview-reviewers,snorp
status-firefox66: --- → affected
tracking-firefox65: --- → blocking
tracking-firefox66: --- → +
Comment on attachment 9029933 [details]
Bug 1497148 - Log warning for custom profile with null dir and avoid throwing exception.

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: None

User impact if declined: Continued startup crashes.

Is this code covered by automated tests?: No

Has the fix been verified in Nightly?: No

Needs manual test from QE?: No

If yes, steps to reproduce: 

List of other uplifts needed: None

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): The patch isn't trivial but it's also not large.  We're seeing bad data in the wild that we can't really explain; this is unlikely to change behaviour away from that bad data and with luck will let us work around the bad data.

String changes made/needed: none
Attachment #9029933 - Flags: approval-mozilla-beta?
Comment on attachment 9029933 [details]
Bug 1497148 - Log warning for custom profile with null dir and avoid throwing exception.

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: None

User impact if declined: Continued startup crashes.

Is this code covered by automated tests?: No

Has the fix been verified in Nightly?: No

Needs manual test from QE?: No

If yes, steps to reproduce: 

List of other uplifts needed: None

Risk to taking this patch: Medium

Why is the change risky/not risky? (and alternatives if risky): See beta request.

String changes made/needed: none
Attachment #9029933 - Flags: approval-mozilla-release?

Comment 34

3 months ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/c0b651621eb8
Status: ASSIGNED → RESOLVED
Last Resolved: 3 months ago
status-firefox66: affected → fixed
Resolution: --- → FIXED
Target Milestone: --- → Firefox 66
Comment on attachment 9029933 [details]
Bug 1497148 - Log warning for custom profile with null dir and avoid throwing exception.

[Triage Comment]
Approved for 65.0b4. Fingers crossed.
Attachment #9029933 - Flags: approval-mozilla-beta? → approval-mozilla-beta+

Comment 36

3 months ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-beta/rev/7fe309a250d8
status-firefox65: affected → fixed
Comment on attachment 9029933 [details]
Bug 1497148 - Log warning for custom profile with null dir and avoid throwing exception.

Not seeing anything cropping up in 65.0b4 in the last 18 hours, so let's try this on release.
Attachment #9029933 - Flags: approval-mozilla-release? → approval-mozilla-release+

Comment 38

3 months ago
bugherderuplift
https://hg.mozilla.org/releases/mozilla-release/rev/3864bee6b6ea
status-firefox64: affected → fixed
Status: RESOLVED → VERIFIED
status-firefox64: fixed → verified
status-firefox65: fixed → verified
status-firefox66: fixed → verified
You need to log in before you can comment on or make changes to this bug.