Closed Bug 1079584 Opened 10 years ago Closed 7 years ago

Persist delayed search events on disk

Categories

(Firefox for Android Graveyard :: Search Activity, defect, P1)

All
Android
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: rnewman, Assigned: vivek, Mentored)

References

Details

(Whiteboard: [lang=java])

Attachments

(1 file, 2 obsolete files)

Follow-on from Bug 1042951. Read that for context.
tracking-fennec: --- → ?
tracking-fennec: ? → 35+
I'd like to take this up if nobody's working on it. I've got two other bugs in review which should get wrapped up pretty soon.
Assignee: nobody → swaroop.rao
Status: NEW → ASSIGNED
Constraints:

* Search events must not be double-counted.
* Search events should eventually be counted, except in exceptional situations.
* We probably need to pickle the event timestamp along with the event.
* Don't regress startup.
So, the way I understood it from reviewing bug #1042951, I need to implement a Service that will periodically wake up and flush a ConcurrentLinkedQueue to a SharedPreferences repository. And, while doing that, I should ensure that the constraints in comment #2 are satisfied. Can I assume that the task of populating the ConcurrentLinkedQueue correctly has already been taken care of?

Also, I will need some clarity about the term "pickle the event timestamp" as well as some knowledge about the impact of these changes on startup. I can get on IRC later tonight (10/16 evening Indian time) to discuss this, if necessary.
(In reply to swaroop.rao from comment #3)
> So, the way I understood it from reviewing bug #1042951, I need to implement
> a Service that will periodically wake up and flush a ConcurrentLinkedQueue
> to a SharedPreferences repository. And, while doing that, I should ensure
> that the constraints in comment #2 are satisfied. Can I assume that the task
> of populating the ConcurrentLinkedQueue correctly has already been taken
> care of?

This might or might not be the best solution.

What we have now is a stopgap: we track events in memory, and apply them when the browser starts. It's very simple and expedient.

We could instead write them straight to a file, or to SharedPreferences. The difficulty there is coordinating when to read, when to clear, how to control access to the file or prefs, so that we don't impact performance or lose data due to races or otherwise.

The ideal solution is for the search activity to write directly into the FHR content provider -- no need to flush on startup, and it's the eventual persistent store. That was the solution I initially proposed for Bug 1042951.

The obstacle to that is simply that it doesn't know the current environment hash. The solution involves persisting the current environment hash when BrowserHealthRecorder computes it so that other activities can write changes out-of-band.


> Also, I will need some clarity about the term "pickle the event timestamp"

Searches occur at a moment in time. Right now we assume that they happen on the same UTC day as they're added to the final FHR database. That's obviously not the case -- do a search, don't open Firefox for a day or two.

We need to record *when* the search occurred so that the search event lands on the correct day.


> as well as some knowledge about the impact of these changes on startup.

Basically: don't do any writes or reads during startup, or during an activity transition.
Rich (hope everyone calls you that and not Richard!), I'm almost ready to start working on this. The two bugs I was working on earlier are very close to being closed out, so I can focus on this in earnest. Where can I start looking to understand more about FHR, BHR, etc.? Should I look in mobile/android/base/health for the existing code?
(In reply to swaroop.rao from comment #5)
> Should I look in
> mobile/android/base/health for the existing code?

Reading mobile/android/base/health/BrowserHealthRecorder.java is a great starting point. Pay particular attention to "Environment" stuff, and the in-memory queue we use for searches.
This bug is tracking as a high-priority Firefox 35 bug which is currently on mozilla-beta. Is this bug truly a P1 and is it being worked on?
Swaroop, are you still working on this?
Flags: needinfo?(swaroop.rao)
I can take this bug if swaroop is not interested.
I'm going to unassign due to radio silence.  vivek, if you want to start work in this area, go for it.
Assignee: swaroop.rao → nobody
Attached patch 1079584.patch (obsolete) — Splinter Review
Patch with following implementation:
* environment hash stored in shared preference
* search events stored through BHR content provider
* search events date pickled and en-queued in case environment is not valid
* en-queued search events stored when environment becomes valid.

Manually tested as per the comment https://bugzilla.mozilla.org/show_bug.cgi?id=1042951#c30

The tests were also repeated after changing system time to verify pickled timestamps.
Assignee: nobody → vivekb.balakrishnan
Flags: needinfo?(swaroop.rao)
Attachment #8541353 - Flags: review?(rnewman)
Thanks, Vivek!

Just so you know, we're all on Christmas break right now; I might get to this sooner, but worst case you might not hear from me until Dec 29 or 30.
Comment on attachment 8541353 [details] [diff] [review]
1079584.patch

Review of attachment 8541353 [details] [diff] [review]:
-----------------------------------------------------------------

Concerns:

* I don't know what state the FHR content provider interface is in. If it works, great.
* Profiles. See inline comment.
* It should never be the case that the search activity records a search, doesn't have an environment, then records another and finds that now it does, without BHR being initialized in the mean time -- only BHR creates environments. So what's the value in moving the in-memory event queue out of BHR and into the search activity? In BHR it can do the write as soon as the environment is created, which is ideal.
* I would really like to see tests for this, which will require a little refactoring to move the insertion logic into somewhere testable.

::: mobile/android/base/health/BrowserHealthRecorder.java
@@ +73,5 @@
>      // This is raised from Java. We include the location in the message.
>      public static final String EVENT_SEARCH = "Search:Event";
>  
> +    // Preference used to store environment hash.
> +    public static final String PREF_ENVIRONMENT_FILE = "FHREnvironmentHash";

You should use existing prefs here. File handles are valuable resources. See GeckoSharedPrefs.

Ideally this would use the per-profile shared preferences handle -- because environments are -- but the search activity isn't profile-scoped :(

That exposes an issue, which is that health report storage is scoped to a profile, and in theory an environment can exist in one per-profile DB and not in another.

So here we really do care which profile owns this environment.

My recommendation, then:

* Use per-app prefs.
* Write a key to those app-scoped prefs like `fhr.env.$profile` -- so you'd use `fhr.env.default` for the default profile.
* In the search activity, use the existing approach for finding the profile to use (I think it just assumes 'default'), then look up the appropriate key.

@@ +315,5 @@
> +        // This will be read in SearchActivity to update search events.
> +        SharedPreferences settings = GeckoAppShell.getContext().getSharedPreferences(PREF_ENVIRONMENT_FILE, Context.MODE_PRIVATE);
> +        SharedPreferences.Editor editor = settings.edit();
> +        editor.putInt(PREF_ENVIRONMENT_KEY, this.env);
> +        editor.commit();

Use chaining and apply:

  settings.edit()
          .putInt(...)
          .apply();

::: mobile/android/search/java/org/mozilla/search/SearchActivity.java
@@ +455,5 @@
> +            Log.w(LOGTAG, "Search event not stored as environment is not valid.");
> +            return;
> +        }
> +
> +        // Generate content uri of the form "content://org.mozilla.gecko.health/events/env/measurement/v/field"

I'm not at all confident that this works. How are you verifying this?
Attachment #8541353 - Flags: review?(rnewman) → feedback+
Attached patch 1079584.patch (obsolete) — Splinter Review
Patch with suggested changes:
* env hash generated for profile are stored in app-wide gecko shared preference
* default profile is used to flush search event thru content providers in case if a valid env hash is available.
* Search event timestamp are pickled.
Attachment #8541353 - Attachment is obsolete: true
Attachment #8543121 - Flags: review?(rnewman)
Comment on attachment 8543121 [details] [diff] [review]
1079584.patch

Review of attachment 8543121 [details] [diff] [review]:
-----------------------------------------------------------------

::: mobile/android/base/health/BrowserHealthRecorder.java
@@ +315,5 @@
> +        // Store the environment hash in an app-wide shared preference.
> +        // Key for the preference is generated from the current profile.
> +        final Context context = GeckoAppShell.getContext();
> +        final String profileName = GeckoProfile.get(context).getName();
> +        GeckoSharedPrefs.forApp(context)

The constructor is already given the prefs instance that we need:

    public BrowserHealthRecorder(final Context context,
                                 final SharedPreferences appPrefs,
                                 final String profilePath,

so use that instead of digging it up here.

GeckoApp at this point also already has the GeckoProfile instance itself, so let's just pass that through instead of the profilePath, and then call `profile.getDir().getAbsolutePath()` when needed.

That means here you can just call `profile.getName()`.

Be prepared for this to potentially be null.

@@ +644,5 @@
>              if (EVENT_KEYWORD_SEARCH.equals(event)) {
>                  // A search via the URL bar. Since we eliminate all other search possibilities
>                  // (e.g. bookmarks keyword, search suggestion) when we initially process the
>                  // search URL, this is considered a default search.
> +                recordSearch(message.getString("identifier"), "bartext", System.currentTimeMillis());

Let's provide an overload for recordSearch that defaults the timestamp so that callers don't need to care about timestamps wherever possible.

::: mobile/android/search/java/org/mozilla/search/SearchActivity.java
@@ +239,5 @@
> +    public void onResume() {
> +        super.onResume();
> +
> +        // Read the environment hash for default profile from app-wide preference.
> +        final String key = BrowserHealthRecorder.PREF_ENVIRONMENT_KEY_PREFIX + GeckoProfile.DEFAULT_PROFILE;

SearchEngineManager does this:

        final File pluginsDir = GeckoProfile.get(context).getFile("searchplugins");

so it appears to default to the current profile. That might be correct, but it's worth making sure that Margaret knows about the tradeoffs here. I'll flag her for an additional quick review.

@@ +272,3 @@
>          }
>  
>          startSearch(query);

I kinda think this should come before the recording -- the CP stuff isn't nearly as cheap as pushing into an in-memory list.

@@ +442,5 @@
> +     * Store the search event in browser health report database for default profile.
> +     */
> +    private void storeSearchEvent() {
> +        final String engineID = (engine.getIdentifier() == null) ? "other" : engine.getIdentifier();
> +        final String defaultProfilePath = GeckoProfile.get(this, GeckoProfile.DEFAULT_PROFILE)

Let's factor out the profile stuff into a single method.
Attachment #8543121 - Flags: review?(rnewman)
Attachment #8543121 - Flags: review?(margaret.leibovic)
Attachment #8543121 - Flags: feedback+
tracking-fennec: 35+ → +
(In reply to Richard Newman [:rnewman] from comment #16)

> ::: mobile/android/search/java/org/mozilla/search/SearchActivity.java
> @@ +239,5 @@
> > +    public void onResume() {
> > +        super.onResume();
> > +
> > +        // Read the environment hash for default profile from app-wide preference.
> > +        final String key = BrowserHealthRecorder.PREF_ENVIRONMENT_KEY_PREFIX + GeckoProfile.DEFAULT_PROFILE;
> 
> SearchEngineManager does this:
> 
>         final File pluginsDir =
> GeckoProfile.get(context).getFile("searchplugins");
> 
> so it appears to default to the current profile. That might be correct, but
> it's worth making sure that Margaret knows about the tradeoffs here. I'll
> flag her for an additional quick review.

Yes, we decided to use the default profile for the search activity, since there isn't really another clear option, given that we do need to find *some* default search engine. So I think it's safe to associate any search activity events with the default profile in FHR.

I'll take a look at the patch to see if there are any issues that jump out at me.
(In reply to :Margaret Leibovic from comment #17)
> (In reply to Richard Newman [:rnewman] from comment #16)
> 
> > ::: mobile/android/search/java/org/mozilla/search/SearchActivity.java
> > @@ +239,5 @@
> > > +    public void onResume() {
> > > +        super.onResume();
> > > +
> > > +        // Read the environment hash for default profile from app-wide preference.
> > > +        final String key = BrowserHealthRecorder.PREF_ENVIRONMENT_KEY_PREFIX + GeckoProfile.DEFAULT_PROFILE;
> > 
> > SearchEngineManager does this:
> > 
> >         final File pluginsDir =
> > GeckoProfile.get(context).getFile("searchplugins");
> > 
> > so it appears to default to the current profile. That might be correct, but
> > it's worth making sure that Margaret knows about the tradeoffs here. I'll
> > flag her for an additional quick review.
> 
> Yes, we decided to use the default profile for the search activity, since
> there isn't really another clear option, given that we do need to find
> *some* default search engine. So I think it's safe to associate any search
> activity events with the default profile in FHR.

Er, I'm sorry, I was confused there. rnewman is correct in pointing out that we get the current profile from the search activity context, but looking at the implementation in GeckoProfile, I believe that will always return the default profile. Perhaps it would be worth being more explicit about always using the default profile in the search activity.

I don't think we did a good job thinking though all the ways the search activity can interact with profiles, since we're using a per-app pref to keep track of the default engine in the search activity, but that is a per-profile pref on the gecko side.
Comment on attachment 8543121 [details] [diff] [review]
1079584.patch

Review of attachment 8543121 [details] [diff] [review]:
-----------------------------------------------------------------

Seems reasonable to me, I trust rnewman's review comments :)

::: mobile/android/search/java/org/mozilla/search/SearchActivity.java
@@ +272,3 @@
>          }
>  
>          startSearch(query);

I agree with rnewman here, onSearch will be called on the main thread, and we don't want to block on any content provider interactions (perhaps that insert should even be pushed off onto a background thread).

@@ +461,5 @@
> +                .build();
> +
> +        final ContentValues cv = new ContentValues();
> +        cv.put("value", engineID);
> +        queryHandler.startInsert(1, null, healthReportUri, cv);

Is there a reason this logic can't go in a static method in BrowserHealthRecorder? It seems to me like SearchActivity shouldn't be responsible for such health-report-specific storage logic.
Attachment #8543121 - Flags: review?(margaret.leibovic) → feedback+
Attached patch 1079584.patchSplinter Review
Patch with review comments addressed.
As suggested moved content provider insert operation to BHR
Attachment #8543121 - Attachment is obsolete: true
Attachment #8548587 - Flags: review?(rnewman)
Attachment #8548587 - Flags: feedback?(margaret.leibovic)
Comment on attachment 8548587 [details] [diff] [review]
1079584.patch

Review of attachment 8548587 [details] [diff] [review]:
-----------------------------------------------------------------

This is looking good to me. I'll leave the final review to rnewman.

::: mobile/android/base/background/healthreport/HealthReportProvider.java
@@ +19,5 @@
>  import android.content.ContentValues;
>  import android.content.UriMatcher;
>  import android.database.Cursor;
>  import android.net.Uri;
> +import android.util.Log;

Nit: Looks like an unused import ended up in here.

::: mobile/android/base/health/BrowserHealthRecorder.java
@@ +954,5 @@
> +    /**
> +     * Store the search event immediately through {@link HealthReportProvider} iff environmentHash is valid.
> +     * otherwise, search events are stored in memory until environmentHash is computed
> +     */
> +    public static void mayBeStoreSearchEvents(final AsyncQueryHandler queryHandler,

Nit: Maybe is a single word, so update this to 'maybeStoreSearchEvents'.

::: mobile/android/search/java/org/mozilla/search/SearchActivity.java
@@ +237,5 @@
> +    public void onResume() {
> +        super.onResume();
> +
> +        // Read the environment hash for default profile from app-wide preference.
> +        final String key = BrowserHealthRecorder.PREF_ENVIRONMENT_KEY_PREFIX + getDefaultGeckoProfile().getName();

Should we have a helper function in BrowserHealthRecorder to get the environment hash, rather than leaving SearchActivity in charge of the details to compute the correct key?

@@ +267,2 @@
>          try {
> +            BrowserHealthRecorder.mayBeStoreSearchEvents(queryHandler, engine.getIdentifier(), profilePath, environmentHash, "activity");

It looks like you were refactoring other logic to pass around GeckoProfile instances instead of the path. Any reason to not do that here as well?
Attachment #8548587 - Flags: feedback?(margaret.leibovic) → feedback+
Comment on attachment 8548587 [details] [diff] [review]
1079584.patch

Review of attachment 8548587 [details] [diff] [review]:
-----------------------------------------------------------------

I concur with Margaret's comments.



I'm still slightly worried about using the CP directly, because it's not well tested.

I'd consider using the HealthReportDatabaseStorage pattern that we already have, rather than using the CP directly. For example, to write a search:


/**
 * Call from a background thread.
 */
public static void recordSearch(final Context context, final int env, final int day, final String profilePath, final String location, final String engine) {
    final ContentProviderClient cpc = EnvironmentBuilder.getContentProviderClient(context);
    try {
        final HealthReportDatabaseStorage storage = EnvironmentBuilder.getStorage(cpc, profilePath);
        if (storage == null) {
            // Abort.
            return;
        }

        // These aren't necessary every time, so this is slightly inefficient. We could add a simple method to storage to avoid this.
        storage.beginInitialization();
        storage.finishInitialiation();
        final int searchField = storage.getField(MEASUREMENT_NAME_SEARCH_COUNTS,
                                                 MEASUREMENT_VERSION_SEARCH_COUNTS,
                                                 location)
                                       .getID();
        storage.recordDailyDiscrete(env, day, searchField, engine);
    } finally {
        cpc.release();
    }
}

This handily eliminates generateContentUriForSearchEvents, the use of AsyncQueryHandler, etc.

You might even consider breaking this out to make a simple SearchRecorder:

public class SearchRecorder {
    private final ContentProviderClient client;
    private final HealthReportDatabaseStorage storage;
    public SearchRecorder(final Context context, final String profilePath) {
        this.client = EnvironmentBuilder.getContentProviderClient(context);
        final HealthReportDatabaseStorage s = EnvironmentBuilder.getStorage(cpc, profilePath);
        if (s == null) {
            this.client.release();
            throw new IllegalArgumentException("Couldn't get storage.");
        }

        this.storage = s;
    }

    public void init() {
        this.storage.beginInitialization();
        this.storage.finishInitialization();
    }

    public void release() {
        this.client.release();
    }

    public void recordSearch(final int env, final int day, final String location, final String engine) {
        final int searchField = storage.getField(MEASUREMENT_NAME_SEARCH_COUNTS,
                                                 MEASUREMENT_VERSION_SEARCH_COUNTS,
                                                 location)
                                       .getID();
        storage.recordDailyDiscrete(env, day, searchField, engine);
    }
}

Then you can create a SearchRecorder in the SearchActivity, and hold on to it. The above isn't fully fleshed out, but I think you see the idea.

Thoughts?

::: mobile/android/base/health/BrowserHealthRecorder.java
@@ +102,5 @@
>      private ContentProviderClient client;
>      private volatile HealthReportDatabaseStorage storage;
>      private final ConfigurationProvider configProvider;
>      private final SharedPreferences prefs;
> +    private final GeckoProfile profile;

I don't think we need this. If you must, store the profile name, or the computed environment pref key -- this seems to be the only use for the profile after construction.

@@ +164,5 @@
>  
>          // Note that the PIC is not necessarily fully initialized at this point:
>          // we haven't set the app locale. This must be done before an environment
>          // is recorded.
> +        this.profileCache = new ProfileInformationCache(profile.getDir().getAbsolutePath());

Let's grab the path once, at the top of the constructor. getDir() is far from free!

@@ +316,5 @@
>              return -1;
>          }
> +
> +        this.env = EnvironmentBuilder.registerCurrentEnvironment(this.storage, this.profileCache, this.configProvider);
> +        // Store the environment hash in an app-wide shared preference.

Nit: newline before comment.

@@ +952,5 @@
>  
> +
> +    /**
> +     * Store the search event immediately through {@link HealthReportProvider} iff environmentHash is valid.
> +     * otherwise, search events are stored in memory until environmentHash is computed

Nit: capitalize "Otherwise", add closing period to sentence.

@@ +991,5 @@
> +    private static void storeSearchEvents(final AsyncQueryHandler queryHandler, final Uri healthReportUri, final String engineID) {
> +        final String engine = (engineID == null) ? "other" : engineID;
> +        final ContentValues cv = new ContentValues();
> +        cv.put("value", engine);
> +        queryHandler.startInsert(1, null, healthReportUri, cv);

I imagine you want to use a unique token here, rather than '1'.
Attachment #8548587 - Flags: review?(rnewman) → feedback+
This bug is old and no more progress for more than almost 2 years, so i'm removing the tracking flag now.
feel free to renominate. Thank you !
tracking-fennec: + → ---
This is an ex-Search Activity now (bug 1221344).
Status: ASSIGNED → RESOLVED
Closed: 7 years ago
Resolution: --- → WONTFIX
Product: Firefox for Android → Firefox for Android Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: