Open Bug 1640054 Opened 4 years ago Updated 4 years ago

Store important SERP URL params in the search service

Categories

(Firefox :: Search, task, P3)

task

Tracking

()

People

(Reporter: adw, Unassigned)

References

Details

Bug 1398416 adds form history to the urlbar, and as part of that, we exclude SERP results from browser history that duplicate form history results.

For example, if you search Google for "foo" from the urlbar, you end up with the SERP https://www.google.com/search?client=firefox-b-1-d&q=foo in your browser history. Then if you type "foo" in the urlbar, you'll see a "foo" form history result, but you won't see a result for the SERP since it effectively duplicates the form history result. In contrast, before bug 1398416 (or if form history is disabled), you'd see only the SERP result.

The way we determine whether a SERP "duplicates" a form history result is by generating a submission URL from the form history search terms and comparing it to the SERP URL. The question is how to compare URL params. Right now we're conservative and all URL params in the two URLs must be the same except for client. Ideally we'd like to ignore "unimportant" params so that we can be smarter about deduping. Unimportant params include page numbers at least, but we need to be careful about respecting params like Google's tbm=isch, which indicates an image search as opposed to a general search.

Mark, right now I'm interested not in which params are "important" but in how/where we might keep track of them. Is remote settings the right place? Is there some per-engine remote metadata we can hook into? We'd need to be able to adjust them at any time probably in case they change. Any thoughts on how we can get started here?

Mark, please see comment 0.

Flags: needinfo?(standard8)
Summary: Store "important" SERP URL params in the search service → Store important SERP URL params in the search service

I think we have a few potential routes, depending on how you want to access the data:

  1. Encoding this information in the telemetry information we use to report SERP results (which needs to move to remote settings at some stage). https://searchfox.org/mozilla-central/rev/ea7f70dac1c5fd18400f6d2a92679777d4b21492/browser/components/search/SearchTelemetry.jsm#63

This wouldn't be associated with search engine objects, but I don't know if that would be a problem or not. It is primarily associated with just the search telemetry, but I guess we could make it more generic.

  1. Save the information in the search engine configuration / WebExtensions.

This would associate the data directly with search engine objects. I would be reluctant to add this to the configuration as that is already quite complex. The WebExtension might be an option, if we added a privileged set of parameters.

  1. A separate remote settings collection.

This could also be an option, though I'm getting wary about how many collections we're adding.

Flags: needinfo?(standard8)

I talked with mconnor, and we may want to support non-installed engines at some point, so option 1 or 3 seems like the way to go. What do you think about making a new remote settings collection that will start off with this info, and then we can expand it to also include the search telemetry info (from option 1) at some point? I'm not very familiar with remote settings yet, so I'm not sure whether that's possible.

(In reply to Drew Willcoxon :adw from comment #3)

I talked with mconnor, and we may want to support non-installed engines at some point, so option 1 or 3 seems like the way to go. What do you think about making a new remote settings collection that will start off with this info, and then we can expand it to also include the search telemetry info (from option 1) at some point? I'm not very familiar with remote settings yet, so I'm not sure whether that's possible.

That would be fine with me. The remote settings data is a JSON structure - remote settings itself doesn't really care about the structure of the data. So we just need to take care from an old client perspective, but basically as long as you're adding new fields, rather than removing/or changing them, old clients will be fine.

In case it is useful, this bug I just did for the override list has an overview of how to set it all up: https://bugzilla.mozilla.org/showdependencytree.cgi?id=1635220&hide_resolved=0

See Also: → 1647889
See Also: → 1667897
You need to log in before you can comment on or make changes to this bug.