Redundant Windows and Linux platform-name values in Super Search

RESOLVED INVALID

Status

--
major
RESOLVED INVALID
5 years ago
2 years ago

People

(Reporter: stephend, Unassigned)

Tracking

Details

(URL)

Attachments

(1 attachment)

(Reporter)

Description

5 years ago
Created attachment 8349724 [details]
redundant platform names.png

STR:

1. Load https://crash-stats.allizom.org/search/?product=FennecAndroid&platform=Android&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform
2. Click on the textfield for "platform" "has terms" _______

Actual:

Windows
Windows
Windows NT
Mac OS X
Linux
Linux

The 1st two "Windows" options seem (are?) redundant, as do the last two "Linux" options.
I don't think this is a super search problem, but a data problem. What this shows is simply that our processed crashes contain not sanitized fields. Here I suspect that those fields have leading or trailing white spaces. 

Do we really want to resolve this?
(Reporter)

Comment 2

5 years ago
(In reply to Adrian Gaudebert [:adrian] from comment #1)
> I don't think this is a super search problem, but a data problem. What this
> shows is simply that our processed crashes contain not sanitized fields.
> Here I suspect that those fields have leading or trailing white spaces. 
> 
> Do we really want to resolve this?

I believe we do; redundant information will be confusing to anyone trying to use this (use-case: "Which Linux do I choose?  Do they have the same values?"  etc.) -- at the least, we should try to fix the data issue, no?

Comment 3

5 years ago
Searching for product contains Firefox and Platform=?, both linux platforms return the same # of results, 3173. The windows results differ slightly: 602098, 602127.

I suspect that large results are probabilistic. I used the raw query functionality to run the same query three times and then another three times with a sort field added. Erik Rose recommended it as a way to try and force ES to return an exact count but we got a different total results every time. The variations are small (<1%).

Once you have Linux selected in the platform box the second "Linux" tag is no longer an option, which implies they are the same. Inspecting the html shows no difference in the string content. Using the switch to raw ES there is no difference in generated query, either.

I don't know where or why we're getting two of the same, but I don't think it would change current behavior to dedupe the form field.
Not happening any more.
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → INVALID
(Assignee)

Updated

2 years ago
Product: Socorro → Socorro Graveyard
You need to log in before you can comment on or make changes to this bug.