Closed Bug 951899 Opened 11 years ago Closed 8 years ago

Redundant Windows and Linux platform-name values in Super Search

Categories

(Socorro Graveyard :: Middleware, defect)

defect
Not set
major

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: stephend, Unassigned)

References

()

Details

Attachments

(1 file)

STR:

1. Load https://crash-stats.allizom.org/search/?product=FennecAndroid&platform=Android&_facets=signature&_columns=date&_columns=signature&_columns=product&_columns=version&_columns=build_id&_columns=platform
2. Click on the textfield for "platform" "has terms" _______

Actual:

Windows
Windows
Windows NT
Mac OS X
Linux
Linux

The 1st two "Windows" options seem (are?) redundant, as do the last two "Linux" options.
I don't think this is a super search problem, but a data problem. What this shows is simply that our processed crashes contain not sanitized fields. Here I suspect that those fields have leading or trailing white spaces. 

Do we really want to resolve this?
(In reply to Adrian Gaudebert [:adrian] from comment #1)
> I don't think this is a super search problem, but a data problem. What this
> shows is simply that our processed crashes contain not sanitized fields.
> Here I suspect that those fields have leading or trailing white spaces. 
> 
> Do we really want to resolve this?

I believe we do; redundant information will be confusing to anyone trying to use this (use-case: "Which Linux do I choose?  Do they have the same values?"  etc.) -- at the least, we should try to fix the data issue, no?
Searching for product contains Firefox and Platform=?, both linux platforms return the same # of results, 3173. The windows results differ slightly: 602098, 602127.

I suspect that large results are probabilistic. I used the raw query functionality to run the same query three times and then another three times with a sort field added. Erik Rose recommended it as a way to try and force ES to return an exact count but we got a different total results every time. The variations are small (<1%).

Once you have Linux selected in the platform box the second "Linux" tag is no longer an option, which implies they are the same. Inspecting the html shows no difference in the string content. Using the switch to raw ES there is no difference in generated query, either.

I don't know where or why we're getting two of the same, but I don't think it would change current behavior to dedupe the form field.
Not happening any more.
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → INVALID
Product: Socorro → Socorro Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: