Closed Bug 1379162 Opened 7 years ago Closed 4 years ago

Expose layout.css.servo.enabled from the TelemetryEnvironment field via SuperSearch

Categories

(Socorro :: General, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: marco, Unassigned)

References

Details

It might also be worth adding the value to one of the tabs on the report page.
Note that this requires to change the `TelemetryEnvironment` field to be actual JSON instead of a string as it is right now. Probably something we want to do in the processors.
Pardon my fly-by ignorance but instead of parsing the TelemetryEnvironment string in the processor, is it not something that can be appended to the raw crash? Kinda like we did with DOMIPCEnabled. Then the processor can parse this and turn it into a true boolean in the processed crash.
(In reply to Peter Bengtsson [:peterbe] from comment #2) > Pardon my fly-by ignorance but instead of parsing the TelemetryEnvironment > string in the processor, is it not something that can be appended to the raw > crash? Kinda like we did with DOMIPCEnabled. Then the processor can parse > this and turn it into a true boolean in the processed crash. Would that also solve the issue dbaron described in bug 1376359 comment 18? Would be great to kill two birds with one stone.
Making it easier to figure out if crashes are stylo-related is really important as we widen our testing audience and gauge stability, which is happening this week. We want a column on crash stats, so that we (a) don't need special permissions and (b) can pick the stylo crashes out of a list without following each link individually. If I understand correctly, my original patch in bug 1376359 would have done that, but bsmedberg suggested I do the telemetry thing in bug 1376359 comment 9. Is it realistic to get the ergonomics we want on crash-stats within the next day or two? If not, should I just land my original patch from bug 1376359?
Flags: needinfo?(peterbe)
Flags: needinfo?(benjamin)
Flags: needinfo?(adrian)
Hey Bobby, Benjamin -- The team is short staffed this week. We can review and ship this quickly but it will be disruptive. If we can, I'd prefer to prepare the patch and ship it early next week. I recognize this is pressing and relatively straightforward to add as a one-off. I'm also concerned that stylo is one of several features being shipped preffed off and would welcome suggestions for how we identify and represent the whole class of features shipping this way. It may be that we scope this bug to an immediate fix for stylo and work out a more comprehensive view in a second bug that eventually replaces this near term work.
Flags: needinfo?(peterbe)
Flags: needinfo?(adrian)
Hi Chris, Thanks for the quick response. I'll try to provide some context around the ask and then hopefully we can figure out a path forward. There's a lot of schedule pressure around stylo. The hope is that we can get things solid enough this week to begin pref experiments for nightly users next week. That would give two weeks before 56 branches to beta, which is probably the bare minimum time we'd need to make meaningful adjustments in response to what we see. Adjustments before 56 are key, because we plan to do stylo pref experiments on 56 beta, since the 57 beta cycle is likely too late to change much. So we really need to gain confidence that stylo is ready for unsuspecting Nightly users by the end of this week. We've been tracking rendering regressions and feel pretty good about those, but it was pointed out in the meeting today that we may not have a good window into crashes due to the issues discussed in this bug. So we need to solve those ASAP, so that we can both identify and fix major crashes before next week. The two issues identified with the current crash reporter setup are: (1) Only people with relatively sensitive permissions can see if stylo was enabled for a given crash report. (2) We don't have a column, which means we have to click through each individual crash report to see if stylo was enabled, which makes it difficult to scan large numbers of crashes. I'm no expert in this area so I don't know the best way to fix these issues. If landing my original patch from bug 1376359 would get us there, I'm totally happy to go land it. Otherwise, we may need to do a one-off on the server side. It was also suggested during the meeting that we should coordinate more closely with Marcia, who already watches crash reports. She presumably has more permissions and crash-stats-fu than the rest of us, so if she has the bandwidth to identify and track our crashes, I'm happy to let her dictate the required ergonomics changes (which may be nothing at all). NI Chris and Lonnen to comment on the above and determine what we should do here.
Flags: needinfo?(mozillamarcia.knous)
Flags: needinfo?(chris.lonnen)
Understood. Thank you for the explanation. I sent mail to the set of people who could do this work, asking for a plan and a timeline tomorrow morning. Expect an update here. I'm expecting we will produce a 'good enough' solution for right now with a more thoughtful and comprehensive solution to follow on a less pressured timeline.
Flags: needinfo?(chris.lonnen)
As far as I have noticed, the word ``servo`` only appears in the Telemetry Environment if it is enabled. So, here's a hack to be able to search for stylo-enabled crashes: https://crash-stats.mozilla.com/search/?telemetry_environment=~servo&product=Firefox There are several better solutions going forward: 1. improve how we store the TelemetryEnvironment field to make it easier to expose things such as 'layout.css.servo.enabled', and then expose that as a regular field in Super Search. - requires some code changes to Socorro, might take a bit of time - does not require any change to Firefox 2. make the client add that value to the crash report's annotations, and then expose that in Super Search - requires some code changes to Firefox - requires a configuration change to crash-stats
See Also: → 1379662, 1379665
The hack Adrian links to in Comment 8 is good enough for me right now. I think Nick has also communicated to the stability list regarding how to file these crashes, since they are looking at crashes daily as well. As long as the information flows to that list as well I think we are good.
Flags: needinfo?(mozillamarcia.knous)
Great, thanks everyone! If Marcia has what she needs for the short term then Chris' team has some breathing room to get the right fix landed.
(In reply to Adrian Gaudebert [:adrian] from comment #8) > As far as I have noticed, the word ``servo`` only appears in the Telemetry > Environment if it is enabled. So, here's a hack to be able to search for > stylo-enabled crashes: > https://crash-stats.mozilla.com/search/ > ?telemetry_environment=~servo&product=Firefox > > There are several better solutions going forward: > > 1. improve how we store the TelemetryEnvironment field to make it easier to > expose things such as 'layout.css.servo.enabled', and then expose that as a > regular field in Super Search. > - requires some code changes to Socorro, might take a bit of time > - does not require any change to Firefox > 2. make the client add that value to the crash report's annotations, and > then expose that in Super Search > - requires some code changes to Firefox > - requires a configuration change to crash-stats This is what the patches (now obsolete attachments) in https://bugzilla.mozilla.org/show_bug.cgi?id=1376359 are about. There is third option, unless I'm reading all this wrong, and that is to write a piece of processor code that extracts it from the TelemetryEnvironment JSON payload and stores it as a "first class field" in the processed crash. (Plus adding it as a whitelisted Super Search field of course) Considering that it's already in the TelemetryEnvironment (i.e. process_crash['servo_enabled'] = json.loads(raw_crash['TelemetryEnvironment'].get('layout.css.servo.enabled')) it would be easy. - requires no code changes to Firefox - requires no code changes to how SuperSearch works - requires code changes in crash-stats processor - requires configuration change to crash-stats It would also be possible (and relatively easy) to add this to the JSON Schema too so it can be queried in Redash etc.
peterbe, let's follow up separately about better environment searching: we'd like to make it more automatic in the future rather than having to invent a new field for each new feature, so that this can scale to hundreds of options with little intervention.
Flags: needinfo?(benjamin)
With Adrian's "servo" query in comment 8, the Stylo team can find crash reports that have Stylo enabled. This is adequate for our needs. Should we just close this bug as WORKSFORME?
(In reply to Chris Peterson [:cpeterson] from comment #13) > With Adrian's "servo" query in comment 8, the Stylo team can find crash > reports that have Stylo enabled. This is adequate for our needs. Should we > just close this bug as WORKSFORME? I think we should still keep it open because we might be unlucky and the word "servo" might appear for something else in the massive TelemetryEnvironment string. I'm eager to explore the the point Benjamin points out in https://bugzilla.mozilla.org/show_bug.cgi?id=1379162#c12 The fact that we should better be able to search within the JSON in a more accurate way. It needs some thoughts on our end though.
(In reply to Peter Bengtsson [:peterbe] from comment #14) > I think we should still keep it open because we might be unlucky and the > word "servo" might appear for something else in the massive > TelemetryEnvironment string. Searching for the complete Stylo pref name ("layout.css.servo.enabled") instead of just the substring "servo" also works: https://crash-stats.mozilla.com/search/?telemetry_environment=layout.css.servo.enabled

This is from 4 years ago. Is it still important or can we close it out?

Flags: needinfo?(mcastelluccio)

I'd say this is no longer important.

Status: NEW → RESOLVED
Closed: 4 years ago
Flags: needinfo?(mcastelluccio)
Resolution: --- → WONTFIX
You need to log in before you can comment on or make changes to this bug.