Closed Bug 1553455 Opened 2 years ago Closed 1 year ago

Untrusted modules dashboard creation

Categories

(Data Science :: Dashboard, task, P3)

x86_64
Unspecified
task

Tracking

(Not tracked)

RESOLVED INVALID

People

(Reporter: RT, Unassigned, NeedInfo)

References

Details

Brief description of the request:
Telemetry was shipped to help better understand the impact that specific modules have on user experience and we want to make this telemetry accessible to most people at Mozilla to help understand / troubleshoot DLL issues and make decisions on blocklist additions.

Dashboard requirements:

  • Provide a stack that relates to a DLL name
  • All untrusted modules, sorted by load time with a filter on "share of users impacted above X"
  • All untrusted modules, sorted by crash frequency with a filter on "share of users impacted above X"
  • All untrusted modules sorted by share of users impacted
  • All untrusted modules causing the bootstrap process to crash sorted by frequency of occurence
  • Floor frame: Given a particular floor frame, what is the frequency of its use as an injection mechanism?
  • Top DLLs that correlate with engagement metrics drop (search count, session time, URIs browsed ...) with a filter on "share of users impacted above X"
  • Top DLLs that correlate with retention drop (3 week retention?) with a filter on "share of users impacted above X"

Link to any assets:
Previous work done: https://docs.google.com/document/d/1r8PSHrEFeVc7S7kzrDnRq28_MRUreP9_cePyKDj8i8A/edit

Is there a specific data scientist you would like or someone who has helped to triage this request:
No

Hi Romain, what timeline/urgency do you have for this?

Flags: needinfo?(rtestard)
Priority: -- → P3

Hi Emily, having this by end of June would be great - this should help inform our next priorities regarding DLL blocking with the bootstrap process shipping in 68.

Flags: needinfo?(rtestard)

Hi Emily, any news on this please?

Hi Romain, sorry for the delay, at the moment this isn't prioritized for other work on the Data Science team. Adding in @rmiller as a lot of this might be something that can be solved with MDV2 (though I'm not sure the timeline)

Flags: needinfo?(rmiller)

I'm not sure if MDV2 will help... it's an significant but incremental improvement over the existing measurement dashboard on the telemetry.mozilla.org site. In any case, it (MDV2) probably won't be ready for use until early Q4.

Flags: needinfo?(rmiller)

Hi Emily, given rmiller's comment can you please help clarify if this is something the data team can help support us with in Q3. This is necessary work for the bootstrap process improvement work planned in Q3 and is a direct dependency on our OKRs for the bootstrap process.

Flags: needinfo?(ethompson)

Hey Romain, right now we're slammed with other priorities. adding @mreid and @dzieleski in case we can grab anyone to help with the first few items on your list

Flags: needinfo?(mreid)
Flags: needinfo?(ethompson)
Flags: needinfo?(dzielaski)

My team doesn't have capacity to help tackle this at least for the next month or so.

Flags: needinfo?(mreid)

Some notes after conversation with Mark:

  • it should be possible to fulfill all requirements except 1st and 6th without symbolication/signature generation
  • untrusted_modules ping is enabled only in nightly - that might be not enough for meaningful engagement analysis

Thanks Arkadiusz and sorry for the delay (just back from parental leave).
Are there reasons for not having the untrusted_modules ping enabled on beta or release? Otherwise I can open a bug to handle this.

Flags: needinfo?(akomarzewski)

(In reply to Romain Testard [:RT] from comment #10)

Are there reasons for not having the untrusted_modules ping enabled on beta or release? Otherwise I can open a bug to handle this.

I think we wanted to see if data captured with this ping will prove to be useful, I'm not aware of any other reasons (ni'ing Aaron for context).

Flags: needinfo?(akomarzewski) → needinfo?(aklotz)

Oh, this data is extremely useful. The closer to the release channel we get, the more useful it is, in fact!

My understanding was that there was concern about the computational load of symbolicating the call stacks in the ping given the volume of data that would be coming in from beta and release. Does that ring a bell?

Flags: needinfo?(aklotz) → needinfo?(akomarzewski)

You're right, symbol server load was a concern too. We have worked around this by symbolicating only the pings from nightly in the ETL job - this is no longer a blocker for submissions from beta and release.

As I mentioned in comment 9, 1st and 6th requirements need symbolication (others can be derived from raw pings). We'll need to plan and develop more robust symbolication job in order to analyze full population. I'll find out what we need there.

Flags: needinfo?(akomarzewski)

Great, thanks for the clarifications.
I now raised bug 1577217 to deal with enablement on beta and release.

Depends on: 1577217
Status: NEW → RESOLVED
Closed: 1 year ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.