Closed Bug 1249658 Opened 8 years ago Closed 7 years ago

Add child histograms to the longitudinal dataset.

Categories

(Cloud Services Graveyard :: Metrics: Pipeline, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: rvitillo, Assigned: harter)

References

Details

We can't just add an array of child payloads to the longitudinal dataset or queries are going to be extremely slow. Plus, we probably don't want to consider each child payload individually (per session fragment) during an analysis so it might make sense to aggregate all child payloads into a single payload (Sam, can you confirm this with metrics?).

Tentatively, we could create a new column with a "_CHILD" suffix for each child metric. That would end up nearly doubling the number of columns of the dataset though which will likely blow up the Parquet buffers during the creation of the dataset.
Flags: needinfo?(spenrose)
I do not understand what child payloads represent. I do not believe metrics has used them for analysis, but will ask. It would be great to have a definitive statement of their purpose. Do we know who to ask?
Flags: needinfo?(spenrose)
(In reply to Sam Penrose from comment #1)
> I do not understand what child payloads represent. I do not believe metrics
> has used them for analysis, but will ask. It would be great to have a
> definitive statement of their purpose. Do we know who to ask?

They represent the telemetry payloads of the content processes in e10s.
Priority: -- → P3
Blocks: 1251580
Blocks: 1255755
No longer blocks: 1251580
Child histograms are now be aggregated in the client and sent in a place that isn't <payload-root>.childPayloads[0..N].{keyedH|h}istograms but is, instead, (taking a cue from bug 1281795) in <payload-root>.processes.content.{keyedH|h}istograms. This should make it considerably easier to add child histograms to the longitudinal dataset. That said, it isn't entirely clear to me if anyone is using histograms that aren't related to search in the longitudinal dataset at all.
Summary: Add child payloads to longitudinal dataset. → Add child histograms to the longitudinal dataset.
Assignee: nobody → rharter
Priority: P3 → P2
(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #3)

> That said, it isn't entirely clear to me if anyone is using histograms that
> aren't related to search in the longitudinal dataset at all.

How can we verify these histograms are useful? Has someone requested this data?
Flags: needinfo?(rvitillo)
(In reply to Ryan Harter [:harter] from comment #4)

> > That said, it isn't entirely clear to me if anyone is using histograms that
> > aren't related to search in the longitudinal dataset at all.
> 
> How can we verify these histograms are useful? Has someone requested this
> data?

The data could be useful to analyze e10s performance.
Flags: needinfo?(rvitillo)
Priority: P2 → P1
I think (In reply to Roberto Agostino Vitillo (:rvitillo) from comment #5)
> (In reply to Ryan Harter [:harter] from comment #4)
> > How can we verify these histograms are useful? Has someone requested this
> > data?
> 
> The data could be useful to analyze e10s performance.

I think with the Quantum program rolling out more data-driven, we'll also see more needs for child/content data across these data sets.
Do we want to use the "_child" suffix for child process histograms? Now that opt-in histograms are no longer included (bug 1313151), the number of additional columns should be manageable.
Flags: needinfo?(rvitillo)
(In reply to Ryan Harter [:harter] from comment #7)
> Do we want to use the "_child" suffix for child process histograms? Now that
> opt-in histograms are no longer included (bug 1313151), the number of
> additional columns should be manageable.

What about using "_content" as suffix?
Flags: needinfo?(rvitillo)
Would these cover the GPU/plugin/addon/etc processes as well? i.e. not necessarily just content processes.
It would be great if the suffix could vary depending on the process type.
(In reply to Roberto Agostino Vitillo (:rvitillo) from comment #10)
> It would be great if the suffix could vary depending on the process type.

I don't see the process types in the ping. Would we have to add more information to the ping? If so, let's stick with the "_child" suffix for now. SG?
Flags: needinfo?(rvitillo)
(In reply to Ryan Harter [:harter] from comment #11)
> (In reply to Roberto Agostino Vitillo (:rvitillo) from comment #10)
> > It would be great if the suffix could vary depending on the process type.
> 
> I don't see the process types in the ping. Would we have to add more
> information to the ping? If so, let's stick with the "_child" suffix for
> now. SG?

Ryan, the different processes types are listed under the "processes" [1] section. Is that what you were looking for?

Each process object contains the data related to that process.

[1] - https://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/telemetry/data/main-ping.html#processes
(In reply to Alessio Placitelli [:Dexter] from comment #12)
> (In reply to Ryan Harter [:harter] from comment #11)
> > (In reply to Roberto Agostino Vitillo (:rvitillo) from comment #10)
> > > It would be great if the suffix could vary depending on the process type.
> > 
> > I don't see the process types in the ping. Would we have to add more
> > information to the ping? If so, let's stick with the "_child" suffix for
> > now. SG?

See below.

> Ryan, the different processes types are listed under the "processes" [1]
> section. Is that what you were looking for?

Yes.

> Each process object contains the data related to that process.
> 
> [1] -
> https://gecko.readthedocs.io/en/latest/toolkit/components/telemetry/
> telemetry/data/main-ping.html#processes
Flags: needinfo?(rvitillo)
I think I'm confused - here's an explanation of how I understand the problem.

There's a request to separate the data resulting from content processes and GPU/plugin/addons. I interpret this to mean we need three types of histogram keys:

 * {histogram} - histograms already included in the longitudinal set
 * {histogram}_content - for content processes
 * {histogram}_child or _other - for GPU/plugins/addons

Are the histograms from both of these sources stored under "payloads.processes.content"? If so, I don't see any way to split the data. Inspecting a sample of pings, I see the following structure:

* payloads.processes
  * content
    * histograms: Dict of histograms
    * keyedHistograms
  * parent
    * scalars
    * keyedScalars
Flags: needinfo?(rvitillo)
(In reply to Ryan Harter [:harter] from comment #14)
> Are the histograms from both of these sources stored under
> "payloads.processes.content"? If so, I don't see any way to split the data.
> Inspecting a sample of pings, I see the following structure:
> 
> * payloads.processes
>   * content
>     * histograms: Dict of histograms
>     * keyedHistograms
>   * parent
>     * scalars
>     * keyedScalars

The histograms for the parent process still need to be moved to "processes.parent", they are in the root of the ping payload [1].

> "payload" : {
>   "histograms" : {
>     ... parent process ...
>   },
>   "keyedHistograms" : {
>     ... parent process ...
>   },
>   "processes" : {
>     "parent" : {
>       ... parent histograms are in payload/histograms as of now. 
>       }
>     },
>     "content" : {
>       "histograms" : {
>         ... content histograms
>       }
>     },
>     "gpu" : {
>       "histograms" : {
>         ... content histograms
>       }
>     },
> },

The histograms for each child process are stored within the related entry under processes. That means that in processes.gpu you will find an histograms section with histograms for that process.

[1] - http://searchfox.org/mozilla-central/rev/4f5f9f3222f35fa4635ea96a0c963c130854ef19/toolkit/components/telemetry/TelemetrySession.jsm#1295
Flags: needinfo?(rvitillo)
Thanks, Alessio. That cleared up my confusion.

I have a candidate pull request here:
https://github.com/mozilla/telemetry-batch-view/pull/155
Also, for reference, Bug 1325653 is somewhat relevant to this solution.
Depends on: 1328230
Closed with PR 155:
https://github.com/mozilla/telemetry-batch-view/pull/155
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Product: Cloud Services → Cloud Services Graveyard
You need to log in before you can comment on or make changes to this bug.