Closed Bug 1754062 Opened 3 years ago Closed 3 years ago

Firefox Desktop labeled_counter "power.cpu_time_per_process_type_ms" ending up in additional_properties for some reason

Categories

(Data Platform and Tools :: General, defect)

defect

Tracking

(Not tracked)

VERIFIED FIXED

People

(Reporter: chutten, Unassigned)

Details

A labeled_counter metric of name power.cpu_time_per_process_type_ms has a column in firefox_desktop.metrics, but the values are empty arrays.

If you take a peek in additional_properties, you'll find its data there.

This is odd. The metric was added in 1747138. The metric didn't bounce on landing. It should be in nearly every "metrics" ping from builds 20220204092958 and up (nightly, and soon (now?) beta).

Why is the data in additional_properties?

Do you have further questions here, :chutten? Or does this look good to you now? The stable table data for 2022-02-07 should have this field populated. Feel free to reopen if you see any issues.

probe-scraper only runs on weekdays (see schedule includes 1-5 for day field in https://workflow.telemetry.mozilla.org/tree?dag_id=probe_scraper) in order to avoid picking up transient probe changes that sometimes have cropped up and disappeared over a weekend. This probably explains why it was delayed in showing up in the schema.

Status: NEW → RESOLVED
Closed: 3 years ago
Resolution: --- → FIXED

I do have one further question. I'm still trying to understand the order of operations.

  1. Schema Deploy on Feb 4 misses the landed patch (completed at 01:28 UTC)
  2. Patch lands late Feb 3 (22:51 EST, so... nearly 3AM Feb 4 UTC)
  3. Patch is picked up in the Feb 4 morning Nightly (20220204092958)
  4. Users start picking up Nightly updates on Feb 4
    • Data begins flowing
    • Data ends up in additional_properties
  5. Feb 5 and Feb 6 are weekend days. No changes.
  6. Schema Deploy on Feb 7 picks up the landed patch. New column added.
    • Data flowing into _live tables starts using the new column to store data
    • Schema for (among other things) stable tables update to have the new column
  7. Query against the stable tables on Feb 7 finds a column with the correct name, but all the data's in additional_properties
  8. Feb 8's copy-dedupe of the _live tables to stable finds data in the new column
    • Data is present in the new column in stable tables

So... it's because the patch missed Friday's schema deploy that there was no column in the _live tables for the data to flow into over the weekend... meaning that when the data reached the stable tables, they got stuck with additional_properties despite the Monday schema deploy giving them the correct column?

The surprise was "there's a column with the correct name, but there's no data in it", but I guess that's unavoidable unless we process additional_properties into new columns when moving from _live to stable.

(In reply to Chris H-C :chutten from comment #4)

The surprise was "there's a column with the correct name, but there's no data in it", but I guess that's unavoidable unless we process additional_properties into new columns when moving from _live to stable.

There is inevitably some gap, yes.

To clarify your timeline a little bit, schema updates happen via terraform, and we update live and stable table schemas at the same time. But the stable table only gets populated once per day.

So, you'll see the new column reflected in stable table immediately after schema deploy, but you won't see any rows with that field populated until after ~02:00 UTC when the nightly query that populates the new partition of the stable table finishes running.

That's wonderful, thank you. No further questions.

Status: RESOLVED → VERIFIED
Component: Datasets: General → General
You need to log in before you can comment on or make changes to this bug.