Bug 1578661 Comment 21 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

So, I couldn't catch Jan-Erik in the office today, so a few questions, as the approach I use to do the counters depends on how the tool is used, which I don't know.

Basically, there are four ways to approach this as I see it.

### 1 - Use another checked-in file in the repository

This is the simplest (kinda). We have a few files with lists of CSS properties, like the devtools property database, or the `property_database.js` that we use for CSS tests.

#### Pros

 * Keeps the setup we have, where `probe-scraper` pulls files from `mozilla-central`.

#### Cons

 * These are JS files, so it means we need to process them somehow. `probe-scraper` is python so that's a bit annoying.
 * These lists can miss some properties (we intentionally ignore some properties in the devtools database, like `-moz-osx-font-smoothing`, which we may want to count). That's fixable, though not great. For `property_database.js`, I _think_ we should not missing any content-accessible property, and if we did that'd be something to fix, as the fuzzers use that file.
 * We don't share the parsing code with Gecko, so that can get out of sync. I expect it to be a tiny loop of the sort of the one Gecko uses, so it may or may not be a huge deal:

````py
def from_ServoCSSPropList(filename, strict_type_checks):
    histograms = collections.OrderedDict()
    properties = runpy.run_path(filename)["data"]
    for prop in properties:
        add_css_property_counters(histograms, prop.name)
    return histograms
```

#### Constraints

We can only do this if probe-scraper does _not_ depend on the order of the histograms (which I think it's the case).

### 2 - Like above, but generating yet another (probably python) file and linting for its existence / up-to-date-ness.

This is, effectively, checking-in `ServoCSSProperties.py` (or a simplified / trimmed-down version of it), and linting / testing it on automation to ensure that it doesn't go out of sync.

#### Pros

 * Same as above, minus all the cons.

#### Cons

 * Slightly gross.
 * Another thing to get people backed out when they add a new CSS property (though unifying it with the generation of the devtools DB should be no worse than today, so not a very strong con).

#### Constraints

None, I think.

### 3 - Upload `ServoCSSProperties.py` as an artifact to taskcluster, make probe-scrapper use it

This one would make the properties file an artifact in taskcluster, like the logs. This is actually pretty easy to do.

#### Pros

 * Same as above, minus all the cons.
 * Puts us into a better spot to rely on build-generated stuff (see below).

#### Cons

 * probe-scraper needs to look at tascluster URLs on top of hg.mozilla.org (this is easy, just pointing it out).
 * needs a bit of taskcluster wrangling, but I'm familiar with that.

#### Constraints

This limits the effective revisions that `probe-scraper` can scrape to the ones we have builds for. I don't know whether that's an issue or not. Looking at `probe_scraper/scrapers/buildhub.py` it seems we use buildhub, and as such we may already be relying on that. So probably not?

### 4 - As above, but the artifact is all the histogram data that we generate at build-time and probe-scraper reconstructs

This basically makes probe-scraper simpler by effectively just pulling the data it wants from taskcluster rather than duplicating all the parsing and merging and what not for every revision.

#### Pros

 * Same as above.
 * Makes probe-scraper much less likely to get out of sync with Gecko.

#### Cons

 * Basically same as above.
 * May make probe-scraper harder to extend, maybe?
 * May need to keep current probe-scraper code to keep data for old Firefox versions (though you can also pre-compile that data and put it somewhere on GitHub or what not).

----

Thoughts? Other ideas I may have missed? I'm leaning towards (2), (3), or maybe (3) + (4) as followup work if you deem it worth it, depending on whether the potential issues above are real or just potential, and whether the extra work of doing (3) / (4) is worth it, in your opinion.
So, I couldn't catch Jan-Erik in the office today, so a few questions, as the approach I use to do the counters depends on how the tool is used, which I don't know.

Basically, there are four ways to approach this as I see it.

### 1 - Use another checked-in file in the repository

This is the simplest (kinda). We have a few files with lists of CSS properties, like the devtools property database, or the `property_database.js` that we use for CSS tests.

#### Pros

 * Keeps the setup we have, where `probe-scraper` pulls files from `mozilla-central`.

#### Cons

 * These are JS files, so it means we need to process them somehow. `probe-scraper` is python so that's a bit annoying.
 * These lists can miss some properties (we intentionally ignore some properties in the devtools database, like `-moz-osx-font-smoothing`, which we may want to count). That's fixable, though not great. For `property_database.js`, I _think_ we should not missing any content-accessible property, and if we did that'd be something to fix, as the fuzzers use that file.
 * We don't share the parsing code with Gecko, so that can get out of sync. I expect it to be a tiny loop of the sort of the one Gecko uses, so it may or may not be a huge deal:

```py
def from_ServoCSSPropList(filename, strict_type_checks):
    histograms = collections.OrderedDict()
    properties = runpy.run_path(filename)["data"]
    for prop in properties:
        add_css_property_counters(histograms, prop.name)
    return histograms
```

#### Constraints

We can only do this if probe-scraper does _not_ depend on the order of the histograms (which I think it's the case).

### 2 - Like above, but generating yet another (probably python) file and linting for its existence / up-to-date-ness.

This is, effectively, checking-in `ServoCSSProperties.py` (or a simplified / trimmed-down version of it), and linting / testing it on automation to ensure that it doesn't go out of sync.

#### Pros

 * Same as above, minus all the cons.

#### Cons

 * Slightly gross.
 * Another thing to get people backed out when they add a new CSS property (though unifying it with the generation of the devtools DB should be no worse than today, so not a very strong con).

#### Constraints

None, I think.

### 3 - Upload `ServoCSSProperties.py` as an artifact to taskcluster, make probe-scrapper use it

This one would make the properties file an artifact in taskcluster, like the logs. This is actually pretty easy to do.

#### Pros

 * Same as above, minus all the cons.
 * Puts us into a better spot to rely on build-generated stuff (see below).

#### Cons

 * probe-scraper needs to look at tascluster URLs on top of hg.mozilla.org (this is easy, just pointing it out).
 * needs a bit of taskcluster wrangling, but I'm familiar with that.

#### Constraints

This limits the effective revisions that `probe-scraper` can scrape to the ones we have builds for. I don't know whether that's an issue or not. Looking at `probe_scraper/scrapers/buildhub.py` it seems we use buildhub, and as such we may already be relying on that. So probably not?

### 4 - As above, but the artifact is all the histogram data that we generate at build-time and probe-scraper reconstructs

This basically makes probe-scraper simpler by effectively just pulling the data it wants from taskcluster rather than duplicating all the parsing and merging and what not for every revision.

#### Pros

 * Same as above.
 * Makes probe-scraper much less likely to get out of sync with Gecko.

#### Cons

 * Basically same as above.
 * May make probe-scraper harder to extend, maybe?
 * May need to keep current probe-scraper code to keep data for old Firefox versions (though you can also pre-compile that data and put it somewhere on GitHub or what not).

----

Thoughts? Other ideas I may have missed? I'm leaning towards (2), (3), or maybe (3) + (4) as followup work if you deem it worth it, depending on whether the potential issues above are real or just potential, and whether the extra work of doing (3) / (4) is worth it, in your opinion.

Back to Bug 1578661 Comment 21