Closed Bug 968923 Opened 11 years ago Closed 8 years ago

Implement some equivalent of Chrome's use counters (on top of telemetry?)

Categories

(Core :: General, defect)

defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla43

People

(Reporter: bzbarsky, Assigned: froydnj)

References

(Depends on 1 open bug, Blocks 5 open bugs)

Details

(Keywords: dev-doc-needed)

Attachments

(13 files, 18 obsolete files)

114.69 KB, patch
Details | Diff | Splinter Review
4.48 KB, patch
smaug
: review+
Details | Diff | Splinter Review
21.25 KB, patch
gfritzsche
: review+
mshal
: review+
heycam
: review+
Details | Diff | Splinter Review
7.71 KB, patch
seth
: feedback+
Details | Diff | Splinter Review
2.79 KB, patch
gfritzsche
: review+
Details | Diff | Splinter Review
13.58 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
12.63 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
22.21 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
5.19 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
18.93 KB, patch
froydnj
: review+
Details | Diff | Splinter Review
2.19 KB, patch
bzbarsky
: review+
Details | Diff | Splinter Review
19.01 KB, patch
smaug
: review+
Details | Diff | Splinter Review
1.94 KB, patch
heycam
: review+
Details | Diff | Splinter Review
If I understand their use counter setup correctly, it allows them to answer the following question:

  What fraction of pages that our users loaded encountered condition X?

This fundamentally requires keeping track of two values: total number of pages loaded by all users and number of pages that encountered condition X.

The question is how to implement that on our end.

On the DOM side it's pretty simple to just have an enum and be able to call into a document with a value of that enum, have the first call do something and the rest no-op.  We have a similar setup for WarnOnceAbout already.

What should the "something" be?  The goal here is that a developer should be able to add a value to the enum, wait for it to ride the trains, then load some web page where he's told what fraction of documents had that enum value triggered...  Does telemetry let us do that in a sane way?
One other note.  A common request for this sort of data is to have some idea of how many distinct _sites_ are involved, or which sites they are.  The former would be a bit harder than the above proposal while the latter is a privacy no-go.
(In reply to Boris Zbarsky [:bz] from comment #1)
> One other note.  A common request for this sort of data is to have some idea
> of how many distinct _sites_ are involved, or which sites they are.  The
> former would be a bit harder than the above proposal while the latter is a
> privacy no-go.

I'm still not clear as to how to measures 'number of pages loaded' and how to tell if a feature is used by multiple times per 'page' or not. 

Nobody I spoke to has a good understanding as to how to track 'pages'. Telemetry is just a vehicle to carry payloads. If you can capture a value locally in a way that makes sense to you, we can send it out.

So my request is: please implement dom/network needed to measure 'x per page' so we know what we are asked to hook up to telemetry.
> I'm still not clear as to how to measures 'number of pages loaded'

We can measure the number of non-chrome inner window objects created, say.  This part is simple.

The hard part is figuring out when to reset the counter and when to report it and such.

> and how to tell if a feature is used by multiple times per 'page'

See comment 0.  It covers this part.

> Telemetry is just a vehicle to carry payloads. 

Sort of.  It's also a UI for viewing them, etc.

So what sorts of payloads can telemetry send and allow me to view?  Can I do something where I have a bunch of counters, every time we go to send data we send the current values and zero them out, and on the server end we just total up the values of all the counters and have an easy way to see those totals for a particular release?
In bug 968572 I have a patch to do some measurements of certain kinds of CSS property usage where we store that information on the nsPresContext, only report it at nsPresContext destruction time, and only if the nsPresContext is for a document that is non-chrome and not about:* or chrome:*.  (Perhaps that last check is redundant with "the document is non-chrome".)

That counts things like background-image documents, iframes, etc.  It wouldn't be hard to modify to store this information only for the top level document.

I agree with Boris that it would be great if we could get this information per page rather than per loaded document, but that doesn't seem feasible at the moment.
I'm fine with per loaded document.  I'd just like us to have something developers can use turnkey without the pain I went through in bug 909656 trying to get _some_ sort of useful data....
Assignee: nobody → cam
Status: NEW → ASSIGNED
(In reply to Boris Zbarsky [:bz] from comment #3)
> 
> So what sorts of payloads can telemetry send and allow me to view?  Can I do
> something where I have a bunch of counters, every time we go to send data we
> send the current values and zero them out, and on the server end we just
> total up the values of all the counters and have an easy way to see those
> totals for a particular release?

Yes. just have an api for getAndResetCounter() that you call to assign to a simpleMeasure in TelemetryPing.js.
Hmm.  What does the current simpleMeasure setup do?  I would have thought it resets counters already, if it does counters.  

Are there existing simpleMeasure providers in C++ we could crib?  The stuff in Telemetry.h does histograms, and we don't really need or want a histogram here....

And how do we get the server to just add up the numbers instead of producing histograms?
In my WIP patch I'm defining a histogram per use counter of kind "boolean", where I call Telemetry::Accumulate each time a Document is destroyed with true if script ever called a particular DOM method in that document, or false if it didn't.

Am I right in thinking that each time I call Telemetry::Accumulate(), one of two values (for the "true" nad "false" cases) will get incremented locally on the histogram object, and then when the daily telemetry ping is performed, those two numbers will be sent to the server, and then those two values are reset back to zero?  And then when you view the data on telemetry.mozilla.org, this shows the total number of false/true values that were sent to the server, and not anything like "number of false/true values per user"?

I'm assuming this is how it worked with the XMLHTTPREQUEST_ASYNC_OR_SYNC histogram.
FWIW, I have had little luck getting actual _numbers_ out of a two-value histogram on the telemetry server.  Approximate percentages, yes, but approximate to like the nearest %.

Better server UI is part of what we need for this bug.  :(
If I go to http://telemetry.mozilla.org/#nightly/29/XMLHTTPREQUEST_ASYNC_OR_SYNC and click "download as csv" I can get exact numbers, but it is difficult to use that date range selector on the graph to a get a particular range of interest.
For this, I don't think we need graph ranges.  Just having the raw counts per-release is great.  The ideal UI for these use counters, I suspect, is a single page that just lists them all and lets you see the percentages of pages using them, on a per-release basis.
Oh, and the point is we can grab the CSV, via XHR or server-side, and munge it to produce that UI, I would think.
Also, if we're going to make use of the "didn't use" number, then we need to restrict this to exclude various XBL bits, XHR result documents, etc, etc...  Doable; just have to be a bit careful.
Attached patch WIP (v0.1) (obsolete) — Splinter Review
This is what I've got so far.  You can only put [UseCounter] on a regular operation (static or not) for the moment.  The generated histogram JSON file is objdir/dom/bindings/UseCounterHistograms.json and is created from Codegen.py.

I'm not really sure about the dom/bindings/Makefile.in change -- is there a different way I should write "dom/bindings/UseCounterHistograms.json is generated by Codegen.py running"?

The telemetry calls in ~nsDocument are #if-ed out at the moment; instead it prints the use counter booleans to the console.
You'll probably want gps to look over the build bits.
I'm glad you CC'd a build peer early in the design of this feature!

Telemetry and their histograms are one part of the build system I wish were designed better. I believe there are bugs on this already. But in summary:

1) Generating a massive .h/.cpp with all the histograms/probes means that any time you change a histogram, you've invalidated every object file including said .h file. In other words, you are effectively introducing a clobber build. This is similar to how anytime we add a new AC_SUBST/AC_DEFINE to configure, mozilla-config.h changes and forces a rebuild of everything. I believe a solution with multiple, domain-specific .h/.cpp files for logically grouped entries would be a better design since it would cull dependency hell. If you are familiar with Protocol Buffers, they employ a similar solution with each .proto file deriving to a .h/.cpp pair.

2) Having all the histograms/probes defined in a single file is so pre-moz.build. We now have the ability to aggregate whole world build config info as part of config.status. We should consider separate JSON files for defining counters/histograms, each living in the source tree next to the component it measures. We could even have this data in moz.build files themselves if we didn't want to use JSON. Not recommending it - just throwing it out there.

3) We should consider processing the JSON files / probe data as part of config.status. We've historically had dependency issues with changes to Telemetry that require clobbers. I'm not sure what the underlying problems are. Moving more things to config.status can form a nice band aid. But it can be annoying to wait for config.status to run. So, we need to be cognizant of the effect on developer workflows.
That all sounds fine, although not something I myself would want to have to try doing. :-)  I guess I just want to make sure that the changes I'm making in toolkit/components/telemetry/Makefile.in
, dom/bindings/Makefile.in and dom/bindings/mozwebidlcodegen/__init__.py are correct and aren't making builds any worse -- for example if touching an arbitrary .webidl file but not adding/removing [UseCounter] caused the whole world to rebuild that would be bad.
Attachment #8372751 - Flags: feedback?(gps)
(In reply to Boris Zbarsky [:bz] from comment #12)
> Oh, and the point is we can grab the CSV, via XHR or server-side, and munge
> it to produce that UI, I would think.
Yes, see telemetry.js bits, http://jonasfj.dk/blog/2014/01/custom-telemetry-dashboards/ it's quite easy to get raw data

Patches welcome re better UI :)

> Am I right in thinking that each time I call Telemetry::Accumulate(), one of two values (for the "true" nad "false" cases) will get incremented locally on the histogram object, and then when the daily telemetry ping is performed, those two numbers will be sent to the server, and then those two values are reset back to zero?

Sort of. Telemetry for the same session is designed to overwrite the previous submission. Each browser session starts with counters reset(eg exit to reset counters).
(In reply to Taras Glek (:taras) from comment #18)
> Sort of. Telemetry for the same session is designed to overwrite the
> previous submission. Each browser session starts with counters reset(eg exit
> to reset counters).

OK, so same net effect.  Great.  I think using Histograms here rather than writing up something custom with the simple measurements is going to be simpler.
So one question here is whether we are happy with measuring "per loaded content document (excluding things like XHR result documents)", or if we should aggregate all of the content documents loaded within a single window.  If I have a couple of <iframe>s in a page that use (or not) a counted feature, should we report true/false for each of those <iframe>s, plus the top level document, or just one true/false for the top level document that is a logical or of itself and the <iframe>s?
Can we get both?  ;)
Er yeah, that should be easy enough. :-)
Apart from documents which are in the docshell tree, and SVG documents loaded as images, are there any other documents we'd want to track?
Hrm.  It sort of depends on the feature.  For DOM APIs, we might care about XHR result documents...
Maybe we can account feature usage in XHR result documents and documents created with DOMImplementation.createDocument() to the document in the window that we created the XHR/Document from.
(In reply to Cameron McCormack (:heycam) from comment #25)
> Maybe we can account feature usage in XHR result documents and documents
> created with DOMImplementation.createDocument() to the document in the
> window that we created the XHR/Document from.

Turns out this will happen automatically, since the code in the bindings will grab the document from the global rather than the |this| value (which also is a document).
Attached patch WIP (v0.2) (obsolete) — Splinter Review
This patch now supports use counters for DOM methods, attributes and for CSS properties.  That's probably enough types of use counters to begin with.

We record both per-document usage and per-page document.  Per-page use counters represent whether a feature was used in the document itself, any of its children (like via <iframe>s) or resource documents (SVG-referenced things) or image documents that are SVG documents.

Generating the use counter histograms directly from Codegen.py turned out to be too much of a hassle in terms of getting the dependencies in the build system right.  Also there was no corresponding place from which to generate them for CSS properties.  So now there is a global use counter configuration file, dom/base/UseCounters.conf.  You still also need to add [UseCounter] to the corresponding interface member in the Web IDL file.  For properties, you make the CSS_PROP declaration take the CSS_PROPERTY_HAS_USE_COUNTER flag.

I considered using JSON as the format for UseCounters.conf, but I realised that JSON doesn't allow comments and I wanted to put some notes in that file to describe the format and remind people to add [UseCounter] in the right place.  So I've gone with a simple custom format.

gps, do you want to take a quick look at my Makefile.in changes to make sure they're sane?

bz, so far I've tested:

* SVG images in <img>, background-image, list-style-image, 'content'
* SVG resource documents, like |fill: url(otherdoc#something)|
* hierarchies of <iframe>s
* calling things on XHR/DOMParser-created documents

to make sure the right documents get the per-document/per-page use counters set, and that the Telemetry calls happen.  Any other interesting multiple document cases we need to make work?
Attachment #8372751 - Attachment is obsolete: true
Attachment #8372751 - Flags: feedback?(gps)
Attachment #8376953 - Flags: feedback?(gps)
Attachment #8376953 - Flags: feedback?(bzbarsky)
You can abuse Python's eval() or execsource() to convert a JSON file to a Python data structure. Then, add comments and trailing commas. It still quacks like JSON but it's really Python.

I'll try to do a full review shortly.
Yeah I suppose using a Python object for UseCounters.conf is another option.  I could change it if you really think it's preferable, but I think what I have now is fine for the moment.
Comment on attachment 8376953 [details] [diff] [review]
WIP (v0.2)

Review of attachment 8376953 [details] [diff] [review]:
-----------------------------------------------------------------

I'm not a huge fan of:

a) monolithic and centralized config files like UseCounters.conf
b) monolithic and centralized header files for derived data

On the surface "a" is a style nit. I just think things should be defined closer to their origin and sufficiently decoupled from the rest of the tree. i.e. if we remove feature X from toolkit/components/X I shouldn't need to update files in dom/base.

"b" is a performance concern. Because TelemetryHistogramEnums.h is included by a lot of translation units, a small change to that file results in massive recompilation. On my OS X build when that file changes, we recompile ~100 .o files (in unified mode) and that adds ~70s wall and ~7:50 CPU to builds.

"b" is the larger concern. More and more features and translation units will use Telemetry. This patch is proof :) Over time, the percentage of translation units including TelemetryHistogramEnums.h will approach 100%. Over time, the development cost to add/remove a histogram will approach the time for a clobber build. I don't believe this is something developers want to live with.

I argue it's easier to transition into multiple header files if the histograms are defined in separate config files (one .h per histogram file is a natural mapping). Therefore, I think pursuing separate config files today is worth doing, especially since it's marginal extra effort.

I would either define use counters inline in moz.build files:

   USE_COUNTERS['css_properties'] += ['fill']

Or have moz.build aggregate the set of files with use counters:

   USE_COUNTER_FILES += ['usecounters.conf']

We can then have the build backend emit a JSON (or other machine-readable) file describing what it sees and that file is consumed by gen-usecounters.py.

I'm not going to block r+ if you use a central config file. I do want people aware of the performance implications and the technical debt a unified config file / header entails. It'll be making incremental builds slower for everyone. I strongly encourage us to get off that path at our earliest convenience. Separate config files with the intention that they eventually result in multiple .h files is a good first step.

::: dom/base/Makefile.in
@@ +15,5 @@
>  endif
> +
> +usecounter_files = UseCounterList.h PropertyUseCounterMap.inc UseCounterHistograms.json
> +usecounter_conf_src = $(srcdir)/UseCounters.conf
> +usecounter_conf_dest = $(CURDIR)/UseCounters.conf

You'll want to add these generated files to GENERATED_FILES in moz.build.

You shouldn't need $(CURDIR) in there. All paths are relative from the objdir.

@@ +17,5 @@
> +usecounter_files = UseCounterList.h PropertyUseCounterMap.inc UseCounterHistograms.json
> +usecounter_conf_src = $(srcdir)/UseCounters.conf
> +usecounter_conf_dest = $(CURDIR)/UseCounters.conf
> +
> +$(usecounter_conf_dest) : $(usecounter_conf_src)

Nit: Cuddle colon to the left.

@@ +20,5 @@
> +
> +$(usecounter_conf_dest) : $(usecounter_conf_src)
> +	$(call py_action,preprocessor,$(DEFINES) $(ACDEFINES) $^ -o $@)
> +
> +$(usecounter_files) : $(srcdir)/gen-usecounters.py $(usecounter_conf_dest)

make will race here due to multiple outputs for one rule.

See https://www.gnu.org/software/automake/manual/html_node/Multiple-Outputs.html

(Yes, make is silly.)

::: dom/base/UseCounter.h
@@ +1,2 @@
> +#ifndef UseCounter_h_
> +#define UseCounter_h_

Need moar MPL.

::: dom/base/UseCounters.conf
@@ +16,5 @@
> +//
> +//   (d) one of three possible use counter declarations:
> +//
> +//         method <IDL interface name>.<IDL operation name>
> +//         attribute <IDL interface name>.<IDL attribute name>

It seems silly to me that we have to both put an attribute in a .webidl *and* update this file. Why can't we have the .webidl parser emit data of all discovered counters and have that output fed into gen-usecounters.py?

::: dom/base/gen-usecounters.py
@@ +1,2 @@
> +#!/usr/bin/env python
> +

Need moar MPL.

@@ +4,5 @@
> +import sys
> +
> +AUTOGENERATED_WARNING_COMMENT = "/* THIS FILE IS AUTOGENERATED BY gen-usercounters.py - DO NOT EDIT */"
> +
> +def read_conf(conf_filename):

Just pass in a file object. It's more versatile.

@@ +11,5 @@
> +    with open(conf_filename, 'r') as f:
> +        line_num = 0
> +        for line in f:
> +            line = line.rstrip('\n')
> +            line_num = line_num + 1

for line_num, line in enumerate(f):

@@ +12,5 @@
> +        line_num = 0
> +        for line in f:
> +            line = line.rstrip('\n')
> +            line_num = line_num + 1
> +            if line == '' or line.startswith('//'):

if not line or line.startswith('//')

@@ +17,5 @@
> +                # empty line or comment
> +                continue
> +            m = re.match(r'method ([A-Za-z0-9]+)\.([A-Za-z0-9]+)$', line)
> +            if m:
> +                (interface_name, method_name) = m.groups()

Nit: you don't need the parens.

@@ +20,5 @@
> +            if m:
> +                (interface_name, method_name) = m.groups()
> +                counters.append({ 'type': 'method',
> +                                  'interface_name': interface_name,
> +                                  'method_name': method_name })

Nit: cuddle braces

@@ +42,5 @@
> +
> +def generate_list(filename, counters):
> +    with open(filename, 'w') as f:
> +        def print_optional_macro_declare(name):
> +            print >>f, '''

Please use print() in all new Python so it is Python 3 compatible. (The print statement was removed in Python 3.)

That being said, people typically use f.write() or f.writeline().

@@ +63,5 @@
> +        print_optional_macro_declare('USE_COUNTER_DOM_METHOD')
> +        print_optional_macro_declare('USE_COUNTER_DOM_ATTRIBUTE')
> +        print_optional_macro_declare('USE_COUNTER_CSS_PROPERTY')
> +
> +        for counter in counters:

We strive for consistent and stable output from the build system. Any chance we could sort the counters? Although, I suppose if the order comes from the conf file, then that's as stable as we need.

@@ +102,5 @@
> +                items.append('''  "%s": {
> +    "expires_in_version": "never",
> +    "kind": "boolean",
> +    "description": "%s"
> +  }''' % (name, desc))

Python has a json module that will convert Python data structures to json and will handle Unicode properly. Please use it.

::: toolkit/components/telemetry/Makefile.in
@@ +22,5 @@
>  histoenums_TARGET := export
>  
>  include $(topsrcdir)/config/rules.mk
>  
> +histogram_files := $(srcdir)/Histograms.json $(DEPTH)/dom/base/UseCounterHistograms.json

This introduces a race condition in the build system between the processing of toolkit/components/telemetry/Makefile and dom/base/Makefile. You'll need to add a dependency to the bottom of /Makefile.in to prevent it.

@@ +39,5 @@
>  
> +$(histogram_enum_file): $(histogram_files) $(enum_python_deps)
> +	$(PYTHON) $(srcdir)/gen-histogram-enum.py $(histogram_files) > $@
> +$(histogram_data_file): $(histogram_files) $(data_python_deps)
> +	$(PYTHON) $(srcdir)/gen-histogram-data.py $(histogram_files) > $@

It's a best practice to have all our Python scripts that perform code generation to emit a .pp make file that includes extra dependencies. Since these .py files are self-contained and don't import other/shared modules, we should be fine listing dependencies in the Makefile.in. Be forewarned.

Also, I'm not sure why we have multiple scripts/actions for the same set of inputs. We should consolidate these scripts. Followup.
Attachment #8376953 - Flags: feedback?(gps) → feedback+
Blocks: 914360
(In reply to Gregory Szorc [:gps] from comment #31)
> a) monolithic and centralized config files like UseCounters.conf
> b) monolithic and centralized header files for derived data
> 
> On the surface "a" is a style nit. I just think things should be defined
> closer to their origin and sufficiently decoupled from the rest of the tree.
> i.e. if we remove feature X from toolkit/components/X I shouldn't need to
> update files in dom/base.

OK.  We could easily split the configuration file out into say dom/base/UseCounters.conf for all IDL attribute/operation use counters, layout/style/UseCounters.conf for CSS properties, and other files for new types of use counters that we come up with.

I feel like it is simpler for them to be defined in one place, though.

> "b" is a performance concern. Because TelemetryHistogramEnums.h is included
> by a lot of translation units, a small change to that file results in
> massive recompilation. On my OS X build when that file changes, we recompile
> ~100 .o files (in unified mode) and that adds ~70s wall and ~7:50 CPU to
> builds.
> 
> "b" is the larger concern. More and more features and translation units will
> use Telemetry. This patch is proof :) Over time, the percentage of
> translation units including TelemetryHistogramEnums.h will approach 100%.
> Over time, the development cost to add/remove a histogram will approach the
> time for a clobber build. I don't believe this is something developers want
> to live with.

I think adding and removing use counters will be infrequent enough that this doesn't really matter.

But: if we did want to solve this, we could make the enum a sized enum and then forward declare it.  In nearly all of the headers where it is included, we don't need to refer to specific values of the enum.

> I argue it's easier to transition into multiple header files if the
> histograms are defined in separate config files (one .h per histogram file
> is a natural mapping). Therefore, I think pursuing separate config files
> today is worth doing, especially since it's marginal extra effort.

How do we ensure that each histogram-enum-value-defining .h file uses unique values, unless we just have a master TelemetryHistogramEnum.h that #includes all the individually generated ones?  We're back to where we start then, though, where a change to any of the separate .h files will cause lots of things to rebuild.

> I would either define use counters inline in moz.build files:
> 
>    USE_COUNTERS['css_properties'] += ['fill']
>
> Or have moz.build aggregate the set of files with use counters:
> 
>    USE_COUNTER_FILES += ['usecounters.conf']
>
> We can then have the build backend emit a JSON (or other machine-readable)
> file describing what it sees and that file is consumed by gen-usecounters.py.
> 
> I'm not going to block r+ if you use a central config file. I do want people
> aware of the performance implications and the technical debt a unified
> config file / header entails. It'll be making incremental builds slower for
> everyone. I strongly encourage us to get off that path at our earliest
> convenience. Separate config files with the intention that they eventually
> result in multiple .h files is a good first step.

If we go with the forward-declared enum, there won't be any performance problems with a single config file / header.  Would that turn your "reluctant r+" into a "happy r+"?

> ::: dom/base/Makefile.in
> It seems silly to me that we have to both put an attribute in a .webidl
> *and* update this file. Why can't we have the .webidl parser emit data of
> all discovered counters and have that output fed into gen-usecounters.py?

That is what I had initially (well, generating a histogram json file for the use counters directly), but I found it difficult to get the build system rules right that would cause the Web IDL python generation scripts to generate the histogram json file which the histogram .h file generation rule could then depend on.  (Although below you talk about a race condition between toolkit/components/telemetry/Makefile and dom/base/Makefile which might have been the issue.)  It also resulted in a bunch of code added to the Web IDL python scripts, and the current UseCounter.conf-reading script is shorter.  I also realised that for other types of use counters, like properties, there wasn't a good place to declare them in existing files, and I wanted to have the declaration of use counters to be done in the same way regardless of what type they were.

> ::: dom/base/gen-usecounters.py
> @@ +63,5 @@
> > +        print_optional_macro_declare('USE_COUNTER_DOM_METHOD')
> > +        print_optional_macro_declare('USE_COUNTER_DOM_ATTRIBUTE')
> > +        print_optional_macro_declare('USE_COUNTER_CSS_PROPERTY')
> > +
> > +        for counter in counters:
> 
> We strive for consistent and stable output from the build system. Any chance
> we could sort the counters? Although, I suppose if the order comes from the
> conf file, then that's as stable as we need.

It does come from the conf file.  I don't think sorting gains us any stability.

> @@ +102,5 @@
> > +                items.append('''  "%s": {
> > +    "expires_in_version": "never",
> > +    "kind": "boolean",
> > +    "description": "%s"
> > +  }''' % (name, desc))
> 
> Python has a json module that will convert Python data structures to json
> and will handle Unicode properly. Please use it.
> 
> ::: toolkit/components/telemetry/Makefile.in
> @@ +22,5 @@
> >  histoenums_TARGET := export
> >  
> >  include $(topsrcdir)/config/rules.mk
> >  
> > +histogram_files := $(srcdir)/Histograms.json $(DEPTH)/dom/base/UseCounterHistograms.json
> 
> This introduces a race condition in the build system between the processing
> of toolkit/components/telemetry/Makefile and dom/base/Makefile. You'll need
> to add a dependency to the bottom of /Makefile.in to prevent it.

Could you explain how to do this please?
Flags: needinfo?(gps)
(In reply to Cameron McCormack (:heycam) from comment #32)
> (In reply to Gregory Szorc [:gps] from comment #31)
> > "b" is a performance concern. Because TelemetryHistogramEnums.h is included
> > by a lot of translation units, a small change to that file results in
> > massive recompilation. On my OS X build when that file changes, we recompile
> > ~100 .o files (in unified mode) and that adds ~70s wall and ~7:50 CPU to
> > builds.
> > 
> > "b" is the larger concern. More and more features and translation units will
> > use Telemetry. This patch is proof :) Over time, the percentage of
> > translation units including TelemetryHistogramEnums.h will approach 100%.
> > Over time, the development cost to add/remove a histogram will approach the
> > time for a clobber build. I don't believe this is something developers want
> > to live with.
> 
> I think adding and removing use counters will be infrequent enough that this
> doesn't really matter.

 $ hg log toolkit/components/telemetry/Histograms.json --template '{date(firstpushdate, "%Y-%m")}\n' | sort | uniq -c
   5 2012-08
  12 2012-09
  17 2012-10
  20 2012-11
  10 2012-12
  16 2013-01
  11 2013-02
  24 2013-03
  15 2013-04
   5 2013-05
  11 2013-06
  24 2013-07
  13 2013-08
  15 2013-09
  16 2013-10
  43 2013-11
  15 2013-12
  28 2014-01
  16 2014-02

0.5 to ~1.5 average changes per day is not infrequent. This is probably masked by other changes to widely-used headers. But that shouldn't be an excuse to contribute to the problem.
 
> But: if we did want to solve this, we could make the enum a sized enum and
> then forward declare it.  In nearly all of the headers where it is included,
> we don't need to refer to specific values of the enum.

Interesting idea.
 
> > I argue it's easier to transition into multiple header files if the
> > histograms are defined in separate config files (one .h per histogram file
> > is a natural mapping). Therefore, I think pursuing separate config files
> > today is worth doing, especially since it's marginal extra effort.
> 
> How do we ensure that each histogram-enum-value-defining .h file uses unique
> values, unless we just have a master TelemetryHistogramEnum.h that #includes
> all the individually generated ones?  We're back to where we start then,
> though, where a change to any of the separate .h files will cause lots of
> things to rebuild.

I'm not sure how histogram enums are submitted as part of Telemetry. Would it be possible to define an individual histogram not as a single integer but as a tuple of (namespace, enum)? How does Telemetry know that HistoX from Nightly is the same as HistoX from Aurora if HistoX is represented by "42?"

> > I'm not going to block r+ if you use a central config file. I do want people
> > aware of the performance implications and the technical debt a unified
> > config file / header entails. It'll be making incremental builds slower for
> > everyone. I strongly encourage us to get off that path at our earliest
> > convenience. Separate config files with the intention that they eventually
> > result in multiple .h files is a good first step.
> 
> If we go with the forward-declared enum, there won't be any performance
> problems with a single config file / header.  Would that turn your
> "reluctant r+" into a "happy r+"?

Yes.

I'd still like multiple files because I think it is architecturally better (and results in fewer merge conflicts). But performance is my major concern.

> > ::: dom/base/Makefile.in
> > It seems silly to me that we have to both put an attribute in a .webidl
> > *and* update this file. Why can't we have the .webidl parser emit data of
> > all discovered counters and have that output fed into gen-usecounters.py?
> 
> That is what I had initially (well, generating a histogram json file for the
> use counters directly), but I found it difficult to get the build system
> rules right that would cause the Web IDL python generation scripts to
> generate the histogram json file which the histogram .h file generation rule
> could then depend on.  (Although below you talk about a race condition
> between toolkit/components/telemetry/Makefile and dom/base/Makefile which
> might have been the issue.)  It also resulted in a bunch of code added to
> the Web IDL python scripts, and the current UseCounter.conf-reading script
> is shorter.  I also realised that for other types of use counters, like
> properties, there wasn't a good place to declare them in existing files, and
> I wanted to have the declaration of use counters to be done in the same way
> regardless of what type they were.

As long as we're using recursive make, toolkit/components/telemetry and dom/base will not directly contain cross-directory dependencies because that's how recursive make works. The way you encode dependencies is by directory traversal order in the root Makefile.in. See reply below.

You already have a separate mechanism for use counters from Telemetry Histograms. I don't think it's a huge deal to have a separate mechanism for WebIDL and CSS use counters. Think of it as putting in a little extra effort now to make the lives of WebIDL developers easier going forward. I can't say with certainty if that trade-off is worth it!
and will handle Unicode properly. Please use it.

> > 
> > ::: toolkit/components/telemetry/Makefile.in
> > @@ +22,5 @@
> > >  histoenums_TARGET := export
> > >  
> > >  include $(topsrcdir)/config/rules.mk
> > >  
> > > +histogram_files := $(srcdir)/Histograms.json $(DEPTH)/dom/base/UseCounterHistograms.json
> > 
> > This introduces a race condition in the build system between the processing
> > of toolkit/components/telemetry/Makefile and dom/base/Makefile. You'll need
> > to add a dependency to the bottom of /Makefile.in to prevent it.
> 
> Could you explain how to do this please?

The following will ensure that |make -C toolkit/components/telemetry export| occurs after |make -C dom/base export| and |make -C dom/bindings/export|:

ifdef MOZ_PSEUDO_DERECURSE
toolkit/components/telemetry/export: dom/base/export dom/bindings/export
endif
Flags: needinfo?(gps)
(In reply to Gregory Szorc [:gps] from comment #33)
> 0.5 to ~1.5 average changes per day is not infrequent. This is probably
> masked by other changes to widely-used headers. But that shouldn't be an
> excuse to contribute to the problem.

I was thinking of use counters rather than telemetry histograms more generally.  I don't think people will need to be adding/removing use counters to measure the usage of Web platform features as frequently as these changes to TelemetryHistograms.json.

> I'm not sure how histogram enums are submitted as part of Telemetry. Would
> it be possible to define an individual histogram not as a single integer but
> as a tuple of (namespace, enum)? How does Telemetry know that HistoX from
> Nightly is the same as HistoX from Aurora if HistoX is represented by "42?"

I have no idea really.

> > If we go with the forward-declared enum, there won't be any performance
> > problems with a single config file / header.  Would that turn your
> > "reluctant r+" into a "happy r+"?
> 
> Yes.

OK.  I've filed bug 977430 for that.  I'll get your feedback on that once the patch is up.

> You already have a separate mechanism for use counters from Telemetry
> Histograms. I don't think it's a huge deal to have a separate mechanism for
> WebIDL and CSS use counters. Think of it as putting in a little extra effort
> now to make the lives of WebIDL developers easier going forward. I can't say
> with certainty if that trade-off is worth it!

OK.  I'll go back to using the [UseCounter] extended attribute in .webidl files as the key to generating DOM use counters, and having a separate .conf file for CSS properties.

> > Could you explain how to do this please?
> 
> The following will ensure that |make -C toolkit/components/telemetry export|
> occurs after |make -C dom/base export| and |make -C dom/bindings/export|:
> 
> ifdef MOZ_PSEUDO_DERECURSE
> toolkit/components/telemetry/export: dom/base/export dom/bindings/export
> endif

Great, thanks!
I'm thinking some more.  If we generate the use counter enum and the telemetry histogram json files based on [UseCounter] in .webidl files, and given that that would happens as part of Codegen.py, I think any change to any .webidl file would cause the enum/json files to be regenerated.  Unless we add the ability to invoke Codegen.py more than once, so not just from mozwebidlcodegen/__init__.py's generate_build_files.  That would mean we have to parse all the .webidl files a second time though.

Regarding splitting the .conf files into separate directories, and whether that can help avoid an enum header that causes various files to be rebuilt whenever a use counter is added/removed, I don't think we can.  nsIDocument needs to know at compile time the total number of use counters N, and we need each use counter to have a unique value less than N.  So using an enum is really the only thing that makes sense.  Unless we want to use strings or something else to represent use counters IDs, and have a lookup table in nsIDocument.  But I don't think we want to pay that performance cost, since we want to keep the function that records a use counter as fast as possible.

So although bug 977430 will reduce the number of files that get recompiled when use counters change and hence when telemetry IDs change, by not including the enum file in headers, every cpp file that does telemetry calls will get rebuilt when a use counter gets added/removed.  If as you say, eventually very many files will have telemetry calls, then we'll be rebuilding a lot.  I don't see a good way around this, though.

Happy to hear your further thoughts.
Flags: needinfo?(gps)
I'm sorry for the lag here....

Still reading through the patch, but why are we adding the new document observer notification?  Do we have some code that does something useful with it?  If not, what are the planned uses?
Flags: needinfo?(cam)
(In reply to Boris Zbarsky [:bz] (reviews will be slow; ask someone else) from comment #36)
> Still reading through the patch, but why are we adding the new document
> observer notification?  Do we have some code that does something useful with
> it?  If not, what are the planned uses?

Yes, I'm using it in image/src/VectorImage.cpp to allow outer documents to listen for use counters that are incremented by SVG documents loaded as images.  nsDocument::NotifyUseCounterSet is where I'm notifying the observers from.
Flags: needinfo?(cam)
>Would it be possible to define an individual histogram not as a single integer but as a tuple
> of (namespace, enum)? How does Telemetry know that HistoX from Nightly is the same as HistoX
> from Aurora if HistoX is represented by "42?"

Jumping in late here: in the telemetry payload must not use opaque integer enum values but should use string keys. Currently there is no way to represent a single histogram with an extra "key" value, but that doesn't mean we couldn't build something like that if it were important. Note that we'd still want to declare probe expiration (and in the future, probe sampling rate) separately for each measurement.

I'm not sure I understand how the .webidl annotations fit into this bug. Not all of these measurements are generated through webIDL, right? Can we separate out the core "measure usage by document" and "automatically measure usage in webidl" bits to make this easier to understand? In particular (for example) it would have been nice to use this mechanism for bug 851917 instead of the hand-rolled system.
Comment on attachment 8376953 [details] [diff] [review]
WIP (v0.2)

> to allow outer documents to listen for use counters that are incremented by SVG documents
> loaded as images.

Ah, I see.  It seems like a lot of machinery to get this data, but I'm not seeing a better way of doing it so far.  :(

It might be good to break out this boilerplate into a separate patch.  Maybe two patches: one to add the document observer, and one to add all the imagelib bits.

I'd like to understand the difference between the SetDocumentUseCounter and SetPageUseCounter APIs.  Also, whether we need to keep the separate bitsets for "page counters" and "document counters".

SetPageUseCounter shouldn't go through a presshell.  You can get documents from a docshell directly via do_GetInterface.

MappedAttrParser probably doesn't need an mNodePrincipal member anymore; it can just get it off the element.

For codegen, I'm not sure CGPerSignatureCall is the right place to put this code.  It's convenient because it covers all the callers, but it also means a copy of the code for every overload... and you're having to change all the callers anyway to pass in the useCounterName.  So it might be better to do it in things like CGAbstractStaticBindingMethod and CGSpecializedGetter/Setter/Method.

>+        # XXX Why do I get two "URL" interface descriptors?

One's a worker descriptor.  Since we don't particularly support use counters in workers so far with this setup, you could just filter those out in your config.getDescriptors() call.

That said, getMembersWithUseCounters seems to be unused.

No need to add "UseCounter" to the trailing bit in IDLMethod, since you're already checking for it earlier.  Though I'm not sure what I think of "isSpecial" as a name...

The global enum thing is a bit annoying, but I have no better ideas.  :(
Attachment #8376953 - Flags: feedback?(bzbarsky) → feedback+
Oh, and the actual question you asked, about interesting document cases.  XBL documents are worth testing (e.g. for the CSS use counters, in binding sheets).  So are documents created via DOMImplementation, though I expect these to act like DOMParser.  

Another thing: chrome vs non-chrome.  The patch here only sends telemetry for the latter, but it might be good to do it for both, with different names so we can catch extension usage too?
(In reply to Cameron McCormack (:heycam) from comment #35)
> I'm thinking some more.  If we generate the use counter enum and the
> telemetry histogram json files based on [UseCounter] in .webidl files, and
> given that that would happens as part of Codegen.py, I think any change to
> any .webidl file would cause the enum/json files to be regenerated.  Unless
> we add the ability to invoke Codegen.py more than once, so not just from
> mozwebidlcodegen/__init__.py's generate_build_files.  That would mean we
> have to parse all the .webidl files a second time though.

We definitely do not want to parse the .webidl files multiple times in a build. We invested a lot of effort into making WebIDLs build the way they do now because of performance and build dependency reasons.

> Regarding splitting the .conf files into separate directories, and whether
> that can help avoid an enum header that causes various files to be rebuilt
> whenever a use counter is added/removed, I don't think we can.  nsIDocument
> needs to know at compile time the total number of use counters N, and we
> need each use counter to have a unique value less than N.  So using an enum
> is really the only thing that makes sense.  Unless we want to use strings or
> something else to represent use counters IDs, and have a lookup table in
> nsIDocument.  But I don't think we want to pay that performance cost, since
> we want to keep the function that records a use counter as fast as possible.
>
> So although bug 977430 will reduce the number of files that get recompiled
> when use counters change and hence when telemetry IDs change, by not
> including the enum file in headers, every cpp file that does telemetry calls
> will get rebuilt when a use counter gets added/removed.  If as you say,
> eventually very many files will have telemetry calls, then we'll be
> rebuilding a lot.  I don't see a good way around this, though.
> 
> Happy to hear your further thoughts.

Is there any way we can have multiple "namespaces" for use counters and histograms? By having a single shared namespace/array/list of integer enumerations, we've painted ourselves into this corner of any-changes-to-the-global-list-require-recompilations-to-the-world.

What I'd like to see is separate enumerations for each domain/namespace. Each component will only know about the enumerations for its subset of the world. There would be a single component (a registrar if you will) that knows how to map each tuple of (namespace, local enumeration) to the master integer list (if such a unified list even exists).

For example, instead of:

   void Accumulate(ID id, uint32_t sample);

we have

   void Accumulate(ID ns, ID, id, uint32_t sample);

Each histogram is annotated with its "namespace." When enumerations are generated, we group all histograms by their namespace. Say we have the "dom" and "http" namespaces. We'd write out multiple .h files: TelemetryEnumsDom.h and TelemetryEnumsHttp.h. .cpp files would only include the .h containing the enumerations they needed. When a histogram is recorded, we'd do something like:

  Accumulate(TELEMETRY_NS_HTTP, TELEMETRY_HTTP_PAGE_CACHE_READ_TIME, value);

When telemetry values are aggregated, the internal data structure is a double hashmap instead of a single (namespace is outer, enum is inner). Full knowledge of all enumerations for all namespaces would only need to live inside the Telemetry aggregator.

I imagine converting all existing Telemetry to use namespaces would be a big effort. I think we could implement use counters initially as a one-off and over time convert existing Telemetry code to use the new, namespace-aware APIs, slowly improving include hell in the process.

Does this idea have legs?
(In reply to Gregory Szorc [:gps] from comment #41)
> For example, instead of:
> 
>    void Accumulate(ID id, uint32_t sample);
> 
> we have
> 
>    void Accumulate(ID ns, ID, id, uint32_t sample);

Doing a lookup based on integer (namespace, id) pairs is probably acceptable in terms of performance, compared to using strings.

The signature would need to be something like:

  void Accumulate(Telemetry::Namespace ns, uint32_t id, uint32_t sample);

where Telemetry::Namespace is as enum representing the domains/namespaces.  The id argument can't be an enum any more, assuming they have overlapping values across different namespaces.  So we lose some type safety there.

Maybe we could have separate Accumulate functions per namespace.

  void AccumulateHTTP(Telemetry::HTTPID id, uint32_t sample);
  void AccumulateDOM(Telemtry::DOMID id, uint32_t sample);

That'd let us keep the type safety.

So the net result would be that if we add/remove a namespace, everything that uses telemetry gets rebuilt.  If you add/remove an individual ID within a namespace, everything within that namespace that uses telemetry.

Still, all of this improvement seems orthogonal to the use counter stuff I really want to implement in this bug.

Anyway, I am not a Telemetry peer.  (Oh, looks like there's no Telemetry module.)  Vladan do you have a view on this?
Flags: needinfo?(vdjeric)
(Or Taras?)
(In reply to Cameron McCormack (:heycam) from comment #42)
> Still, all of this improvement seems orthogonal to the use counter stuff I
> really want to implement in this bug.

Sorry about that. This trap has been in place for months. You're just the unlucky person to step into it first.

If use counters is super high importance, feel free to call stop energy on me. I'm just hoping we can get a scalable solution for new implementations without too much additional effort.
Flags: needinfo?(gps)
No longer blocks: 914360
See Also: → 914360
Keywords: dev-doc-needed
OS: Mac OS X → All
Hardware: x86 → All
Clearing old needinfo here. 

The Telemetry build improvements are unlikely to happen this quarter unless someone outside my team steps up. Converting all the existing Telemetry users is going to be a special pain.

Use-counters seem like a decent idea although I think they can be implemented using existing Telemetry semantics, e.g. to track number of pages containing a Flash object you can use a boolean histogram and accumulate a "true" value every time you load a page with Flash and "false" every time you load a page without Flash. For "number of times a condition X is met on a page", a linear or exponential histogram would suffice. I don't think histograms need to be "reset" as long as you accumulate a zero value when a page does not meet condition X.
The current Telemetry dashboard (and other Telemetry tools) would show the aggregate results without needing any changes.

Anyway, we can revive this discussion if there's still interest in adding new semantics to Telemetry
Flags: needinfo?(vdjeric)
I don't think rewriting the Telemetry world just to get this feature in is worth it.

There's ~150 files that #include "mozilla/Telemetry.h".  (For comparison, a crude "find . -name \*.cpp|wc -l" says we have 6000+ .cpp files in the tree, and a crude "find . -name 'Unified*.cpp|wc -l" in the objdir says we compile ~470 unified files.)  Most of those are C++ files, and eyeballing it, it looks like we might only rebuild ~30-40 files when Telemetry.h changes.  Only 7 of them are headers, according to emacs's grep:

./security/manager/boot/src/RootCertificateTelemetryUtils.h:10:#include "mozilla/Telemetry.h"
./intl/uconv/ucvja/nsJapaneseToUnicode.h:8:#include "mozilla/Telemetry.h"
./image/src/Decoder.h:16:#include "mozilla/Telemetry.h"
./netwerk/cache/nsCacheService.h:25:#include "mozilla/Telemetry.h"
./toolkit/components/places/Helpers.h:18:#include "mozilla/Telemetry.h"
./dom/storage/DOMStorageCache.h:16:#include "mozilla/Telemetry.h"
./accessible/base/Statistics.h:10:#include "mozilla/Telemetry.h"

None of these headers look like "hub" headers that would cause the world to get recompiled.  Virtually all of these include-Telemetry.h-in-headers could be eliminated with forward enum declarations (which it looks like all of our supported compilers support now?).

Our WebIDL implementation already requires rebuilding quite a bit when things change, our IPDL situation is somewhat similar, and there are other problems with inappropriate #includes elsewhere in the tree (I think touching files in gfx/ requires rebuilding things in netwerk/, for instance).  It's not worth holding up this feature just because we fear some unspecified amount of rebuilding when things change.

Cameron, are you able to push forward on this this quarter?  Or do you mind if I try to kick it forward a little bit?
Flags: needinfo?(cam)
Forward enums were indeed lacking when I wrote the patches and were the cause of most of the Telemetry.h #includes.

This is not in my Q1 deliverables so I can't say I'll get time to work on it soon.  Please feel free to move forward with these patches.

Though I spoke with Ehsan in Portland and he was hoping that I could rework these patches on top of FHR rather than Telemetry.
Flags: needinfo?(cam)
This updates Cameron's patch to compile with something resembling current
trunk.  I don't think his previous patch addressed review comments, and this
one doesn't either; I'm leaving that part for another day.
Attachment #8376953 - Attachment is obsolete: true
Can we please hook this up to nsIDocument::WarnOnceAbout. Those are all features that we're trying to deprecate, so are all highly relevant to know how commonly they are used.
(In reply to Jonas Sicking (:sicking) from comment #49)
> Can we please hook this up to nsIDocument::WarnOnceAbout. Those are all
> features that we're trying to deprecate, so are all highly relevant to know
> how commonly they are used.

For avoidance of doubt, this is just another way of saying, "please make sure we have use counters for the features that hit nsIDocument::WarnOnceAbout"?
Flags: needinfo?(jonas)
Sort of.

Though I think the simplest way to ensure that, both now and going forward, is to add code to WarnOnceAbout to make sure that that function automatically logs features. Otherwise we might in the future add something to WarnOnceAbout and forget to add a [UseCounter] for that feature. There's also code that calls into WarnOnceAbout which doesn't have corresponding WebIDL, such as addEventListener("DOMAttrModified", myHandler, false);
Flags: needinfo?(jonas)
Conveniently, WarnOnceAbout takes an enum identifying the particular feature, and they are all defined in nsDeprecatedOperationList.h and nsDocumentWarningList.h.  So it shouldn't be hard to include them in the UseCounter.h to define them automatically.
I went and split these patches up along lines that made sense to me.  Sadly,
the main patch (part 3) doesn't look easily separatable.
We'll need access to the element, and to its OwnerDoc, once we hook up
use counters.
(In reply to Boris Zbarsky [:bz] from comment #39)
> Comment on attachment 8376953 [details] [diff] [review]
> > to allow outer documents to listen for use counters that are incremented by SVG documents
> > loaded as images.
> 
> Ah, I see.  It seems like a lot of machinery to get this data, but I'm not
> seeing a better way of doing it so far.  :(
> 
> I'd like to understand the difference between the SetDocumentUseCounter and
> SetPageUseCounter APIs.  Also, whether we need to keep the separate bitsets
> for "page counters" and "document counters".

Cameron, can you comment on why you did this?  I don't know enough about how our documents work to understand the distinction myself.
Flags: needinfo?(cam)
I thought it might be useful to measure the usage of counted features in a whole tab (so, the top level document and any iframes / SVG image documents referenced by them; these are the "page counters") versus in a single document.
Flags: needinfo?(cam)
Forgive my ignorance, but if we didn't have the split between document/pages in this series, iframe / SVG image documents would get counted normally, correct?  So we're only splitting things up to try to differentiate between top-level documents (main pages of a site) and iframes (ads?), right?
Flags: needinfo?(cam)
Yes if we didn't have this split between document/page counters then they would be all document counters, and iframe/SVG image documents would have their own counters.

My thinking for the split was that if you had a top level HTML document that had, say, five external SVG images that it references, and we're use-counting an SVG feature that is used by all of those images, then it might be better to think of that as one "page" that uses the feature.  (One "page" would break if we removed that feature.)
Flags: needinfo?(cam)
Thanks for the explanation.

bz, since you were the one who originally asked about the difference, does Cameron's rationale seem reasonable?
Flags: needinfo?(bzbarsky)
That seems fine if we document it in the code, yes.
Flags: needinfo?(bzbarsky)
This patch adds UseCounter.conf for defining use counters along with
infrastructure to generate enums and telemetry identifiers from
UseCounter.conf.

This patch should get DOM review, too, but I'm not doing that quite yet, mostly
because the currently interesting bits are the build and telemetry bits, and
I'd like to get those squared away first.

I realized that it's probably better, rather than generating the intermediate
UseCounters.json file, to have all the Telemetry machinery know how to parse
UseCounters.conf and extract the necessary information from that.  I'm not
quite sure where to put that; can we throw utility code like that into
python/mozbuild/mozbuild/action/, or should it go someplace else?  (Doing this
would make it easier to port things to moz.build's GENERATED_FILES mechanism,
too.)
Attachment #8565956 - Attachment is obsolete: true
Attachment #8570542 - Flags: feedback?(mshal)
Attachment #8570542 - Flags: feedback?(gfritzsche)
We'll need access to the element, and to its OwnerDoc, once we hook up
use counters.
Attachment #8565957 - Attachment is obsolete: true
This patch adds everything that documents need to store use counters and
propagating use counters to telemetry.  It also adds bits for enabling
parent documents to know about use counters their children use.  The
image observation bits are so SVG documents loaded as images can also
pass use counters to their parent document.

Seth, how do the imagelib bits look to you?
Attachment #8565958 - Attachment is obsolete: true
Attachment #8570545 - Flags: feedback?(seth)
Last piece.  I looked into addressing bz's comment 39, but CGPerSignatureCall
really does look like the right place to put all this, since that's where all
the actual getter/setter/method bodies get generated:
CGSpecialized{Getter,Setter,Method} wind up going through CGPerSignatureCall.
Are these user counters per *document* or per *user* We can do per-user metrics on the telemetry server now using just histograms, so I think we'd only need this additional client side data for per-document metrics.
Comment on attachment 8570545 [details] [diff] [review]
part 3 - add infrastructure for reporting use counters through documents

Review of attachment 8570545 [details] [diff] [review]:
-----------------------------------------------------------------

This patch looks good in many respects but I think it's a lot more complex than is really necessary to add this functionality. Some of this complexity is going to add a significant boilerplate and maintenance burden down the line.

The most important changes I think are needed are:

(1) Get rid of ReportUseCounters() totally and update the telemetry eagerly.
(2) Remove imgINotificationObserver::NotifyDocumentUseCounterSet() and rethink that aspect of the design. I've proposed an alternative below.

I didn't look at every file because I think these issues need to be addressed before anything else.

::: dom/base/nsDocument.cpp
@@ +12961,5 @@
> +  NS_DOCUMENT_NOTIFY_OBSERVERS(DocumentUseCounterSet, (this, aUseCounter));
> +}
> +
> +void
> +nsDocument::ReportUseCounters()

A lot of complexity elsewhere in the patch stems from the fact that we don't just report the telemetry for use counters eagerly. We should just report it when we send the notifications. I don't think we gain much from waiting. (In particular, it's not obvious to me that there's any performance benefit to waiting.)

::: image/public/imgINotificationObserver.idl
@@ +55,5 @@
>  
>    [noscript] void notify(in imgIRequest aProxy, in long aType,
>                           [const] in nsIntRect aRect);
> +  [noscript] void notifyDocumentUseCounterSet(in imgIRequest aProxy,
> +                                              in long aUseCounter);

Please no.

This incurs a substantial boilerplate overhead everywhere that uses images, and it's not a good design for a stateless notification.

I'd suggest just adding a new notification type (e.g. DOCUMENT_USE_COUNTER_SET) and a new method to imgIContainer, with the following protocol:

(1) Listener::Notify() called with DOCUMENT_USE_COUNTER_SET as the type.
(2) Listener calls imgIContainer::PropagateUseCounters(nsIDocument*).
(3) PropagateUseCounters() applies the use counters from the SVG-as-image document to the passed-in document.

That avoids the necessity to add a new notification method to imgINotificationObserver. It also means that new documents can get previously fired document use counters by calling PropagateUseCounters() when they get SIZE_AVAILABLE, which is *crucial* as images are shared between documents.

::: image/src/SVGDocumentWrapper.cpp
@@ +263,5 @@
>        nsSVGEffects::RemoveAllRenderingObservers(svgElem);
>      }
>  
> +    // Tell the VectorImage to drop its remaining document observer
> +    mVectorImage->CancelUseCounterListener();

I'm not a fan of this. Especially since we need a weak pointer to the owning VectorImage just to support this call.

Do we know that this is actually needed?

Note that SVGDocumentWrapper probably shouldn't be a shutdown observer at all; now that I've noticed that, I'm going to try to rip it out.

::: image/src/VectorImage.cpp
@@ +1103,5 @@
>    }
>  }
>  
>  void
> +VectorImage::CancelInitialListeners()

Maybe call this something like CancelLoadListeners().

@@ +1185,5 @@
>  
> +void
> +VectorImage::OnDocumentUseCounterSet(int32_t aUseCounter)
> +{
> +  if (mProgressTracker) {

mProgressTracker is never null anymore. Feel free to MOZ_ASSERT if you want to be paranoid, but it does not need a runtime check.

::: layout/generic/nsBulletFrame.cpp
@@ +807,5 @@
> +                                           int32_t aUseCounter)
> +{
> +  nsIDocument* doc = PresContext()->Document();
> +  if (doc) {
> +    // doc->SetPageUseCounter(static_cast<UseCounter>(aUseCounter));

Should this be commented out?

I see that happened in nsImageFrame, too... It'd be good to audit all of the callers.
Attachment #8570545 - Flags: feedback?(seth) → feedback-
Cameron, you've heard most of my feedback on IRC already, but I'd be interested in anything you have to add as the original author of the patch. In particular, I'm interested in what you think about the alternative approach to notifications I proposed.
Flags: needinfo?(cam)
Comment on attachment 8570542 [details] [diff] [review]
part 1 - add infrastructure for defining use counters from UseCounters.conf

> I realized that it's probably better, rather than generating the intermediate
> UseCounters.json file, to have all the Telemetry machinery know how to parse
> UseCounters.conf and extract the necessary information from that.  I'm not
> quite sure where to put that; can we throw utility code like that into
> python/mozbuild/mozbuild/action/, or should it go someplace else?  (Doing this
> would make it easier to port things to moz.build's GENERATED_FILES mechanism,
> too.)

gps is much better suited to answer these kinds of architectural questions. Plus it looks like he already has a lot of context on this particular bug.

>+usecounter_files := UseCounterList.h PropertyUseCounterMap.inc UseCounterHistograms.json
>+$(usecounter_files): $(srcdir)/gen-usecounters.py $(usecounter_conf_dest)
>+	$(PYTHON) $(srcdir)/gen-usecounters.py $(usecounter_conf_dest)

Although he mentioned this earlier, I wanted to re-highlight the fact that listing multiple targets in make probably isn't going to do what you want.
Attachment #8570542 - Flags: feedback?(mshal) → feedback?(gps)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #67)
> Are these user counters per *document* or per *user* We can do per-user
> metrics on the telemetry server now using just histograms, so I think we'd
> only need this additional client side data for per-document metrics.

These are per-document (I'm not sure if "user counters" is a typo or a genuine misunderstanding, so I'm not quite sure how to respond to this).
Flags: needinfo?(benjamin)
Attachment #8565959 - Attachment is obsolete: true
Typo. Are per-document counters really valuable enough to spend this effort, now that we have per-user counters already?
Flags: needinfo?(benjamin)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #72)
> Typo. Are per-document counters really valuable enough to spend this effort,
> now that we have per-user counters already?

I'm not sure I follow.  We don't have anything measuring the usage of individual DOM features on webpages at all, as far as I know.  So I don't know what the "per-user counters" we already have is supposed to refer to.  Is this the discussion in comment 57 and following, or something else?
Flags: needinfo?(benjamin)
I was going to suggest just using histograms. It seems that there are two bits of infrastructure here:

A. Adding magic to webidl to automatically measure certain things
B. Adding magic to documents so that we can count usage per-document, instead of globally.

I'm asking whether B is necessary.
Flags: needinfo?(benjamin)
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #74)
> I was going to suggest just using histograms. It seems that there are two
> bits of infrastructure here:
> 
> A. Adding magic to webidl to automatically measure certain things
> B. Adding magic to documents so that we can count usage per-document,
> instead of globally.
> 
> I'm asking whether B is necessary.

Ah, I think I see now, thanks for the explanation.

comment 60, I think, addresses this.  We could make all the counters global, but then we might see that lots of webpages are using one feature, when in reality, it's one (top-level) webpage with lots of (say) iframe ads.  That seems like a helpful distinction to have.  I don't really know if this distinction will prove to be true in practice, but if we collect a quarter or two of data and determine that there's no different between the page/document histograms, then we can turn one off.

(I think we still need bits in nsDocument &co to track the did-I-use-this-thing state properly, and adding the page/document split to that doesn't seem like a large amount of code.)
(In reply to Seth Fowler [:seth] from comment #68)
> This patch looks good in many respects but I think it's a lot more complex
> than is really necessary to add this functionality. Some of this complexity
> is going to add a significant boilerplate and maintenance burden down the
> line.
> 
> The most important changes I think are needed are:
> 
> (1) Get rid of ReportUseCounters() totally and update the telemetry eagerly.
> (2) Remove imgINotificationObserver::NotifyDocumentUseCounterSet() and
> rethink that aspect of the design. I've proposed an alternative below.

Thanks for the feedback.

> ::: image/public/imgINotificationObserver.idl
> @@ +55,5 @@
> >  
> >    [noscript] void notify(in imgIRequest aProxy, in long aType,
> >                           [const] in nsIntRect aRect);
> > +  [noscript] void notifyDocumentUseCounterSet(in imgIRequest aProxy,
> > +                                              in long aUseCounter);
> 
> Please no.
> 
> This incurs a substantial boilerplate overhead everywhere that uses images,
> and it's not a good design for a stateless notification.
> 
> I'd suggest just adding a new notification type (e.g.
> DOCUMENT_USE_COUNTER_SET) and a new method to imgIContainer, with the
> following protocol:
> 
> (1) Listener::Notify() called with DOCUMENT_USE_COUNTER_SET as the type.
> (2) Listener calls imgIContainer::PropagateUseCounters(nsIDocument*).
> (3) PropagateUseCounters() applies the use counters from the SVG-as-image
> document to the passed-in document.
> 
> That avoids the necessity to add a new notification method to
> imgINotificationObserver. It also means that new documents can get
> previously fired document use counters by calling PropagateUseCounters()
> when they get SIZE_AVAILABLE, which is *crucial* as images are shared
> between documents.

OK, so the new notification type is, I suppose, easy, but I'm less clear on the subsequent parts.  What is |Listener| in the above sequence--the generic name for a concrete class for imgINotificationObserver?  I guess PropagateUseCounters is the new method you're suggesting to add on imgIContainer?

I also don't understand the last sentence and why that would be so important.

> ::: image/src/SVGDocumentWrapper.cpp
> @@ +263,5 @@
> >        nsSVGEffects::RemoveAllRenderingObservers(svgElem);
> >      }
> >  
> > +    // Tell the VectorImage to drop its remaining document observer
> > +    mVectorImage->CancelUseCounterListener();
> 
> I'm not a fan of this. Especially since we need a weak pointer to the owning
> VectorImage just to support this call.
> 
> Do we know that this is actually needed?

I do not know, though...

> Note that SVGDocumentWrapper probably shouldn't be a shutdown observer at
> all; now that I've noticed that, I'm going to try to rip it out.

...yes, that *does* seem a bit odd.

ni? for my confusion in understanding the new notification protocol.
Flags: needinfo?(seth)
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #76)
> > I'd suggest just adding a new notification type (e.g.
> > DOCUMENT_USE_COUNTER_SET) and a new method to imgIContainer, with the
> > following protocol:
> > 
> > (1) Listener::Notify() called with DOCUMENT_USE_COUNTER_SET as the type.
> > (2) Listener calls imgIContainer::PropagateUseCounters(nsIDocument*).
> > (3) PropagateUseCounters() applies the use counters from the SVG-as-image
> > document to the passed-in document.
> > 
> > That avoids the necessity to add a new notification method to
> > imgINotificationObserver. It also means that new documents can get
> > previously fired document use counters by calling PropagateUseCounters()
> > when they get SIZE_AVAILABLE, which is *crucial* as images are shared
> > between documents.
> 
> OK, so the new notification type is, I suppose, easy, but I'm less clear on
> the subsequent parts.  What is |Listener| in the above sequence--the generic
> name for a concrete class for imgINotificationObserver?  I guess
> PropagateUseCounters is the new method you're suggesting to add on
> imgIContainer?

Sorry, I probably should've expanded on the above a bit more! I'll try to clarify a bit.

By "Listener" I did indeed just mean "some class that implements imgINotificationObserver".

What I proposed above is to move from a "push" model to a "pull with update notifications" model. Instead of attaching the use counter information to the notification, as the code I reviewed is doing, we add a new pull method (imgIContainer::PropagateUseCounters) that imgINotificationObserver instances can call when they get a DOCUMENT_USE_COUNTER_SET notification.

This has two advantages:

(1) Code that doesn't care about this can just ignore it.

(2) Because images are shared between documents and are cached, it won't be unusual at all that a listener starts listening to the image after all of the use counters are set. We always have to think about this when designing APIs in ImageLib. A purely stateless notification-based approach just won't work for this, because listeners that are late to the party won't get any of the use counter messages. My proposed design handles this situation by having listeners call PropagateUseCounters when they first start observing the image. That gets them caught up immediately.

However, looking at this again, I realize that there's an additional complication here: these are *counters*, and it's not obvious how to make PropagateUseCounters handle delta updates without storing some additional state about what listeners have already gotten, which we don't want to do if possible. So we need to tweak this a little bit.

I can see a couple of possibilities here, but maybe the most straightforward one is to get rid of the notifications entirely, and just call PropagateUseCounters in nsImageLoadingContent::UntrackImage. That way we'd propagate the use counters when the document got torn down, or when the <img> element's src attribute changed. We'd also need to call PropagateUseCounters somewhere in ImageLoader to catch CSS images.

Does that seem like a reasonable plan?
Flags: needinfo?(seth)
(In reply to Seth Fowler [:seth] from comment #77)
> Sorry, I probably should've expanded on the above a bit more! I'll try to
> clarify a bit.

No problem!  I'm just happy I didn't totally fail at reading comprehension. :)

> What I proposed above is to move from a "push" model to a "pull with update
> notifications" model. Instead of attaching the use counter information to
> the notification, as the code I reviewed is doing, we add a new pull method
> (imgIContainer::PropagateUseCounters) that imgINotificationObserver
> instances can call when they get a DOCUMENT_USE_COUNTER_SET notification.
> 
> This has two advantages:
> 
> (1) Code that doesn't care about this can just ignore it.
> 
> (2) Because images are shared between documents and are cached, it won't be
> unusual at all that a listener starts listening to the image after all of
> the use counters are set. We always have to think about this when designing
> APIs in ImageLib. A purely stateless notification-based approach just won't
> work for this, because listeners that are late to the party won't get any of
> the use counter messages. My proposed design handles this situation by
> having listeners call PropagateUseCounters when they first start observing
> the image. That gets them caught up immediately.
> 
> However, looking at this again, I realize that there's an additional
> complication here: these are *counters*, and it's not obvious how to make
> PropagateUseCounters handle delta updates without storing some additional
> state about what listeners have already gotten, which we don't want to do if
> possible. So we need to tweak this a little bit.

I'm assuming you mean "...additional state about what listeners have already gotten which counters...", correct?

It's worth noting that these are only really "counters" at the level of Telemetry; for the document/SVG/etc. level, they're just boolean flags, so it's probably feasible to bitwise-or things together for propagation.  We eventually have to notify Telemetry of the flags, of course, but the patch already does that at document destruction.

> I can see a couple of possibilities here, but maybe the most straightforward
> one is to get rid of the notifications entirely, and just call
> PropagateUseCounters in nsImageLoadingContent::UntrackImage. That way we'd
> propagate the use counters when the document got torn down, or when the
> <img> element's src attribute changed. We'd also need to call
> PropagateUseCounters somewhere in ImageLoader to catch CSS images.

I guess we'd do this in ImageLoader::RemoveImage?  (That looks to me like the obviuos place to do it, but i don't know that code at all...)

> Does that seem like a reasonable plan?

I'm not sure.  If we want to propagate things on a pull model, I think that addresses comment 37, but it also means that we need to have SVG images maintain a separate set of use counters that will get or'd into their parent at some point, correct?
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #78)
> I'm assuming you mean "...additional state about what listeners have already
> gotten which counters...", correct?

Yep.

> It's worth noting that these are only really "counters" at the level of
> Telemetry; for the document/SVG/etc. level, they're just boolean flags, so
> it's probably feasible to bitwise-or things together for propagation.  We
> eventually have to notify Telemetry of the flags, of course, but the patch
> already does that at document destruction.

That's good to know. If they're just flags, it makes handling delta updates much simpler. So we *could* use the notification approach, though if the approach that doesn't involve notifications works just as well we'd probably be better off using it for simplicity's sake.

It was one of my recommendations above that we not bother waiting for document destruction to send this stuff to Telemetry, BTW. I don't think that should be an issue no matter which approach we use.

> I guess we'd do this in ImageLoader::RemoveImage?  (That looks to me like
> the obviuos place to do it, but i don't know that code at all...)

That seems right to me too, though ImageLoader is probably the area of this code I am least familiar with.

> I'm not sure.  If we want to propagate things on a pull model, I think that
> addresses comment 37, but it also means that we need to have SVG images
> maintain a separate set of use counters that will get or'd into their parent
> at some point, correct?

Well, VectorImage already holds (indirectly, via SVGDocumentWrapper) the SVG document that it's displaying, so I think we can just pull those use counters directly from the document.
(In reply to Seth Fowler [:seth] from comment #69)
> Cameron, you've heard most of my feedback on IRC already, but I'd be
> interested in anything you have to add as the original author of the patch.
> In particular, I'm interested in what you think about the alternative
> approach to notifications I proposed.

The main thing I want to be sure of is that the change to notifying an outer document of use counters in the SVG image document eagerly does not cause us to lose information when there are multiple documents using a given SVG image, but I think your DOCUMENT_USE_COUNTER_SET notification proposal, where we just re-inform the outer documents whenever we get a new use counter set in the SVG image document, ensures this.  Overall this is simpler and I don't think there's going to be any performance problem, assuming that we only dispatch the DOCUMENT_USE_COUNTER_SET notification if an SVG image document use counter actually flips from false to true.
Flags: needinfo?(cam)
(In reply to Seth Fowler [:seth] from comment #77)
> I can see a couple of possibilities here, but maybe the most straightforward
> one is to get rid of the notifications entirely, and just call
> PropagateUseCounters in nsImageLoadingContent::UntrackImage. That way we'd
> propagate the use counters when the document got torn down, or when the
> <img> element's src attribute changed. We'd also need to call
> PropagateUseCounters somewhere in ImageLoader to catch CSS images.

I think this approach sounds even better, not needing a notification at all.
Comment on attachment 8570542 [details] [diff] [review]
part 1 - add infrastructure for defining use counters from UseCounters.conf

Review of attachment 8570542 [details] [diff] [review]:
-----------------------------------------------------------------

This looks sane.

Is it premature optimization to support use counters outside of dom/base/? I imagine others may want to utilize this feature. Having e.g. moz.build support for "this file defines my use counters" might be warranted. That might be scope bloat for the initial feature, however.
Attachment #8570542 - Flags: feedback?(gps) → feedback+
(In reply to Gregory Szorc [:gps] from comment #82)
> This looks sane.

Thanks for the review.

Can you address the question(s) in comment 63 about digesting the UseCounters.conf file directly from the Telemetry generation scripts, rather than creating a separate UseCounters.json file?  I think doing direct digestion would make the build system parts cleaner, but I don't know where I'd put the utility code for parsing the UseCounters.conf file...maybe the Telemetry scripts should just be smart enough to import it from dom/base/?  (Though that doesn't seem quite right.)
Flags: needinfo?(gps)
Comment on attachment 8570542 [details] [diff] [review]
part 1 - add infrastructure for defining use counters from UseCounters.conf

Review of attachment 8570542 [details] [diff] [review]:
-----------------------------------------------------------------

From what i gather, telemetry-wise (i.e. ignoring the code-gen etc.) this use counter implementation is just for avoid going through Histograms.json?
The integration into the histogram enum list seems a little awkward, but maybe i'm missing the goal there? What is that part, including the dummy enums etc., trying to do?
I would like to see this documented more clearly.

::: dom/base/UseCounter.h
@@ +10,5 @@
> +#define USE_COUNTER_DOM_ATTRIBUTE(interface_, name_) \
> +    eUseCounter_##interface_##_##name_,
> +#define USE_COUNTER_CSS_PROPERTY(name_, id_) \
> +    eUseCounter_property_##id_,
> +#include "mozilla/dom/UseCounterList.h"

I see that we are generating that file dynamically. Why can't we just directly generate the contents here instead of the macro magic?

::: dom/base/gen-usecounters.py
@@ +108,5 @@
> +        for counter in counters:
> +            def append_counter(name, desc):
> +                items[name] = { 'expires_in_version': 'never',
> +                                'kind' : 'boolean',
> +                                'description': desc }

So these come just down to boolean histograms.
Can we document that in this and other files? (It took me a while to find this part here)

::: toolkit/components/telemetry/gen-histogram-enum.py
@@ +18,5 @@
>  banner = """/* This file is auto-generated, see gen-histogram-enum.py.  */
>  """
>  
> +def print_enums_at_end_of_use_counters():
> +    print "  HistogramDUMMY2,"

What are HistogramDUMMY1/2 needed for? This is hard to understand, can you document this?

@@ +32,4 @@
>  
>      print banner
>      print "enum ID : uint32_t {"
> +    for histogram in histogram_tools.from_files(filenames):

Use counters are treated different here.
Can't we just e.g. return a tuple of (histograms,useCounters) and handle things much more straight-forward here?

@@ +32,5 @@
>  
>      print banner
>      print "enum ID : uint32_t {"
> +    for histogram in histogram_tools.from_files(filenames):
> +        if histogram.name().startswith("USE_COUNTER"):

"USE_COUNTER_"?
Attachment #8570542 - Flags: feedback?(gfritzsche)
Use counter submission normally happens at document destruction.  For
testing use counters, however, we need to have use counters updated in
telemetry at deterministic points.  Therefore, we provide a method on
nsIDOMWindowUtils that forces use counters out to telemetry so we can
examine them.

This is not totally perfect.  Specifically, it doesn't seem to be fully general
for cases like:

- SVG images as <img src="thing.svg">
- SVG images used in CSS stylesheets (background-image, list-style-image, etc.)

Ideas welcome on how to recursively flush use counters for all a document's
resources.
Before embarking on the notification rewrite that Seth suggested, I thought
it'd be a good idea to have tests to ensure that I didn't do anything too
wrong.  These are the results so far (still missing SVG resource docs, which
are a new thing to me, and possibly other interesting cases).

They don't really work, either (6 PASSes, 6 FAILs currently); you can see the
correct numbers being printed out if you enabled sDebugUseCounters in
nsDocument::ReportUseCounters, but as use counter reporting only happens at
document destruction, we are (AFAICT) at the mercy of the cycle collector/GC as
to when things actually show up in telemetry.

The basic idea, however, seems reasonable, so I'm going to keep trying this and
making it more robust unless folks have better ideas.
Here's an updated version of nsIDOMWindowUtils::FlushUseCounters.  This version
takes arbitrary nodes so we can flush out documents and <img> nodes; other
nodes might be added if need be.

None of this is necessary in a world where use counters flowing into telemetry
happens eagerly, though.  We do lazy submission of the counters into telemetry
because that lets one see a nice graph for an individual feature easily:
feature X has Y documents that use it and Z pages that don't.  If we had
server-side help for displaying the statistics, we could do a single histogram
with the number of documents and combine that with a counter showing how many
times feature X was activated, on a per-page/document basis.  I'm not sure how
feasible that'd be, but maybe it'd be easier than trying to get these tests all
straightened out...
Updated tests.  These almost all pass on my machine, except for some problems
with the background-image test; it looks like the exact time when the
background is painted is very important for getting the use counter submitted,
and that submission doesn't happen in time for the use counter to be propagated
up to the parent document so that the appropriate PAGE histogram is filled in.

Writing these tests makes me realize that a lot of the complication from the
patch comes from trying to ensure that we get these page/document counts
correct: we're trying to infer parentage of documents, which isn't normally
maintained, and is somewhat delicate to reconstruct.  Seth's idea to do things
with pull notifications might make this a little better.  I have a little more
confidence implementing the pull notifications now that there's tests involved.
Attachment #8578848 - Attachment is obsolete: true
Attachment #8578847 - Attachment is obsolete: true
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #83)
> Can you address the question(s) in comment 63 about digesting the
> UseCounters.conf file directly from the Telemetry generation scripts, rather
> than creating a separate UseCounters.json file?  I think doing direct
> digestion would make the build system parts cleaner, but I don't know where
> I'd put the utility code for parsing the UseCounters.conf file...maybe the
> Telemetry scripts should just be smart enough to import it from dom/base/? 
> (Though that doesn't seem quite right.)

It depends where things are getting parsed. If you hook this up to moz.build files and parse things as part of emitter.py or a backend, it should be part of python/mozbuild.

If it's an at-build thing, code in telemetry or dom/ is fine. dom/bindings/mozwebidlcodegen is an example of a Python package in a non-standard directory. We just added dom/bindings to build/virtualenv_packages.txt for that to "just work." I'm not a super huge fan of adding random directories to sys.path, but it does solve the problem.
Flags: needinfo?(gps)
I think this patch addresses most of the feedback from the previous cycle.
I've split things up so we're not preprocessing things into a histogram file
in $OBJDIR/dom/base/ anymore, which removes cross-directory dependencies in
the build system.  Instead, I've setup the histogram_tools.py bits to
understand how to parse multiple different filenames; this change makes it
easy to add support for parsing things like nsDeprecatedOperationList.h.

Many more things are a part of moz.build, rather than cluttering
dom/base/Makefile.in with a bunch of extra rules.

I think the only comment I didn't address was generating dom/base/UseCounter.h
directly, rather than munging things with the preprocessor.  I think it's
slightly easier to figure out where eUseCounter_* enums are coming from if they
live in .h files rather than .py files.  But this is really a matter for the
dom/ peer who reviews this, and currently I'm only seeking telemetry/build
review.
Attachment #8570542 - Attachment is obsolete: true
Attachment #8599363 - Flags: review?(mshal)
Attachment #8599363 - Flags: review?(gfritzsche)
Comment on attachment 8599363 [details] [diff] [review]
part 1 - add infrastructure for defining use counters from UseCounters.conf

>+INSTALL_TARGETS += usecounterlist
>+usecounterlist_FILES := UseCounterList.h
>+usecounterlist_DEST = $(DIST)/include/mozilla/dom
>+usecounterlist_TARGET := export

Boo, do we have a bug yet for being able to add a generated file to EXPORTS directly?

The rest of the build system bits look fine to me, though I didn't review the python scripts in detail.
Attachment #8599363 - Flags: review?(mshal) → review+
(In reply to Michael Shal [:mshal] from comment #91)
> Comment on attachment 8599363 [details] [diff] [review]
> part 1 - add infrastructure for defining use counters from UseCounters.conf
> 
> >+INSTALL_TARGETS += usecounterlist
> >+usecounterlist_FILES := UseCounterList.h
> >+usecounterlist_DEST = $(DIST)/include/mozilla/dom
> >+usecounterlist_TARGET := export
> 
> Boo, do we have a bug yet for being able to add a generated file to EXPORTS
> directly?

No, we should probably add one, so we can do things like:

EXPORTS += [
    '!UseCounterList.h',
]

and then moz.build can generate all the INSTALL_TARGETS stuff automagically.
Comment on attachment 8599363 [details] [diff] [review]
part 1 - add infrastructure for defining use counters from UseCounters.conf

Review of attachment 8599363 [details] [diff] [review]:
-----------------------------------------------------------------

The telemetry scripts look good with the below comments addressed.

::: toolkit/components/telemetry/gen-histogram-enum.py
@@ +28,5 @@
> +    # Note that histogram_tools.py guarantees that all of the USE_COUNTER_*
> +    # histograms are defined in a contiguous block.  We therefore assume
> +    # that there's only one group for which use_counter_group is true.
> +    for (use_counter_group, histograms) in itertools.groupby(histogram_tools.from_files(filenames),
> +                                                             lambda h: h.name().startswith("USE_COUNTER")):

That's getting pretty long. Can we just put the itertools.groupby() part in a preceding statement?
Also, |startswith("USE_COUNTER_")|?

@@ +31,5 @@
> +    for (use_counter_group, histograms) in itertools.groupby(histogram_tools.from_files(filenames),
> +                                                             lambda h: h.name().startswith("USE_COUNTER")):
> +        # The HistogramDUMMY* enum variables are used to make the computation
> +        # of Histogram{First,Last}UseCounter easier.  Otherwise, we'd have to
> +        # special case the first and last histogram in the group.

We could just look those two histogram names up via two filter/find statements before the loop, so this wouldn't make things very complicated.

::: toolkit/components/telemetry/histogram_tools.py
@@ +9,3 @@
>  import re
> +import sys
> +sys.path.append(os.path.join(buildconfig.topsrcdir, 'dom/base/'))

Comment that it's needed for the usecounters.
Attachment #8599363 - Flags: review?(gfritzsche) → review+
Comment on attachment 8599363 [details] [diff] [review]
part 1 - add infrastructure for defining use counters from UseCounters.conf

Cameron, bz suggested that you might be able to review the dom-y bits of the patch.

(I realize you wrote it, so if you don't feel comfortable reviewing it, could you find someone else to review, particularly someone who can address the XXX comment in gen-usecounters.py?  Thanks!)
Attachment #8599363 - Flags: review?(cam)
Comment on attachment 8599363 [details] [diff] [review]
part 1 - add infrastructure for defining use counters from UseCounters.conf

Review of attachment 8599363 [details] [diff] [review]:
-----------------------------------------------------------------

r=me on UseCounters.conf, UseCounter.h and gen-usecounters.py.

::: dom/base/UseCounters.conf
@@ +33,5 @@
> +// property usage, which we would also like to track.
> +
> +method SVGSVGElement.getElementById
> +attribute SVGSVGElement.currentScale
> +property fill

Does everything work even without any use counters defined?  I just added these as examples of things I might want to measure, but we might not want to land with them here.

::: dom/base/gen-usecounters.py
@@ +55,5 @@
> +    print(AUTOGENERATED_WARNING_COMMENT, file=f)
> +    print('''
> +enum {
> +  // XXX is this the right define?
> +  #define CSS_PROP_LIST_INCLUDE_LOGICAL

Yes, this will let us add a use counter for logical properties like margin-inline-start, which I think we want to be able to do.  But please test that use counters on logical properties actually work.

@@ +59,5 @@
> +  #define CSS_PROP_LIST_INCLUDE_LOGICAL
> +  #define CSS_PROP(name_, id_, method_, flags_, pref_, parsevariant_,     \\
> +                   kwtable_, stylestruct_, stylestructoffset_, animtype_) \\
> +    USE_COUNTER_FOR_CSS_PROPERTY_##id_ = eUseCounter_UNKNOWN,
> +  #include "nsCSSPropList.h"

(FWIW, in the future I think we want to be able to measure usage of shorthand properties (and possible aliases, too) but since they have already been dealt with by the time we're in nsCSSDataBlock::DoTransferFromBlock, we'll need to handle them separately.  So it's fine not to include shorthands/aliases here.)
Attachment #8599363 - Flags: review?(cam) → review+
Comment on attachment 8570544 [details] [diff] [review]
part 2 - change MappedAttrParser to store a nsSVGElement directly, instead of its nsIPrincipal

Might as well get this little bit landed.
Attachment #8570544 - Flags: review?(bugs)
Attachment #8570544 - Flags: review?(bugs) → review+
(In reply to Pulsebot from comment #97)
> https://hg.mozilla.org/integration/mozilla-inbound/rev/c729634e55d1
> https://hg.mozilla.org/integration/mozilla-inbound/rev/1a67af1cbf1b

Added review changes, especially the part about working with 0 use counters, and removed lines from UseCounters.conf.

I thought that it was a little weird to land a patch authored by heycam, reviewed by heycam, so I committed the patch authored by me with attribution in the commit line.
Keywords: leave-open
This patch gives ns{I,}Document the ability to track use counters and
also to send those to telemetry at document destruction time.  We push
things to telemetry lazily because we're can't definitively say whether
something has been used until the document is torn down.
Attachment #8570545 - Attachment is obsolete: true
This part implements what I think Seth was describing in his previous review of
these patches.  We now pull use counters from images into their owning/parent
documents when they finish decoding, rather than eagerly notifying the
owning/parent documents at each use counter.
Attachment #8620573 - Flags: feedback?(seth)
This patch teaches gen-histogram-enum.py how to correctly deal with the
situation where there are no use counters.  It also adds
mozilla::Telemetry::GetHistogramName for debugging purposes.
Attachment #8620576 - Flags: review?(gfritzsche)
This patch ensures that uses of CSS properties that have use counters
associated with them get recorded in the appropriate document.
Attachment #8620578 - Flags: review?(dholbert)
This patch is split out because it will probably need some tweaks before
landing.  We need some use counters to run tests with, and these happen to be
the ones Cameron had chosen before.  I doubt they're the best ones; ideas
welcome on what's unused enough that we could include them in builds without
slowing things down too much.

I guess if we had use counters for deprecated operations, we could go ahead and
test some of those.  I'd still like to have some for SVG documents, though.
Attachment #8580245 - Attachment is obsolete: true
Attachment #8580250 - Attachment is obsolete: true
Use counter submission normally happens at document destruction.  For
testing use counters, however, we need to have use counters updated in
telemetry at deterministic points.  Therefore, we provide a method on
nsIDOMWindowUtils that forces use counters out to telemetry so we can
examine them.
Finally, tests for a variety of purposes.  All these tests pass (modulo the
SVG-with-external-resource-paint-server, which dholbert thought might be busted
and/or underspec'd), which gives me some confidence that I've modified all the
right places.  Some tests are probably needed for CSS properties in regular
HTML documents as well (even though they're probably redundant with the SVG
tests), and maybe a test that sticks images on <input>.
Comment on attachment 8620578 [details] [diff] [review]
part 3d - record use counter information from the CSS parser

I should probably punt on this to a style system owner/peer.

Can't punt to heycam, since froyndj says this is largely code that heycam wrote (which has now been split out); so, punting to dbaron. :)
Attachment #8620578 - Flags: review?(dholbert) → review?(dbaron)
Comment on attachment 8620568 [details] [diff] [review]
part 3a - add core DOM use counter functionality

smaug semi-grudgingly consented to review this. ;)
Attachment #8620568 - Flags: review?(bugs)
Comment on attachment 8620568 [details] [diff] [review]
part 3a - add core DOM use counter functionality

>@@ -4630,16 +4632,18 @@ nsIDocument::SetContainer(nsDocShell* aContainer)
>     // check if same type root
>     nsCOMPtr<nsIDocShellTreeItem> sameTypeRoot;
>     aContainer->GetSameTypeRootTreeItem(getter_AddRefs(sameTypeRoot));
>     NS_ASSERTION(sameTypeRoot, "No document shell root tree item from document shell tree item!");
> 
>     if (sameTypeRoot == aContainer) {
>       static_cast<nsDocument*>(this)->SetIsTopLevelContentDocument(true);
>     }
>+
>+    static_cast<nsDocument*>(this)->SetIsContentDocument(true);
I would add a helper method which takes docshell from mScopeObject (do_QueryReferent to nsPIDOMWindow and get docshell from there) and then
check what the docshell type is. Then call the helper method in 
nsDocument::SetScriptHandlingObject and nsDocument::SetScriptGlobalObject.
(or make nsDocument::SetScriptGlobalObject to call SetScriptHandlingObject when needed and then add static_cast<nsDocument*>(this)->SetIsContentDocument(true); there)
That way also data documents would get the IsContainerDocument flag.


>+nsIDocument::GetTopLevelContentDocument()
>+{
>+  nsDocument* parent = static_cast<nsDocument*>(this);
>+
>+  do {
>+    if (parent->IsTopLevelContentDocument()) {
>+      break;
>+    }
>+
>+    nsIDocument* candidate = parent->GetParentDocument();
>+    if (!candidate) {
>+      break;
>+    }
>+    parent = static_cast<nsDocument*>(candidate);
>+  } while(true);
>+
>+  MOZ_ASSERT(parent);
>+
>+  return parent;
>+}
This is a bit odd method, since if called before destroying docshell tree, 
it returns actual top level content document, but if afterwards, it returns 'this'. Could it perhaps return null (as the name hints) if there is no
parent document?

Also, for data documents you probably want to get extantDoc from scope object (via nsPIDOMWindow) and iterate through the
parent chain.

>+static bool
>+MightBeAboutOrChromeScheme(nsIURI* aURI)
>+{
>+  MOZ_ASSERT(aURI);
>+  bool isAbout = true;
>+  bool isChrome = true;
>+  aURI->SchemeIs("about", &isAbout);
>+  aURI->SchemeIs("chrome", &isChrome);
>+  return isAbout || isChrome;
>+}
I think we should use the uri from principal, so that
if some feature is used in about:blank opened from a web page we get still proper reporting.

>+nsDocument::ReportUseCounters()
>+{
>+  static const bool sDebugUseCounters = false;
>+  if (mReportedUseCounters) {
>+    return;
>+  }
>+
>+  mReportedUseCounters = true;
>+
>+  if (Telemetry::HistogramUseCounterCount > 0 &&
>+      (IsContentDocument() || IsResourceDoc()) &&
>+      mOriginalURI && !MightBeAboutOrChromeScheme(mOriginalURI)) {
>+    if (sDebugUseCounters) {
>+      nsCString spec;
>+      mOriginalURI->GetSpec(spec);
For debugging you probably want both originalURI and the uri from principal


>+    // We keep separate counts for individual documents and pages to
>+    // more accurately track how many web pages might break if certain
>+    // features were removed.  Consider the case of a single HTML
>+    // document with several SVG images and/or iframes with
>+    // sub-documents of their own.  If we maintained a single set of use
>+    // counters and all the sub-documents use a particular feature, then
>+    // telemetry would indicate that we would be breaking N documents if
>+    // that feature were removed.  Whereas with a document/page split,
>+    // we can see that N documents would be affected, but only a single
>+    // web page would be affected.
This is rather new terminology to me. a web page isn't a document but a collection of documents.
That comment was a bit hard to parse. Perhaps explain that 
"page" means a top level content page or something like that.

>+  void SetDocumentUseCounter(mozilla::UseCounter aUseCounter) {
{ to its own line with methods. Same also elsewhere.


I wonder if svg-as-images documents could rely on principal uri to indicate whether they are content documents.
dholbert might have some ideas for that. (Would be just nice to mark all the "content documents" consistently with the flag).
Tough, I guess chrome might use svg-as-images from the web - so, not quite sure what behavior we want there.
Attachment #8620568 - Flags: review?(bugs) → review-
Comment on attachment 8620573 [details] [diff] [review]
part 3b - propagating use counters from SVG images into owning/parent documents

Review of attachment 8620573 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good. This should be a lot more robust than the previous approach.

You only requested feedback, but I don't think there's anything controversial here; you can consider this an r+ from me if you wish.
Attachment #8620573 - Flags: feedback?(seth) → feedback+
I was just reading through the old comments on this bug, and it's not clear to me whether we solved the problem of telemetry ID stability.  If a use counter is added somewhere in the middle of UseCounters.conf, or if one is removed, is it going to confuse the server that collects the telemetry?  Or does the server correlate the integer telemetry ID with the specific build of Firefox to know exactly which one it is?
Assignee: cam → nfroyd
(In reply to Cameron McCormack (:heycam) from comment #112)
> I was just reading through the old comments on this bug, and it's not clear
> to me whether we solved the problem of telemetry ID stability.  If a use
> counter is added somewhere in the middle of UseCounters.conf, or if one is
> removed, is it going to confuse the server that collects the telemetry?  Or
> does the server correlate the integer telemetry ID with the specific build
> of Firefox to know exactly which one it is?

The server deals exclusively with the string names, so the integer ordering doesn't matter.
Jonas was asking about this last night.  I have no particular preference
whether it lands with this bug or in a different bug, but landing it in this
bug might make writing tests easier...
Attachment #8620576 - Flags: review?(gfritzsche) → review+
Thanks for the review.  ni? for questions below.

(In reply to Olli Pettay [:smaug] from comment #110)
> Comment on attachment 8620568 [details] [diff] [review]
> part 3a - add core DOM use counter functionality
> 
> >@@ -4630,16 +4632,18 @@ nsIDocument::SetContainer(nsDocShell* aContainer)
> >     // check if same type root
> >     nsCOMPtr<nsIDocShellTreeItem> sameTypeRoot;
> >     aContainer->GetSameTypeRootTreeItem(getter_AddRefs(sameTypeRoot));
> >     NS_ASSERTION(sameTypeRoot, "No document shell root tree item from document shell tree item!");
> > 
> >     if (sameTypeRoot == aContainer) {
> >       static_cast<nsDocument*>(this)->SetIsTopLevelContentDocument(true);
> >     }
> >+
> >+    static_cast<nsDocument*>(this)->SetIsContentDocument(true);
> I would add a helper method which takes docshell from mScopeObject
> (do_QueryReferent to nsPIDOMWindow and get docshell from there) and then
> check what the docshell type is. Then call the helper method in 
> nsDocument::SetScriptHandlingObject and nsDocument::SetScriptGlobalObject.
> (or make nsDocument::SetScriptGlobalObject to call SetScriptHandlingObject
> when needed and then add
> static_cast<nsDocument*>(this)->SetIsContentDocument(true); there)
> That way also data documents would get the IsContainerDocument flag.

The basic idea here is to move the setting of content document-ness to someplace where data documents can also have that flag set?

I don't follow this explanation.  Am I extracting the helper method from SetContainer, or writing a helper method to check toplevel document-ness?  Or something else entirely?

> >+nsIDocument::GetTopLevelContentDocument()
> >+{
> >+  nsDocument* parent = static_cast<nsDocument*>(this);
> >+
> >+  do {
> >+    if (parent->IsTopLevelContentDocument()) {
> >+      break;
> >+    }
> >+
> >+    nsIDocument* candidate = parent->GetParentDocument();
> >+    if (!candidate) {
> >+      break;
> >+    }
> >+    parent = static_cast<nsDocument*>(candidate);
> >+  } while(true);
> >+
> >+  MOZ_ASSERT(parent);
> >+
> >+  return parent;
> >+}
> This is a bit odd method, since if called before destroying docshell tree, 
> it returns actual top level content document, but if afterwards, it returns
> 'this'. Could it perhaps return null (as the name hints) if there is no
> parent document?

That would probably be better.  I think I wrote this to not return null when I was still fumbling around with how to do SVG images correctly.

> Also, for data documents you probably want to get extantDoc from scope
> object (via nsPIDOMWindow) and iterate through the
> parent chain.

Rather than iterating through the parent chain of the data doc, I want to iterate through the parent chain of the extantDoc?

> >+static bool
> >+MightBeAboutOrChromeScheme(nsIURI* aURI)
> >+{
> >+  MOZ_ASSERT(aURI);
> >+  bool isAbout = true;
> >+  bool isChrome = true;
> >+  aURI->SchemeIs("about", &isAbout);
> >+  aURI->SchemeIs("chrome", &isChrome);
> >+  return isAbout || isChrome;
> >+}
> I think we should use the uri from principal, so that
> if some feature is used in about:blank opened from a web page we get still
> proper reporting.

I am studiously avoiding any questions related to about:blank handling and will just follow orders here. ;)

> >+nsDocument::ReportUseCounters()
> >+{
> >+  static const bool sDebugUseCounters = false;
> >+  if (mReportedUseCounters) {
> >+    return;
> >+  }
> >+
> >+  mReportedUseCounters = true;
> >+
> >+  if (Telemetry::HistogramUseCounterCount > 0 &&
> >+      (IsContentDocument() || IsResourceDoc()) &&
> >+      mOriginalURI && !MightBeAboutOrChromeScheme(mOriginalURI)) {
> >+    if (sDebugUseCounters) {
> >+      nsCString spec;
> >+      mOriginalURI->GetSpec(spec);
> For debugging you probably want both originalURI and the uri from principal

*shrug* mOriginalURI was generally good enough for me, but I can have both...
Flags: needinfo?(bugs)
(In reply to Nathan Froyd [:froydnj] [:nfroyd] from comment #116)
> Thanks for the review.  ni? for questions below.
> 
> (In reply to Olli Pettay [:smaug] from comment #110)
> > Comment on attachment 8620568 [details] [diff] [review]
> > part 3a - add core DOM use counter functionality
> > 
> > >@@ -4630,16 +4632,18 @@ nsIDocument::SetContainer(nsDocShell* aContainer)
> > >     // check if same type root
> > >     nsCOMPtr<nsIDocShellTreeItem> sameTypeRoot;
> > >     aContainer->GetSameTypeRootTreeItem(getter_AddRefs(sameTypeRoot));
> > >     NS_ASSERTION(sameTypeRoot, "No document shell root tree item from document shell tree item!");
> > > 
> > >     if (sameTypeRoot == aContainer) {
> > >       static_cast<nsDocument*>(this)->SetIsTopLevelContentDocument(true);
> > >     }
> > >+
> > >+    static_cast<nsDocument*>(this)->SetIsContentDocument(true);
> > I would add a helper method which takes docshell from mScopeObject
> > (do_QueryReferent to nsPIDOMWindow and get docshell from there) and then
> > check what the docshell type is. Then call the helper method in 
> > nsDocument::SetScriptHandlingObject and nsDocument::SetScriptGlobalObject.
> > (or make nsDocument::SetScriptGlobalObject to call SetScriptHandlingObject
> > when needed and then add
> > static_cast<nsDocument*>(this)->SetIsContentDocument(true); there)
> > That way also data documents would get the IsContainerDocument flag.
> 
> The basic idea here is to move the setting of content document-ness to
> someplace where data documents can also have that flag set?
Right

> 
> I don't follow this explanation.  Am I extracting the helper method from
> SetContainer, or writing a helper method to check toplevel document-ness? 
> Or something else entirely?
You create a helper method on document which uses document's mScopeObject to get an nsPIDOMWindow
object and from that nsDocShell and then checks the type of the docshell.
(of if the document has a container, the method could just use that to check the type of the docshell).
And if type is content docshell, then set the flag.



> Rather than iterating through the parent chain of the data doc, I want to
> iterate through the parent chain of the extantDoc?
data documents don't have parents, so iterate parents of extantDoc
Flags: needinfo?(bugs)
Comment on attachment 8620578 [details] [diff] [review]
part 3d - record use counter information from the CSS parser

> // and attributes must have a [UseCounter] extended attribute in
>-// the Web IDL file, and CSS properties must be declared with
>-// the CSS_PROPERTY_HAS_USE_COUNTER flag in nsCSSPropList.h.
>+// the Web IDL file.  CSS properties require no special treatment
>+// beyond being listed below.
> //

I'm confused.  You're not removing the flag or the properties (fill,
fill-opacity) in patch 5a (and the older versions of patch 1).

(although it seems like none of the current patches actually #define the
flag anymore.  Is it just that the most recent patch 5a is obsolete?)



>+    static_cast<UseCounter>(USE_COUNTER_FOR_CSS_PROPERTY_##id_),

I'd actually prefer you use method_ rather than id_; I'd like to get
rid of all uses of id_ and replace them with method_, since we don't
actually need the extra param to the macro.  And I think using id_
here simplifies things, since in part 1 you'd be able to drop:
>+            prop = counter['property_name'].replace('-', '_')
although you'd need to list the method (without prefixes) in the use counter
file instead of the CSS property name (but I think that's fine).

You would also need to define and #undef:
>#define CSS_PROP_PUBLIC_OR_PRIVATE(publicname_, privatename_) privatename_
around the inclusions of nsCSSPropList.h

(If it ends up being more work than that, I'm ok skipping the comment,
although it just creates more work for the patch to remove id_.)


(Patch 1's hack of using a macro to overwrite an enum is pretty horrible,
though.  It feels like gen-usecounters.py could at least use a comment
explaining it.)

>+  // XX is this the right define?
>+  #define CSS_PROP_LIST_INCLUDE_LOGICAL

Yes, and please remove the XXX comment here and in patch 1.

>-    bool changed; // outparam for ParseProperty. (ignored)
>+    bool changed = false; // outparam for ParseProperty. (ignored)

Please remove the " (ignored)" since it's no longer true.



In nsCSSDataBlock.h, please explain that aSheetDocument may be null,
and if it's non-null then use counters are incremented.  (I guess this
is a decent place for property use counters, although value use counters
would need to go elsewhere so that we count them only when they win the
cascade.)  Please also be consistent about the naming of aDocument vs.
aSheetDocument (probably the latter).


I'd prefer if gPropertyUseCounter were a private member of nsCSSProps,
with an accessor that had an appropriate range assertion in it.
(like kParserVariantTable / ParserVariant())

I'm not crazy about the huge array for a very small set of uses, but
I guess it's ok.

r=dbaron with that
Attachment #8620578 - Flags: review?(dbaron) → review+
Addressed review feedback, carrying over r+.

The array of use counters is a little heavyweight; to address that, I'll be
changing the size of mozilla::UseCounter to a 16-bit quantity.  32K use
counters should be enough for anybody.
Attachment #8620578 - Attachment is obsolete: true
Attachment #8637283 - Flags: review+
Comment on attachment 8620580 [details] [diff] [review]
part 4 - hook up use counters to WebIDL bindings

See comment 66 for additional details on this.
Attachment #8620580 - Flags: review?(bzbarsky)
Attachment #8620583 - Flags: review?(bzbarsky)
Attachment #8620584 - Flags: review?(bzbarsky)
Attachment #8623749 - Flags: review?(bzbarsky)
Changed this to use the new method for CSS properties dbaron suggested.

I'm not sure what makes good test properties/methods/attributes here.  Ideally,
the ones we choose would be the same as the ones we want to examine in the
wild, but I'm not sure we can do that...
Attachment #8620582 - Attachment is obsolete: true
Attachment #8637286 - Flags: review?(bzbarsky)
Comment on attachment 8620580 [details] [diff] [review]
part 4 - hook up use counters to WebIDL bindings

Is there any way to forward-declare the enum in BindingUtils.h instead (e.g. by using an enum class?) of including this new header?

Putting this stuff into CGPerSignatureCall means that we'll have multiple SetDocumentAndPageUseCounter() calls in overloaded methods, one per overload, though only one of them will be reached on a given call.  I guess that's OK.

It looks like we use the same use counter for both the getter and setter of an attribute.  Is that what we want?  It seems like we may want to count gets and sets separately....

It looks like people will need to manually add to the use counter enum in addition to adding the [UseCounter] annotation in the IDL, right?  Seems like we could avoid that by generating the list of IDL counters into a header that UseCounter.h then includes, right?  Please file a followup on that.

r=me modulo those issues.
Attachment #8620580 - Flags: review?(bzbarsky) → review+
Comment on attachment 8620583 [details] [diff] [review]
part 5b - add nsIDOMWindowUtils::forceUseCounterFlush

>+  MOZ_ASSERT(aNode);

You can't assert that; chrome can mess up and pass in null.  Just NS_ENSURE_ARG_POINTER, please.

>+  if (nsCOMPtr<nsIDOMHTMLImageElement> img = do_QueryInterface(aNode)) {

I'd really prefer QI to nsIContent followed by HTMLImageElement::FromContent (which returns null if not HTMLImageElement).  We would kind of like to kill off the nsIDOM* interfaces if we can.

>+  static_cast<image::Image*>(container.get())->ReportUseCounters();

It's not clear to me why this cast is safe.  Why can't we just put ReportUseCounters on imgIContainer (as notxpcom, noscript, nostdcall, etc)?  Then you wouldn't need the LOCAL_INCLUDES change either.

Similarly, is there a reason to not put that API on nsIDocument instead of nsDocument?

>+   * If this image is src'd from an SVG document, flush the SVG document's

I totally misread this comment at first (thought it was talking about the document that the image lives in)...  How about:

  If this image's src points to an SVG document, ....

>+  mImages.EnumerateEntries(FlushUseCountersForImage, mDocument);

I thought we were trying to use hashtable iterators now?

r=me with the above fixed, but please do fix the casting bits; there shouldn't really be any static_casts here when we're done.
Attachment #8620583 - Flags: review?(bzbarsky) → review+
Comment on attachment 8620584 [details] [diff] [review]
part 5c - add tests for use counters

>+  yield check_use_counter_img("file_use_counter_svg_getElementById.svg",
>+                              "PROPERTY_FILL");
>+  yield check_use_counter_img("file_use_counter_svg_currentScale.svg",
>+                              "PROPERTY_FILL");

Why aren't those SVGSVGELEMENT_GETELEMENTBYID and SVGSVGELEMENT_CURRENTSCALE respectively?  If those are wrong, please double-check the rest of these guys; I'm not sure which SVG use counter should correspond to what here.

r=me with that dealt with, though I have low confidence in my chances of having caught bugs in the various "wait for destruction" bits....
Attachment #8620584 - Flags: review?(bzbarsky) → review+
Comment on attachment 8623749 [details] [diff] [review]
part 6 - add use counters for deprecated operations

r=me
Attachment #8623749 - Flags: review?(bzbarsky) → review+
Comment on attachment 8637286 [details] [diff] [review]
part 5a - add use counters for SVGSVGElement and CSS properties for testing purposes

r=me
Attachment #8637286 - Flags: review?(bzbarsky) → review+
I addressed all the comments except for one: I didn't do the rearranging of
when nsDocument::mIsTopLevelContentDocument is set, because I don't think data:
documents are that interesting for the purposes of these counters.
Attachment #8620568 - Attachment is obsolete: true
Attachment #8641126 - Flags: review?(bugs)
Comment on attachment 8641126 [details] [diff] [review]
part 3a - add core DOM use counter functionality

>+nsIDocument::GetTopLevelContentDocument()
>+{
>+  nsDocument* parent;
>+
>+  if (!mLoadedAsData) {
>+    parent = static_cast<nsDocument*>(this);
>+  } else {
>+    nsCOMPtr<nsPIDOMWindow> window(mWindow);
As I said in an earlier comment, you want to use scope object here.
nsCOMPtr<nsPIDOMWindow> window = do_QueryInterface(GetScopeObject());


>+nsIDocument::PropagateUseCounters(nsIDocument* aParentDocument)
>+{
>+  // What really matters here is that our use counters get propagated as
>+  // high up in the content document hierarchy as possible.  So,
>+  // starting with aParentDocument, we need to find the toplevel content
>+  // document, and propagate our use counters into its
>+  // mChildDocumentUseCounters.
>+  nsIDocument* contentParent = aParentDocument->GetTopLevelContentDocument();
>+
>+  if (!contentParent) {
>+    return;
>+  }

I'd like to see some assertions here to hint the cases this is supposed to be used.
So perhaps MOZ_ASSERT(this != contentParent);

>+nsDocument::ReportUseCounters()
>+{
>+  static const bool sDebugUseCounters = false;
>+  if (mReportedUseCounters) {
>+    return;
>+  }
>+
>+  mReportedUseCounters = true;
>+
>+  if (Telemetry::HistogramUseCounterCount > 0 &&
>+      (IsContentDocument() || IsResourceDoc()) &&
>+      mOriginalURI && !MightBeAboutOrChromeScheme(mOriginalURI)) {
So I still think we should use uri from the principal.
NodePrincipal()->GetURI(...) so that when web pages use about:blank, we report the usage correctly.
Attachment #8641126 - Flags: review?(bugs) → review-
(In reply to Olli Pettay [:smaug] from comment #128)
> Comment on attachment 8641126 [details] [diff] [review]
> part 3a - add core DOM use counter functionality
> 
> >+nsIDocument::GetTopLevelContentDocument()
> >+{
> >+  nsDocument* parent;
> >+
> >+  if (!mLoadedAsData) {
> >+    parent = static_cast<nsDocument*>(this);
> >+  } else {
> >+    nsCOMPtr<nsPIDOMWindow> window(mWindow);
> As I said in an earlier comment, you want to use scope object here.
> nsCOMPtr<nsPIDOMWindow> window = do_QueryInterface(GetScopeObject());

It looks like mWindow is simply a cached copy of this call, according to the code in nsDocument::SetScriptGlobalObject.  Do I still have to QI here?  I guess tests would also help here...

> >+nsIDocument::PropagateUseCounters(nsIDocument* aParentDocument)
> >+{
> >+  // What really matters here is that our use counters get propagated as
> >+  // high up in the content document hierarchy as possible.  So,
> >+  // starting with aParentDocument, we need to find the toplevel content
> >+  // document, and propagate our use counters into its
> >+  // mChildDocumentUseCounters.
> >+  nsIDocument* contentParent = aParentDocument->GetTopLevelContentDocument();
> >+
> >+  if (!contentParent) {
> >+    return;
> >+  }
> 
> I'd like to see some assertions here to hint the cases this is supposed to
> be used.
> So perhaps MOZ_ASSERT(this != contentParent);

OK.

> >+nsDocument::ReportUseCounters()
> >+{
> >+  static const bool sDebugUseCounters = false;
> >+  if (mReportedUseCounters) {
> >+    return;
> >+  }
> >+
> >+  mReportedUseCounters = true;
> >+
> >+  if (Telemetry::HistogramUseCounterCount > 0 &&
> >+      (IsContentDocument() || IsResourceDoc()) &&
> >+      mOriginalURI && !MightBeAboutOrChromeScheme(mOriginalURI)) {
> So I still think we should use uri from the principal.
> NodePrincipal()->GetURI(...) so that when web pages use about:blank, we
> report the usage correctly.

OK.
Flags: needinfo?(bugs)
mWindow is null on data documents, since SetScriptGlobalObject isn't ever called there.
On data documents SetScriptHandlingObject or SetScopeObject is called, and that sets scope object.

(mWindow maps to document.defaultView, and document.defaultView is null on data documents:
 document.implementation.createHTMLDocument().defaultView == null)
Flags: needinfo?(bugs)
(In reply to On vacation August 4-25.  Please mail manually if really needed. from comment #123)
> >+  static_cast<image::Image*>(container.get())->ReportUseCounters();
> 
> It's not clear to me why this cast is safe.  Why can't we just put
> ReportUseCounters on imgIContainer (as notxpcom, noscript, nostdcall, etc)? 
> Then you wouldn't need the LOCAL_INCLUDES change either.

imgIContainer is builtinclass, at least, so we know that casting it to *some* C++ pointer is safe.  I believe |Image| is the concrete superclass of all imgIContainer instances, too.

Seth, which do you prefer?  The current setup, where Image declares |ReportUseCounters|, but actually calling it requires static_cast<Image*>'ing, or the route bz suggests?
Flags: needinfo?(seth)
Attachment #8641126 - Attachment is obsolete: true
Attachment #8644553 - Flags: review?(bugs)
Comment on attachment 8644553 [details] [diff] [review]
part 3a - add core DOM use counter functionality


>+
>+void
>+nsDocument::ReportUseCounters()
>+{
>+  static const bool sDebugUseCounters = false;
>+  if (mReportedUseCounters) {
>+    return;
>+  }
>+
>+  mReportedUseCounters = true;
>+
>+  if (Telemetry::HistogramUseCounterCount > 0 &&
>+      (IsContentDocument() || IsResourceDoc())) {
>+    nsCOMPtr<nsIURI> uri;
>+    NodePrincipal()->GetURI(getter_AddRefs(uri));
>+    if (!uri || MightBeAboutOrChromeScheme(uri)) {
>+      return;
>+    }
Actually, let's just use
if (nsContentUtils::IsSystemPrincipal(NodePrincipal())) {
  return;
}
here and drop MightBeAboutOrChromeScheme.
Attachment #8644553 - Flags: review?(bugs) → review+
So running this through debug and asan builds revealed a problem: in
nsSVGElement.cpp, we had:

    bool changed = false; // outparam for ParseProperty.
    mParser.ParseProperty(propertyID, aMappedAttrValue, mDocURI, mBaseURI,
                          mElement->NodePrincipal(), mDecl, &changed, false, true);
    if (changed) {
      // The normal reporting of use counters by the nsCSSParser won't happen
      // since it doesn't have a sheet.
      MOZ_ASSERT(!nsCSSProps::IsShorthand(propertyID));
      UseCounter useCounter = nsCSSProps::UseCounterFor(propertyID);
      if (useCounter != eUseCounter_UNKNOWN) {
        mElement->OwnerDoc()->SetDocumentAndPageUseCounter(useCounter);
      }
    }

This chunk of code caused two problems:

- the MOZ_ASSERT would fire because we *can* have shorthand properties here;
- in asan builds, letting shorthand properties through to
  nsCSSProps::UseCounterFor would cause out-of-bounds accesses:

  static mozilla::UseCounter UseCounterFor(nsCSSProperty aProperty) {
    MOZ_ASSERT(0 <= aProperty && aProperty < eCSSProperty_COUNT_no_shorthands,
               "out of range");
    return gPropertyUseCounter[aProperty];
  }

The problem seems to be addressed with this simple patch, but I'm wondering if
this is the best way to fix it, or if it's symptomatic of problems someplace
else.  The equivalent check in nsCSSExpandedDataBlock::DoTransferFromBlock has
no checks for shorthand properties (or aliases, for that matter)...
Attachment #8645958 - Flags: review?(cam)
Comment on attachment 8645958 [details] [diff] [review]
fix bogus assert in nsSVGElement.cpp

(In reply to Nathan Froyd [:froydnj][:nfroyd] from comment #134)
> - the MOZ_ASSERT would fire because we *can* have shorthand properties here;

I think I was assuming that SVG's presentation attributes are all for longhand properties, but that might not be the case for properties that used to be longhands but became shorthands later (but which still have the same presentation attribute).  font-variant is one example.

> - in asan builds, letting shorthand properties through to
>   nsCSSProps::UseCounterFor would cause out-of-bounds accesses:
> 
>   static mozilla::UseCounter UseCounterFor(nsCSSProperty aProperty) {
>     MOZ_ASSERT(0 <= aProperty && aProperty <
> eCSSProperty_COUNT_no_shorthands,
>                "out of range");
>     return gPropertyUseCounter[aProperty];
>   }
> 
> The problem seems to be addressed with this simple patch, but I'm wondering
> if
> this is the best way to fix it, or if it's symptomatic of problems someplace
> else.  The equivalent check in nsCSSExpandedDataBlock::DoTransferFromBlock
> has
> no checks for shorthand properties (or aliases, for that matter)...

nsCSSExpandedDataBlock only stores longhand properties, which is why there's no check there.

I don't recall: do we not support use counters for shorthand properties?  If not -- and I guess from the gPropertyUseCounter array length we don't -- we should instead just count all of the corresponding longhands that get set.  You could either iterate over the properties in mDecl or use CSSPROPS_FOR_SHORTHAND_SUBPROPERTIES to iterate over the longhand components of propertyID, and then set the use counters for each of them.
Attachment #8645958 - Flags: review?(cam) → review-
This patch implements your second suggestion, using
CSSPROPS_FOR_SHORTHAND_SUBPROPERTIES.  WDYT?
Attachment #8645958 - Attachment is obsolete: true
Attachment #8647572 - Flags: review?(cam)
Comment on attachment 8647572 [details] [diff] [review]
fix bogus assert in nsSVGElement.cpp

Review of attachment 8647572 [details] [diff] [review]:
-----------------------------------------------------------------

Looks good, thanks.
Attachment #8647572 - Flags: review?(cam) → review+
Blocks: 1195959
Is there Telemetry-user documentation for this somewhere? How about a new MDN wiki page linked from https://developer.mozilla.org/en-US/docs/Mozilla/Performance/Adding_a_new_Telemetry_probe
Flags: needinfo?(nfroyd)
(In reply to Vladan Djeric (:vladan) -- please needinfo! from comment #140)
> Is there Telemetry-user documentation for this somewhere? How about a new
> MDN wiki page linked from
> https://developer.mozilla.org/en-US/docs/Mozilla/Performance/
> Adding_a_new_Telemetry_probe

There is not.  I should write one and also experiment with writing a custom dashboard so that the %usage for each one is obvious.
Flags: needinfo?(nfroyd)
Blocks: 1204994
Flags: needinfo?(seth)
Has this been implemented?
If so, how is the data accessed?
Flags: needinfo?(nfroyd)
This bug really needs to be resolved FIXED. There is definitely some followup documentation and other work to do, but it's continuing to confuse people that this bug is open but the mechanism landed.

I feel pretty strongly that as we start to use more of these, we should do the more efficient counter collection scheme that I described at https://groups.google.com/forum/#!searchin/mozilla.dev.platform/use$20counters$20smedberg/mozilla.dev.platform/ktoEer_cdJA/Ni58spbvKgAJ (so that if somebody never uses a feature, we don't bloat the payload). Nathan can you file this bug or ping me if you disagree?
Flags: needinfo?(nfroyd)
Blocks: 1274601
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #145)
> This bug really needs to be resolved FIXED. There is definitely some
> followup documentation and other work to do, but it's continuing to confuse
> people that this bug is open but the mechanism landed.

Indeed, I didn't realize that this didn't get marked as RESOLVED FIXED.

> I feel pretty strongly that as we start to use more of these, we should do
> the more efficient counter collection scheme that I described at
> https://groups.google.com/forum/#!searchin/mozilla.dev.platform/
> use$20counters$20smedberg/mozilla.dev.platform/ktoEer_cdJA/Ni58spbvKgAJ (so
> that if somebody never uses a feature, we don't bloat the payload). Nathan
> can you file this bug or ping me if you disagree?

Bug 1274601.
No longer blocks: 1274601
Status: ASSIGNED → RESOLVED
Closed: 8 years ago
Flags: needinfo?(nfroyd)
Keywords: leave-open
Resolution: --- → FIXED
Target Milestone: --- → mozilla43
Blocks: 1274601
Depends on: 1338606
See Also: → 1845779
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: