653268 - [Snippet Service] Track snippet load numbers

Reporter

Description

•

15 years ago

Now that snippets are internally hosted we'd like to get a better idea of how often each snippet is loaded. This will: -help us calculate rough clickthrough rate of individual snippets so we can track and optimize over time -help us understand who is viewing snippets so we can understand targeting by geo, version, etc -let us know how often default snippet content is served (equates to service being down) May be a couple of different ways to implement with final objective being some sort of reporting interface that can be exported/viewed on a ongoing basis.

:Gavin Sharp [email: gavin@gavinsharp.com]

Updated

•

15 years ago

Assignee: build → nobody

QA Contact: coop → webdev

Blake Cutler

Comment 1

•

15 years ago

Is there a reason we need absolute numbers here? As long as we track how frequently snippets are shown (10%, 20%, etc.), we'll have enough information to optimize their display.

Laura Forrest

Reporter

Comment 2

•

15 years ago

(In reply to comment #1) > Is there a reason we need absolute numbers here? As long as we track how > frequently snippets are shown (10%, 20%, etc.), we'll have enough information > to optimize their display. Snippets rotate in and out and change often over time. We'd like to be able to compare without content having to be live side-by-side.

Laura Forrest

Reporter

Updated

•

15 years ago

Assignee: nobody → malexis

Corey Shields [:cshields]

Comment 3

•

15 years ago

This is going to be difficult (I'm guessing near impossible) because of the caching.. If you request /foo/bar and get snippet A, you are going to get that from cache until the cache expires. Once the cache expires, the next hit you get may deliver snippet B from the backend which would then be cached. The caching server has no idea there is a difference between the two, and the log hits are all going to be seen as /foo/bar Maybe Les or Daniel would have an idea as to how to get around this, but the whole point of caching is to avoid the load of "individual hits" which in turn is what you are asking to quantify here (in detail)

Les Orchard [:lorchard]

Comment 4

•

15 years ago

(In reply to comment #3) > Maybe Les or Daniel would have an idea as to how to get around this, but the > whole point of caching is to avoid the load of "individual hits" which in turn > is what you are asking to quantify here (in detail) Yeah, off the top of my head, we'd fire off an HTTP request to a metrics service somewhere when a snippet is revealed, whether via image or ajax request. I'm not totally familiar with what we use for metrics these days, so I can't speak to how that would work exactly or if it could withstand traffic from about:home Another thing might be to track metrics on the client side, and then send off a collected report somewhere every time Fx contacts the snippet service. That would reduce the traffic, but I'm not sure if what we use for metrics would support that off-the-shelf. This starts to sound like something akin to crash reporter / socorro

Les Orchard [:lorchard]

Comment 5

•

15 years ago

(In reply to comment #4) > Another thing might be to track metrics on the client side, and then send off a > collected report somewhere every time Fx contacts the snippet service. Oh, also: This scheme would require some changes to Fx itself, I think, if only to trigger a client-side metric report transmission on snippet fetch.

Corey Shields [:cshields]

Comment 6

•

15 years ago

(In reply to comment #4) > Yeah, off the top of my head, we'd fire off an HTTP request to a metrics > service somewhere when a snippet is revealed, whether via image or ajax > request. > > I'm not totally familiar with what we use for metrics these days, so I can't > speak to how that would work exactly or if it could withstand traffic from > about:home > > Another thing might be to track metrics on the client side, and then send off a > collected report somewhere every time Fx contacts the snippet service. That > would reduce the traffic, but I'm not sure if what we use for metrics would > support that off-the-shelf. This starts to sound like something akin to crash > reporter / socorro But even still, the whole reason we cache is to deal with the intense load of connections from every client out there. The request is to track every connection and these ideas still require taking a connection from every client somewhere. In that case we may as well ditch caching and take every connection straight to the cluster (which is not a good idea, we would be in no position to handle such load, and to do so would be a huge cost for just a metric gain)

Les Orchard [:lorchard]

Comment 7

•

15 years ago

(In reply to comment #6) > But even still, the whole reason we cache is to deal with the intense load of > connections from every client out there. The request is to track every > connection and these ideas still require taking a connection from every client > somewhere. In that case we may as well ditch caching and take every connection > straight to the cluster (which is not a good idea, we would be in no position > to handle such load, and to do so would be a huge cost for just a metric gain) Yup, all true. I don't want to trivialize the effort - I don't have any numbers, but it could be similar in magnitude to collecting Fx crash reports. (ie. it might not be worth it.)

Chris Beard

Comment 8

•

15 years ago

If I recall correctly, when we designed about:home we purposely used a URL scheme that was equivalent to what we used for the blocklist ping so that it would be (reasonably) straightforward to roll up reporting in the same way. Given how the system works we're not going to be able to get load rates for specific snippets, but at least we can get the volume overall.

Les Orchard [:lorchard]

Comment 9

•

15 years ago

(In reply to comment #8) > If I recall correctly, when we designed about:home we purposely used a URL > scheme that was equivalent to what we used for the blocklist ping so that it > would be (reasonably) straightforward to roll up reporting in the same way. > Given how the system works we're not going to be able to get load rates for > specific snippets, but at least we can get the volume overall. This is true, too. But, I'm not sure that number will be very useful. It'll roughly be Firefox ADUs minus whomever didn't load about:home that day (eg. they never reloaded it, or they set a different home page) But, definitely no per-snippet counts in that number.

Corey Shields [:cshields]

Comment 10

•

15 years ago

(In reply to comment #8) > If I recall correctly, when we designed about:home we purposely used a URL > scheme that was equivalent to what we used for the blocklist ping so that it > would be (reasonably) straightforward to roll up reporting in the same way. > Given how the system works we're not going to be able to get load rates for > specific snippets, but at least we can get the volume overall. We can easily track user hits, that's no problem at all.. We just can't easily tell who got what snippet.

Chris Beard

Comment 11

•

15 years ago

It won't give us per-snippet counts directly, but we could infer at least a rough idea for that based upon making some assumptions. My thinking was that if we know that there were 1M user hits loading the snippet set and there are 4 snippets in random equally weighted rotation and we assumed that each user conservatively has at most 1 impression of the page between refresh of the snippet set -- then we can calculate that each snippet would have something in the range of 250K impressions. It doesn't need to be perfect. Having a base that accounts for growth and that's consistent would allow us to measure rates of change of conversion rates as we work to tune and optimize the channel. As it stands we only see the click-through and that number will likely only grow as the user base grows, and so doesn't tell us much about it's effectiveness.

Daniel Einspanjer [:dre] [:deinspanjer]

Comment 12

•

15 years ago

We'll be able to handle this request once we get the Metrics Data Collection Module project underway. Basically, the project is very close to what was mentioned at the bottom of comment #4. A system that collects metrics client-side and sends them in periodically. Before that service is online, it is unlikely we'll be able to do anything with the actual per-snippit counts. We could look at implementing some metrics that Chris outlined in comment #11, but we need a lot more details such as examples of what the data looks like. We also have to prioritize that work against several other projects we have in the queue for this quarter, including developing the DCM project I mentioned above.

Chris More [:cmore]

Comment 13

•

15 years ago

Couldn't we just use Webtrends JavaScript to track click-through rates on the snippets if you consider each snippet an ad? For example: WT.ad=HTML5-demo-3-snippet -- for impression WT.ac=HTML5-demo-3-snippet -- for click WT.ad / WT.ac = click-through rate. Regardless if the snippet is cached or not, a Webtrends call will be made and thus impressions, click-through rates, and other valuable geo-location metrics will be stored for analysis.

Chris More [:cmore]

Comment 14

•

15 years ago

Just found out the monthly pageviews for FF home (3.6) and that would overwhelm Webtrends for sure unless it is sampled down or we come up with another data collection method. Ignore my previous comment....

Blake Cutler

Comment 15

•

15 years ago

(In reply to comment #2) > (In reply to comment #1) > > Is there a reason we need absolute numbers here? As long as we track how > > frequently snippets are shown (10%, 20%, etc.), we'll have enough information > > to optimize their display. > > Snippets rotate in and out and change often over time. We'd like to be able > to compare without content having to be live side-by-side. As long as we track the start and end date of each snippet, it will be relatively simple to build a very strong proxy for each snippet's click though rate. Unless I'm missing something, doing so should capture most of the benefit at a fraction of the cost.

Les Orchard [:lorchard]

Comment 16

•

15 years ago

(In reply to comment #15) > As long as we track the start and end date of each snippet, it will be > relatively simple to build a very strong proxy for each snippet's click > though rate. Unless I'm missing something, doing so should capture most of > the benefit at a fraction of the cost. I think the issue is that this bug is about snippet *impressions*, not click-throughs. We already measure click-throughs on snippet links, for the most part.

Anurag Phadke[:aphadke@mozilla.com]

Assignee

Updated

•

14 years ago

Assignee: malexis → aphadke

Tom Lowenthal

Updated

•

14 years ago

Keywords: privacy-review-needed

Anurag Phadke[:aphadke@mozilla.com]

Assignee

Updated

•

14 years ago

Depends on: 690881

Tom Lowenthal

Updated

•

14 years ago

Keywords: privacy-review-needed

Fred Wenzel [:wenzel]

Updated

•

13 years ago

Component: Webdev → Service

Product: mozilla.org → Snippets

Version: other → unspecified

Osmose [:osmose, :mkelly]

Comment 17

•

13 years ago

The snippet tracking efforts from almost a year ago have since given us many of the metrics requested here. Marking as a duplicate of that project, if we need more stats we'll make a new bug.

Status: NEW → RESOLVED

Closed: 13 years ago

Resolution: --- → DUPLICATE

Bugzilla

[Snippet Service] Track snippet load numbers

Categories

(Snippets :: Service, defect)

Tracking

(Not tracked)

People

(Reporter: lforrest, Assigned: aphadke)

References

Details

Crash Data

Security

(public)

User Story

Description

Updated

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Comment 15

Comment 16

Updated

Updated

Updated

Updated

Updated

Comment 17