Last Comment Bug 684091 - Telemetry for image decode count
: Telemetry for image decode count
Status: RESOLVED FIXED
:
Product: Core
Classification: Components
Component: ImageLib (show other bugs)
: unspecified
: x86 Mac OS X
: -- normal (vote)
: mozilla9
Assigned To: Jeff Muizelaar [:jrmuizel]
:
: Milan Sreckovic [:milan]
Mentors:
Depends on: 673292
Blocks: image-suck 683286
  Show dependency treegraph
 
Reported: 2011-09-01 16:29 PDT by Jeff Muizelaar [:jrmuizel]
Modified: 2011-10-23 16:08 PDT (History)
4 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
Add the ability to subtract from histograms (1.78 KB, patch)
2011-09-01 16:40 PDT, Jeff Muizelaar [:jrmuizel]
taras.mozilla: review+
Details | Diff | Splinter Review
Keep a histogram of how often we've decoded a particular image (6.31 KB, patch)
2011-09-01 16:42 PDT, Jeff Muizelaar [:jrmuizel]
justin.lebar+bug: review-
Details | Diff | Splinter Review
Keep a histogram of how often we've decoded images v2 (5.81 KB, patch)
2011-09-06 08:03 PDT, Jeff Muizelaar [:jrmuizel]
justin.lebar+bug: review+
Details | Diff | Splinter Review

Description Jeff Muizelaar [:jrmuizel] 2011-09-01 16:29:51 PDT

    
Comment 1 Jeff Muizelaar [:jrmuizel] 2011-09-01 16:40:04 PDT
Created attachment 557693 [details] [diff] [review]
Add the ability to subtract from histograms
Comment 2 Jeff Muizelaar [:jrmuizel] 2011-09-01 16:42:19 PDT
Created attachment 557696 [details] [diff] [review]
Keep a histogram of how often we've decoded a particular image
Comment 3 Justin Lebar (not reading bugmail) 2011-09-01 17:32:42 PDT
Sid, is the policy to do a privacy review on every new histogram?  This one seems obviously fine to me, but marking privacy-review-needed just in case.
Comment 4 Justin Lebar (not reading bugmail) 2011-09-02 14:06:47 PDT
Comment on attachment 557696 [details] [diff] [review]
Keep a histogram of how often we've decoded a particular image

>@@ -186,16 +187,17 @@ RasterImage::RasterImage(imgStatusTracke
>   mFrameDecodeFlags(DECODE_FLAGS_DEFAULT),
>   mAnim(nsnull),
>   mLoopCount(-1),
>   mObserver(nsnull),
>   mLockCount(0),
>   mDecoder(nsnull),
>   mWorker(nsnull),
>   mBytesDecoded(0),
>+  mDecodeCount(0),
> #ifdef DEBUG
>   mFramesNotified(0),
> #endif
>   mHasSize(PR_FALSE),
>   mDecodeOnDraw(PR_FALSE),
>   mMultipart(PR_FALSE),
>   mDiscardable(PR_FALSE),
>   mHasSourceData(PR_FALSE),
>@@ -203,16 +205,17 @@ RasterImage::RasterImage(imgStatusTracke
>   mHasBeenDecoded(PR_FALSE),
>   mWorkerPending(PR_FALSE),
>   mInDecoder(PR_FALSE),
>   mAnimationFinished(PR_FALSE)
> {
>   // Set up the discard tracker node.
>   mDiscardTrackerNode.curr = this;
>   mDiscardTrackerNode.prev = mDiscardTrackerNode.next = nsnull;
>+  Telemetry::GetHistogramById(Telemetry::IMAGE_DECODE_COUNT)->Add(mDecodeCount);

This was kind of confusing to me.  Maybe Add(0) would be more clear?

> //******************************************************************************
> RasterImage::~RasterImage()
> {
>@@ -240,16 +243,17 @@ RasterImage::~RasterImage()
>   DiscardTracker::Remove(&mDiscardTrackerNode);
> 
>   // If we have a decoder open, shut it down
>   if (mDecoder) {
>     nsresult rv = ShutdownDecoder(eShutdownIntent_Interrupted);
>     if (NS_FAILED(rv))
>       NS_WARNING("Failed to shut down decoder in destructor!");
>   }
>+  Telemetry::GetHistogramById(Telemetry::IMAGE_DECODE_COUNT)->Subtract(mDecodeCount);

Hm, is this right?  If we never decode the image, we'll Add(0) in the constructor, but then we shouldn't ever Subtract(0).  Or am I misunderstanding?
Comment 5 Sid Stamm [:geekboy or :sstamm] 2011-09-02 14:16:57 PDT
(In reply to Justin Lebar [:jlebar] from comment #3)
> Sid, is the policy to do a privacy review on every new histogram?  This one
> seems obviously fine to me, but marking privacy-review-needed just in case.

Thanks... we should just do a quick once-over on new histograms, nothing too in depth for most.  Can you explain what's being measured in more detail?  Are you measuring time it takes to decode *any* image, or specific images?  Are we collecting information about specific images in a way that will tell us what specific image files users have loaded?
Comment 6 Justin Lebar (not reading bugmail) 2011-09-02 14:33:51 PDT
When a page contains a compressed image (e.g. a jpeg), we have to decode the jpeg before we can display it onscreen.  But those decoded images take up a lot more memory than the compressed images do, so we throw the decoded images away after the user switches away from the tab containing the images.  When the user switches back to the tab, we re-decode the images.

This histogram measures the distribution over "how many times did we decode this image?".  So histogram values of 0 -> 100, 1 -> 200, 2 -> 50 would indicate that we never decoded 100 images, that we decoded 200 images once, and that we decoded 50 images twice.
Comment 7 Sid Stamm [:geekboy or :sstamm] 2011-09-02 14:48:24 PDT
Okay, so if there's nothing being transmitted about individual images (hashes of the URL fingerprints or something), there doesn't seem to be anything privacy-risky here.  If I understand correctly you're, tallying how many images were decoded in each of the decoding passes.
Comment 8 Justin Lebar (not reading bugmail) 2011-09-02 14:52:37 PDT
> Okay, so if there's nothing being transmitted about individual images (hashes of the URL 
> fingerprints or something)

Correct, none of that stuff.

> If I understand correctly you're, tallying how many images were decoded in each of the decoding 
> passes.

We're tallying how many times a compressed image was thrown away and then re-decoded. (Plus whether it was ever decoded in the first place.)
Comment 9 Sid Stamm [:geekboy or :sstamm] 2011-09-02 15:25:37 PDT
(a bit of "teach noob sid via IRC" happened since the last comment). 

I now understand what we're measuring: the distribution of "reincarnations" of images.  

If there's one image that's incredibly common on one type of site and it gets re-decoded far more frequently than all others (like a hypothetical tracking beacon present on on most porn sites) we could discern from your histogram whether or not you've encountered this "most interesting" image and thus, whether or not you're spending time on that segment of the web.  

Not sure if that's a *giant* privacy risk (it is hypothetical), but we're not intending to collect that information, but are nevertheless learning it via this histogram.

What are we hoping to learn/fix by measuring this?
Comment 10 Jeff Muizelaar [:jrmuizel] 2011-09-06 07:55:37 PDT
(In reply to Sid Stamm [:geekboy] from comment #9)
> (a bit of "teach noob sid via IRC" happened since the last comment). 
> 
> I now understand what we're measuring: the distribution of "reincarnations"
> of images. 

It depends what you mean by "reincarnations". The decision to discard or decode is largely not controllable by content. i.e. there's no reason we need to redecode when loading a new a page.

> If there's one image that's incredibly common on one type of site and it
> gets re-decoded far more frequently than all others (like a hypothetical
> tracking beacon present on on most porn sites) we could discern from your
> histogram whether or not you've encountered this "most interesting" image
> and thus, whether or not you're spending time on that segment of the web.  

I think this would be very difficult to pull off. First, to pull this specific image out of the noise of other images stored in the histogram would be tricky. Second, there's no direct correlation between how frequent an image is and how often we redecode it. 

> Not sure if that's a *giant* privacy risk (it is hypothetical), but we're
> not intending to collect that information, but are nevertheless learning it
> via this histogram.
> 
> What are we hoping to learn/fix by measuring this?

We want to track if were decoding images the right amount. i.e we can tune things so we redecode too much or not enough. This also helps show if there are a lot of images that don't need to be decoded at all.
Comment 11 Jeff Muizelaar [:jrmuizel] 2011-09-06 07:58:24 PDT
(In reply to Justin Lebar [:jlebar] from comment #4)
> >   mDiscardTrackerNode.curr = this;
> >   mDiscardTrackerNode.prev = mDiscardTrackerNode.next = nsnull;
> >+  Telemetry::GetHistogramById(Telemetry::IMAGE_DECODE_COUNT)->Add(mDecodeCount);
> 
> This was kind of confusing to me.  Maybe Add(0) would be more clear?

I can switch to Add(0).


> >+  Telemetry::GetHistogramById(Telemetry::IMAGE_DECODE_COUNT)->Subtract(mDecodeCount);
> 
> Hm, is this right?  If we never decode the image, we'll Add(0) in the
> constructor, but then we shouldn't ever Subtract(0).  Or am I
> misunderstanding?


Yeah, this hunk should just go. It was from a previous version of this statistic that only tracked live images. For now, I'd like to keep the information from dead images around.
Comment 12 Jeff Muizelaar [:jrmuizel] 2011-09-06 08:03:55 PDT
Created attachment 558477 [details] [diff] [review]
Keep a histogram of how often we've decoded images v2
Comment 13 Justin Lebar (not reading bugmail) 2011-09-06 08:05:28 PDT
> We want to track if were decoding images the right amount. i.e we can tune things so we redecode too 
> much or not enough.

Sid suggested that a histogram showing the amount of time between a tab being unfocused and refocused might be more useful for tuning this.

> This also helps show if there are a lot of images that don't need to be decoded at all.

Well, it shows us if there are a lot of images that we *don't* decode, which isn't the same thing as showing us that there are a lot of images that we *could choose* not to decode.  IOW, it's good for seeing whether changes we made to become lazier about decoding images had an impact, but it's not good for showing us whether we need to become lazier.
Comment 14 Justin Lebar (not reading bugmail) 2011-09-06 08:07:27 PDT
Comment on attachment 558477 [details] [diff] [review]
Keep a histogram of how often we've decoded images v2

r=me; the code looks right, and there's at least a plausible use-case for this data.  But I think we should look into tab focused/unfocused timings, as those might be more useful for tuning the discard timer.
Comment 15 Sid Stamm [:geekboy or :sstamm] 2011-09-06 14:18:07 PDT
I agree with what :jlebar said in comment 13 and comment 14.  I there's no serious privacy risk here (removing privacy-review-needed keyword), but have a hunch there might be other, more helpful measurements that can solve our problems.
Comment 16 (dormant account) 2011-09-07 16:49:04 PDT
Comment on attachment 557693 [details] [diff] [review]
Add the ability to subtract from histograms

># HG changeset patch
># Parent 3801b68949daee91c886c2e3e0a4f1d4a8a82b96
>Bug 684091. Add the ability to subtract values from Histograms
>
>diff --git a/ipc/chromium/src/base/histogram.cc b/ipc/chromium/src/base/histogram.cc
>--- a/ipc/chromium/src/base/histogram.cc
>+++ b/ipc/chromium/src/base/histogram.cc
>@@ -125,16 +125,27 @@ void Histogram::Add(int value) {
>   if (value < 0)
>     value = 0;
>   size_t index = BucketIndex(value);
>   DCHECK_GE(value, ranges(index));
>   DCHECK_LT(value, ranges(index + 1));
>   Accumulate(value, 1, index);
> }
> 
>+void Histogram::Subtract(int value) {
>+  if (value > kSampleType_MAX - 1)
>+    value = kSampleType_MAX - 1;
>+  if (value < 0)
>+    value = 0;
>+  size_t index = BucketIndex(value);
>+  DCHECK_GE(value, ranges(index));
>+  DCHECK_LT(value, ranges(index + 1));
>+  Accumulate(value, -1, index);
>+}
>+

I would prefer if instead of copying code here, you stuck the common logic into a int Normalize(int value) utility function

However it's a matter of taste, so r+ either way. 

Sorry for the slow review.

Note You need to log in before you can comment on or make changes to this bug.