Firefox reliably exceeds memory when origin telemetry test mode is active

RESOLVED FIXED in Firefox 68

Status

()

defect
RESOLVED FIXED
2 months ago
2 months ago

People

(Reporter: englehardt, Assigned: xeonchen)

Tracking

unspecified
mozilla68
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox68 fixed)

Details

Attachments

(2 attachments)

I'm on Nightly 68 build 20190513215004 on Linux. When I enable origin telemetry test mode, I can reliably cause Firefox to exceed system memory (thus locking the machine). I have 32GB of memory, so it's definitely not a low resource issue.

STR:

  1. On a fresh profile, enable test mode: telemetry.origin_telemetry_test_mode.enabled = true.
  2. Go to https://senglehardt.com/test/prio/sample_trackers_remote.html. (There's a small chance of NSFW favicons on this page)
  3. Close the browser.
  4. Using htop, or your favorite resource monitor, watch the FF process quickly fill memory.

Note that I'm able to trigger the OOM issues if I continually refresh the page. Once I get to around ~160 origins blocked the browser exceeds memory. I suspect the issue is somewhere in the building of the Prio pings.

Gary or Chris, are you able to reproduce?

Flags: needinfo?(xeonchen)
Flags: needinfo?(chutten)

Comment on attachment 9064705 [details]
Bug 1551410 - Part 1: increment loop index;

(In reply to Steven Englehardt [:englehardt] from comment #1)

Gary or Chris, are you able to reproduce?

Yes, I can reproduce this, and have found potential root cause.

Chris, would you help to check if this modification is reasonable? I don't really understand this part...

Flags: needinfo?(xeonchen)
Attachment #9064705 - Flags: feedback?(chutten)

Cripes, I knew I should've dropped more comments in there.

The idea is to, for a given metric, transform the {origin: count, ...} bag into the bit vector shards prio needs. If count were only ever absent/0/1, this would be easy, but since count can be any number we may need any number of bit vector "generations" (sets of shards) to encode the number.

Like if the bag was {fb.com: 2} and the size of the origin list is 2.5k, we need 2 shards to encode a full list of falses for the other 2.5k - 1 origins and the one true at "fb.com"'s index. And then we need another 2 shards to encode those 2.5k - 1 falses and 1 true. Each of those sets of 2 shards are "generations" in this block of code.

And yes I missed the increment. The tests miss the case where we ask for the encoded snapshot when there's more than one accumulation for that metric of a single origin. RecordOriginTwiceMixed could be augmented to test this.

Flags: needinfo?(chutten)
Attachment #9064705 - Flags: feedback?(chutten)
Attachment #9064705 - Attachment description: Bug 1551410 - increment loop index; → Bug 1551410 - Part 1: increment loop index;

I think we should land part 1 first for QA to continue their test. Let's leave this open until the test code is done.

Assignee: nobody → xeonchen
Keywords: leave-open
Pushed by xeonchen@gmail.com:
https://hg.mozilla.org/integration/autoland/rev/50cab7d5d57a
Part 1: increment loop index; r=chutten
Keywords: leave-open
Status: NEW → RESOLVED
Closed: 2 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
You need to log in before you can comment on or make changes to this bug.