Closed Bug 1007723 Opened 10 years ago Closed 10 years ago

Telemetry (?) is causing a high memory consumption and lagging on browser startup

Categories

(Toolkit :: Telemetry, defect)

x86_64
Windows 7
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla33

People

(Reporter: quicksaver, Assigned: rvitillo)

Details

(Whiteboard: [mentor=rvitillo][lang=javascript])

Attachments

(1 file, 1 obsolete file)

Something strange has been happening for the past few months that, until now, I had no clue why it was happening.

Every time I opened Nightly, firefox.exe would hog the processor to the maximum a few moments after startup, lagging everything. Memory consumption increased with every startup, after a week or two it goes way beyond 1Gb, locking the window until whatever it's doing finishes.

At first I thought this was an add-on, but disabling all of them didn't help. Then I thought it could be some sort of profile corruption, so I deleted everything in the profile folder (except for a few things, this is the developer profile I use to code my add-ons). For a while this worked, no processor hogging, no high memory consumption. But after a few days it started happening again, and with each startup it got worse.

So I created a new profile, started over from scratch. Again, everything was fine at first, but after a few days it started happening again, worse with each startup just like before.

It's obvious that's it's something built over time, so I tried comparing my normal personal profile (which I use with firefox release, and doesn't suffer from this issue) with my dev profile. The biggest difference was the saved-telemetry-pings folder. My normal profile's folder only has one file of about 66KB, while my dev's profile has 246 files inside, totaling more than 600MB!

Btw, I started a Nightly session in my dev profile when I started writing this bug, it only stopped hogging the processor now (task manager said it was at 50% this whole time, I have a dual core), and it's currently consuming 1,2GB of memory. This memory is not released until I close the browser.

Here comes the important part, as soon as it stopped hogging CPU, the saved-telemetry-pings folder is emptied! And now, after restarting Nightly, it no longer hogs CPU on startup and it has a normal memory consumption.

Also, after closing it and every time I close/restart the browser, it creates a new file in saved-telemetry-pings. I'm guessing this is a sort of queue of things telemetry has to do at a predetermined point, so that it doesn't do them on every session?

The problem is, I restart this profile a lot (it's my dev profile, I restart to test any changes I make). So, I'm guessing that, if it comes to the time to process these files and there are a lot of them, it will be extremely heavy for the browser to do it, and it will lag considerably. Worse of all, if I restart the browser before this finishes, the files don't seem to be deleted, so they will remain for the following session, lagging that too.

Is there any way I can prevent this? Or is this something that shouldn't be happening in the first place? The only way I found out to keep the files in saved-telemetry-pings folder from being created, was to disable telemetry through the toolkit.telemetry.enabled preference, which I'm guessing is better for a dev profile, as its info is probably not as relevant, but that's a bit beyond the point.

(Is there a telemetry tracking bug I should block with this?)
I'm pretty sure 600MB files is not something that should be happening. Needinfoing people with more of a clue than just that.
Flags: needinfo?(mconley)
Flags: needinfo?(aklotz)
This sounds like it's in the area I worked on in bug 863872, which dealt with sending backlogged Telemetry pings (we send any pings older than a week on start-up, but younger than two weeks - anything older than 2 weeks is chucked).

Quicksaver - if you grab a copy of Fx 26[1] (which didn't have bug 863872 landed on it), do you still see the same effect?

Cc'ing some other folks who I know are familiar with this area of Telemetry.

[1] http://ftp.mozilla.org/pub/mozilla.org/firefox/releases/26.0/
Flags: needinfo?(mconley)
How do I get it to trigger the CPU hog? I can build up the number of files by just restarting a bunch of times sure, but so far they don't seem to get processed automatically. Is this a time-based thing? Or when it reaches a certain number of files it will do it by itself?

I realize this is a bit of a weird question coming from the bugee, but like I said, all the files were cleared last time when I let firefox do its thing while it was hogging CPU, and after that the new files remain in the profile folder no matter how long I keep my session open, and firefox never seemed to hog CPU since. If it's working purely on a schedule, it might take a few days for me to post a reply to this. :)
I believe many, many restarts (which need to occur quickly enough so that Telemetry does not have a chance to send any pings) will pack that folder with queued pings.

That's one way of doing it, anyway.
Flags: needinfo?(aklotz)
So while I'm abusing my Ctrl+Alt+R keys in Nightly so I have a packed folder again to try to reproduce the issue (thank god for the restartless restart add-on!), please explain something to me. What do you consider the bug here is? Is it the fact that it's creating a bunch of queued pings, or that it hogs CPU and memory while processing them?

Because just now I realized I'm not exactly sure what it is that you want me to test with Firefox 26.

Sorry if this is a dumb question, but I'm not very familiar with anything-Telemetry, so I don't know what I should be looking at/for.
(In reply to Luís Miguel [:Quicksaver] from comment #5)
> So while I'm abusing my Ctrl+Alt+R keys in Nightly so I have a packed folder
> again to try to reproduce the issue (thank god for the restartless restart
> add-on!), please explain something to me. What do you consider the bug here
> is? Is it the fact that it's creating a bunch of queued pings, or that it
> hogs CPU and memory while processing them?
> 
> Because just now I realized I'm not exactly sure what it is that you want me
> to test with Firefox 26.
> 
> Sorry if this is a dumb question, but I'm not very familiar with
> anything-Telemetry, so I don't know what I should be looking at/for.

The huge collection of pings is not great, but is somewhat expected if you're doing lots of restarts.

I'm curious to know if Firefox 26 is similar to 29 in that it consumes a lot of memory / processor on start-up with many items in the telemetry ping queue.
Telemetry pings are not saved by default on dev builds (MOZILLA_OFFICIAL and MOZ_TELEMETRY_REPORTING flags are required). So I wonder, did you change those flags manually or are pings being saved without them being enabled?
Flags: needinfo?(quicksaver)
Assignee: nobody → rvitillo
I don't even know how to do that... So I'm guessing I didn't do it. How do I see if they're enabled?
Flags: needinfo?(quicksaver)
When you said Nightly you probably meant the official Nightly build, not a local one. In that case telemetry is enabled.

What we could do is to set a threshold for the number of saved pings, and once we hit that threshold, forcefully trigger a send and deleting them.
Yes I meant the official Nightly build, I'm sorry if I caused confusion.
Whiteboard: [good first bug][lang=javascript]
The key bug for this problem is bug 833545; we should not be loading all the telemetry pings into memory. This is what consumes time and memory at startup. Bug 1002703 is related; we should ensure that whatever startup-time work we do to check and send saved pings doesn't delay startup.
(In reply to Mike Conley (:mconley) from comment #6)
> I'm curious to know if Firefox 26 is similar to 29 in that it consumes a lot
> of memory / processor on start-up with many items in the telemetry ping
> queue.

160 files in the ping queue ~= 373MB; I didn't go up to 246 files like before, because Nightly already hogs CPU for a while like this, and memory consumption already surpasses 1GB.

Tried opening firefox 26 in that profile, it worked just fine, normal memory consumption, no CPU hogging. Kept a session going for around 10 minutes twice, it never seemed to even touch the files in the saved-telemetry-pings folder.

(I kept a copy of that folder btw, in case you'd like me to run some more similar tests like this.)
Whiteboard: [good first bug][lang=javascript] → [good first bug][mentor=rvitillo][lang=javascript]
I would like to work on it, but I need a mentor to help me.
We can hook you up. :)

Needinfo'ing rvitillo, who is designated as mentor.
Flags: needinfo?(rvitillo)
Felipe, feel free to go ahead and start working on it :)  You can ask questions in the bug or on irc in #perf, my handle there is rvitillo.
Flags: needinfo?(rvitillo)
I installed limeChat to start with irc, how can i find you on irc?
(In reply to felipejpa15 from comment #16)
> I installed limeChat to start with irc, how can i find you on irc?

You need to connect to irc.mozilla.org.

Once you're connected, use this command:

/join #perf

To join the #perf channel. You should find rvitillo in there. Then say:

rvitillo: ping

And he'll respond to you when he's back at his keyboard.
Vladan suggested also that we could keep only the last N most recent entries and delete older ones.
Component: Untriaged → Telemetry
Product: Firefox → Toolkit
Dropping the good first bug tag, because seriously no.
Whiteboard: [good first bug][mentor=rvitillo][lang=javascript] → [mentor=rvitillo][lang=javascript]
Felipe, are you still working on it? I didn't hear back from you for a while.
Flags: needinfo?(felipejpa15)
I'm sorry I don't have time for now. :'(
Flags: needinfo?(felipejpa15)
Comment on attachment 8438509 [details] [diff] [review]
Telemetry (?) is causing a high memory consumption and lagging on browser startup, v1

Review of attachment 8438509 [details] [diff] [review]:
-----------------------------------------------------------------

::: toolkit/components/telemetry/TelemetryFile.jsm
@@ +60,5 @@
>    get OVERDUE_PING_FILE_AGE() {
>      return OVERDUE_PING_FILE_AGE;
>    },
>  
> +  get MAX_LRU_PINGS() {

A little doc?

@@ +147,5 @@
>        let iter = new OS.File.DirectoryIterator(directory);
>        let exists = yield iter.exists();
>  
>        if (exists) {
>          let entries = yield iter.nextBatch();

Ok, but the following change deserves some doc.

@@ +175,5 @@
> +            continue;
> +          }
> +
> +          result.push(this.loadHistograms(entry.path));
> +        }

I would go for:

for (let i = 0; i < MAX_LRU_PINGS; ++i) {
  result.push(...)
}
for (let i = MAX_LRU_PINGS; i < sortedEntries.length; ++i) {
  OS.File.remove(...) // You probably don't even need to `yield` here.
}

::: toolkit/components/telemetry/tests/unit/test_TelemetrySendOldPings.js
@@ +36,5 @@
>  const PING_TIMEOUT_LENGTH = 5000;
>  const EXPIRED_PINGS = 5;
>  const OVERDUE_PINGS = 6;
>  const RECENT_PINGS = 4;
> +const LRU_PINGS = TelemetryFile.MAX_LRU_PINGS;

If this is exported only for the sake of tests, you can access directy MAX_LRU_PINGS in the BackstagePass object of TelemetryFile.jsm, e.g.

Cu.impor("resource://gre/modules/TelemetryFile.jsm", {}).MAX_LRU_PINGS;

@@ +221,5 @@
>  
>  /**
> + * Test that only the most recent LRU_PINGS pings are kept at startup.
> + */
> +add_task(function* () {

Can you add a name to the test?

@@ +235,5 @@
> +      do_check_neq(p.slug, item.slug);
> +    }
> +  }
> +
> +  assertNotSaved(pings);

Shouldn't this be just `tail`?

::: toolkit/components/telemetry/tests/unit/xpcshell.ini
@@ +13,5 @@
>  [test_TelemetryStopwatch.js]
>  [test_TelemetryPingBuildID.js]
>  [test_ThirdPartyCookieProbe.js]
>  [test_TelemetrySendOldPings.js]
> +#skip-if = debug == true || os == "android" # Disabled due to intermittent orange on Android

What did you change here?
Attachment #8438509 - Flags: review?(dteller) → review+
(In reply to David Rajchenbach Teller [:Yoric] from comment #23)
> Comment on attachment 8438509 [details] [diff] [review]
> Telemetry (?) is causing a high memory consumption and lagging on browser
> startup, v1
> 
> Review of attachment 8438509 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> ::: toolkit/components/telemetry/TelemetryFile.jsm
> @@ +60,5 @@
> >    get OVERDUE_PING_FILE_AGE() {
> >      return OVERDUE_PING_FILE_AGE;
> >    },
> >  
> > +  get MAX_LRU_PINGS() {
> 
> A little doc?

There is some documentation above the declaration of const MAX_LRU_PINGS.

> ::: toolkit/components/telemetry/tests/unit/test_TelemetrySendOldPings.js
> @@ +36,5 @@
> >  const PING_TIMEOUT_LENGTH = 5000;
> >  const EXPIRED_PINGS = 5;
> >  const OVERDUE_PINGS = 6;
> >  const RECENT_PINGS = 4;
> > +const LRU_PINGS = TelemetryFile.MAX_LRU_PINGS;
> 
> If this is exported only for the sake of tests, you can access directy
> MAX_LRU_PINGS in the BackstagePass object of TelemetryFile.jsm, e.g.
> 
> Cu.impor("resource://gre/modules/TelemetryFile.jsm", {}).MAX_LRU_PINGS;

Right, I kept it this way to be consistent with MAX_PING_FILE_AGE & OVERDUE_PING_FILE_AGE.

> 
> ::: toolkit/components/telemetry/tests/unit/xpcshell.ini
> @@ +13,5 @@
> >  [test_TelemetryStopwatch.js]
> >  [test_TelemetryPingBuildID.js]
> >  [test_ThirdPartyCookieProbe.js]
> >  [test_TelemetrySendOldPings.js]
> > +#skip-if = debug == true || os == "android" # Disabled due to intermittent orange on Android
> 
> What did you change here?

I forgot to disable the test again before refreshing the patch.
https://hg.mozilla.org/mozilla-central/rev/409f474cc986
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Target Milestone: --- → mozilla33
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: