Set MOZ_APP_RESTART later

RESOLVED FIXED in Firefox 24

Status

()

defect
RESOLVED FIXED
7 years ago
6 years ago

People

(Reporter: glandium, Assigned: gsvelto)

Tracking

unspecified
mozilla25
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox24 fixed, firefox25 fixed)

Details

(Whiteboard: [qa-])

Attachments

(2 attachments, 2 obsolete attachments)

MOZ_APP_RESTART, which is used as timestamp for process creation on a restart, is set more or less at the beginning of shutdown, which, as we know, can take some time, which means process creation time ends up being completely wrong.
(In reply to Mike Hommey [:glandium] from comment #0)
> MOZ_APP_RESTART, which is used as timestamp for process creation on a
> restart, is set more or less at the beginning of shutdown, which, as we
> know, can take some time, which means process creation time ends up being
> completely wrong.

I think that's correct. That shutdown is in the way of the next startup.
(In reply to Taras Glek (:taras) from comment #1)
> (In reply to Mike Hommey [:glandium] from comment #0)
> > MOZ_APP_RESTART, which is used as timestamp for process creation on a
> > restart, is set more or less at the beginning of shutdown, which, as we
> > know, can take some time, which means process creation time ends up being
> > completely wrong.
> 
> I think that's correct. That shutdown is in the way of the next startup.

If you want that shutdown to be accounted, it would be better if it was separated. Currently, there is no way to know whether the long start time comes from shutdown or startup.
Inspired by :mak's idea in bug 896025 comment 6 I've cooked up a patch that uses the MOZ_APP_RESTART environment variable only to detect that Firefox was restarted, not to store the time at which it was shut down. When restarting instead of reading the new "fake" process startup time from this environment variable the code now re-compute the process creation timestamp using the already present first-available-timestamp. This has a number of advantages:

- It should fix this bug as the measurements after a restart are now in line with the first startup (I've verified this in the telemetry)

- It does not require a machine-specific counter to be serialized into the variable allowing the relevant code to be factorized

- It should also fix issues with a stray MOZ_APP_RESTART environment variable being left behind such as bug 896025

I have given only light testing to this patch (Linux only essentially) but as soon as I'll have verified it on more platforms I'll submit it for review.
Assignee: nobody → gsvelto
Status: NEW → ASSIGNED
CC'ing :mak because this might also be the right fix for bug 896025
it looks good and it's also a good solution to bug 896025
Blocks: 896025
This is a slightly revised patch. The functionality is exactly the same as described in comment 3 but I've moved the implementation of the generic methods out of the header file and into an OS-independent .cpp file. The rationale behind this choice is that our build time is long enough already and I don't want to add more code to a header that's included in a lot of different places. Additionally this makes for a nice place to put the OS-independent parts of this class (I've moved the definition of the class' static fields there too).

BTW if I understand our build-files policy correctly this patch shouldn't need to be reviewed by a build-peer because it's only adding an item in an existing list variable in the moz.build file.
Attachment #782625 - Attachment is obsolete: true
Attachment #783155 - Flags: review?(nfroyd)
(In reply to Gabriele Svelto [:gsvelto] from comment #6)
> BTW if I understand our build-files policy correctly this patch shouldn't
> need to be reviewed by a build-peer because it's only adding an item in an
> existing list variable in the moz.build file.

that's correct.
Comment on attachment 783155 [details] [diff] [review]
[PATCH] When restarting recompute the process startup timestamp instead of using the timestamp stored it in MOZ_APP_RESTART during the shutdown procedure

Review of attachment 783155 [details] [diff] [review]:
-----------------------------------------------------------------

r=me.  Hooray for unifying the ProcessCreation implementations!

::: xpcom/ds/TimeStamp.cpp
@@ +24,5 @@
> +  if (sProcessCreation.IsNull()) {
> +    char *mozAppRestart = PR_GetEnv("MOZ_APP_RESTART");
> +    TimeStamp ts;
> +
> +    if (mozAppRestart && strcmp(mozAppRestart, "")) {

I think explicit comparisons to 0 are better with strcmp.

I was a little confused with this check and the PR_SetEnv call below; it might be worthwhile explaining that PR_SetEnv may just set MOZ_APP_RESTART to the null string and that's why we can't just check for the presence of MOZ_APP_RESTART.

@@ +30,5 @@
> +      // process startup time and unset MOZ_APP_RESTART.
> +      ts = sFirstTimeStamp;
> +      PR_SetEnv("MOZ_APP_RESTART=");
> +    } else {
> +      TimeStamp now = TimeStamp::Now();

Nit: No TimeStamp:: prefix needed here.

::: xpcom/ds/TimeStamp.h
@@ +349,5 @@
> +   *
> +   * @returns The number of microseconds since the calling process was started
> +   *          or 0 if an error was encountered while computing the uptime
> +   */
> +  static uint64_t ComputeProcessUptime();

This should be private, not public.

::: xpcom/ds/TimeStamp_posix.cpp
@@ +305,5 @@
>  // Computes and returns the process uptime in ns on various BSD flavors.
>  // Returns 0 if an error was encountered.
>  
>  static uint64_t
> +TimeStamp::ComputeProcessUptime()

Remove the static qualifier here.

@@ +346,5 @@
>  
>  #else
>  
>  static uint64_t
> +TimeStamp::ComputeProcessUptime()

...and here.
Attachment #783155 - Flags: review?(nfroyd) → review+
Thanks for the review Nathan. I've refreshed the patch addressing the issues you raised in comment 9 except this one:

(In reply to Nathan Froyd (:froydnj) from comment #9)
> ::: xpcom/ds/TimeStamp.h
> @@ +349,5 @@
> > +   *
> > +   * @returns The number of microseconds since the calling process was started
> > +   *          or 0 if an error was encountered while computing the uptime
> > +   */
> > +  static uint64_t ComputeProcessUptime();
> 
> This should be private, not public.

I've double-checked and the method is already under the private part of the class. I've added documentation for it to make it more comprehensible just like I did to other private fields that I added in the past.
Attachment #783155 - Attachment is obsolete: true
(In reply to Gabriele Svelto [:gsvelto] from comment #10)
> > > +  static uint64_t ComputeProcessUptime();
> > 
> > This should be private, not public.
> 
> I've double-checked and the method is already under the private part of the
> class. I've added documentation for it to make it more comprehensible just
> like I did to other private fields that I added in the past.

Doh, you're right.  My mistake!
I've finally managed to get a working Windows build and the telemetry is looking good there too. I'm also running the final patch through the try servers: 

https://tbpl.mozilla.org/?tree=Try&rev=b6e81c7b8be2

Once it's done I'll flag this for check-in and we can also verify that it fixes 896025. Afterwards I'd also like to fix bug 858927 so that we can get coherent timestamps even outside of Gecko.
Try run is green except for a couple of intermittent timeouts and an error on Windows 7 which looks JIT-related but which I've never seen before. It looks like a parsing error or something along the lines which I don't think is related to my patch.
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/612a421df3ae
Status: ASSIGNED → RESOLVED
Closed: 6 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla25
We should probably uplift this, at least to Aurora, to avoid wrong startup measurements and bug 896025
Attachment 783740 [details] [diff] doesn't apply cleanly to aurora so here's a back-ported patch; the changes are minimal and only regard the moz.build file.

[Approval Request Comment]
Bug caused by (feature/regressing bug #): The way process creation time was measured after a restart has been in place since bug 628363 from what I could gather
User impact if declined: Minimal, under certain conditions the user can see a prompt telling her that Firefox startup was slow (a very slow shutdown or a crash such as in bug 896025)
Testing completed (on m-c, etc.): Testing was done on m-c, the try run is here https://tbpl.mozilla.org/?tree=Try&rev=b6e81c7b8be2 . Additionally I've manually built aurora on my box and checked that the telemetry results are correct after restarts
Risk to taking this patch (and alternatives if risky): Minimal, the affected code is identical in aurora and m-c
String or IDL/UUID changes made by this patch: none
Attachment #784335 - Flags: approval-mozilla-aurora?
(In reply to Gabriele Svelto [:gsvelto] from comment #17)
> Created attachment 784335 [details] [diff] [review]
> [PATCH aurora] When restarting recompute the process startup timestamp
> instead of using the timestamp stored it in MOZ_APP_RESTART during the
> shutdown procedure
> 
> Attachment 783740 [details] [diff] doesn't apply cleanly to aurora so here's
> a back-ported patch; the changes are minimal and only regard the moz.build
> file.
> 
> [Approval Request Comment]
> Bug caused by (feature/regressing bug #): The way process creation time was
> measured after a restart has been in place since bug 628363 from what I
> could gather
> User impact if declined: Minimal, under certain conditions the user can see
> a prompt telling her that Firefox startup was slow (a very slow shutdown or
> a crash such as in bug 896025)
> Testing completed (on m-c, etc.): Testing was done on m-c, the try run is
> here https://tbpl.mozilla.org/?tree=Try&rev=b6e81c7b8be2 . Additionally I've
> manually built aurora on my box and checked that the telemetry results are
> correct after restarts
> Risk to taking this patch (and alternatives if risky): Minimal, the affected
> code is identical in aurora and m-c
> String or IDL/UUID changes made by this patch: none

Any chance  could set-up test profiles to ensure 896025 was resolved and the slow-start up algorithm is working as expected in addition to the m-c testing ?
Flags: needinfo?(gsvelto)
Attachment #784335 - Flags: approval-mozilla-aurora? → approval-mozilla-beta?
(In reply to bhavana bajaj [:bajaj] from comment #18)
> Any chance  could set-up test profiles to ensure 896025 was resolved and the
> slow-start up algorithm is working as expected in addition to the m-c
> testing ?

Marco did some testing over in bug 896205 comment 16 that confirms this fix works as expected on m-c.
Attachment #784335 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
(In reply to Nathan Froyd (:froydnj) from comment #19)
> Marco did some testing over in bug 896205 comment 16 that confirms this fix
> works as expected on m-c.

I think you meant bug 896025 comment 16
Assuming no QA needed here. Please remove [qa-] from the whiteboard and add the verifyme keyword if this needs QA.
Whiteboard: [qa-]
You need to log in before you can comment on or make changes to this bug.