The default bug view has changed. See this FAQ.

Intermittent TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/components/sessionstore/test/browser/browser_522375.js | process created before main is run, and other errors

RESOLVED FIXED in Firefox 10

Status

()

Firefox
Session Restore
RESOLVED FIXED
6 years ago
4 years ago

People

(Reporter: mbrubeck, Assigned: gsvelto)

Tracking

({intermittent-failure})

Trunk
Firefox 10
x86_64
Linux
intermittent-failure
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

Attachments

(2 attachments, 4 obsolete attachments)

(Reporter)

Description

6 years ago
https://tbpl.mozilla.org/php/getParsedLog.php?id=6554847&tree=Firefox
Rev3 Fedora 12x64 mozilla-central opt test mochitest-other on 2011-09-26 11:13:18 PDT for push 44ef245b8706

builder: mozilla-central_fedora64_test-mochitest-other
slave: talos-r3-fed64-036
starttime: 1317060798.19
results: warnings (1)
buildid: 20110926073617
builduid: f8ea518cda884d3981afe49eb8c18f0b
revision: 44ef245b8706

INFO TEST-END | chrome://mochitests/content/browser/browser/components/sessionstore/test/browser/browser_514751.js | finished in 173ms
TEST-INFO | checking window state
TEST-INFO | unknown test url | before wait for focus -- loaded: complete active window: ([object ChromeWindow]) chrome://browser/content/browser.xul focused window: ([object Window]) about:blank desired window: ([object ChromeWindow]) chrome://browser/content/browser.xul child window: ([object Window]) about:blank docshell visible: true
TEST-INFO | unknown test url | already focused
TEST-INFO | unknown test url | maybe run tests <load:true, focus:true> -- loaded: complete active window: ([object ChromeWindow]) chrome://browser/content/browser.xul focused window: ([object Window]) about:blank desired window: ([object ChromeWindow]) chrome://browser/content/browser.xul child window: ([object Window]) about:blank docshell visible: true
TEST-START | chrome://mochitests/content/browser/browser/components/sessionstore/test/browser/browser_522375.js
TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/components/sessionstore/test/browser/browser_522375.js | process created before main is run ({process:(new Date(1317061148373)), createTopLevelWindow:(new Date(1317061149410)), firstPaint:(new Date(1317061152628)), sessionRestored:(new Date(1317061470046))})
Stack trace:
    JS frame :: chrome://mochitests/content/browser/browser/components/sessionstore/test/browser/browser_522375.js :: test :: line 4
    JS frame :: chrome://mochikit/content/browser-test.js :: Tester_execTest :: line 277
    JS frame :: chrome://mochikit/content/browser-test.js :: <TOP_LEVEL> :: line 227
    native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0

TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/components/sessionstore/test/browser/browser_522375.js | main ran before first paint ({process:(new Date(1317061148373)), createTopLevelWindow:(new Date(1317061149410)), firstPaint:(new Date(1317061152628)), sessionRestored:(new Date(1317061470046))})
Stack trace:
    JS frame :: chrome://mochitests/content/browser/browser/components/sessionstore/test/browser/browser_522375.js :: test :: line 8
    JS frame :: chrome://mochikit/content/browser-test.js :: Tester_execTest :: line 277
    JS frame :: chrome://mochikit/content/browser-test.js :: <TOP_LEVEL> :: line 227
    native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0

TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/browser/components/sessionstore/test/browser/browser_522375.js | Session restored after main ({process:(new Date(1317061148373)), createTopLevelWindow:(new Date(1317061149410)), firstPaint:(new Date(1317061152628)), sessionRestored:(new Date(1317061470046))})
Stack trace:
    JS frame :: chrome://mochitests/content/browser/browser/components/sessionstore/test/browser/browser_522375.js :: test :: line 10
    JS frame :: chrome://mochikit/content/browser-test.js :: Tester_execTest :: line 277
    JS frame :: chrome://mochikit/content/browser-test.js :: <TOP_LEVEL> :: line 227
    native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0

INFO TEST-END | chrome://mochitests/content/browser/browser/components/sessionstore/test/browser/browser_522375.js | finished in 28ms

Skipping 18902 lines...

INFO TEST-END | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug618502.js | finished in 585ms
TEST-INFO | checking window state
TEST-INFO | unknown test url | before wait for focus -- loaded: complete active window: ([object ChromeWindow]) chrome://browser/content/browser.xul focused window: ([object Window]) about:blank desired window: ([object ChromeWindow]) chrome://browser/content/browser.xul child window: ([object Window]) about:blank docshell visible: true
TEST-INFO | unknown test url | already focused
TEST-INFO | unknown test url | maybe run tests <load:true, focus:true> -- loaded: complete active window: ([object ChromeWindow]) chrome://browser/content/browser.xul focused window: ([object Window]) about:blank desired window: ([object ChromeWindow]) chrome://browser/content/browser.xul child window: ([object Window]) about:blank docshell visible: true
TEST-START | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug623950.js
*** LOG addons.repository: Requesting http://127.0.0.1:8888/extensions-dummy/metadata?appOS=Linux&appVersion=9.0a1&tMain=%TIME_MAIN%&tFirstPaint=4255&tSessionRestored=354498
TEST-INFO | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug623950.js | appOS=Linux&appVersion=9.0a1&tMain=%TIME_MAIN%&tFirstPaint=4255&tSessionRestored=354498
TEST-PASS | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug623950.js | OS should be correct
TEST-PASS | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug623950.js | Version should be correct
TEST-UNEXPECTED-FAIL | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug623950.js | Should be a sensible tMain
Stack trace:
    JS frame :: chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug623950.js :: observe :: line 45
    JS frame :: resource://gre/modules/AddonRepository.jsm :: <TOP_LEVEL> :: line 1164
    JS frame :: resource://gre/modules/AddonRepository.jsm :: <TOP_LEVEL> :: line 734
    JS frame :: chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug623950.js :: test :: line 63
    JS frame :: chrome://mochikit/content/browser-test.js :: Tester_execTest :: line 277
    JS frame :: chrome://mochikit/content/browser-test.js :: <TOP_LEVEL> :: line 227
    native frame :: <unknown filename> :: <TOP_LEVEL> :: line 0

TEST-PASS | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug623950.js | Should be a sensible tFirstPaint
TEST-PASS | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug623950.js | Should be a sensible tSessionRestored
TEST-INFO | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug623950.js | Console message: [JavaScript Error: "formatURL: Couldn't find value for key: TIME_MAIN" {file: "resource:///components/nsURLFormatter.js" line: 131}]
TEST-INFO | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug623950.js | Console message: LOG addons.repository: Requesting http://127.0.0.1:8888/extensions-dummy/metadata?appOS=Linux&appVersion=9.0a1&tMain=%TIME_MAIN%&tFirstPaint=4255&tSessionRestored=354498
TEST-PASS | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug623950.js | Should have seen metadata request
INFO TEST-END | chrome://mochitests/content/browser/toolkit/mozapps/extensions/test/browser/browser_bug623950.js | finished in 22ms
Comment hidden (Treeherder Robot)
Taras, this seems pretty odd to have fail. It actually seems like TIME_MAIN is just never set (seeing as the AOM test fails too).
Blocks: 522375

Comment 3

6 years ago
(In reply to Paul O'Shannessy [:zpao] from comment #2)
> Taras, this seems pretty odd to have fail. It actually seems like TIME_MAIN
> is just never set (seeing as the AOM test fails too).

What's AOM?

This is bizarre, we are looking into it. Are these failures all 64bit linux? Not clear from later entries.

Comment 4

6 years ago
This could be a regression from stricter checking introduced by bug 670008 http://hg.mozilla.org/mozilla-central/rev/caf2dc068c23

This combined with clock-skew and/or some issue with using jiffies could cause main > process-creation

Comment 5

6 years ago
We can confirm above by adding a check for STARTUP_MEASUREMENT_ERRORS telemetry histogram to the test.

Comment 6

6 years ago
Created attachment 565385 [details] [diff] [review]
relying on time sucks
Attachment #565385 - Flags: review?(mh+mozilla)

Comment 7

6 years ago
Instead of checking main value and discarding it if doesn't make sense, the patch leaves main in. This is OK because .main and .firstPaint assignment should always result in correct values(ie PR_Now doesn't fail on unix).
 This will not fix the orange, but will help us diagnose if the error is actually to due to the process timestamp or something funky is going on with the main timestamp.
Attachment #565385 - Flags: review?(mh+mozilla) → review+

Updated

6 years ago
Keywords: checkin-needed

Updated

6 years ago
Assignee: nobody → tglek
In my queue with a few other bits that are being sent to try first and then onto inbound :-)

https://tbpl.mozilla.org/?tree=Try&rev=c1528769b893
Status: NEW → ASSIGNED
Keywords: checkin-needed
https://hg.mozilla.org/integration/mozilla-inbound/rev/be7e806933a4
Target Milestone: --- → Firefox 10
(Reporter)

Comment 10

6 years ago
https://hg.mozilla.org/mozilla-central/rev/be7e806933a4
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
Comment hidden (Treeherder Robot)
Keywords: intermittent-failure
Whiteboard: [orange]
(In reply to Taras Glek from comment #7)
>  This will not fix the orange, but will help us diagnose

Oops.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
The recent surge in this was caused by the landing of bug 841651.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 100

4 years ago
I'm having some trouble reproducing this on my machine. I'm using a 64-bit debug build (/w optimization turned on) on a Fedora 18 machine.

I can get browser_addonrepository_performance.js to fail but it will bail out with a "Should be a sensible tFirstPaint" error message instead of "Should be a sensible tMain". Printing out tMain and tFirstPaint yields me a >0 value for tMain (which in the logs linked from this bug must have been set to something else for the test to fail) and the '%TIME_FIRST_PAINT%' string for tFirstPaint.

On the other hand I was unable to reproduce the issue in browser_addonrepository_performance.js.

Since these bugs seem to be timing-related is there any known way to trigger them in an easier way? I'll try loading my machine a lot in order to slow down threads with lower priority but I'm not sure if this is going to be enough to reproduce them consistently.
Comment hidden (Treeherder Robot)
(In reply to Gabriele Svelto [:gsvelto] from comment #100)
> I'm having some trouble reproducing this on my machine. I'm using a 64-bit
> debug build (/w optimization turned on) on a Fedora 18 machine.

Note that this is happening almost exclusively on 32bit builds. Might help in catching what's going on.
(Assignee)

Comment 103

4 years ago
(In reply to Ryan VanderMeulen [:RyanVM] from comment #102)
> Note that this is happening almost exclusively on 32bit builds. Might help
> in catching what's going on.

OK, I'll try to make 32-bit build but I think I've figured out what's wrong here: the process creation and main timings are obtained by accessing the mozilla::StartupTimeline. The 'main' value is recorded by invoking:

StartupTimeline::Record(StartupTimeline::MAIN);

Within XREMain::XRE_mainInit(), this will use PR_Now() to record the time of the call. The 'process start' event on the other hand is computed in nsAppStartup.cpp by the ThreadedCalculateProcessCreationTimestamp() call. This call will look at the /proc/*/stat file of the current process and derive the time when the kernel created the process by looking at the |starttime| field.

The source for this field is the number of clock ticks as measured by the kernel since startup, the value returned from PR_Now() on the other hand is obtained via gettimeofday(). Since the two time sources are different it doesn't surprise me that they might be off-sync and comparing their results directly is probably a bad idea. A way to mitigate this issue would be to modify nsAppStartup::GetStartupInfo() to clamp obviously wrong times for process creation. We already have a check for that so I could easily add another one to fix the issue.

The second issue is also caused by this: tMain is being computed as startupInfo.main - startupInfo.process in toolkit/mozapps/extensions/AddonRepository.jsm thus yielding a negative value.
Comment hidden (Treeherder Robot)
(Assignee)

Comment 105

4 years ago
Created attachment 720738 [details] [diff] [review]
[PATCH] Ensure that the startup MAIN timestamp is never older than the PROCESS_CREATION one

This patch compares the timestamps for the MAIN and PROCESS_CREATION events and if MAIN appears to have occurred earlier than PROCESS_CREATION sets both to the value of MAIN.

This will fix the issue we're currently seeing but will obviously skew the measurements when the two timestamps are inconsistent. Alternatively we could set PROCESS_CREATION to -1 if it's value is inconsistent with MAIN and bump Telemetry::STARTUP_MEASUREMENT_ERRORS like we do already. In this case we'd need to patch both tests to check for STARTUP_MEASUREMENT_ERRORS and skip the tests if those errors are encountered.

Both approaches work for me so if we think this patch is not the right way to go I'll prepare another one against the tests ASAP.
Attachment #720738 - Flags: review?(mh+mozilla)
Comment on attachment 720738 [details] [diff] [review]
[PATCH] Ensure that the startup MAIN timestamp is never older than the PROCESS_CREATION one

Review of attachment 720738 [details] [diff] [review]:
-----------------------------------------------------------------

I think it would be better to mark the measures as errors.
Attachment #720738 - Flags: review?(mh+mozilla) → review-
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 109

4 years ago
Created attachment 720793 [details] [diff] [review]
[PATCH v2] Record invalid PROCESS_CREATION timestamps and adjust tests to cope with them

Revised patch, this uses telemetry to record invalid PROCESS_CREATION timestamps and adjusts the tests so that the ones for which we don't have reliable data are skipped.
Attachment #720738 - Attachment is obsolete: true
Attachment #720793 - Flags: review?(mh+mozilla)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment on attachment 720793 [details] [diff] [review]
[PATCH v2] Record invalid PROCESS_CREATION timestamps and adjust tests to cope with them

Review of attachment 720793 [details] [diff] [review]:
-----------------------------------------------------------------

I'm ok with the C++ change, but I'm not sure silently skipping the test is the right thing to do. Is it possible to turn them into known-failures?
Attachment #720793 - Flags: review?(mh+mozilla) → review+
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 126

4 years ago
Created attachment 721148 [details] [diff] [review]
[PATCH v3] Record invalid PROCESS_CREATION timestamps and adjust tests to respond to errors with known failures

This is a revised patch which uses todo() to record known failures instead of skipping the tests altogether. Unfortunately I'm not familiar enough with mochitest to figure out if this is the right approach so before obsoleting the previous patch I'll try to find a mochitest expert to have a look at my changes.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 132

4 years ago
Created attachment 721297 [details] [diff] [review]
[PATCH v4] Record invalid PROCESS_CREATION timestamps and adjust tests to respond to errors with known failures

I've investigated a little further how to handle the mochitests and apparently the common idiom we use for this situation is to record a known failure via a |todo(false, "explain why this is a known failure")| call so I've revised my patch accordingly.

Mike, should I ask an additional review from a unit-test peer for this patch? If it's not needed I'll just obsolete the second version and request for checking-in this one if the try run is green.

The try run (slimmed down to touch only the affected tests) is here:

https://tbpl.mozilla.org/?tree=Try&rev=cc05a863544e
Attachment #721148 - Attachment is obsolete: true
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Updated

4 years ago
Attachment #720793 - Attachment is obsolete: true
(Assignee)

Comment 147

4 years ago
The try-run was green, the patch C++ bits are unmodified compared to the one that was r+'d by Mike and the tests now issue known failures when the problem is detected like other mochitests do. Since no other changes were added if nobody has anything against it I think the patch should be good to checkin with r=mh+mozilla, feel free to clear the checkin-needed keyword if you feel otherwise.

To avoid confusion the patch that needs to be checked-in is attachment 721297 [details] [diff] [review].
Keywords: checkin-needed
https://hg.mozilla.org/integration/mozilla-inbound/rev/f856bcb5097f

*fingers crossed* - Thanks for working on this, Gabriele!
Keywords: checkin-needed
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Backed out for xpcshell failures.
https://hg.mozilla.org/integration/mozilla-inbound/rev/bc5920039204

https://tbpl.mozilla.org/php/getParsedLog.php?id=20386661&tree=Mozilla-Inbound
Comment hidden (Treeherder Robot)
(Assignee)

Comment 154

4 years ago
(In reply to Ryan VanderMeulen [:RyanVM] from comment #152)
> Backed out for xpcshell failures.
> https://hg.mozilla.org/integration/mozilla-inbound/rev/bc5920039204

OK, I've traced back the errors to the session recorder, which requires the process startup time to be set to a valid value:

https://mxr.mozilla.org/mozilla-central/source/services/datareporting/sessions.jsm#219

In addition I've found other consumers relying on the process creation timestamp field being present and set to a valid value:

https://mxr.mozilla.org/mozilla-central/source/toolkit/components/telemetry/TelemetryPing.js#118
https://mxr.mozilla.org/mozilla-central/source/toolkit/mozapps/extensions/AddonRepository.jsm#744
https://mxr.mozilla.org/mozilla-central/source/browser/metro/base/content/browser-ui.js#185
https://mxr.mozilla.org/mozilla-central/source/mobile/xul/chrome/content/browser-ui.js#499
https://mxr.mozilla.org/mozilla-central/source/browser/components/nsBrowserGlue.js#445

So we've got two options here:

- Going back to my first patch where we would fake the process creation timestamp by setting it to be equal to the main timestamp when we detect the values are inverted. To mitigate the hackiness of this we might still record a telemetry error to indicate that something went wrong when the measurement happened.

- Changing all consumers to use the main timestamp instead of the process creation timestamp as the startup time. This is more intrusive but would have the advantage of actually having all consumers use timestamps that were taken via the same time source (i.e. PR_Now()) and so are necessarily comparable.

BTW I've spotted another corner case when we would get erroneous values in these fields: since PR_Now() uses gettimeofday() it's susceptible to time-jumps caused when the system clock is manually set and these could be reflected in the recorded timestamps. Considering this is a very unlikely event I don't think we should bother too much with it but it serves to show that having more checks to sanitize these values can't hurt.
Maybe bug 793735 would be part of the solution, although last time i looked, it was tricky...
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 162

4 years ago
(In reply to Mike Hommey [:glandium] from comment #155)
> Maybe bug 793735 would be part of the solution, although last time i looked,
> it was tricky...

Yes, it would definitely fix any possible time-jump issue caused by things like NTP kicking in and adjusting for DST or so but I would still need to find a way to accommodate for the process creation time we compute from /proc/*/stat. After giving it more thought I think the best approach is to set process creation to a valid value (i.e. main) in case the value is inconsistent but still record the issue using telemetry. This way all consumers should be happy but we will still be able to issue known failures in the mochitests.
Agreed.
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
(Assignee)

Comment 174

4 years ago
I've got a revised patch, it's not ready yet because I've been hitting a snag when the timestamps are reset when the browser is restarted. I'll proceed to fixing bug 793735 as soon as I'm done here. I've got a WIP for that one too but I'd like to finish here first in order not to drag the problem along.
Assignee: taras.mozilla → gsvelto
Comment hidden (Treeherder Robot)
(Assignee)

Comment 176

4 years ago
Created attachment 722780 [details] [diff] [review]
[PATCH v5] Record invalid PROCESS_CREATION timestamps and adjust tests to respond to errors with known failures

This is hopefully the last iteration of this patch. The changes with the previous one are a bit subtle so I'm requesting the review again. The changes are essentially the following:

- Whenever an inconsistent process creation timestamp is detected telemetry is used to record that it was erroneous.

- If there is a valid main function timestamp it is used to replace the erroneous process creation timestamp to prevent consumers expecting it to be there from breaking. If no valid main function timestamp is present the erroneous process creation timestamp is set to -1 just like before the patch.

- Tests have been modified to issue known failures depending on the presence of telemetry errors instead of the value of the process creation timestamp.

I've pushed the patch to try with all mochitests & xpcshell tests and it turned out green:

https://tbpl.mozilla.org/?tree=Try&rev=65390f74a834
Attachment #721297 - Attachment is obsolete: true
Attachment #722780 - Flags: review?(mh+mozilla)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Attachment #722780 - Flags: review?(mh+mozilla) → review+
(Assignee)

Comment 199

4 years ago
Thanks for the review! Hopefully this should be fixed for good this time.

Again to avoid any confusion with the older patch the one to be checked in is attachment 722780 [details] [diff] [review].
Keywords: checkin-needed
Comment hidden (Treeherder Robot)
https://hg.mozilla.org/integration/mozilla-inbound/rev/d6f51f2bac4b
Flags: in-testsuite+
Keywords: checkin-needed
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
https://hg.mozilla.org/mozilla-central/rev/d6f51f2bac4b
Status: REOPENED → RESOLVED
Last Resolved: 6 years ago4 years ago
Resolution: --- → FIXED
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
Comment hidden (Treeherder Robot)
You need to log in before you can comment on or make changes to this bug.