sessionrestore seems broken in general

RESOLVED FIXED in Firefox 65

Status

enhancement
RESOLVED FIXED
9 months ago
8 months ago

People

(Reporter: igoldan, Assigned: mconley)

Tracking

Trunk
mozilla65
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox65 fixed)

Details

Attachments

(5 attachments)

We get profiles for ts_paint, tpaint, a11y but fail on sessionrestore- looking at the logs, it appears that sessionrestore is broken in general- maybe there is an issue with sessionrestore and profile gathering, it was recently converted to a web extension.

There's also something suspicious with the test generally though, baseline measurements for sessionrestore_no_auto_restore are higher than vanilla sessionrestore measurements, which makes no sense.
While we're looking closely at sessionrestore, this would be a good time to take care of Florian's suggestion in https://bugzilla.mozilla.org/show_bug.cgi?id=1491997#c11
Started looking at this so I can investigate bug 1503796.

I think I know what's happening here. Patches shortly.
Assignee: nobody → mconley
Essentially, there used to be code that called TalosContentProfiler.initFromURLQueryParams before dumping the profile. That call piped together a bunch of arguments that were passed through the command-line - and, critically, let startup tests (which kicks off profiling before Talos even gets loaded) know where to save start-up test files.

This initFromURLQueryParams stuff isn't happening now - and even when I patch it so it does, it's being ignored because TalosParentProfiler is already "initted" when we're start-up profiling.

So, unsurprisingly, a minor mess on our hands here.

I have some patches baking on try with a potential fix, but I'm only 50% confident that it'll work for all Talos tests that support profiling. I have some alternatives if that falls through.
So my original idea didn't pan out - it didn't work for pageloader tests. This is because TalosParentProfiler would be loaded by pageloader, and then a separate instance would be loaded by some pageloader tests, and they'd go out of sync. Trying to make the TalosParentProfiler's initted boolean be true if the profiler was already running was a very weak attempt on my part in bug 1374333 to try to have agreed state between multiple copies of TalosParentProfiler (the only state that'd kinda be in agreement would be the initted value).

My new approach is to make the TalosParentProfiler a singleton - that way, everybody's talking to the same TalosParentProfiler.

This seems to fix things.

To fix sessionrestore, I needed to then make it call TalosParentProfiler.initFromURLQueryParams with the stuff it's getting from the command line. I also got rid of some dead code.
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/e9d439c05850
Make TalosParentProfiler a singleton module. r=aswan
https://hg.mozilla.org/integration/autoland/rev/c544521fcec8
Make sure sessionrestore tests initialize the TalosParentProfiler. r=aswan
https://hg.mozilla.org/integration/autoland/rev/7bd65d135a65
Remove dead code from sessionrestore Talos test. r=aswan
https://hg.mozilla.org/integration/autoland/rev/443ffec35c5f
Make pageloader tests wait for TalosPowers to become available before starting. r=aswan
Backed out for damp failures on  TalosParentProfiler

backout: https://hg.mozilla.org/integration/autoland/rev/46dbc3229ed09b52e31c2f7adad962878b8c460b


failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=210692931&repo=autoland&lineNumber=767

23:51:19     INFO -  TEST-INFO | started process 5732 (C:\Users\task_1541720192\build\application\firefox\firefox -wait-for-browser -no-deelevate -profile c:\users\task_1541720192\appdata\local\temp\tmpolpmvu\profile)
23:51:20     INFO -  PID 5732 | 1541721080188	addons.webextension.screenshots@mozilla.org	WARN	Loading extension 'screenshots@mozilla.org': Reading manifest: Invalid host permission: resource://pdf.js/
23:51:20     INFO -  PID 5732 | 1541721080188	addons.webextension.screenshots@mozilla.org	WARN	Loading extension 'screenshots@mozilla.org': Reading manifest: Invalid host permission: about:reader*
23:51:22     INFO -  PID 5732 | [Parent 1120, Gecko_IOThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
23:51:22     INFO -  PID 5732 | [Child 9028, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
23:51:22     INFO -  PID 5732 | [Child 9028, Chrome_ChildThread] WARNING: pipe error: 109: file z:/build/build/src/ipc/chromium/src/chrome/common/ipc_channel_win.cc, line 346
23:51:24     INFO -  PID 5732 | Initialize the head file with a reference to this DAMP instance
23:51:24    ERROR -  PID 5732 | TEST-UNEXPECTED-FAIL | damp | TypeError: TalosParentProfiler is undefined; can't access its "resume" property
23:51:24     INFO -  Terminating psutil.Process(pid=5732L, name='firefox.exe', started='23:51:19')
23:51:31     INFO -  rmtree() failed for "('c:\\users\\task_1541720192\\appdata\\local\\temp\\tmpolpmvu',)". Reason: The process cannot access the file because it is being used by another process (13). Retrying...
23:51:31     INFO -  rmtree() failed for "('c:\\users\\task_1541720192\\appdata\\local\\temp\\tmpolpmvu',)". Reason: The process cannot access the file because it is being used by another process (13). Retrying...
23:51:31     INFO -  rmtree() failed for "('c:\\users\\task_1541720192\\appdata\\local\\temp\\tmpolpmvu',)". Reason: The process cannot access the file because it is being used by another process (13). Retrying...
23:51:31     INFO -  rmtree() failed for "('c:\\users\\task_1541720192\\appdata\\local\\temp\\tmpolpmvu',)". Reason: The process cannot access the file because it is being used by another process (13). Retrying...
23:51:31     INFO -  rmtree() failed for "('c:\\users\\task_1541720192\\appdata\\local\\temp\\tmpolpmvu',)". Reason: The process cannot access the file because it is being used by another process (13). Retrying...
23:51:31     INFO -  Exception while removing profile directory: c:\users\task_1541720192\appdata\local\temp\tmpolpmvu
23:51:31     INFO -  [Error 32] The process cannot access the file because it is being used by another process: 'c:\\users\\task_1541720192\\appdata\\local\\temp\\tmpolpmvu\\profile\\cert9.db'
23:51:31     INFO -  TEST-UNEXPECTED-ERROR | damp | unexpected error
23:51:31    ERROR -  Traceback (most recent call last):
23:51:31     INFO -    File "C:\Users\task_1541720192\build\tests\talos\talos\run_tests.py", line 304, in run_tests
23:51:31     INFO -      talos_results.add(mytest.runTest(browser_config, test))
23:51:31     INFO -    File "C:\Users\task_1541720192\build\tests\talos\talos\ttest.py", line 64, in runTest
23:51:31     INFO -      return self._runTest(browser_config, test_config, setup)
23:51:31     INFO -    File "C:\Users\task_1541720192\build\tests\talos\talos\ttest.py", line 215, in _runTest
23:51:31     INFO -      debugger_args=browser_config['debugger_args']
23:51:31     INFO -    File "C:\Users\task_1541720192\build\tests\talos\talos\talos_process.py", line 156, in run_browser
23:51:31     INFO -      raise TalosError("unexpected error")
23:51:31     INFO -  TalosError: unexpected error
23:51:31     INFO -  TEST-INFO took 13056ms
23:51:31     INFO -  SUITE-END | took 13s
23:54:29     INFO -  WARNING | IO Completion Port failed to signal process shutdown
23:54:29     INFO -  Parent process 5732 exited with children alive:
23:54:29     INFO -  PIDS: 1120, 872, 7500, 7280, 5048
23:54:29     INFO -  Attempting to kill them, but no guarantee of success
23:54:29    ERROR - Return code: 2
Flags: needinfo?(mconley)
Huh. So I think hg absorb ate my change to the damp test. :/ So I guess I'll be a little more skeptical of it now. :/
Flags: needinfo?(mconley)
Pushed by mconley@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/21a96274fe32
Make TalosParentProfiler a singleton module. r=aswan
https://hg.mozilla.org/integration/autoland/rev/ea40fb28a2bf
Make sure sessionrestore tests initialize the TalosParentProfiler. r=aswan
https://hg.mozilla.org/integration/autoland/rev/46dd8d73a77a
Remove dead code from sessionrestore Talos test. r=aswan
https://hg.mozilla.org/integration/autoland/rev/c6589add558c
Make pageloader tests wait for TalosPowers to become available before starting. r=aswan
https://hg.mozilla.org/integration/autoland/rev/64449193e189
Ensure TalosParentProfiler is around before running the DAMP tests. r=aswan
See Also: → 1495069
You need to log in before you can comment on or make changes to this bug.