Closed Bug 1511818 Opened Last year Closed 11 months ago

4.53 - 41.04% tp5o responsiveness / tp5o_webext responsiveness / tps (linux64, linux64-qr, windows10-64, windows10-64-qr, windows7-32) regression on push e54c1b4fa9c653c03d90600d7dd7d04279c1f889 (Thu Nov 29 2018)

Categories

(Firefox :: Protections UI, defect, P2)

65 Branch
defect

Tracking

()

VERIFIED FIXED
Firefox 66
Tracking Status
firefox-esr60 --- unaffected
firefox63 --- unaffected
firefox64 --- unaffected
firefox65 --- fixed
firefox66 --- fixed

People

(Reporter: Bebe, Assigned: ehsan)

References

(Blocks 1 open bug)

Details

(Keywords: perf, regression, talos-regression, Whiteboard: [privacy65])

Attachments

(1 file)

Talos has detected a Firefox performance regression from push:

https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?changeset=e54c1b4fa9c653c03d90600d7dd7d04279c1f889

As author of one of the patches included in that push, we need your help to address this regression.

Regressions:

 41%  tp5o responsiveness windows7-32 opt e10s stylo                0.85 -> 1.20
 39%  tp5o responsiveness linux64 opt e10s stylo                    0.98 -> 1.36
 38%  tp5o responsiveness windows10-64 pgo e10s stylo               0.85 -> 1.16
 37%  tp5o responsiveness windows7-32 pgo e10s stylo                0.83 -> 1.13
 34%  tp5o responsiveness windows10-64-qr opt e10s stylo            0.95 -> 1.28
 34%  tp5o_webext responsiveness linux64 opt e10s stylo             1.25 -> 1.67
 33%  tp5o responsiveness linux64 pgo e10s stylo                    0.84 -> 1.13
 32%  tp5o_webext responsiveness windows10-64-qr opt e10s stylo     1.21 -> 1.61
 31%  tp5o_webext responsiveness windows7-32 opt e10s stylo         1.14 -> 1.50
 31%  tp5o_webext responsiveness windows10-64 opt e10s stylo        1.22 -> 1.59
 30%  tp5o_webext responsiveness windows10-64 pgo e10s stylo        1.12 -> 1.46
 29%  tp5o_webext responsiveness linux64 pgo e10s stylo             1.12 -> 1.44
 27%  tp5o_webext responsiveness linux64-qr opt e10s stylo          1.78 -> 2.26
 26%  tp5o_webext responsiveness windows7-32 pgo e10s stylo         1.10 -> 1.39
  5%  tps windows10-64-qr opt e10s stylo                            8.29 -> 8.67


You can find links to graphs and comparison views for each of the above tests at: https://treeherder.mozilla.org/perf.html#/alerts?id=17927

On the page above you can see an alert for each affected platform as well as a link to a graph showing the history of scores for this test. There is also a link to a treeherder page showing the Talos jobs in a pushlog format.

To learn more about the regressing test(s), please see: https://wiki.mozilla.org/Performance_sheriffing/Talos/Tests

For information on reproducing and debugging the regression, either on try or locally, see: https://wiki.mozilla.org/Performance_sheriffing/Talos/Running

*** Please let us know your plans within 3 business days, or the offending patch(es) will be backed out! ***

Our wiki page outlines the common responses and expectations: https://wiki.mozilla.org/Performance_sheriffing/Talos/RegressionBugsHandling
Component: General → Tracking Protection
Product: Testing → Firefox
Target Milestone: --- → Firefox 65
Version: Version 3 → 65 Branch
We should make sure to look into this.
Flags: needinfo?(ehsan)
Priority: -- → P2
Whiteboard: [privacy65]
Thanks for the bug report, Florin!  Do you have any Talos profile links (before and after the regression) to share please?
Flags: needinfo?(ehsan) → needinfo?(fstrugariu)
I tried running talos tests locally to see which one of these patches has caused the regression but the noise in the tests was too much for this to be a fruitful effort, so ended up pushing a series of try jobs to investigate further.
Flags: needinfo?(ehsan)
I looked at my try pushes today.  Unfortunately I messed up one of my pushes (the one that was testing the backout of part 2) so don't have results to report from that one yet, and I do suspect that one as the main underlying cause at this point.  I made another push to get results there but backing out parts 3-6 doesn't seem to fix the regression.

I also started to look at quite a few of the individual Talos profiles in despair looking for straws to grab, and found this one: https://perfht.ml/2GaFCe6.  There you can see that the URI mutation stuff I added in part 2 is showing up in the profile.  It would be really sad if that is indeed the cause but I pushed a patch to try to replace that with bare string manipulation to see what happens.  Will have more to report later, probably tomorrow.
Another update.  I finally confirmed that the regression is indeed caused by part 2: <https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=8ea04fa2b3728b33dee13e26603e0fae579820a5&newProject=try&newRevision=d28cf868755f5a0202bc4ed12b1f4eaa9270e18e&framework=1>.  Now I'm testing various strategies for a fix...
Even moving to idle dispatch didn't help (it made things much worse!)

If I get really desperate I may resort to just using _domainSet here.  I am about to run out of ideas.
https://treeherder.mozilla.org/perf.html#/compare?originalProject=try&originalRevision=8ea04fa2b3728b33dee13e26603e0fae579820a5&newProject=try&newRevision=0f568c92b942604a8c3de47450ce0704ebdbcd02&framework=1 shows that a strategy based on removing this code completely is necessary to fix this regression: https://searchfox.org/mozilla-central/rev/fd62b95c187a40b328d9e7fd9d848833a6942b57/browser/modules/BrowserUsageTelemetry.jsm#256-259

In other words, we need to slightly change what bug 1509047 part 2 implemented by changing tracking "unique origins" to "unique domains" and not worry about cases like counting hosts with raw IPv4/v6 addresses.  I think that's an acceptable trade-off for fixing this regression, given that I've tried all other avenues and nothing works besides just removing this code.
Flags: needinfo?(ehsan)
This is an attempt to fix a Talos regression caused by the cost of URL
parsing.  The approach I'm taking here is to modify the requirements of
bug 1509047 part 2 slightly so that I can remove the expensive code
which resulted in the Talos regression.

The change in behaviour is that the automatic access grants for the
Storage Access API will use the number of unique domains visited in the
session as opposed to the number of unique origins visited.
Pushed by eakhgari@mozilla.com:
https://hg.mozilla.org/integration/autoland/rev/1a679a931855
Keep track of unique domains with a valid hostname for the purposes of the automatic access grant heuristics of the Storage Access API r=mikedeboer
https://hg.mozilla.org/mozilla-central/rev/1a679a931855
Status: NEW → RESOLVED
Closed: 11 months ago
Resolution: --- → FIXED
Target Milestone: Firefox 65 → Firefox 66
Comment on attachment 9030470 [details]
Bug 1511818 - Keep track of unique domains with a valid hostname for the purposes of the automatic access grant heuristics of the Storage Access API

[Beta/Release Uplift Approval Request]

Feature/Bug causing the regression: Bug 1509047

User impact if declined: Responsiveness regression

Is this code covered by automated tests?: Yes

Has the fix been verified in Nightly?: Yes

Needs manual test from QE?: No

If yes, steps to reproduce: 

List of other uplifts needed: None

Risk to taking this patch: Low

Why is the change risky/not risky? (and alternatives if risky): The fix is pretty localized, and it partially reverts the change made in bug 1509047 to the way the code worked before that bug landed.

String changes made/needed: None.
Attachment #9030470 - Flags: approval-mozilla-beta?
Assignee: nobody → ehsan
Comment on attachment 9030470 [details]
Bug 1511818 - Keep track of unique domains with a valid hostname for the purposes of the automatic access grant heuristics of the Storage Access API

[Triage Comment]
Fixes a perf regression with the new Storage Access API. Approved for 65.0b5.
Attachment #9030470 - Flags: approval-mozilla-beta? → approval-mozilla-beta+
Backed out changeset for frequently failing xpc tests in services/sync/tests/unit/. 

Push with failures: https://treeherder.mozilla.org/#/jobs?repo=mozilla-beta&searchStr=windows%2C7%2Cdebug%2Cxpcshell%2Ctests%2Ctest-windows7-32%2Fdebug-xpcshell%2Cx%28x%29&group_state=expanded&fromchange=d5f982689c717bba98621154f5dde808319eaa91&tochange=d2bd2bb0c329208ce96e3c6fcb5e741f88c35227&selectedJob=217105789

Failure logs:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=217105789&repo=mozilla-beta&lineNumber=13648
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=217136607&repo=mozilla-beta&lineNumber=12257

Backout link: https://hg.mozilla.org/releases/mozilla-beta/rev/d2bd2bb0c329208ce96e3c6fcb5e741f88c35227

22:55:52  WARNING -  PROCESS-CRASH | services/sync/tests/unit/test_load_modules.js | application crashed [@ mozilla::places::Database::Init()]
22:55:52     INFO -  Crash dump filename: c:\users\task_1544825738\appdata\local\temp\xpc-other-qb7afv\9eb7797b-2ec6-476a-8497-fa61e6928b6d.dmp
22:55:52     INFO -  Operating system: Windows NT
22:55:52     INFO -                    6.1.7601 Service Pack 1
22:55:52     INFO -  CPU: x86
22:55:52     INFO -       GenuineIntel family 6 model 63 stepping 2
22:55:52     INFO -       8 CPUs
22:55:52     INFO -  GPU: UNKNOWN
22:55:52     INFO -  Crash reason:  EXCEPTION_BREAKPOINT
22:55:52     INFO -  Crash address: 0x567e7a21
22:55:52     INFO -  Process uptime: 181 seconds
22:55:52     INFO -  Thread 0 (crashed)
22:55:52     INFO -   0  xul.dll!mozilla::places::Database::Init() [Database.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 491 + 0x0]
22:55:52     INFO -      eip = 0x567e7a21   esp = 0x001af114   ebp = 0x001af140   ebx = 0x001af120
22:55:52     INFO -      esi = 0x0df0b820   edi = 0x001af12c   eax = 0x7278b6a4   ecx = 0x000001eb
22:55:52     INFO -      edx = 0x00000000   efl = 0x00000206
22:55:52     INFO -      Found by: given as instruction pointer in context
22:55:52     INFO -   1  xul.dll!mozilla::places::Database::GetSingleton() [Database.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 356 + 0xf]
22:55:52     INFO -      eip = 0x567e738e   esp = 0x001af148   ebp = 0x001af14c
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -   2  xul.dll!nsNavHistory::Init() [nsNavHistory.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 435 + 0x1d]
22:55:52     INFO -      eip = 0x5682e2e6   esp = 0x001af154   ebp = 0x001af16c
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -   3  xul.dll!nsNavHistory::GetSingleton() [nsNavHistory.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 381 + 0xf]
22:55:52     INFO -      eip = 0x5682e0d2   esp = 0x001af174   ebp = 0x001af17c
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -   4  xul.dll!static nsresult nsNavHistoryConstructor(class nsISupports *, const struct nsID & const, void * *) [nsPlacesModule.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 18 + 0x30]
22:55:52     INFO -      eip = 0x5685a7b0   esp = 0x001af184   ebp = 0x001af19c
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -   5  xul.dll!mozilla::GenericFactory::CreateInstance(nsISupports *,nsID const &,void * *) [GenericFactory.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 16 + 0xc]
22:55:52     INFO -      eip = 0x51fef862   esp = 0x001af1a4   ebp = 0x001af1b0
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -   6  xul.dll!nsComponentManagerImpl::CreateInstanceByContractID(char const *,nsISupports *,nsID const &,void * *) [nsComponentManager.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 1064 + 0xd]
22:55:52     INFO -      eip = 0x51ff86cb   esp = 0x001af1b8   ebp = 0x001af1e4
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -   7  xul.dll!nsComponentManagerImpl::GetServiceByContractID(char const *,nsID const &,void * *) [nsComponentManager.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 1417 + 0x12]
22:55:52     INFO -      eip = 0x51ff536a   esp = 0x001af1ec   ebp = 0x001af248
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -   8  xul.dll!nsGetServiceByContractID::operator()(nsID const &,void * *) [nsComponentManagerUtils.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 243 + 0x19]
22:55:52     INFO -      eip = 0x51ffab2d   esp = 0x001af250   ebp = 0x001af264
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -   9  xul.dll!nsCOMPtr_base::assign_from_gs_contractid(nsGetServiceByContractID,nsID const &) [nsCOMPtr.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 82 + 0x6]
22:55:52     INFO -      eip = 0x51f68be0   esp = 0x001af26c   ebp = 0x001af280
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  10  xul.dll!nsCategoryObserver::nsCategoryObserver(nsTSubstring<char> const &) [nsCategoryCache.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 43 + 0x1a]
22:55:52     INFO -      eip = 0x51ff1963   esp = 0x001af288   ebp = 0x001af364
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  11  xul.dll!nsCategoryCache<mozIStorageVacuumParticipant>::LazyInit() [nsCategoryCache.h:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 98 + 0x8]
22:55:52     INFO -      eip = 0x52c95fb2   esp = 0x001af36c   ebp = 0x001af378
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  12  xul.dll!mozilla::storage::VacuumManager::Observe(nsISupports *,char const *,char16_t const *) [VacuumManager.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 332 + 0x1a]
22:55:52     INFO -      eip = 0x52c8936a   esp = 0x001af380   ebp = 0x001af508
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  13  xul.dll!nsIdleServiceDaily::Observe(nsISupports *,char const *,char16_t const *) [nsIdleService.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 106 + 0xf]
22:55:52     INFO -      eip = 0x5538108b   esp = 0x001af510   ebp = 0x001af538
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  14  xul.dll!nsIdleService::IdleTimerCallback() [nsIdleService.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 732 + 0x10]
22:55:52     INFO -      eip = 0x55382cfa   esp = 0x001af540   ebp = 0x001af660
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  15  xul.dll!nsTimerImpl::Fire(int) [nsTimerImpl.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 559 + 0xf]
22:55:52     INFO -      eip = 0x52031bf3   esp = 0x001af668   ebp = 0x001af74c
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  16  xul.dll!nsTimerEvent::Run() [TimerThread.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 259 + 0x8]
22:55:52     INFO -      eip = 0x520157ba   esp = 0x001af754   ebp = 0x001af794
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  17  xul.dll!nsThread::ProcessNextEvent(bool,bool *) [nsThread.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 1157 + 0x6]
22:55:52     INFO -      eip = 0x5201df53   esp = 0x001af79c   ebp = 0x001af888
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  18  xul.dll!NS_ProcessNextEvent(nsIThread *,bool) [nsThreadUtils.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 468 + 0x11]
22:55:52     INFO -      eip = 0x520211c4   esp = 0x001af890   ebp = 0x001af8ac
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  19  xul.dll!nsresult `anonymous namespace'::ParentImpl::ShutdownObserver::Observe(class nsISupports *, const char *, const char16_t *) [BackgroundImpl.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 1065 + 0x23a]
22:55:52     INFO -      eip = 0x52653018   esp = 0x001af8b4   ebp = 0x001af8e0
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  20  xul.dll!nsObserverList::NotifyObservers(nsISupports *,char const *,char16_t const *) [nsObserverList.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 66 + 0xb]
22:55:52     INFO -      eip = 0x51fa2567   esp = 0x001af8e8   ebp = 0x001af90c
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  21  xul.dll!nsObserverService::NotifyObservers(nsISupports *,char const *,char16_t const *) [nsObserverService.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 286 + 0xe]
22:55:52     INFO -      eip = 0x51fa3a1e   esp = 0x001af914   ebp = 0x001af988
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  22  xul.dll!mozilla::ShutdownXPCOM(nsIServiceManager *) [XPCOMInit.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 870 + 0xf]
22:55:52     INFO -      eip = 0x52055789   esp = 0x001af990   ebp = 0x001af9cc
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  23  xul.dll!XRE_XPCShellMain(int,char * *,char * *,XREShellData const *) [XPCShellImpl.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 1410 + 0x7]
22:55:52     INFO -      eip = 0x52c2c1b9   esp = 0x001af9d4   ebp = 0x001afc38
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  24  xul.dll!mozilla::BootstrapImpl::XRE_XPCShellMain(int,char * *,char * *,XREShellData const *) [Bootstrap.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 48 + 0x11]
22:55:52     INFO -      eip = 0x56a0ea84   esp = 0x001afc40   ebp = 0x001afc50
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  25  xpcshell.exe!NS_internal_main(int,char * *,char * *) [xpcshell.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 65 + 0x12]
22:55:52     INFO -      eip = 0x013a1287   esp = 0x001afc58   ebp = 0x001afc78
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  26  xpcshell.exe!wmain [nsWindowsWMain.cpp:2c85064c240cc9e3046fee8b781182a6291d3ee4 : 129 + 0xf]
22:55:52     INFO -      eip = 0x013a1190   esp = 0x001afc80   ebp = 0x001afca4
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  27  xpcshell.exe!static int __scrt_common_main_seh() [exe_common.inl : 288 + 0x1c]
22:55:52     INFO -      eip = 0x013f6604   esp = 0x001afcac   ebp = 0x001afcec
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  28  kernel32.dll!BaseThreadInitThunk + 0x12
22:55:52     INFO -      eip = 0x75f73c45   esp = 0x001afcf4   ebp = 0x001afcf8
22:55:52     INFO -      Found by: previous frame's frame pointer
22:55:52     INFO -  29  ntdll.dll!CreatePipe + 0x231
22:55:52     INFO -      eip = 0x76fe37f5   esp = 0x001afd00   ebp = 0x001afd38
22:55:52     INFO -      Found by: call frame info
22:55:52     INFO -  30  ntdll.dll!CreatePipe + 0x204
22:55:52     INFO -      eip = 0x76fe37c8   esp = 0x001afd40   ebp = 0x001afd50
22:55:52     INFO -      Found by: call frame info
Flags: needinfo?(ehsan)
This failing only on Windows 7 debug, I guess the other fails are blocked by Bug 1508664.
Looks like these continued to happen post-backout?
Flags: needinfo?(csabou)
Also, this highly-related to the known issue tracked in bug 1505772. I think this should just re-land on Beta.
Flags: needinfo?(ehsan)
Re-landed this as it shouldn't have been backed out. Sorry for the mess.
Flags: needinfo?(csabou)
== Change summary for alert #18183 (as of Thu, 13 Dec 2018 13:38:40 GMT) ==

Improvements:

 32%  tp5o responsiveness opt e10s stylo            1.25 -> 0.85
 31%  tp5o responsiveness opt e10s stylo            1.32 -> 0.91
 30%  tp5o responsiveness pgo e10s stylo            1.20 -> 0.83
 29%  tp5o responsiveness opt e10s stylo            1.28 -> 0.91
 28%  tp5o responsiveness opt e10s stylo            1.35 -> 0.97
 28%  tp5o responsiveness pgo e10s stylo            1.15 -> 0.83
 27%  tp5o responsiveness pgo e10s stylo            1.14 -> 0.83
 26%  tp5o_webext responsiveness opt e10s stylo     1.59 -> 1.17
 25%  tp5o_webext responsiveness opt e10s stylo     1.71 -> 1.27
 25%  tp5o_webext responsiveness pgo e10s stylo     1.51 -> 1.13
 24%  tp5o_webext responsiveness pgo e10s stylo     1.45 -> 1.10
 24%  tp5o_webext responsiveness pgo e10s stylo     1.40 -> 1.07
 23%  tp5o_webext responsiveness opt e10s stylo     2.33 -> 1.80
 22%  tp5o responsiveness opt e10s stylo            1.93 -> 1.50
  4%  tps opt e10s stylo                            8.68 -> 8.31

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=18183
Status: RESOLVED → VERIFIED
You need to log in before you can comment on or make changes to this bug.