<a class="header-button" href="https://bugzilla.mozilla.org/home" title="Go to home page"> Bugzilla

Assignee

Comment 8

•

5 years ago

It looks like APZ's knowledge of the visual and layout viewports is getting out of sync for a brief period of time, with the layout viewport reflecting the older (maximized) size and the visual viewport reflecting the newer (unmaximized) size. Need to dig a bit further to understand how this comes to be.

Assignee

Comment 9

•

5 years ago

I haven't verified this, but I suspect this code is at fault. I'd really like to get rid of it, but last time I tried it broke other things.

Assignee

Comment 10

•

5 years ago

(In reply to Botond Ballo [:botond] [spotty responses until Feb 19] from comment #9)

I'd really like to get rid of it, but last time I tried it broke other things.

Specifically, these are the things that broke when I tried to get rid of this code.

Flags: needinfo?(botond)

Assignee

Comment 11

•

5 years ago

At least the macOS crashtest that used to fail without that code is passing now: https://treeherder.mozilla.org/#/jobs?repo=try&tier=1&revision=a628ac932168c607ee63948c752d08c204a7c7f6

We can try removing that code again and see what happens.

Assignee

Comment 12

•

5 years ago

Attached file Bug 1611660 - Accept layout viewport updates from the main thread right away. r=tnikkel — Details

Previously, we would wait until the following frame (for uncertain reasons
that date back to B2G), but this meant the layout and visual viewports would
be out of sync for a frame, causing APZ to misbehave.

Phabricator Automation

Updated

•

5 years ago

Assignee: nobody → botond

Status: NEW → ASSIGNED

https://hg.mozilla.org/mozilla-central/rev/9dc9deb3ba3b

Assignee

Updated

•

5 years ago

Priority: -- → P3

Pulsebot

Comment 13

•

5 years ago

Pushed by bballo@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/9dc9deb3ba3b Accept layout viewport updates from the main thread right away. r=tnikkel

Cosmin Sabou [:CosminS]

Comment 14

•

5 years ago

bugherder

Status: ASSIGNED → RESOLVED

Closed: 5 years ago

status-firefox74: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla74

Log:
https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=287021072&repo=autoland&lineNumber=8805

Assignee

Comment 15

•

5 years ago

Attached file Bug 1611660 - Adjust WR test expectations. r=gw — Details

Dorel Luca [:dluca]

Comment 16

•

5 years ago

Backed out changeset 9dc9deb3ba3b (bug 1611660) for Reftest failure in layout/reftests/bugs/605138-1.html == layout/reftests/bugs/605138-1-ref.html

Push with failures:
https://treeherder.mozilla.org/#/jobs?repo=autoland&selectedJob=287021072&revision=9dc9deb3ba3bdfb7875b8259308d31ec46f4d6d1

Backout:
https://hg.mozilla.org/integration/autoland/rev/3c7cdd9f7b803c14af51d263f6fa62d4623c00a8

Flags: needinfo?(botond)

Cristina Coroiu [:ccoroiu]

Comment 17

•

5 years ago

This is failing on windows platform too: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=testfailed%2Cbusted%2Cexception&classifiedState=unclassified&fromchange=92d62985486856c0fb0b5a7b940816e19b1faf64&selectedJob=287020174

Status: RESOLVED → REOPENED

status-firefox74: fixed → ---

Resolution: FIXED → ---

Target Milestone: mozilla74 → ---

Cristina Coroiu [:ccoroiu]

Updated

•

5 years ago

Regressions: 1603237

Cristina Coroiu [:ccoroiu]

Updated

•

5 years ago

Regressions: 1572900

Andreea Pavel [:apavel]

Updated

•

5 years ago

No longer regressions: 1603237

Assignee

Updated

•

5 years ago

Flags: needinfo?(botond)

Pascal Chevrel:pascalc

Updated

•

5 years ago

status-firefox74: --- → affected

Assignee

Updated

•

5 years ago

status-firefox72: --- → wontfix

status-firefox73: --- → wontfix

Jens Stutte [:jstutte]

Comment 19

•

5 years ago

:botond, will you come to this for 74 ?

Flags: needinfo?(botond)

Assignee

Comment 20

•

5 years ago

I will try, but how soon we can re-land this fix will depend on how tricky tracking down the intermittent regression it caused (bug 1572900) will be.

Flags: needinfo?(botond)

Pascal Chevrel:pascalc

Comment 21

•

5 years ago

Wontfix for 74 as we are nearing the end of the beta cycle.

status-firefox74: affected → wontfix

status-firefox75: --- → affected

Assignee

Comment 22

•

5 years ago

•

Edited

So the intermittent failure tracked in bug 1572900 has had a low background occurrence rate on a variety of platforms since long before these patches landed, but, looking at the failures from the day these patches landed and were backed out, they seem to have caused a spike on "windows7-32-mingwclang" -- which is a Tier 2 platform -- in particular.

I did before and after try pushes and indeed, reftests on windows7-32-mingwclang have a 65% failure rate with the patches applied (compared to 0%, out of 20 runs for each task), without the patches.

Kats, do you think we should block the re-landing of these patches on figuring out what's going on with those mingw failures? On the one hand, they're Tier 2 failures and these patches fix some user-noticeable regressions. On the other hand, the high failure rate with these patches may indicate a potential cause for concern even on a Tier 2 platform.

Flags: needinfo?(kats)

Comment 23

•

5 years ago

I don't think a high failure rate on only mingw-clang should block landing these patches, no. If we can disable the test on mingw-clang that would be good to avoid the sheriffs some headache, but regardless we should land these and continue investigation of the failure. If there's a subtle bug in the patch then maybe we'll get some additional user reports which will help us track down the problem (since debugging the test via try pushes will likely be somewhat painful).

Flags: needinfo?(kats)

Assignee

Comment 24

•

5 years ago

I don't think the failure points to a single test we could disable. The harness itself seems to be timing out near the end. Shutdown hang perhaps?

Comment 25

•

5 years ago

Hm, ok. If you want me to investigate the failure let me know. I'm happy to offload stuff from you so you can focus more on desktop zooming.

Assignee

Comment 26

•

5 years ago

Tom, IIRC you've worked on MinGW support. Do you have any theories for why a patch might be causing a reftest harness error, intermittently but very frequently (65% failure rate), only on the "windows7-32-mingwclang" platform?

Do you have any suggestions for how one might investigate such a failure? Is it straightforward to build this configuration locally on a Windows machine?

Flags: needinfo?(tom)

Tom Ritter [:tjr]

Comment 27

•

5 years ago

No theories. This build is a cross build; it builds on Linux. It's not trivial to build it locally, but doing so won't get you more than downloading the builds from Try and running them on a Windows machine. FWIW I would confirm the platform: is it only x86 and not x64? Is it both debug and opt or only one of them?

Does the failure reproduce outside the harness? If so you can just debug with WinDbg. If it doesn't, then one would need to run the test+harness locally, which is not easy for a cross build. I've heard it's been done, but I've never succeeded myself (and filed Bug 1577101 for it.) Some people have succeeded in debugging it using an interactive task in try, so that's another option.

This seems to be indicative of a general problem, it's just we managed to reproduce it reliably in an annoying place to reproduce it locally. If you want to disable the MinGW test, I'm okay with that (just please file a new bug about investigating it...)

Flags: needinfo?(tom)

Assignee

Comment 28

•

5 years ago

(In reply to Tom Ritter [:tjr] (ni for response to sec-[approval|rating|advisories|cve]) from comment #27)

FWIW I would confirm the platform: is it only x86 and not x64?

It looks to be only x86 and not x64, yes.

Is it both debug and opt or only one of them?

I've only tested debug. I can't find a way to trigger opt reftests for mingw, not even in mach try fuzzy --full.

Does the failure reproduce outside the harness?

It's not a failure of a specific test, but a failure in the harness itself. I've previously speculated it might be a shutdown hang, but looking at the logs more closely, it actually appears to be a startup hang, during the startup of one of the many browser invocations performed in the reftest chunk. It's not even the same test directory every time, but a different one each time.

This means there isn't a single test (or even a single directory of tests) that we could disable. We'd have to disable all reftests on mingw, which we presumably don't want to do.

Assignee

Comment 29

•

5 years ago

(In reply to Max from comment #5)

Sure:
https://hummingmindidp.azurewebsites.net

Max, this link doesn't resolve any more.

Could you provide another link which does (or, preferably, attach a local testcase to the bug)? Thanks!

Flags: needinfo?(d3_max_t)

Assignee

Comment 30

•

5 years ago

I explored writing an alternative fix to this bug with the hope of avoiding the MinGW test failures. Unfortunately, a Try push shows that the alternative fix triggers the same failures as well.

Assignee

Comment 31

•

5 years ago

Based on debugging on Try, it looks like for the affected browser invocations, the reftest webextension is not getting installed or initialized, even though at least some other webextensions are. No idea what might be happening there...

Assignee

Comment 32

•

5 years ago

We don't even get as far as the point where the marionette client tries to install the reftest extension.

Assignee

Comment 33

•

5 years ago

It looks like the browser process never gets as far as initializing the marionette server, and the client hangs when trying to connect to the server.

Assignee

Comment 34

•

5 years ago

The browser process does not even get as far as gBrowserInit.onLoad().

BugBot [:suhaib / :marco/ :calixte]

Assignee

Comment 35

•

5 years ago

It gets as far as creating a document viewer for the top-level chrome document and calling LoadStart() on it, but LoadComplete() (which is what would trigger firing the onload event and calling gBrowserInit.onLoad() as its handler) is never called on that document viewer.

It's not immediately clear to me where additional logging could be added to diagnose how far the loading process gets / why LoadComplete() never gets called. I'm open to suggestions for how to investigate this further.

Jens Stutte [:jstutte]

Updated

•

5 years ago

status-firefox75: affected → wontfix

status-firefox76: --- → fix-optional

Comment 36

•

5 years ago

There are some r+ patches which didn't land and no activity in this bug for 2 weeks.
:botond, could you have a look please?
For more information, please visit auto_nag documentation.

Flags: needinfo?(botond)

Assignee

Comment 37

•

5 years ago

The patches are not in a state where they can land, as they cause high-frequency intermittents for which we do not have a fix yet.

Flags: needinfo?(botond)

Comment 38

•

5 years ago

The high-frequency intermittents (at least the ones in bug 1572900) seem to have gone away in the last day or two, probably due to changes in the workers. It's probably a good time to rebase these patches and do a try push and reland.

Flags: needinfo?(botond)

Assignee

Comment 39

•

5 years ago

Huh, weird. But good news for this bug, I guess!

Flags: needinfo?(botond)

Assignee

Comment 40

•

5 years ago

Unfortunately, I'm still seeing a high volume of the same failure as before: https://treeherder.mozilla.org/#/jobs?repo=try&revision=956bd07738080bedbb2970079313f4aeed73fa8a

Comment 41

•

5 years ago

Hm, ok I'll continue investigating using the win 7 mingw platform. Hopefully it's the same problem I was chasing down on win10-aarch64.

Timothy Nikkel (:tnikkel)

Comment 42

•

5 years ago

Thinking about this more - even if I do track down all the different hang locations, it's unlikely that they will be fixed soon since the root cause is not obvious and may be inside mingw.

Tom, can you elaborate on these mingw builds - are they builds that we ship to users, or do we just keep them to ensure a mingw-based build doesn't break? In the latter case, do we need to keep running win 7 reftests on these builds? I feel like an increase in intermittent failures in a tier-2 test suite that we're not using for anything in particular shouldn't prevent an obviously-unrelated patch from landing.

Flags: needinfo?(tom)

Comment 43

•

5 years ago

I assumed they were used for Tor browser but I have nothing to back that up.

Tom Ritter [:tjr]

Comment 44

•

5 years ago

(In reply to Kartikaya Gupta (email:kats@mozilla.com) from comment #42)

Tom, can you elaborate on these mingw builds - are they builds that we ship to users,

No.

or do we just keep them to ensure a mingw-based build doesn't break?

Yes.

In the latter case, do we need to keep running win 7 reftests on these builds?

No.

I feel like an increase in intermittent failures in a tier-2 test suite that we're not using for anything in particular shouldn't prevent an obviously-unrelated patch from landing.

You are correct. Feel free to disable the test(s) you need to, file a bug against Bug 1475994 for re-enabling it with details/linking here. You can also make the hang bugs children of the new bug you filed.

Flags: needinfo?(tom)

Updated

•

5 years ago

Depends on: 1642720

https://hg.mozilla.org/mozilla-central/rev/5cab400444e9
https://hg.mozilla.org/mozilla-central/rev/5a874f1887ff

Comment 45

•

4 years ago

Bug 1642720 is on autoland so this should be good to re-land now.

Pulsebot

Comment 46

•

4 years ago

Pushed by bballo@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/5cab400444e9 Accept layout viewport updates from the main thread right away. r=tnikkel https://hg.mozilla.org/integration/autoland/rev/5a874f1887ff Adjust WR test expectations. r=gw

Dorel Luca [:dluca]

Comment 47

•

4 years ago

bugherder

Status: REOPENED → RESOLVED

Closed: 5 years ago → 4 years ago

status-firefox79: --- → fixed

Resolution: --- → FIXED

Target Milestone: --- → mozilla79

Ryan VanderMeulen [:RyanVM]

Updated

•

4 years ago

status-firefox76: fix-optional → wontfix

status-firefox77: --- → wontfix

status-firefox78: --- → wontfix

status-firefox-esr68: --- → wontfix

Alexandru Ionescu (needinfo me) [:alexandrui]

Updated

•

4 years ago

Regressions: 1643604

Andreea Pavel [:apavel]

Updated

•

4 years ago

Regressions: 1557133

Comment 48

•

4 years ago

Seeing as this patch has caused a number of other regressions/spikes, should we consider backing it out until we have a better understanding of what's going on?

Comment 49

•

4 years ago

== Change summary for alert #26127 (as of Thu, 04 Jun 2020 10:45:24 GMT) ==

Regressions:

39% raptor-tp6-google-firefox-cold replayed windows10-64-shippable opt 812.71 -> 497.75
37% raptor-tp6-google-firefox-cold replayed windows7-32-shippable opt 740.00 -> 466.17
28% raptor-tp6-google-firefox-cold replayed linux64-shippable opt 850.29 -> 614.75

Improvements:

64% raptor-tp6-google-firefox-cold loadtime windows7-32-shippable opt 1,101.83 -> 397.08
58% raptor-tp6-google-firefox-cold loadtime windows10-64-shippable opt 1,112.67 -> 464.50
23% raptor-tp6-google-firefox-cold windows7-32-shippable opt 385.83 -> 298.67
21% raptor-tp6-google-firefox-cold windows10-64-shippable opt 384.43 -> 304.04

For up to date results, see: https://treeherder.mozilla.org/perf.html#/alerts?id=26127

Updated

•

4 years ago

Whiteboard: perf, perf-alert

Greg Mierzwinski [:sparky]

Comment 50

•

4 years ago

•

Edited

As the above alert is generating impovments and regressing the "replayed" metric should we further investigate this?

Flags: needinfo?(tarek)

Flags: needinfo?(gmierz2)

Comment 51

•

4 years ago

I think yes because we are seeing a simultaneous improvement in loadtime and a regression in the replayed value - the improvements are likely false-positives.

Flags: needinfo?(gmierz2)

Tarek Ziadé (:tarek)

Updated

•

4 years ago

Flags: needinfo?(tarek)

Assignee

Updated

•

4 years ago

No longer regressions: 1557133

Dave Hunt [:davehunt] [he/him] ⌚BST

Updated

•

4 years ago

Keywords: perf, perf-alert

Whiteboard: perf, perf-alert

Updated

•

4 years ago

Flags: needinfo?(fstrugariu)

Updated

•

4 years ago

Regressions: 1650398

Assignee

Updated

•

4 years ago

No longer regressions: 1572900

Comment 52

•

4 years ago

regression bug created removing ni?

Flags: needinfo?(fstrugariu)