Closed Bug 1392235 Opened 7 years ago Closed 7 years ago

Browser completely freezes when opening our site since updating to version 55

Categories

(Core :: SVG, defect)

55 Branch
Unspecified
All
defect
Not set
normal

Tracking

()

RESOLVED FIXED
mozilla57
Tracking Status
firefox-esr52 --- unaffected
firefox55 --- wontfix
firefox56 --- wontfix
firefox57 --- fixed

People

(Reporter: swav, Assigned: jwatt)

References

Details

(Keywords: hang, regression)

Attachments

(3 files, 1 obsolete file)

User Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.9; rv:55.0) Gecko/20100101 Firefox/55.0
Build ID: 20170814073321

Steps to reproduce:

Log into website we're developing, after few seconds browser will lock up completely requiring process kill.


Actual results:

Few seconds after logging in process stops responding and has to be killed from task manager. Tested on OSX 10.9.5, 10.12.6, and Windows 10 and 100% reproducible.
Same website worked fine on version 54 just before update. I have tried disabling all extensions as well but no change. There are no console logs or request listed in development tools.


Expected results:

Not crash
can you provide a standalone testcase?

if not, can you use mozregression tool to figure out when the issue started happening?
http://mozilla.github.io/mozregression/
Flags: needinfo?(swav)
Keywords: hang
I have tried to run mozregression tool on My MacBook Pro, but it just won't start. There is issue with some python dependcies - it complains that "ERROR: No module named PyQt4.uic" despite module being installed. When tried on my colleague's Windows 10 machine mozregression itself crashes when browser hangs. I will try to get it working again tomorrow on another machine.

Standalone test case would be a problem, but willing to send credentials for one of our test accounts via private message. To reproduce you usually just need to log in and wait a few seconds.
Flags: needinfo?(swav)
Hi Swav,

To use Mozregression tool on MAC OS, you will need to install and use Mozregression through a virtual environment. Here are the steps you need to follow in order to do that:

1. First, you will need to open a Terminal window and enter the following command "sudo su" (without quotes). You will be prompted to insert the PC user password. After entering them press the "Enter" key.
2. After credentials are confirmed it's time to install the Python 2.7 component by entering the next command "sudo easy_install pip".
3. Next you will need to install the virtual environment using the next command "sudo pip install virtualenv".
4. After the virtual environment was installed, you will need to provide a directory name for the virtual environment using "virtualenv <folder_name>" command (eg: "virtualenv mozregression" will create the environment folder under your current profile).
5. Next you will need to activate the environment from a file that was created in the environment folder using "source /Users/<profile_name>/<folder_name>/bin/activate" command.
6. You will notice that your command line is now changed into "(folder_name) <code>". Now you will need to install the mozregression tool into the virtual environment using "pip install mozregression" command.
7. After the installation is completed, you can check if everything is ok using a simple command like "mozregression -h" to see if the tool help is displayed (if not you did something wrong along the way).


You can deactivate the virtual environment after you complete the regression using ""deactivate" command that will return you to the normal command line. In the future if you need to use the tool again, please keep in mind that the environment must be activated each time.

If you succeed, please provide us the regression window.

Thank you,
Marius
Flags: needinfo?(swav)
Hi Marius

The problem is the order, I have intalled the mozregression tool following guide, but then went on installing virtualenv. The crucial piece is installing it inside virtual environment.

The result of bisection:

20:27.03 INFO: Narrowed inbound regression window from [2347307a, b9415932] (3 builds) to [59efeb9e, b9415932] (2 builds) (~1 steps left)
20:27.03 INFO: No more inbound revisions, bisection finished.
20:27.03 INFO: Last good revision: 59efeb9e45a38e39ae6c30c094f04e8cc62f73fb
20:27.03 INFO: First bad revision: b94159328bef519c43f28c9971975ae3a11eff8a
20:27.03 INFO: Pushlog:
https://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=59efeb9e45a38e39ae6c30c094f04e8cc62f73fb&tochange=b94159328bef519c43f28c9971975ae3a11eff8a
Flags: needinfo?(swav)
Jonathan, can you please take a look at this?
Status: UNCONFIRMED → NEW
Component: Untriaged → SVG
Ever confirmed: true
Flags: needinfo?(jwatt)
Keywords: regression
OS: Unspecified → All
swav, please email me the login details.
Flags: needinfo?(jwatt) → needinfo?(swav)
Hi

We'll prepare a demo server and send credentials when ready.

Best
Swav
Flags: needinfo?(swav)
Swav, were you able to set up a demo for investigation?
Flags: needinfo?(swav)
Hi

I've emailed details directly to Jonathan Watt few days back. Do you need them as well?

Best
Flags: needinfo?(swav)
Nah, just wanted to make sure y'all were in contact! Thanks.
Jonathan, did you get a chance to take a look yet?
Flags: needinfo?(jwatt)
I have not, yet. Thanks for the needinfo.
Flags: needinfo?(jwatt)
swav: the basic problem here is that you have a mask that contains an element that is masked by that mask. In other words, a mask reference loop.
Assignee: nobody → jwatt
Attachment #8905092 - Flags: review?(longsonr)
Attachment #8905092 - Flags: review?(longsonr) → review+
Attachment #8905267 - Flags: review?(longsonr)
Comment on attachment 8905267 [details] [diff] [review]
part 2 - Add a reftest for SVG mask reference loops

you really need to put in more than the checkin comment.
Attachment #8905267 - Flags: review?(longsonr) → review-
Forgot to qref.
Attachment #8905267 - Attachment is obsolete: true
Attachment #8905268 - Flags: review?(longsonr)
Comment on attachment 8905268 [details] [diff] [review]
part 3 - Add a reftest for SVG mask reference loops

we usually just make these crashtests don't we but I guess this is OK too.
Attachment #8905268 - Flags: review?(longsonr) → review+
Pushed by jwatt@jwatt.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/f574c304f748
part 1 - Fix AutoReferenceChainGuard to flag the guarded frame as in-use. r=longsonr
https://hg.mozilla.org/integration/mozilla-inbound/rev/af8cb3153052
part 2 - Add a reftest for SVG mask reference loops. r=longsonr
(In reply to Robert Longson from comment #18)
> we usually just make these crashtests don't we but I guess this is OK too.

If they're not going to hang, they're going to render one way or another, and it seems like we should be consistent about it. For rendering related hangs/crashes I've avoided crashtests for years for that reason.
Backed out for asserting during wpt-reftest /css/css-masking-1/clip-path-svg-content/clip-path-recursion-002.svg on Linux:

https://hg.mozilla.org/integration/mozilla-inbound/rev/6e4988e39120b0a1f7a7eeeb872c4781f9d5d9ee
https://hg.mozilla.org/integration/mozilla-inbound/rev/4fbe6972d2974789acaeb97369a7becc83cba4db

Push which ran failing tests: https://treeherder.mozilla.org/#/jobs?repo=mozilla-inbound&revision=281a5532ea19ff3c90e3899031059b258104d6db&filter-resultStatus=exception&filter-resultStatus=usercancel&filter-resultStatus=runnable&filter-resultStatus=retry&filter-resultStatus=testfailed&filter-resultStatus=busted
Failure log: https://treeherder.mozilla.org/logviewer.html#?job_id=129263920&repo=mozilla-inbound

Assertion failure: *aChainCounter == noChain || (*aChainCounter >= 0 && *aChainCounter < aMaxChainLength), at /builds/worker/workspace/build/src/layout/svg/AutoReferenceChainGuard.h:92
TEST-UNEXPECTED-ERROR | /css/css-masking-1/clip-path-svg-content/clip-path-recursion-002.svg | screenshot@chrome://marionette/content/reftest.js:386:1
Flags: needinfo?(jwatt)
Initially I didn't understand how the new mask loop test could pass when the WPF clipPath would fail. The reason is because in the clipPath code flow we end up with a second active AutoReferenceChainGuard with a different aMaxChainLength. I.e. under a different function (a nsSVGClipPathFrame::IsValid call under a nsSVGClipPathFrame::PaintClipMask call). That results in us breaking the loop based on aFrameInUse, but after we've already set mMaxChainLength in the constructor, which we then fail to reset in the destructor because we broke the loop.
Flags: needinfo?(jwatt)
Attachment #8905268 - Attachment description: part 2 - Add a reftest for SVG mask reference loops → part 3 - Add a reftest for SVG mask reference loops
To be consistent, I've moved all state setting logic to AutoReferenceChainGuard::Reference, all the reset logic to ~AutoReferenceChainGuard, and now we never set/unset logic if we break a reference loop/chain.
Attachment #8906725 - Flags: review?(longsonr)
Attachment #8906725 - Flags: review?(longsonr) → review+
Pushed by jwatt@jwatt.org:
https://hg.mozilla.org/integration/mozilla-inbound/rev/beb2e1710ae4
part 1 - Fix AutoReferenceChainGuard::Reference to flag the guarded frame as in-use. r=longsonr
https://hg.mozilla.org/integration/mozilla-inbound/rev/0cbc588a5934
part 2 - Fix AutoReferenceChainGuard to never set/unset state when it breaks a reference chain. r=longsonr
https://hg.mozilla.org/integration/mozilla-inbound/rev/ea320a011ac9
part 3 - Add a reftest for SVG mask reference loops. r=longsonr
Jonathan: could you tell me which file had a reference loop?
Please request approval for an uplift to beta if you feel the change is not too risky. Next week we ship the RC build.
Flags: needinfo?(jwatt)
Comment on attachment 8905092 [details] [diff] [review]
part 1 - Fix AutoReferenceChainGuard to flag the guarded frame as in-use

Approval Request Comment
[Feature/Bug causing the regression]: bug 1349477
[User impact if declined]: SVG content with reference loops can hang process
[Is this code covered by automated tests?]: yes
[Has the fix been verified in Nightly?]: yes (by CI test and manually on reported site)
[Needs manual test from QE? If yes, steps to reproduce]: no (done by me)
[List of other uplifts needed for the feature/fix]: none
[Is the change risky?]: low risk
[Why is the change risky/not risky?]: code breaks reference loop
[String changes made/needed]: none
Flags: needinfo?(jwatt)
Attachment #8905092 - Flags: approval-mozilla-beta?
Comment on attachment 8905268 [details] [diff] [review]
part 3 - Add a reftest for SVG mask reference loops

Approval Request Comment
See comment 28.
Attachment #8905268 - Flags: approval-mozilla-beta?
Comment on attachment 8906725 [details] [diff] [review]
part 2 - Fix AutoReferenceChainGuard to never set/unset state when it breaks a reference chain

Approval Request Comment
See comment 28.
Attachment #8906725 - Flags: approval-mozilla-beta?
(In reply to swav from comment #25)
> Jonathan: could you tell me which file had a reference loop?

That's not easy to tell. I'd need to create a custom build with some debug code added, which I don't have time to do right now. needinfo'ing myself to do that later.
Flags: needinfo?(jwatt)
This fix hasn't had much time on nightly for us to catch potential problems. 
Is there any way the user could work around the issue on their site, for now, and we can let this fix ride to release on 57?
Isn't this not even a production web site (it's mentioned as being under development)  I think this can wait till 57...
Hi

The system is too large to set up simple test case so we've set up a separate server for testing so that you don't have to worry about impacting anybody. The system has been in production for 6+ years and worked fine until version 55 was released. Due to nature of crash we can't even determine which asset is causing issue. When would version 57 potentially be released? 

Best
swav: It's the 'mask' with ID "eventsMaskOn" that is causing the reference loop.

You have this style rule in the document:

.icon.events rect {
  mask: url(#eventsMaskOff);
}

That rule applies the mask to all 'rect' descendants of any elements with class="icon events". That would be fine, except that the mask is a descendant of a 'div' with class="icon events", and the mask contains 'rect' elements. As a result the 'mask' is applied to its own descendant 'rect's causing the reference loop.
Flags: needinfo?(jwatt)
Liz: The reporter can fix their site using the information in comment 35. The concern would be that other sites out there have similar issues causing Firefox users to experience hangs. I think the patch is low risk, but since we only appear to have this one hang report since Firefox 55 was released I expect the number of sites causing this type of hang is also low. So although I'd like to eliminate the potential for this hang as soon as possible, I don't feel strongly that we need to uplift it to 56 given that nowadays we don't have Aurora.
Flags: needinfo?(lhenry)
(In reply to Jonathan Watt [:jwatt] from comment #35)
> .icon.events rect {
>   mask: url(#eventsMaskOff);
> }

Actually it would be the following rule that causes the problem:

.main-menu-item:hover .icon.events rect {
  mask: url(#eventsMaskOn);
}

It looks like you have a similar problem with eventsMaskOff though.
OK. Let's let this fix ride with 57. This just comes a little late in the beta cycle for uplift.
Flags: needinfo?(lhenry)
Attachment #8905092 - Flags: approval-mozilla-beta?
Attachment #8905268 - Flags: approval-mozilla-beta?
Attachment #8906725 - Flags: approval-mozilla-beta?
Do you still need the test server? I would like to destroy it if not needed.
No, we don't need it, although it would be nice if you confirmed that this was now fixed in Firefox nightly. https://www.mozilla.org/en-GB/firefox/channel/desktop/
You need to log in before you can comment on or make changes to this bug.