Closed Bug 1157784 Opened 5 years ago Closed 5 years ago

Firefox + Thunderbird stops painting during tab switch when switching windows at the same time as WM_SETTEXT

Categories

(Core :: Graphics, defect, major)

33 Branch
Unspecified
Windows 7
defect
Not set
major

Tracking

()

VERIFIED FIXED
mozilla41
Tracking Status
firefox38 + wontfix
firefox38.0.5 --- verified
firefox39 --- verified
firefox40 --- verified
firefox41 --- verified
firefox-esr31 --- unaffected
firefox-esr38 39+ verified

People

(Reporter: jrmuizel, Assigned: jrmuizel)

References

Details

(Keywords: regression)

Attachments

(1 file, 4 obsolete files)

STR:

1. Open the build (firefox.exe -console) with a clean profile.
2. Go to https://commons.wikimedia.org/wiki/Category:Gigapixel_images_from_the_Google_Art_Project.
3. Middle click on the first 6 pictures to open them in new tabs (pictures will open on pages with a much smaller preview, so NOT at full size).
4. Switch to the console window
5. Click one of the undisplayed tabs
6. If the paints don't stop, switch back to the the console window
7. Click one of the other undisplayed tabs
8. Repeat until the browser stops painting
Summary: Firefox stops painting during tab switch when switching windows at the same time on NVIDIA hardware → Firefox stops painting during tab switch when switching windows at the same time on NVIDIA hardware OMTC
Florin, these steps work better for me than yours in bug 1067470. Can you reproduce with these steps and if so can you find a regression window? So far it looks like the window might be between FF30 and FF32 nightlies. Make sure you keep the OMTC pref on. mozregression now supports settings prefs so you should be able to use that.
Flags: needinfo?(florin.mezei)
I can much easily reproduce the issue with your steps Jeff, so I worked today to get you a regression window. The bad news is that the issue was much harder to reproduce when I reached builds from about a year ago, but I tested for a good 10-15 minutes on the last good build and could not reproduce the issue, so I hope the window below is good.

Last good revision: 5b6e82e7bbbf - 2014-04-15
First bad revision: dd50745d7f35 - 2014-04-16
Pushlog: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=5b6e82e7bbbf&tochange=dd50745d7f35 

There are some fixes in this log, but I think one suspect would be bug 974197.

Tested with: mozregression --arg="-console" --bits 32 --good 2014-03-01 --bad 2014-04-28 --pref "layers.offmainthreadcomposition.enabled:true". Checking about:config in good builds confirmed that the omtc pref was set to true as expected.
Flags: needinfo?(florin.mezei)
I tested with mozregression and I was able to reproduce the error with the 2014-04-16 build, but not with earlier builds. So, your regression window might be be correct.

Reproducing the bug was rather hard, so the "last good" revision is probably not quite 100% certain, yet :-(

Windows 7 x64
NVIDIA GeForce GT 630
So I was able to investigate this further. It looks like our Presents are not showing up to the DWM. It calls D3DKMTGetPresentHistory and we call D3DKMTPresent but our present events don't show up for some reason.
Taking out the kernel debugger we can see that DXGCONTEXT::Present is not calling DXGCONTEXT::SubmitPresent when things are in a broken state.

Next task is to figure out why.
It looks SubmitPresent is not being called because bit 2 the second field of DXGPRESENT is set.
This happens when DXGPRESENT::CheckVisRgn gets an empty region instead of one that is the size of the window
The region is RNG5 coming from GetRandomRgn
RNG5 is the DxRegion and it gets changed around quite a bit. I wonder if its mutation is racing with Present
There's a lot of indirection from Firefox to DX HRGN so it's difficult to track down what's changing the region.
It looks like's not actually the DX region. It's the visible region without the window offset applied.
This looks to be caused by the code here:
https://hg.mozilla.org/mozilla-central/file/42db79f3cd6b/widget/windows/nsWindow.cpp#l4689
If Present happens to run at the same time that we set the window to not visible the bad things happen.
Tracking as Jeff has a lead and we may be able to take this as a ride along in a 38 point release.
Blocks: 1067470
This should fix the problem. It's a bit ugly and I believe has the problem of it being able to postpone WM_SETTEXT indefinitely, but it should work. I've put a build of FF36 that has this patch at:
http://people.mozilla.org/~jmuizelaar/firefox-36.0-tabswitch-fixed.en-US.win32.zip

I'd encourage people to try it out.
Attachment #8605055 - Flags: feedback?(jmathies)
Comment on attachment 8605055 [details] [diff] [review]
Avoid setting the window to invisible while we're presenting

We already have sync calls to the compositor and we presumeably haven't run into any deadlock situations from that so a simple mutex should be sufficient instead of this complicated patch.
Attachment #8605055 - Flags: feedback?(jmathies)
Take a mutex during composite rendering to avoid doing our WM_SETTEXT visibility hack at the same time.
Attachment #8605055 - Attachment is obsolete: true
Attachment #8605355 - Flags: review?(bas)
Blocking the main thread on composition? Won't this introduce a lot of random jank, and worst case basically defeat the purpose for OMTC?
(In reply to Jim Mathies [:jimm] from comment #20)
> Blocking the main thread on composition? Won't this introduce a lot of
> random jank, and worst case basically defeat the purpose for OMTC?

The purpose of OMTC is to avoid blocking the compositor thread on the main thread not the opposite. This only blocks the compositor for the very short time of calling the handler for WM_SETTEXT. The main thread can only be blocked for the duration of a composite. This is something we already have with other synchronous calls to the compositor.
Comment on attachment 8605355 [details] [diff] [review]
The simpler solution

Review of attachment 8605355 [details] [diff] [review]:
-----------------------------------------------------------------

Looking over PCompositor.ipdl I see we do do sync calls here, so I guess blocking the UI thread is considered ok.

::: widget/windows/nsWindow.h
@@ +585,5 @@
>  
>    static bool sNeedsToInitMouseWheelSettings;
>    static void InitMouseWheelScrollData();
> +
> +  CRITICAL_SECTION mPresentLock;

We have a nice mozilla named Mutex object you can use here that works with the RAII helper MutexAutoLock.
Attachment #8605355 - Flags: feedback+
Summary: Firefox stops painting during tab switch when switching windows at the same time on NVIDIA hardware OMTC → Firefox stops painting during tab switch when switching windows at the same time as WM_SETTEXT
Does it need a DeleteCriticalSection?
(In reply to David Major [:dmajor] from comment #24)
> Does it need a DeleteCriticalSection?

And that's why Mutex is better :)
I landed a fix.
Comment on attachment 8605355 [details] [diff] [review]
The simpler solution

Review of attachment 8605355 [details] [diff] [review]:
-----------------------------------------------------------------

::: widget/windows/nsWindow.cpp
@@ +411,5 @@
>    sInstanceCount++;
>  }
>  
>  nsWindow::~nsWindow()
>  {

::DeleteCriticalSection
Attachment #8605355 - Flags: review?(bas) → review+
Jeff, you are truly a god among men. For me, both test builds appear to do the trick. Incredible work on a problem those of us who've been experiencing it thought was swept under the rug for good.

How quickly can we get this out to the release channel?
(In reply to blackwind from comment #29)
> Jeff, you are truly a god among men. For me, both test builds appear to do
> the trick. Incredible work on a problem those of us who've been experiencing
> it thought was swept under the rug for good.
> 
> How quickly can we get this out to the release channel?

This was too late for 38.0.1. If we do a 38.0.2 it will likely be in it.
Comment on attachment 8605355 [details] [diff] [review]
The simpler solution

Approval Request Comment
[Feature/regressing bug #]: OMTC
[User impact if declined]: painting stops/painting artifacts
[Describe test coverage new/current, TreeHerder]: Limited
[Risks and why]: Possible, but unlikely deadlocks. Perhaps some additional jank
Attachment #8605355 - Flags: approval-mozilla-beta?
Attachment #8605355 - Flags: approval-mozilla-aurora?
https://hg.mozilla.org/mozilla-central/rev/82ddfd3ddce0
https://hg.mozilla.org/mozilla-central/rev/83152f78784a
Assignee: nobody → jmuizelaar
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla41
I was able to easily reproduce the issue with an older Firefox Nightly from 2015-02-19, on Windows 7 x64. I could no longer reproduce it on the latest Firefox 41 Nightly (BuildID=20150515030202) on the same machine. I opened Firefox from the console several time then repeatedly clicked on tabs after having switched to the console.
Status: RESOLVED → VERIFIED
Flags: needinfo?(mayankleoboy1)
Attachment #8606353 - Attachment is obsolete: true
Attachment #8606636 - Attachment is obsolete: true
Attachment #8606639 - Attachment is obsolete: true
Is it possible this bug caused bug 1165686  ?
Comment on attachment 8605355 [details] [diff] [review]
The simpler solution

Taking it now in 39 & 40 to get as much coverage as possible.
We should take it in 38.0.5.
Attachment #8605355 - Flags: approval-mozilla-release+
Attachment #8605355 - Flags: approval-mozilla-beta?
Attachment #8605355 - Flags: approval-mozilla-beta+
Attachment #8605355 - Flags: approval-mozilla-aurora?
Attachment #8605355 - Flags: approval-mozilla-aurora+
Reproduced the initial issue using a old Nightly build (2015-02-25), verified that the issue is fixed using Firefox 38.0.5 beta 3 on Windows 7 64-bit.
Will this automatically get into esr?
Flags: needinfo?(ryanvm)
esr38 has already branched. You'll need to request approval at this point.
Flags: needinfo?(ryanvm)
Comment on attachment 8605355 [details] [diff] [review]
The simpler solution

[Approval Request Comment]
If this is not a sec:{high,crit} bug, please state case for ESR consideration:
User impact if declined: Firefox will stop painting sometimes
Fix Landed on Version: 38
Risk to taking this patch (and alternatives if risky): Should be pretty low, however we haven't shipped it widely yet.
Attachment #8605355 - Flags: approval-mozilla-esr38?
Comment on attachment 8605355 [details] [diff] [review]
The simpler solution

Taking it to esr 38 too.
Attachment #8605355 - Flags: approval-mozilla-esr38? → approval-mozilla-esr38+
Duplicate of this bug: 1172005
This fix isn't in 38.0.5 yet, if it is it isn't working because tabs/the whole window still aren't repainted from time to time.
(In reply to someone from comment #57)
> This fix isn't in 38.0.5 yet, if it is it isn't working because tabs/the
> whole window still aren't repainted from time to time.

This is in 38.0.5, if you're still experiencing an issue please file a new bug.
Someone in Bug 1067470 (start from comment 295) mentioned that in Thunderbird 38.0.1 he/she encounters very similar painting problem. It sounds like the exactly same cause. Does this fix need to port there?

Thanks.
(In reply to Benjamin Peng from comment #59)
> Someone in Bug 1067470 (start from comment 295) mentioned that in
> Thunderbird 38.0.1 he/she encounters very similar painting problem. It
> sounds like the exactly same cause. Does this fix need to port there?
> 
> Thanks.

The patch landed on esr38, so it automatically got picked up by Thunderbird 38.1.0.
Confirmed as fixed according to bug 1067470 comment 303.


This wasn't set to block bug 974197.  Was it determined after comment 2/comment 3 that bug 974197 is not the cause?
Flags: needinfo?(jmuizelaar)
Summary: Firefox stops painting during tab switch when switching windows at the same time as WM_SETTEXT → Firefox + Thunderbird stops painting during tab switch when switching windows at the same time as WM_SETTEXT
(In reply to Wayne Mery (:wsmwk, use Needinfo for questions) from comment #60)
> 
> This wasn't set to block bug 974197.  Was it determined after comment
> 2/comment 3 that bug 974197 is not the cause?

Correct. Bug 974197 is not the cause.
Flags: needinfo?(jmuizelaar)

Spotted this again on Linux Mint..

Firefox 70.0 (64-bit)

Reproduces after a few hours of activity.

Desktop manager: i3wm

You need to log in before you can comment on or make changes to this bug.