Closed Bug 1285333 Opened 8 years ago Closed 2 years ago

Crash in atidxx32.dll | atiuxpag.dll | CContext::EmptyOutAllDDIBindPoints

Categories

(Core :: Audio/Video: Playback, defect, P3)

defect

Tracking

()

RESOLVED WORKSFORME
Tracking Status
firefox47 --- wontfix
firefox48 --- wontfix
firefox49 - wontfix
firefox-esr45 --- wontfix
firefox50 --- fix-optional
firefox51 --- fix-optional
firefox52 --- wontfix

People

(Reporter: jrmuizel, Unassigned)

References

Details

(Keywords: crash, stale-bug)

Crash Data

Attachments

(1 file)

This is one of the more common AMD crashes. It looks to happen during DXGITextureHostD3D11 shutdown.
Crash Signature: [@ atidxx32.dll | atiuxpag.dll | CContext::EmptyOutAllDDIBindPoints ]
started investigation
Crash volume for signature 'atidxx32.dll | atiuxpag.dll | CContext::EmptyOutAllDDIBindPoints':
 - nightly(version 50):5 crashes from 2016-06-06.
 - aurora (version 49):51 crashes from 2016-06-07.
 - beta   (version 48):553 crashes from 2016-06-06.
 - release(version 47):3464 crashes from 2016-05-31.
 - esr    (version 45):32 crashes from 2016-04-07.

Crash volume on the last weeks:
            W. N-1  W. N-2  W. N-3  W. N-4  W. N-5  W. N-6  W. N-7
 - nightly       1       0       0       3       1       0       0
 - aurora       10       4       6       9      12      10       0
 - beta         70      92      81     106     129      73       0
 - release     479     529     548     657     748     487       0
 - esr           9       5       8       6       3       1       0

Affected platform: Windows
Crash volume for signature 'atidxx32.dll | atiuxpag.dll | CContext::EmptyOutAllDDIBindPoints':
 - nightly (version 51): 2 crashes from 2016-08-01.
 - aurora  (version 50): 8 crashes from 2016-08-01.
 - beta    (version 49): 78 crashes from 2016-08-02.
 - release (version 48): 410 crashes from 2016-07-25.
 - esr     (version 45): 51 crashes from 2016-05-02.

Crash volume on the last weeks (Week N is from 08-22 to 08-28):
            W. N-1  W. N-2  W. N-3
 - nightly       0       1       0
 - aurora        3       3       1
 - beta         19      26      17
 - release     101     132      99
 - esr           5       8       4

Affected platform: Windows

Crash rank on the last 7 days:
           Browser     Content   Plugin
 - nightly #603
 - aurora  #301
 - beta    #640
 - release #186
 - esr     #988
Can we blacklist this crash away?
Flags: needinfo?(gsquelart)
(In reply to Chris Pearce (:cpearce) from comment #4)
> Can we blacklist this crash away?

Looking at the aggregation by driver versions, crashes seem to be distributed across many versions (e.g., the top two versions only account for ~6% each), so I'm afraid blacklisting might not really help in this case.
Flags: needinfo?(gsquelart)
Do you have any meaningful full crashdumps? This would help root causing the problem as ad-hoc reproing of the problem is not successful.
Assignee: nobody → cpearce
This also doesn't appear to be vendor specific - except that this signature is, of course.  Bug 1292311 records the Nvidia hangs/crashes and a crash like https://crash-stats.mozilla.com/report/index/6b94f054-1d37-4478-ac44-3bf702160926 is the Intel one.

Both this one and Nvidia one started on June 22nd, which I think means: https://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=d224fc999cb6&tochange=2e3390571fdb which was a busy day with graphics, media, sandboxing, compositor process changes, so a lot of places for blame.  Assuming the June 22nd part is correct.

Anthony, can you confirm that the graph in this and bug 1292311 have cliffs on June 22nd, and it isn't just the graphing artifact of some sort?
Flags: needinfo?(anthony.s.hughes)
David, anything in the compositor process side that could have caused this hang?
Flags: needinfo?(dvander)
(In reply to Milan Sreckovic [:milan] from comment #7)
> Anthony, can you confirm that the graph in this and bug 1292311 have cliffs
> on June 22nd, and it isn't just the graphing artifact of some sort?

Both appear to start on June 22nd based on the data.
Flags: needinfo?(anthony.s.hughes)
[Tracking Requested - why for this release]: Bug 1292311 is being tracked, and it appears to be nvidia version of this one, so it seems to make sense to track this one as well.  We may end up duplicating comments between these two bugs, so I suggest people read both.
This looks like a shutdown hang? It seems unlikely anything compositor process related could cause this. (And it's yet another failure mode the compositor process would address.) The bug in comment #7 is shutdown related though and D3D11 can be weird about shutdown order (i.e. releasing devices after releasing textures).

For example, the original stack in bug 1292311 was fixed by bug 1296749. But it looks like newer reports in that bug match the crash report in comment #7.
Flags: needinfo?(dvander)
Thanks - I requested an uplift of bug 1296749, let's see what that does.
Milan, it looks like bug 1285152 is another crash introduced in the June 22nd regression window cited in comment 7.
Flags: needinfo?(milan)
Keywords: crash
Crash volume for signature 'atidxx32.dll | atiuxpag.dll | CContext::EmptyOutAllDDIBindPoints':
 - nightly (version 52): 1 crash from 2016-09-19.
 - aurora  (version 51): 0 crashes from 2016-09-19.
 - beta    (version 50): 13 crashes from 2016-09-20.
 - release (version 49): 194 crashes from 2016-09-05.
 - esr     (version 45): 80 crashes from 2016-06-01.

Crash volume on the last weeks (Week N is from 10-03 to 10-09):
            W. N-1  W. N-2
 - nightly       1       0
 - aurora        0       0
 - beta         11       2
 - release     163      31
 - esr           7      11

Affected platform: Windows

Crash rank on the last 7 days:
           Browser     Content   Plugin
 - nightly #800
 - aurora
 - beta    #1176
 - release #314
 - esr     #1020
(In reply to Anthony Hughes (:ashughes) [GFX][QA][Mentor] from comment #9)
> (In reply to Milan Sreckovic [:milan] from comment #7)
> > Anthony, can you confirm that the graph in this and bug 1292311 have cliffs
> > on June 22nd, and it isn't just the graphing artifact of some sort?
> 
> Both appear to start on June 22nd based on the data.

Further to this, the first reports were in Aurora so we're looking for something that was uplifted around that time.

If we take the 2016-06-22 build as an upper bound and assume enough luck that we can discount the 5 days before it, we get a much more manageable regression range: http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=e50976a962c4&tochange=c0ead5950258
This changeset from bug 1276931 looks pretty suspect, although it was in Aurora for almost a week before this started: http://hg.mozilla.org/releases/mozilla-aurora/rev/dca560586c39
(In reply to Edwin Flores [:eflores] [:edwin] from comment #16)
> This changeset from bug 1276931 looks pretty suspect, although it was in
> Aurora for almost a week before this started:
> http://hg.mozilla.org/releases/mozilla-aurora/rev/dca560586c39

Scratch that. Looks pretty benign.
I am getting this crash repeatedly very often, see my bug
Bug 1318307 and the crashreports. Having 2560x1600 display resolution.
(In reply to zetka from comment #19)
> I am getting this crash repeatedly very often, see my bug
> Bug 1318307 and the crashreports. Having 2560x1600 display resolution.

Could you please provide a memory status snapshot (start "Task Manager" - e.g. by typing taskmgr on a command line - and set it to performance tab before you run Firefox and then watch the CPU and memory utilization, while increasing the numbers of tabs until the issue occurs? Could you then observe/take a screen snapshot when you get this issue? You seem to run a lot of high-resolution application windows on a 32bit Windows machine with 4GB RAM (= 3GB effectively).
I wonder if you see the issue because the Windows OS goes into memory exhaustion and heavy page swapping to the hard disk. That would be a different type of issue though the symptoms may look like what you show. But effectively the Windows compositor would run out of RAM on that machine.
Flags: needinfo?(zetka)
Maybe an additional information for you. I tried different configuration, standard 32bit W7, then PAE patched W7 with 4GB (added about 0.5G mem) and PAE patched with 6GB RAM. It happens in all cases, with inreased RAM it happens too, I think that it happens when same number of windows are opened. I use DriveGleam RAM monitor and it shows about 70% RAM full in case of 4GB ram when crashing or in case of 6GB below 50%.
So I would search the issue in the Video together with high resolution.
(In reply to zetka from comment #21)
> Maybe an additional information for you. I tried different configuration,
> standard 32bit W7, then PAE patched W7 with 4GB (added about 0.5G mem) and
> PAE patched with 6GB RAM. It happens in all cases, with increased RAM it
> happens too, I think that it happens when same number of windows are opened.
> I use DriveGleam RAM monitor and it shows about 70% RAM full in case of 4GB
> ram when crashing or in case of 6GB below 50%.
> So I would search the issue in the Video together with high resolution.

Not sure how the "PAE patch" you did works for Windows7 (that Windows version doesn't support PAE by itself). Using PAE only increases the total accessible RAM for the whole system (one can run more applications in parallel in RAM), but the process memory for one application is still limited to 4GB of memory at max and often not even that (32bit applications - unless they use a special link option - only can use 2GB of virtual memory at max). 
So all the memory Firefox and all the high resolution windows contents created needs to fit into that smaller space. 
I suspect that the application memory of Firefox is running out on your platform, which would show the corruption symptoms you refer to in the other ticket and using PAE would not help that.
To confirm, seeing if your system somehow scrapes on that limit and passes it would be helpful to identify the root cause (and likely not the issue that originated the original bug ticket here).
Flags: needinfo?(zetka)
It is not a problem to restart to original unmodified W7 without PAE, but if I understand you welll. More or less, eating even 2GB of memory for 20 opened windows is uncomparable to chrome in the same situation...
(In reply to zetka from comment #24)
> It is not a problem to restart to original unmodified W7 without PAE, but if
> I understand you welll. More or less, eating even 2GB of memory for 20
> opened windows is uncomparable to chrome in the same situation...

Yes, Chrome has a different internal SW design that in this scenario is not affected by this limit but has other limits instead. On the other side, the x86-64 instruction set (also supported by other vendors e.g. Intel) has been created and publicly released by AMD over 16 years ago now. There is typically no good reason to use a 32bit Windows anymore, the problem you report would likely not happen with a 64bit version of Firefox.
Crash Signature: [@ atidxx32.dll | atiuxpag.dll | CContext::EmptyOutAllDDIBindPoints ] → [@ atidxx32.dll | atiuxpag.dll | CContext::EmptyOutAllDDIBindPoints ] [@ igd10iumd64.dll | CContext::EmptyOutAllDDIBindPoints ]
Flags: needinfo?(milan)
(In reply to Paul Blinzer from comment #25)
> (In reply to zetka from comment #24)
> > It is not a problem to restart to original unmodified W7 without PAE, but if
> > I understand you welll. More or less, eating even 2GB of memory for 20
> > opened windows is uncomparable to chrome in the same situation...
> 
> Yes, Chrome has a different internal SW design that in this scenario is not
> affected by this limit but has other limits instead. On the other side, the
> x86-64 instruction set (also supported by other vendors e.g. Intel) has been
> created and publicly released by AMD over 16 years ago now. There is
> typically no good reason to use a 32bit Windows anymore, the problem you
> report would likely not happen with a 64bit version of Firefox.

The reason why I use 32bit Windows is intentional, as far as I use some 16 and 32 bit applications, which are not functional in 64bit OS. The problem, wher should be worked on is the thing, that even after closing the openend windows, the Firefox memory is not correctly disposed. The Firefox must be restarted with the same windows layout to be alive for certain time before the resources are exhausted again. This is my own opinion.
(In reply to zetka from comment #27)
> (In reply to Paul Blinzer from comment #25)
> > (In reply to zetka from comment #24)
> > > It is not a problem to restart to original unmodified W7 without PAE, but if
> > > I understand you welll. More or less, eating even 2GB of memory for 20
> > > opened windows is uncomparable to chrome in the same situation...
> > 
> > Yes, Chrome has a different internal SW design that in this scenario is not
> > affected by this limit but has other limits instead. On the other side, the
> > x86-64 instruction set (also supported by other vendors e.g. Intel) has been
> > created and publicly released by AMD over 16 years ago now. There is
> > typically no good reason to use a 32bit Windows anymore, the problem you
> > report would likely not happen with a 64bit version of Firefox.
> 
> The reason why I use 32bit Windows is intentional, as far as I use some 16
> and 32 bit applications, which are not functional in 64bit OS. The problem,
> wher should be worked on is the thing, that even after closing the openend
> windows, the Firefox memory is not correctly disposed. The Firefox must be
> restarted with the same windows layout to be alive for certain time before
> the resources are exhausted again. This is my own opinion.

Thanks for the info. The matter of fact is that if virtual addresses get exhausted, which is more likely on a true 32bit OS, issues like this will eventually occur. This is not a driver issue (though the driver may be implicated) but a matter of available resources used. reducing the memory footprint of the application would be the way to continue. 

I wonder though if that problem still occurs with the new Firefox versions with Electrolysis (multi-process operation) e.g. Firefox 52. I suspect it's not reproing anymore due to the process footprint being smaller.
Therefore the bug should be closed if no repro can be achieved on newer browsers.
Flags: needinfo?(zetka)
I will test it in next week and share here my results.
Flags: needinfo?(zetka)
Mass wontfix for bugs affecting firefox 52.
I have seen the problem in case of opening many windows in FF again in the similar way on the 52 version. By my opinion still affected.
See Also: → 1351349
Mass change P1->P2 to align with new Mozilla triage process
Priority: P1 → P2
For signature [@ atidxx32.dll | atiuxpag.dll | CContext::EmptyOutAllDDIBindPoints ], 
there are not new crashes reported for over one month on 57 and after 57. 

For signature [@ igd10iumd64.dll | CContext::EmptyOutAllDDIBindPoints ],
The crash rate is quite low for the past month. 

Therefore, I am going to set this bug as P3. 
A side note: this bug should be more related to gfx since from crash BT it looks like it crashes in PaintThread when handing shader/surface rendering.
Priority: P2 → P3
Assignee: cpearce → nobody

Closing because no crashes reported for 12 weeks.

Status: NEW → RESOLVED
Closed: 2 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: