Closed Bug 700288 Opened 13 years ago Closed 5 years ago

crash in mozilla::gfx::BaseRect::UnionEdges with AMD Radeon HD 6xxx series

Categories

(Core :: Layout, defect)

8 Branch
x86
Windows 7
defect
Not set
critical

Tracking

()

RESOLVED WONTFIX
Tracking Status
firefox8 - ---
firefox10 + ---
firefox13 - ---
firefox14 - ---

People

(Reporter: kairo, Unassigned)

References

Details

(Keywords: crash, regression)

Crash Data

This bug was filed from the Socorro interface and is 
report bp-6fdb7e4b-1e50-45af-b3ce-f443b2111107 .
============================================================= 

Stack:

0 	xul.dll 	mozilla::gfx::BaseRect<int,nsRect,nsPoint,nsSize,nsMargin>::UnionEdges 	obj-firefox/dist/include/mozilla/gfx/BaseRect.h:180
1 	xul.dll 	nsFrame::ConsiderChildOverflow 	layout/generic/nsFrame.cpp:6330
2 	xul.dll 	nsAbsoluteContainingBlock::Reflow 	
3 	xul.dll 	nsBlockFrame::Reflow 	layout/generic/nsBlockFrame.cpp:1220

More crash reports can be found at https://crash-stats.mozilla.com/report/list?signature=mozilla%3A%3Agfx%3A%3ABaseRect%3Cint%2C%20nsRect%2C%20nsPoint%2C%20nsSize%2C%20nsMargin%3E%3A%3AUnionEdges%28nsRect%20const%26%29 - some of the crash reports with this signature only have frame #0 at all, e.g. bp-4208dfbe-c228-4bf7-8c76-bfdfc2111106

This seems to have no correlation to special add-ons at least, and though there seem to be a few reports with other versions, this suddenly spiked in 8.0b6 over the weekend as that last beta has been adopted, ranking as #39 overall on 8 in yesterday's data with 65 crashes per million ADU (or 98 crashes overall).

The vast majority of those crashes are on Win7, though I've seen on on XP as well.
Summary: crash in mozilla::gfx::BaseRect with Firefxo 8.0b6 → crash in mozilla::gfx::BaseRect with Firefox 8.0b6
I *think* this is a layout issue. Move if not the case :)
Component: Graphics → Layout
QA Contact: thebes → layout
Not sure high volume but looks new for 8b6.
Keywords: regression
This has 600+ crashes and it's a clear regression. Need to find an owner.
CCing some of the usual suspects.
Windows only, which is very strange for layout.

Are there any URLs in common among the crash reports?
(In reply to Robert O'Callahan (:roc) (Mozilla Corporation) from comment #5)
> Are there any URLs in common among the crash reports?

Setting the needURLs keyword so that people looking for that will find it and add a URL report to the bug.
Keywords: needURLs
I tried a few URL queries for the last 3 days and despite the fact that I have escaped the regex for some reason it is not showing me any URLs (but I do see URLs in individual reports in crash stats). I will try again.
I tried again today with the same results. Despite the fact that there are 239 crashes for the 2011110200 build, when I run the report it returns an empty search.
grooveshark.com is one site that I see in the trunk reports.
Those are the URLs I see in reports from yesterday:

1 http://xnxx.com/
1 http://www.youtube.com/watch?v=miq5X193Djo&feature=related
1 http://www.nordea.fi/
1 http://www.liveleak.com/c/iraq#item_page=6
1 http://www.facebook.com/dialog/oauth?...
1 http://www.facebook.com/ajax/pagelet/generic.php/PhotoViewerPagelet?...
1 http://www.facebook.com/ajax/pagelet/generic.php/PhotoViewerPagelet?...
1 http://www.facebook.com/ajax/pagelet/generic.php/PhotoViewerPagelet?...
1 http://www.facebook.com/ajax/pagelet/generic.php/PhotoViewerPagelet?...
1 http://www.facebook.com/ajax/pagelet/generic.php/PhotoViewerPagelet?...
1 http://www.facebook.com/ajax/pagelet/generic.php/MoreStoriesPagelet?...
1 http://www.facebook.com/ai.php?...
1 http://www.facebook.com/ai.php?...
1 http://www.facebook.com/ai.php?...
1 http://www.ebay.de/sch/i.html?...
1 http://www.damnlol.com/thats-one-good-looking-dog-3831.html
1 http://www.1001franquias.com.br/todas/8
1 https://zyngapv.hs.llnwd.net/e6/mwfb/graphics/spacer.gif
1 https://www.facebook.com/ajax/profile/navigation.php?...
1 https://s-static.ak.fbcdn.net/connect/xd_proxy.php?...&origin=http%3A%2F%2Ffacebook.mafiawars.zynga.com%2F...
1 https://plus.google.com/u/0/_/notifications/frame?...
1 http://serverfarm.ad-sponsor.com/srv/www/delivery/afr.php?refresh=400&campaignid=12&target=_blank&loc=http%3A%2F%2Fwww.skyload.net%2FFile%2F28cd56c200ccff60095f3d06a10407e4.flv
1 http://sbb.ch/
1 http://maps.google.com/
1 http://kleinanzeigen.ebay.de/anzeigen/stadt/oldenburg/
1 http://grooveshark.com/dfpAds.html?p=explore_popular&w=...
1 http://grooveshark.com/dfpAds.html?p=explore_popular&w=...
1 http://grooveshark.com/dfpAds.html?p=explore_popular&w=...
1 http://grooveshark.com/dfpAds.html?p=explore_popular&w=...
1 http://forum.portaldovt.com.br/forum/index.php?showtopic=137576
1 http://edge.jeetyetmedia.com/728x90_facebook_photo.htm
1 http://bungaissky.blogspot.com/?guestAuth=...
1 \N

At ... I have cut stuff that sounds like it could potentially contain privacy-relevant data.
Keywords: needURLs
Summary: crash in mozilla::gfx::BaseRect with Firefox 8.0b6 → crash in mozilla::gfx::BaseRect::UnionEdges with Firefox 8.0b6
None of the non-... URLs crash for me, needless to say :-(

The correlations page says "none available" ... does that mean there's no correlation with add-ons, or no-one's done the analysis?
With 232 crashes, probably not enough volume to be able to get correlation data. We can manually look up data, but https://crash-analysis.mozilla.com/crash_analysis/20111121/ does not have the trunk as a version currently.
Seems weird that we would have a bug in code as hot as UnionEdges that only has 232 crashes. Most of these URLs have Flash content and I suspect memory corruption from plugins (see bug 678538.) Can we get Flash Player version info here? We may have a known bad Flash Plugin affecting us here.
Jet, the vast majority of those crashes are in chrome processes and we only get Flash versions reported for plugin processes. In analyzing all crashes that so far have happened in December, I found two hangs and one crash in plugin processes with that ::UnionEdges signature, and they had Flash 11.1.102.55, which AFAIK is pretty much the newest version, but also one hang with 10.2.159.1 so I'd say this is not specific to a certain Flash version.
It's currently #3 top crasher in 10.0b6.
It also happened at a high volume in 10.0b1.
My guess is that is caused by Beta testers that want to test if their known issue is fixed in the RC.

It occurs mainly with AMD GPUs (checked manually because there's no correlation file for that) although I saw one with a NVIDIA GPU on Win XP: bp-6d314ea8-a0a1-4737-8537-6a3d22120126.
Device IDs are mainly 9802 (AMD Radeon HD 6300), 9803 (AMD Radeon HD 6300), 9804 (AMD Radeon HD 6310), 9806 (AMD Radeon HD 6320), 9807 (AMD Radeon HD 6290) (source: http://developer.amd.com/download/pc_vendor_id/pages/default.aspx).
It crashes with driver versions up to 8.92 (see bp-dbf7d428-20b9-46a8-8b9c-9c9a12120128).
The latest AMD display driver version is 8.93 released on Jan 25th, 2012 (see http://support.amd.com/us/kbarticles/Pages/AMDCatalystSoftwareSuiteVersion121.aspx).
Blocks: 605780
Keywords: topcrash
Summary: crash in mozilla::gfx::BaseRect::UnionEdges with Firefox 8.0b6 → crash in mozilla::gfx::BaseRect::UnionEdges with AMD Radeon HD 6xxx series
It first appeared in 7.0a2/20110712 and in 8.0a1/20110729.
Version: Trunk → 8 Branch
A lot of these crashes only have a single frame at the top, which is odd. Stack smashed?
B6 data seems pretty different from B1 in volume - B6 has 764 crashes so far and B1 had 231 crashes in 4 weeks. But our beta user base has grown as well since B1, so that may account somewhat for the difference.
Depends on: 722538
We now have a test machine that has the specs to test this bug. Juan and I have spent some time testing but we have not been able to generate a crash in this stack. Juan crashes twice but both times in a different stack where the bug had been fixed.

There were very few crashes in 10.0 but it looks like there are now about 266 crashes in 10.0.1. The next steps IMO to try to figure this out would be to get some news URLs and see if we can try to reproduce.
Keywords: needURLs
Sent email to try go get more URLs to help repro.
Many of the URLs seem to be videos, those are those for yesterday that have more than 1 crash report:

[rkaiser@cm-fs01 ~]$ gunzip --stdout /data/security_group/crash_urls/20120226-crashdata.csv.gz | awk -W compat -F\t '$1 ~ /mozilla::gfx::BaseRect.*::UnionEdges/ {print $1 $2}' | sort | uniq -c | sort -nr
      3 http://www.youtube.com/watch?v=K86kJXt2dHk
      3 http://www.tumblr.com/dashboard
      2 http://www.youtube.com/watch?v=SiXCZ-Ew0b0&feature=related
      2 http://www.youtube.com/watch?v=OoXkiIt3cng&feature=player_embedded
      2 http://www.youtube.com/watch?NR=1&feature=endscreen&v=IMvrJJrpX9Q
      2 http://www.google.com.jm/
      2 http://samakomlao.blogspot.com/
      2 http://qtarawaci.com:2082/frontend/x3/filemanager/index.html?dirselect=webroot&domainselect=qtarawaci.com&dir=%2Fhome%2Fqtarawac%2Fpublic_html&showhidden=1
      2 about:blank

The ones with 1 crash are a large number of additional youtube hits and a couple where the URL sounds like it would be some porn stuff. There's also some google search, facebook and netflix among a few others with a single crash per URL.
Removing needURLs and adding qawanted. Marcia - can you take another stab at this?
Keywords: needURLsqawanted
Slight spike in the 11 explosive report: https://crash-analysis.mozilla.com/rkaiser/2012-03-11/2012-03-11.firefox.11.explosiveness.html. I will try some of the URLs in Comment 21 as well as some of the new ones to see if I can reproduce on the one machine I have that has Radeon.
It's #22 top browser crasher in 11.0.

Here are correlations in 11.0 on March 22:
  mozilla::gfx::BaseRect<int, nsRect, nsPoint, nsSize, nsMargin>::UnionEdges(nsRect const&)|EXCEPTION_ACCESS_VIOLATION_READ (311 crashes)
     95% (297/311) vs.   4% (4480/110886) atidxx32.dll
         20% (63/311) vs.   0% (288/110886) 8.17.10.318
          7% (23/311) vs.   0% (152/110886) 8.17.10.325
         10% (31/311) vs.   0% (189/110886) 8.17.10.331
          6% (20/311) vs.   0% (210/110886) 8.17.10.337
          3% (8/311) vs.   0% (165/110886) 8.17.10.342
          1% (3/311) vs.   0% (262/110886) 8.17.10.355
         37% (116/311) vs.   1% (637/110886) 8.17.10.362
          1% (3/311) vs.   0% (36/110886) 8.17.10.370
          5% (17/311) vs.   0% (167/110886) 8.17.10.378
          0% (1/311) vs.   0% (104/110886) 8.17.10.385
          1% (2/311) vs.   0% (119/110886) 8.17.10.395
          0% (1/311) vs.   0% (112/110886) 8.17.10.401
          0% (1/311) vs.   0% (224/110886) 8.17.10.405
          1% (2/311) vs.   0% (477/110886) 8.17.10.414
          0% (1/311) vs.   0% (11/110886) 8.17.10.418
          2% (5/311) vs.   0% (535/110886) 8.17.10.425

  mozilla::gfx::BaseRect<int, nsRect, nsPoint, nsSize, nsMargin>::UnionEdges(nsRect const&)|EXCEPTION_ACCESS_VIOLATION_WRITE (99 crashes)
     96% (95/99) vs.   4% (4480/110886) atidxx32.dll
         27% (27/99) vs.   0% (288/110886) 8.17.10.318
          3% (3/99) vs.   0% (152/110886) 8.17.10.325
          7% (7/99) vs.   0% (189/110886) 8.17.10.331
          5% (5/99) vs.   0% (210/110886) 8.17.10.337
          4% (4/99) vs.   0% (165/110886) 8.17.10.342
         38% (38/99) vs.   1% (637/110886) 8.17.10.362
          2% (2/99) vs.   0% (36/110886) 8.17.10.370
          4% (4/99) vs.   0% (167/110886) 8.17.10.378
          1% (1/99) vs.   0% (119/110886) 8.17.10.395
          1% (1/99) vs.   0% (477/110886) 8.17.10.414
          3% (3/99) vs.   0% (535/110886) 8.17.10.425

  mozilla::gfx::BaseRect<int, nsRect, nsPoint, nsSize, nsMargin>::Union(nsRect const&)|EXCEPTION_ACCESS_VIOLATION_READ (12 crashes)
     92% (11/12) vs.   4% (4480/110886) atidxx32.dll
         33% (4/12) vs.   0% (288/110886) 8.17.10.318
          8% (1/12) vs.   0% (152/110886) 8.17.10.325
          8% (1/12) vs.   0% (189/110886) 8.17.10.331
         25% (3/12) vs.   1% (637/110886) 8.17.10.362
          8% (1/12) vs.   0% (167/110886) 8.17.10.378
          8% (1/12) vs.   0% (104/110886) 8.17.10.385
There's a spike in crashes from 14.0a1/20120414 making it #1 top crasher in the trunk over the last day although the D2D blocklist is effective. It now affects also AMD GPUs that are not D2D-blocklisted.

The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=10622eaff4fc&tochange=364f0a5a1d2d

I think it's related to bug 745054.
Again, there's a spike in crashes from 14.0a1/20120418. The regression range for the spike is:
http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=c61e7c3a232a&tochange=0c7e2911be75
The spike is gone in the next build.
There is also a spike in crashes from 13.0a2/20120417 that ended after 13.0a2/20120418 making it #8 top browser crasher in 13.0a2.
The regression range for the spike is:
http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=e158b91dd28a&tochange=01ae9ced59c6
The working range for the spike is:
http://hg.mozilla.org/releases/mozilla-aurora/pushloghtml?fromchange=360971f3fcc3&tochange=d44c15d3ecd0

Based on the previous comments, there's likely a static variable that is not initialized somewhere.
It's back in 14.0a2/20120429.
I cross the fingers so that 13.0 isn't a bad build.
From bp-62a17cae-1fbf-48e4-a2ea-e9bb92120507:

AdapterVendorID: 1002, AdapterDeviceID: 9802, AdapterSubsysID: 05201025, AdapterDriverVersion: 8.792.0.0

D3D10 Layers? D3D10 Layers-

D3D9 Layers? D3D9 Layers+

This shows that the block of D2D is working, but we're still crashing anyways.

Should we try blocking D3D9 too? That'd stop us from using the DirectX DLLs altogether.
It was crashing one build out of two in 14.0a2 the last week: https://crash-stats.mozilla.com/report/list?version=Firefox%3A14.0a2&query_search=signature&query_type=contains&reason_type=contains&range_value=4&range_unit=weeks&hang_type=any&process_type=any&signature=mozilla%3A%3Agfx%3A%3ABaseRect%3Cint%2C%20nsRect%2C%20nsPoint%2C%20nsSize%2C%20nsMargin%3E%3A%3AUnionEdges%28nsRect%20const%26%29

(In reply to Joe Drew (:JOEDREW!) from comment #31)
> Should we try blocking D3D9 too? That'd stop us from using the DirectX DLLs
> altogether.
I think the underlying problem should be understood: why it works in some builds and fails in others without any change to the Gfx code.
QAWANTED still on this bug -- is there something QA can help with here? Keep in mind that we pretty much exhausted the testing we could do in Firefox 10b*.
(In reply to Anthony Hughes, Mozilla QA (irc: ashughes) from comment #33)
> QAWANTED still on this bug -- is there something QA can help with here? Keep
> in mind that we pretty much exhausted the testing we could do in Firefox
> 10b*.

We've got an email out to JP and Bas asking what more we can do to test and help them figure out what's going wrong with the few AMD Radeon HD 6xxx crashers that are floating around.
We have been contacted by AMD too. I'd like it if they could try Purify or some other valgrind-alike to see if they're scribbling on our process space.

There's not much that QA can do right now. I'd still like to see us block D3D9 as well on these devices and see if crash rate changes. At least then we'd have something to tell AMD.
Given comment 35 and recent meeting discussions, I'm removing qawanted. Please re-add if there is something specific QA can do to help in the future.
Keywords: qawanted
Depends on: 755974
Unfortunately, 13.0b4 is a bad build.
If bug 755974 lands soon, we should be able to see if it makes those crashes drop off.
No longer blocks: 605780
Crash Signature: [@ mozilla::gfx::BaseRect<int, nsRect, nsPoint, nsSize, nsMargin>::UnionEdges(nsRect const&)] → [@ mozilla::gfx::BaseRect<int, nsRect, nsPoint, nsSize, nsMargin>::UnionEdges(nsRect const&)] [@ @0x0 | mozilla::gfx::BaseRect<int, nsRect, nsPoint, nsSize, nsMargin>::Union(nsRect const&)]
No longer depends on: 755974, 722538
Blocks: 605780
Depends on: 755974, 722538
I'm very concerned that we still don't understand the underlying cause here :s
There are still crashes after the D3D9 blocklisting. See fro instance App Notes of bp-6f14b810-1b1a-4d86-8db5-73fa52120525:
D3D10 Layers? D3D10 Layers- D3D9 Layers? D3D9 Layers+ D3D9 Layers-
There are no crashes in 13.0b5.
In 14.0a2, they were appearing one build out of two during one week (see comment 32), in 10.0 Beta, they were in 10.0b1 and 10.0b6, and in 11.0 Beta, they were in 11.0b2, 11.0b5, 11.0b7.
So there's a risk that they appear again in 13.0b6.
This no longer appears to be a top crasher.
Keywords: topcrash
Crash Signature: [@ mozilla::gfx::BaseRect<int, nsRect, nsPoint, nsSize, nsMargin>::UnionEdges(nsRect const&)] [@ @0x0 | mozilla::gfx::BaseRect<int, nsRect, nsPoint, nsSize, nsMargin>::Union(nsRect const&)] → [@ mozilla::gfx::BaseRect<int, nsRect, nsPoint, nsSize, nsMargin>::UnionEdges(nsRect const&)] [@ @0x0 | mozilla::gfx::BaseRect<int, nsRect, nsPoint, nsSize, nsMargin>::Union(nsRect const&)] [@ mozilla::gfx::BaseRect<T>::UnionEdges] [@ @0x0 | mozilla::g…
Closing because no crashes reported for 12 weeks.
Status: NEW → RESOLVED
Closed: 5 years ago
Resolution: --- → WONTFIX
Closing because no crashes reported for 12 weeks.
You need to log in before you can comment on or make changes to this bug.