Closed Bug 1308471 Opened 8 years ago Closed 3 years ago

Crash in CopyRow_AVX

Categories

(Core :: WebRTC, defect, P3)

x86
Windows XP
defect

Tracking

()

RESOLVED INACTIVE
Tracking Status
firefox50 --- affected
firefox51 --- affected
firefox52 --- wontfix

People

(Reporter: bmaris, Assigned: sotaro)

References

Details

(Keywords: crash, stale-bug)

Crash Data

Attachments

(1 file, 1 obsolete file)

This bug was filed from the Socorro interface and is 
report bp-9864bf0c-c881-4689-a2c0-0cd032161006.
=============================================================

Other raports:
bp-f668f5de-0635-4682-850c-90da82161007
bp-d19a9602-5d2e-46fa-9878-10dfd2161006 
bp-5832a9d6-770d-4fa8-a0d4-7ce7e2161006

[Affected versions]:
- Firefox 50 beta 5
- latest Nightly 52.0a1

[Affected platforms]:
- Windows XP

[Steps to reproduce]:
- I don't have exact steps to reproduce since this reproduces intermittently but it occurred during webRTC calls, exiting rooms, joining rooms, unplugging video/audio, refresh website during call. 

I reproduced on https://opentokrtc.com/, https://meet.jit.si/

[Expected result]:
- No crashes are recorded.

[Actual result]:
- Firefox crashes

[Regression range]:
- Not sure if this is a recent regression, it's hard to find one since I ran into it intermittently, but I'll give it a try ASAP.
Sotaro - this might be fallout from your libyuv landing (Bug 1284803)
Bogdan - you could do a targeted regression check for that bug's landing, though if it's not easy to repro that's probably not worth the time.
Rank: 15
Flags: needinfo?(sotaro.ikeda.g)
Flags: needinfo?(bogdan.maris)
Priority: -- → P1
Assignee: nobody → sotaro.ikeda.g
Flags: needinfo?(sotaro.ikeda.g)
AVX needs OS support, and I don't believe Windows XP supports that. This might be the reason.
Oh wait, it crashes on AMD... Does AMD support AVX? Looks like all crash reports are on "AuthenticAMD family 21 model 2 stepping 0 | 8". Probably this CPU incorrectly sets the bit for AVX?
(In reply to Xidorn Quan [:xidorn] (UTC+10) from comment #3)
> reports are on "AuthenticAMD family 21 model 2 stepping 0 | 8". Probably
> this CPU incorrectly sets the bit for AVX?

It seems to be possible, there are checks of AVX capability by gecko and by libyuv.
But Windows XP doesn't support AVX, which indicates that the detection code is imperfect anyway. Windows's AVX support starts from Windows 7 SP1.
Thanks, we could add Win version check to mozilla::supports_ssse3().
It looks like we have had the correct OS support check:
https://dxr.mozilla.org/mozilla-central/rev/7ae377917236b7e6111146aa9fb4c073c0efc7f4/mozglue/build/SSE.cpp#168-181
which looks effectively the same as what FFmpeg is currently using:
https://github.com/FFmpeg/FFmpeg/blob/9eb3da2/libavutil/x86/cpu.c#L131-L142

Then not sure how could that happen.
Sorry, mozilla::supports_ssse3() check was used only for video playback use case. The crash code path is a use case of webrtc and it have only libyuv check.
(In reply to Sotaro Ikeda [:sotaro] from comment #8)
> Sorry, mozilla::supports_ssse3() check was used only for video playback use
> case. The crash code path is a use case of webrtc and it have only libyuv
> check.

libyuv checks it in InitCpuFlags()
   https://dxr.mozilla.org/mozilla-central/source/media/libyuv/source/cpu_id.cc#212
mozilla::supports_ssse3() and libyuv check seems same actually.
Blocks: 1284803
Before bug 1284803 fix, libyuv::CopyPlane() did not support AVX.
(In reply to Sotaro Ikeda [:sotaro] from comment #10)
> mozilla::supports_ssse3() and libyuv check seems same actually.

Sorry, supports_ssse3() is not related to this bug. The bug is about AVX.
(In reply to Sotaro Ikeda [:sotaro] from comment #9)
> (In reply to Sotaro Ikeda [:sotaro] from comment #8)
> The crash code path is a use case of webrtc and it have only libyuv check.
> 
> libyuv checks it in InitCpuFlags()
>   
> https://dxr.mozilla.org/mozilla-central/source/media/libyuv/source/cpu_id.

AVX check drrmd actually same between gecko and libyuv.
Attachment #8800155 - Attachment is obsolete: true
(In reply to Sotaro Ikeda [:sotaro] from comment #13)
> > libyuv checks it in InitCpuFlags()
> >   
> > https://dxr.mozilla.org/mozilla-central/source/media/libyuv/source/cpu_id.
> 
> AVX check seems actually same between gecko and libyuv.

   https://dxr.mozilla.org/mozilla-central/source/media/libyuv/source/cpu_id.cc#212
(In reply to Randell Jesup [:jesup] from comment #1)
> Sotaro - this might be fallout from your libyuv landing (Bug 1284803)
> Bogdan - you could do a targeted regression check for that bug's landing,
> though if it's not easy to repro that's probably not worth the time.

Looking at bug 1284803 I see that it was fixed in Nightly on 2016-07-26 but I could not reproduce the crash in the day before that (2016-07-25) nor the days after the fix (2016-07-26 and 2016-07-27). Since it took me some time to do this and got me nowhere, I did not went further on since it takes so much of my time.
Flags: needinfo?(bogdan.maris)
:bogdan_maris, can you get the log out to std error with built firefox.exe of comment 18 when the crash happen?
Flags: needinfo?(bogdan.maris)
(In reply to Sotaro Ikeda [:sotaro] from comment #19)
> :bogdan_maris, can you get the log out to std error with built firefox.exe
> of comment 18 when the crash happen?

Sorry, I'm actually not sure what are you asking here. Can you please give me a step-by-step guide or something?
Sorry, my explanation was not clear.

Step is like the following.

[1] Down load the try built firefox binary from the following.
  https://archive.mozilla.org/pub/firefox/try-builds/sikeda@mozilla.com-f52d8bfef2f73e53c7c65cad8ea9bc3f99493f33/try-win32/

[2] Run the downloaded firefox.exe with the command like the following. Logout of attachment 8800203 [details] [diff] [review] is added to the firefox.exe.
  > ./firefox.exe > log.txt 2>&1

[3] Test the STR of the crash and check if log.txt has log out of attachment 8800203 [details] [diff] [review].
Sotaro: I see only 3 reports in crash-stats above (I presume all from QA).  How large an issue do you think this is for fx50?  Is it better to ship with this bug, or land something late in beta to fix it (time is about out), or back the regressing patch out of beta (also has some risk).
Flags: needinfo?(sotaro.ikeda.g)
It seems happen only on WinXP with "AuthenticAMD family 21 model 2 stepping 0 | 8". And the crash seemed to come only from QA. From it, it seems not a problem for fx50.

:milan, how do you think about it?
Flags: needinfo?(sotaro.ikeda.g) → needinfo?(milan)
Flags: needinfo?(bogdan.maris)
:bogdan_maris, can you get the log by comment 21?
Flags: needinfo?(bogdan.maris)
(In reply to Sotaro Ikeda [:sotaro] from comment #24)
> :bogdan_maris, can you get the log by comment 21?

Sorry, I was a bit busy with other stuff. I did test yesterday and today using the indications from comment 21 and my steps to reproduce but was not able to get the same crash signature after many many tries.

I did receive 4 crashes though:

- today crash: 
bp-53d9ac1a-0887-4f3a-b1b2-e52422161019
- yesterday crshes: 
bp-10bf36e7-6d7e-4844-83ea-af26a2161019
bp-2ab32890-edaf-4aed-8f1e-bd5b22161018
bp-de65c6cb-d214-4c53-bfe4-190f42161018

Here is the output of log.txt file:

- Yesterday crashes:
GetXCR0(1) xcr0 7
InitCpuFlags() support AVX
GetXCR0(1) xcr0 7

- Today crash: 
InitCpuFlags() not support AVX

Please let me know if I can help with this further on.
Flags: needinfo?(bogdan.maris)
Let's get some more data and see if we have a fix and how intrusive it would be.  While the number of crashes is low, we do have more XP users in release than in beta, so we want to make sure we anticipate larger number of problems when we release.  If the fix is simple enough, I'm happy to support uplifting it.
Flags: needinfo?(milan)
(In reply to Bogdan Maris, QA [:bogdan_maris] from comment #25)
> (In reply to Sotaro Ikeda [:sotaro] from comment #24)
> > :bogdan_maris, can you get the log by comment 21?
> 
> Sorry, I was a bit busy with other stuff. I did test yesterday and today
> using the indications from comment 21 and my steps to reproduce but was not
> able to get the same crash signature after many many tries.
> 
> I did receive 4 crashes though:
> 
> - today crash: 
> bp-53d9ac1a-0887-4f3a-b1b2-e52422161019
> - yesterday crshes: 
> bp-10bf36e7-6d7e-4844-83ea-af26a2161019
> bp-2ab32890-edaf-4aed-8f1e-bd5b22161018
> bp-de65c6cb-d214-4c53-bfe4-190f42161018
> 

crashes seems not related to this bug's crash.

> Here is the output of log.txt file:
> 
> - Yesterday crashes:
> GetXCR0(1) xcr0 7
> InitCpuFlags() support AVX
> GetXCR0(1) xcr0 7
> InitCpuFlags() not support AVX

It says that avx support check of libyuv worked correctly. If the check works correctly the CopyRow_AVX() should not be called.
Mass wontfix for bugs affecting firefox 52.
Mass change P1->P2 to align with new Mozilla triage process
Priority: P1 → P2
Moving to p3 because no activity for at least 1 year(s).
See https://github.com/mozilla/bug-handling/blob/master/policy/triage-bugzilla.md#how-do-you-triage for more information
Priority: P2 → P3

Hi Bogdan!
I'm trying to reproduce this issue on latest Nightly but without success.
Does it seem to happen only on WinXP, isn't it?. Can you confirm if this is still an issue?

Flags: needinfo?(bogdan.maris)

(In reply to Marcela from comment #32)

Hi Bogdan!
I'm trying to reproduce this issue on latest Nightly but without success.
Does it seem to happen only on WinXP, isn't it?. Can you confirm if this is still an issue?

Unfortunately this happened (for me) only on Windows XP and I don't have that machine anymore. I tried using a XP VM but meet.jit.si does not work on XP for some reason even in the last firefox suported 52.9.0esr. Based on the above, the fact that this is a very old bug and no more crashes are recorded with this particular signature, I'm going to close this bug as Inactive.

Status: NEW → RESOLVED
Closed: 3 years ago
Flags: needinfo?(bogdan.maris)
Resolution: --- → INACTIVE
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: