973192 - Firefox hangs on start when using NouVeau driver

Reporter

Description

•

11 years ago

User Agent: Mozilla/5.0 (X11; Linux x86_64; rv:30.0) Gecko/20100101 Firefox/30.0 (Beta/Release) Build ID: 20140215125843 Steps to reproduce: Since february 14th, I cannot launch Firefox. It eats all CPU. I have to open a terminal, use top and kill firefox task in order to get it run. I'm using Archlinux, 64 bits, with gcc 4.8.2. Seems to be a really recent regression. Removing addons doesn't change a thing. A firefox task is eating all cpu. Actual results: Just build firefox. Try to launch it. Wait for at least 10 seconds. Nothing happens. Opening a terminal and entering top shows me that firefox is eating 100% of CPU. Expected results: Just launching.

Frederic Bezies

Reporter

Updated

•

11 years ago

Component: Untriaged → General

Frederic Bezies

Reporter

Comment 1

•

11 years ago

My last working and quick starting build was based on revision : http://hg.mozilla.org/mozilla-central/rev/6687d299c464 Strange part is that Firefox starts ok when I kill it in top.

Frederic Bezies

Reporter

Comment 2

•

11 years ago

Busted build : https://hg.mozilla.org/mozilla-central/rev/b80f7eece913 Last official working build : https://hg.mozilla.org/mozilla-central/rev/d275eebfae04

Frederic Bezies

Reporter

Comment 3

•

11 years ago

Attached file Debug log while trying to launch with last firefox debug build. — Details

Frederic Bezies

Reporter

Comment 4

•

11 years ago

Could this bug be related ? https://bugzilla.mozilla.org/show_bug.cgi?id=959787

Frederic Bezies

Reporter

Updated

•

11 years ago

Summary: Firefox very slow to start → Firefox hangs on start

Boris Zbarsky [:bzbarsky]

Comment 5

•

11 years ago

Bug 959787 seems a bit unlikely to cause this sort of problem.... Can you try tinderbox builds to narrow down the regression range further, perhaps?

Flags: needinfo?(fredbezies)

Frederic Bezies

Reporter

Comment 6

•

11 years ago

I'll try asap hoping I will find a smaller regression range.

Flags: needinfo?(fredbezies)

Frederic Bezies

Reporter

Comment 7

•

11 years ago

(In reply to Frederic Bezies from comment #6) > I'll try asap hoping I will find a smaller regression range. Looks like I got one. Last working tinderbuild (no hang on start) : https://hg.mozilla.org/mozilla-central/rev/23f7a629a217 First busted tinderbuild (hang and having to kill process) : https://hg.mozilla.org/mozilla-central/rev/5d7caa093f4f Looks weird anyway...

Boris Zbarsky [:bzbarsky]

Comment 8

•

11 years ago

Can you try inbound tinderbuilds?

Frederic Bezies

Reporter

Comment 9

•

11 years ago

(In reply to Boris Zbarsky [:bz] (reviews will be slow; ask someone else) from comment #8) > Can you try inbound tinderbuilds? Try it. Report asap.

Frederic Bezies

Reporter

Comment 10

•

11 years ago

On mozilla-inbound : Last working tinderbuild : https://hg.mozilla.org/integration/mozilla-inbound/rev/c7802c9d6eec First broken tinderbuild : https://hg.mozilla.org/integration/mozilla-inbound/rev/9e7cf0b1d80c Could it be bug #972397 ?!

Boris Zbarsky [:bzbarsky]

Comment 11

•

11 years ago

So http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=c7802c9d6eec&tochange=9e7cf0b1d80c Bug 972397 sure seems like the most likely culprit in there.

Blocks: 972397

tracking-firefox30: --- → ?

Component: General → Graphics: Layers

Flags: needinfo?(nical.bugzilla)

Keywords: regression

Product: Firefox → Core

Boris Zbarsky [:bzbarsky]

Comment 12

•

11 years ago

One more question. When Firefox hangs, can you possibly get stacks for all its threads?

Frederic Bezies

Reporter

Comment 13

•

11 years ago

Will try. But looking at it again, I could find another possible culprit ? bug #972700... What about it ?

Boris Zbarsky [:bzbarsky]

Comment 14

•

11 years ago

Is that really in your range from comment 10? In any case, it's just a string change; shouldn't cause hangs!

Frederic Bezies

Reporter

Comment 15

•

11 years ago

Just asking your opinion. Just tell me how to report stacks while hanging. Will report datas asap.

Boris Zbarsky [:bzbarsky]

Comment 16

•

11 years ago

Something like this should work, if "firefox-pid" is the process id and you're using an sh-compatible shell: gdb > /tmp/log.txt 2>&1 and then at the gdb prompt type: attach firefox-pid and then: thread apply all bt and then: quit

Frederic Bezies

Reporter

Comment 17

•

11 years ago

Attached file Logs asked for in comment #16 — Details

Frederic Bezies

Reporter

Comment 18

•

11 years ago

Added gdb log, hoping it will help fixing this bug.

Boris Zbarsky [:bzbarsky]

Comment 19

•

11 years ago

Thanks! Here's what the thread 1 (main thread) stack looks like: Thread 1 (Thread 0x7f2d032c2740 (LWP 3445)): #0 0x00007f2d02ebfa8d in read () from /usr/lib/libpthread.so.0 #1 0x00007f2cfd97264a in mozilla::widget::GfxInfo::GetData() () from /home/fred/Téléchargements/firefox/libxul.so #2 0x00007f2cfd96ea2d in nsBaseWidget::ComputeShouldAccelerate(bool) () from /home/fred/Téléchargements/firefox/libxul.so #3 0x00007f2cfd96fbf8 in nsBaseWidget::GetLayerManager(mozilla::layers::PLayerTransactionChild*, mozilla::layers::LayersBackend, nsIWidget::LayerManagerPersistence, bool*) () from /home/fred/Téléchargements/firefox/libxul.so #4 0x00007f2cfdeb761e in PresShell::Paint(nsView*, nsRegion const&, unsigned int) () from /home/fred/Téléchargements/firefox/libxul.so #5 0x00007f2cfdbadcb4 in nsViewManager::ProcessPendingUpdatesForView(nsView*, bool) () from /home/fred/Téléchargements/firefox/libxul.so #6 0x00007f2cfdebe827 in nsRefreshDriver::Tick(long, mozilla::TimeStamp) () from /home/fred/Téléchargements/firefox/libxul.so That shows us reading from the pipe to the glxtest process, and presumably blocking on that read...

Flags: needinfo?(bjacob)

Nicolas Silva [:nical]

Comment 20

•

11 years ago

(In reply to Boris Zbarsky [:bz] (reviews will be slow; ask someone else) from comment #11) > So > http://hg.mozilla.org/integration/mozilla-inbound/ > pushloghtml?fromchange=c7802c9d6eec&tochange=9e7cf0b1d80c > > Bug 972397 sure seems like the most likely culprit in there. I would be very surprised. This should just affect assertions in debug builds. other comments of this thread also look unrelated to bug 972397

Flags: needinfo?(nical.bugzilla)

Boris Zbarsky [:bzbarsky]

Comment 21

•

11 years ago

Hmm. Actually, if we're using CPU, not just blocked... It might be worth getting a profile to see what's using the CPU. All the threads showing in attachment 8378338 [details] seem to be blocked, not running.

Frederic Bezies

Reporter

Comment 22

•

11 years ago

Firefox is blocked. I have to kill its process using top. Log was obtained while Firefox is blocked.

Debug log while trying to launch with last firefox debug build. 11 years ago Frederic Bezies 8.30 KB, text/x-log		Details
Logs asked for in comment #16 11 years ago Frederic Bezies 22.34 KB, text/plain		Details
Log asked for in comment #25 11 years ago Frederic Bezies 42.74 KB, text/plain		Details
Try running with this patch and paste here the stderr output 11 years ago Benoit Jacob [:bjacob] (mostly away) 6.86 KB, patch		Details \| Diff \| Splinter Review
Second patch to apply on top of the first one for detailed logging 11 years ago Benoit Jacob [:bjacob] (mostly away) 2.63 KB, patch		Details \| Diff \| Splinter Review
Third patch to apply, to allow attaching GDB to the glxtest process 11 years ago Benoit Jacob [:bjacob] (mostly away) 1.08 KB, patch		Details \| Diff \| Splinter Review
15 seconds of perf analysis 11 years ago Frederic Bezies 7.73 KB, text/plain		Details
Revert bug 972397 11 years ago Benoit Jacob [:bjacob] (mostly away) 860 bytes, patch		Details \| Diff \| Splinter Review
Tentative workaround #1: XSync-instead-of-XCloseDisplay 11 years ago Benoit Jacob [:bjacob] (mostly away) 1.03 KB, patch		Details \| Diff \| Splinter Review
Tentative workaround #2: dont-call-XCloseDisplay 11 years ago Benoit Jacob [:bjacob] (mostly away) 1.01 KB, patch		Details \| Diff \| Splinter Review
Revert Bug 969759 11 years ago Benoit Jacob [:bjacob] (mostly away) 1.32 KB, patch		Details \| Diff \| Splinter Review
Workaround: call XSync instead of XCloseDisplay unless NS_FREE_PERMANENT_DATA is defined 11 years ago Benoit Jacob [:bjacob] (mostly away) 2.01 KB, patch	karlt : review+	Details \| Diff \| Splinter Review
Workaround: do nothing in MOZ_gdk_display_close 11 years ago Benoit Jacob [:bjacob] (mostly away) 2.97 KB, patch	karlt : review+	Details \| Diff \| Splinter Review