Closed
Bug 670175
Opened 13 years ago
Closed 13 years ago
[10.7] Cannot start nightly trunk build after updating from 20110707 build
Categories
(Core :: Memory Allocator, defect)
Tracking
()
RESOLVED
WORKSFORME
People
(Reporter: marcia, Unassigned)
References
(Blocks 1 open bug)
Details
(Keywords: hang, regression)
Crash Data
Attachments
(3 files)
I updated from 20110707 to 20110708 and now I cannot launch the trunk build. Either it hangs at the Profile manager and has to be force quit, or it does launch and then when I try to launch any site I get an instant hang and I have to force quit. I tried with several new profiles and the same thing happens.
I updated on my 10.6 machine and had no issues there.
Reporter | ||
Comment 1•13 years ago
|
||
Possibly related crash report on IRC: https://crash-stats.mozilla.com/report/index/298c0a59-a3ce-4e37-9213-496092110708
I am going to try the other 10.7 machine in the lab as well to see what happens there. Both machines are running 11A511.
Comment 2•13 years ago
|
||
I just downloaded today's mozilla-central nightly (firefox-2011-07-08-03-08-00-mozilla-central) separately, and had no trouble running it on the 10.7 GM (build 11A511).
Later I'll try explicitly updating from yesterday's nightly.
Does it make any difference to use a clean profile?
Comment 3•13 years ago
|
||
Oops, I spoke too soon.
I tried today's nightly with a fresh profile, and it loaded. But then it hung as soon as I tried to visit http://www.apple.com/.
Reporter | ||
Comment 4•13 years ago
|
||
I tried using a clean profile but I could not get it to launch once I had already updated to today's build.
The 10.7 machine in the lab is exhibiting the same behavior after updating. The build hangs and I have to force quit. When I try to create a new profile it hangs at the profile manager and does not let me get beyond that step.
Summary: [10.7] Cannot start nightly trunk build → [10.7] Cannot start nightly trunk build after updating from 20110707 build
Reporter | ||
Comment 5•13 years ago
|
||
Possible pushlog regression: http://hg.mozilla.org/mozilla-central/pushloghtml?fromchange=b2622d5c857a&tochange=6e461b1419b8.
Keywords: hang,
regression
Comment 6•13 years ago
|
||
The way I create a clean profile is to delete or rename ~/Library/Application Support/Firefox -- which is (probably) why I don't see your Profile Manager hang.
I'm building an opt build with debug symbols (from current trunk code), to see what I can find out.
Comment 7•13 years ago
|
||
I just tested today's aurora nightly, and it doesn't have this problem.
Big sigh of relief!
Reporter | ||
Comment 8•13 years ago
|
||
http://tinyurl.com/6goc6mr shows up in crash stats today as a new signature for 10.7 only - users other than the one in Comment 1 are hitting it.
Adding Paul since I see libjemalloc.dylib in the module list up at the top.
Comment 9•13 years ago
|
||
I can no longer reproduce this bug as we've been describing it ... for reasons I can't fathom.
But now I'm seeing something else just as bad: I start today's mozilla-central nightly, wait about 30 seconds, and move the mouse -- then I either crash or hang.
jemalloc sounds like a plausible reason for these problems. I'll try disabling it (in source code) and see if that makes any difference.
Comment 10•13 years ago
|
||
Here's a stack I got crashing in gdb (using my opt build with debug symbols).
And here are two crash stacks from using today's nightly:
bp-dbd2afff-b240-443e-aa49-4c8d42110708
bp-4dca78b0-9389-435d-96ec-874f82110708
Interesting that both of the latter are in font code ... but I suspect that's not relevant.
Comment 11•13 years ago
|
||
Turning off jemalloc on the Mac made my problems go away.
I'm doing a tryserver build, which should be available tomorrow morning. Marcia, you can test with it once it's available.
Comment 12•13 years ago
|
||
The timing seems to match up for the jemalloc enablement on Mac from bug 414946
1ad1fd67e97a 2011-07-07 14:38 -0700 Paul Biggar - Bug 414946 (part 2): Enable jemalloc on Mac (r=pavlov)
2b2f584dc5fd 2011-05-21 20:27 -0700 Paul Biggar - Bug 414946 (part 1): Fix jemalloc on Mac, but leave disabled (r=pavlov)
Comment 13•13 years ago
|
||
Steven, can you try the original crashing builds with the NO_MAC_JEMALLOC environmental variable set? Also, when you say "turning off jemalloc" in comment 11, can you describe what you did (I'm guessing reverting "part 2" (1ad1fd67e97a))?
Comment 14•13 years ago
|
||
(In reply to comment #10)
> Created attachment 544890 [details]
> Gdb crash stack
>
> Here's a stack I got crashing in gdb (using my opt build with debug symbols).
Thanks for this. This points very strongly to jemalloc being the culprit. I'm investigating another jemalloc related bug now (talos tp5 regression on 10.5), and will get back to you shortly.
Comment 15•13 years ago
|
||
> Steven, can you try the original crashing builds with the
> NO_MAC_JEMALLOC environmental variable set?
I'll do that (I didn't realize it was possible to turn off jemalloc
without altering the code).
> Also, when you say "turning off jemalloc" in comment 11, can you
> describe what you did (I'm guessing reverting "part 2"
> (1ad1fd67e97a))?
Reverting "part 2" is exactly what I did.
Comment 16•13 years ago
|
||
(Following up comment #9 and comment #15)
> But now I'm seeing something else just as bad: I start today's
> mozilla-central nightly, wait about 30 seconds, and move the mouse
> -- then I either crash or hang.
Here are more precise STR:
1) Start FF, then after the main window comes up wait for 30 seconds
without doing anything.
2) Move the mouse to a location where the cursor would normally change
shape -- for example over the location bar, where it normally
changes from an arrow to an I-beam.
At this point I normally hang or crash.
I don't hang or crash if I run firefox-bin from a Terminal prompt, and
prior to that enter the following at the command line:
export NO_MAC_JEMALLOC=YES
Comment 17•13 years ago
|
||
> Here are more precise STR:
>
> 1) Start FF, then after the main window comes up wait for 30 seconds
> without doing anything.
>
> 2) Move the mouse to a location where the cursor would normally
> change shape -- for example over the location bar, where it
> normally changes from an arrow to an I-beam.
>
> At this point I normally hang or crash.
This *still* isn't quite right. Here's better:
1) Start FF, then after the main window appears (the default
about:home page), wait until the cursor I-beam in the Google search
box momentarily stops flashing.
2) Quickly (before the I-beam cursor starts flashing again) move the
mouse over "about:home" in the location bar.
At this point you'll normally hang.
For some reason it's quite difficult to reproduce these STR in gdb.
It helps if you first 'set args -foreground' in gdb (to make FF run in
the foreground).
The "hang" doesn't necessarily last forever.
Comment 18•13 years ago
|
||
Comment 19•13 years ago
|
||
Reporter | ||
Updated•13 years ago
|
Crash Signature: [@ libsystem_c.dylib@0x6ac31 ]
Updated•13 years ago
|
Comment 20•13 years ago
|
||
I don't think this happens on all 10.7 machines, as I tried it on jruderman's machine (prerelease 10.7) and didn't have a problem.
Can I get access to a machine on which this happens? Do we have one is our test lab?
Comment 21•13 years ago
|
||
You need to find a machine with the GM DP (build 11A511). That's what both Marcia and I have been testing with.
Reporter | ||
Comment 22•13 years ago
|
||
I have a machine in the QA lab that has the Gold Master installed.
Comment 23•13 years ago
|
||
I tried to duplicate this in the test lab, but couldn't. A good way to move forward is to disable jemalloc on 10.7, and keep it enabled on 10.6. Here's a nightly build with that change made:
https://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/pbiggar@mozilla.com-933061c83fd2/
Can you confirm that you can no longer replicate this bug?
Comment 24•13 years ago
|
||
> Can you confirm that you can no longer replicate this bug?
I can't (testing on the 10.7 GM, build 11A511).
Comment 25•13 years ago
|
||
Um, slight ambiguity there. I think you can no longer replicate the bug?
Comment 26•13 years ago
|
||
Oops, sorry.
I can no longer reproduce the bug with your tryserver build.
Though I still can reproduce it with the 2011-07-08 nightly (the first nightly with jemalloc), using my STR from comment #17.
Updated•13 years ago
|
Blocks: lion-compatibility
Comment 27•13 years ago
|
||
Since we backed out jemalloc on 10.7, this is worksforme. We're going to track enabling jemalloc on 10.7 in bug 694896.
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → WORKSFORME
You need to log in
before you can comment on or make changes to this bug.
Description
•