Last Comment Bug 762449 - (jemalloc4-by-default) Enable jemalloc 4 by default
(jemalloc4-by-default)
: Enable jemalloc 4 by default
Status: REOPENED
:
Product: Core
Classification: Components
Component: Memory Allocator (show other bugs)
: unspecified
: x86_64 Linux
: -- normal with 16 votes (vote)
: mozilla39
Assigned To: Mike Hommey [:glandium]
:
Mentors:
Depends on: 801536 1107677 1110484 1110505 1141761 1142403 1142414 1204148 741720 762445 762446 762448 763920 799090 799093 815071 899126 1014300 1014308 1107694 1108045 1110514 1120798 1120937 1121269 1121314 1134123 1138705 1138999 1139036 1141079 1141660 1142412 1201453 1201738 1254850
Blocks: 762451 811483 1005844 1201802 1201345
  Show dependency treegraph
 
Reported: 2012-06-07 05:24 PDT by Mike Hommey [:glandium]
Modified: 2016-08-18 23:10 PDT (History)
69 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---
disabled
fixed


Attachments
Enable jemalloc 3 by default, but don' make it ride the trains yet (1.23 KB, patch)
2015-01-12 16:01 PST, Mike Hommey [:glandium]
no flags Details | Diff | Splinter Review
Enable jemalloc 3 by default, but don' make it ride the trains yet (1.20 KB, patch)
2015-01-12 16:04 PST, Mike Hommey [:glandium]
n.nethercote: review+
Details | Diff | Splinter Review
jemalloc.org (1.12 KB, text/plain)
2015-01-14 08:16 PST, Bob Clary [:bc:]
no flags Details
gecko-alloc.zip (64.94 KB, application/zip)
2015-01-20 15:10 PST, Bob Clary [:bc:]
no flags Details
Make jemalloc's opt.lg_dirty_mult work as documented (4.38 KB, patch)
2015-02-26 22:14 PST, Mike Hommey [:glandium]
n.nethercote: review+
Details | Diff | Splinter Review
Fix "result of 32-bit shift implicitly converted to 64 bits" on win64 (1.20 KB, patch)
2015-02-27 02:35 PST, Mike Hommey [:glandium]
n.nethercote: review+
Details | Diff | Splinter Review

Description Mike Hommey [:glandium] 2012-06-07 05:24:36 PDT

    
Comment 1 Justin Lebar (not reading bugmail) 2012-06-07 06:46:42 PDT
Although I'm no longer hopeful this will improve our fragmentation situation, it's still relevant to [MemShrink].
Comment 2 Nicholas Nethercote [:njn] 2012-06-12 16:43:48 PDT
At this point it seems like memory consumption won't improve, but it would be nice to be on the upstream version of jemalloc.
Comment 3 Rafael Ávila de Espíndola (:espindola) (not reading bugmail) 2012-07-10 18:58:00 PDT
I spent some time looking at http://dromaeo.com/?dom-modify for another bug and noticed that it was spending a lot of time spinning on a lock. Today Ehsan, Jeff and I looked at it seems that the new jemalloc should improve the situation by using TLS for for the data structures used in some allocations at least.

Is there an easy way to enable it on a build in OS X to check that?
Comment 4 Justin Lebar (not reading bugmail) 2012-07-10 22:00:08 PDT
Note that when we initially turn on jemalloc3, we may disable the TLS cache, since it (unsurprisingly) appears to cause a memory-usage regression.

> Is there an easy way to enable it on a build in OS X to check that?

According to bug 580408 comment 60, you need to build with MOZ_JEMALLOC to get the new jemalloc.  But I'm not sure it works (or has even been tested on) OSX.
Comment 5 Mike Hommey [:glandium] 2012-07-10 23:03:53 PDT
export MOZ_JEMALLOC=1 in your mozconfig.

(In reply to Justin Lebar [:jlebar] from comment #4)
> According to bug 580408 comment 60, you need to build with MOZ_JEMALLOC to
> get the new jemalloc.  But I'm not sure it works (or has even been tested
> on) OSX.

It was tested on all platforms. Only b2g is broken because of a toolchain problem.
Comment 6 Eric Rahm [:erahm] 2014-12-11 15:39:05 PST
Completed triage of the last 3 years of commits [1], added 3 more blockers that can be resolved on mozilla's side. 3 changesets still need to be triaged by other folks.

[1] https://docs.google.com/document/d/1YkJaXVlO4uDHKE47Iel5hT5uRDHao2dku8BCSwGbtRs/edit?usp=sharing
Comment 7 Mike Hommey [:glandium] 2014-12-30 02:00:31 PST
So here is what I think we should do, considering the holidays and the timing wrt next uplift:
- Land bug 1107694 when there is a proper fix for it. I found what's wrong there, I just don't know what the right value for the fix is.
- Switch to jemalloc 3 by default on Jan 12 or 13, after the uplift.
- Resolve the other blockers to this bug: those that can be fixed before Jan 12 can be fixed before then, but at that point I don't think it's worth technically blocking on them, as long as we fix them in the following 6 weeks, and if we don't, we can still make jemalloc 3 not ride the train. FTR, I have a bunch of WIP patches applied on my git clone for the bugs I'm assigned to ; I just won't have them ready before next year because of holidays :)
Comment 8 Gabriele Svelto [:gsvelto] 2014-12-31 05:15:30 PST
FYI I'm keeping an eye on this from the FxOS side mostly to ensure we don't hit memory usage regressions because - as usual - we're dealing with devices with a very tight memory budget.
Comment 9 Mike Hommey [:glandium] 2015-01-12 16:01:59 PST
Created attachment 8547838 [details] [diff] [review]
Enable jemalloc 3 by default, but don' make it ride the trains yet
Comment 10 Mike Hommey [:glandium] 2015-01-12 16:04:44 PST
Created attachment 8547840 [details] [diff] [review]
Enable jemalloc 3 by default, but don' make it ride the trains yet

In fact, we rely on the shell variable being set too, so set it.
Comment 12 Mike Hommey [:glandium] 2015-01-12 18:39:22 PST
Had to disable on b2g:
https://hg.mozilla.org/integration/mozilla-inbound/rev/9229135ca287
Comment 13 Mike Hommey [:glandium] 2015-01-12 19:10:47 PST
And backed out:
https://hg.mozilla.org/integration/mozilla-inbound/rev/ffafa737cb7c

DMD test failures, presumably because of size classes changes:
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=5376229&repo=mozilla-inbound

More critically, there's an infinite loop involving a0alloc in both mac and windows:
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=5374694&repo=mozilla-inbound
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=5376065&repo=mozilla-inbound

And there's the b2g emulator issue, but it might be related to the above, I haven't attached a debugger yet.
Comment 14 Mike Hommey [:glandium] 2015-01-12 19:18:32 PST
(In reply to Mike Hommey [:glandium] from comment #13)
> And there's the b2g emulator issue, but it might be related to the above, I
> haven't attached a debugger yet.

So with a debugger attached, it looks like it's stuck in the libc, but I don't have symbols, and downloading/building a b2g emulator build is going to take a very long time. Eric, would you mind looking at this? I've reproduced with the emulator build I got from automation with "LD_PRELOAD=/system/b2g/libmozglue.so cat"
Comment 15 Eric Rahm [:erahm] 2015-01-12 19:22:00 PST
I'll take a look at this in the morning.
Comment 16 Ryan VanderMeulen [:RyanVM] 2015-01-12 19:36:10 PST
This started on your push too. AFAICT, it was linux64 opt only.
https://treeherder.mozilla.org/ui/logviewer.html#?job_id=5375538&repo=mozilla-inbound
Comment 17 Mike Hommey [:glandium] 2015-01-12 23:54:47 PST
(In reply to Mike Hommey [:glandium] from comment #13)
> More critically, there's an infinite loop involving a0alloc in both mac and
> windows:
> https://treeherder.mozilla.org/ui/logviewer.
> html#?job_id=5374694&repo=mozilla-inbound
> https://treeherder.mozilla.org/ui/logviewer.
> html#?job_id=5376065&repo=mozilla-inbound

Investigated and filed https://github.com/jemalloc/jemalloc/issues/184
Comment 18 Mike Hommey [:glandium] 2015-01-13 04:00:43 PST
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #16)
> This started on your push too. AFAICT, it was linux64 opt only.
> https://treeherder.mozilla.org/ui/logviewer.
> html#?job_id=5375538&repo=mozilla-inbound

I identified this one. It's either a bug in the pkcs11 loader or the pkcs11testmodule, or both. We've been really lucky it didn't happen before. I will file a bug about it tomorrow.
Comment 19 Mark Finkle (:mfinkle) (use needinfo?) 2015-01-13 05:05:25 PST
Looks like this patch regressed startup by ~200ms on Android too. Here's a quick screenshot:
http://cl.ly/image/0G1Y3s0L422c

Link to the tests:
http://phonedash.mozilla.org/#/org.mozilla.fennec/totalthrobber/local-blank/norejected/2015-01-12/2015-01-13/notcached/noerrorbars/standarderror
Comment 20 Mike Hommey [:glandium] 2015-01-13 05:19:26 PST
(In reply to Mark Finkle (:mfinkle) from comment #19)
> Looks like this patch regressed startup by ~200ms on Android too. Here's a
> quick screenshot:
> http://cl.ly/image/0G1Y3s0L422c
> 
> Link to the tests:
> http://phonedash.mozilla.org/#/org.mozilla.fennec/totalthrobber/local-blank/
> norejected/2015-01-12/2015-01-13/notcached/noerrorbars/standarderror

How does one test that on try?
Comment 21 Mike Hommey [:glandium] 2015-01-13 05:21:35 PST
(In reply to Mike Hommey [:glandium] from comment #14)
> (In reply to Mike Hommey [:glandium] from comment #13)
> > And there's the b2g emulator issue, but it might be related to the above, I
> > haven't attached a debugger yet.
> 
> So with a debugger attached, it looks like it's stuck in the libc, but I
> don't have symbols, and downloading/building a b2g emulator build is going
> to take a very long time. Eric, would you mind looking at this? I've
> reproduced with the emulator build I got from automation with
> "LD_PRELOAD=/system/b2g/libmozglue.so cat"

This may well be related to https://github.com/jemalloc/jemalloc/issues/184 because we don't have native tls on android and gonk, so we're effectively in the same kind of infinite-loopy setup as windows and mac.
Comment 22 Mark Finkle (:mfinkle) (use needinfo?) 2015-01-13 05:26:52 PST
(In reply to Mike Hommey [:glandium] from comment #20)
> (In reply to Mark Finkle (:mfinkle) from comment #19)
> > Looks like this patch regressed startup by ~200ms on Android too. Here's a
> > quick screenshot:
> > http://cl.ly/image/0G1Y3s0L422c
> > 
> > Link to the tests:
> > http://phonedash.mozilla.org/#/org.mozilla.fennec/totalthrobber/local-blank/
> > norejected/2015-01-12/2015-01-13/notcached/noerrorbars/standarderror
> 
> How does one test that on try?

One does not. Bob Clary is working on getting Try working for PhoneDash, but it's not completed. In the meantime, we do one of two things:
1. Send test patches to Bob and he runs them in the PhoneDash framework
2. We try to use a local script that launches Fennec via ADB and watches for Throbber Start and Throbber Stop messages.

I forget where the script for #2 lives. Bob might have other alternatives.
Comment 23 Bob Clary [:bc:] 2015-01-13 07:41:32 PST
glandium, I can walk you through the set up of autophone if you have a slow, rooted android phone available or I can test your patches for you if you like.
Comment 24 Eric Rahm [:erahm] 2015-01-13 15:39:51 PST
(In reply to Mike Hommey [:glandium] from comment #14)
> (In reply to Mike Hommey [:glandium] from comment #13)
> > And there's the b2g emulator issue, but it might be related to the above, I
> > haven't attached a debugger yet.
> 
> So with a debugger attached, it looks like it's stuck in the libc, but I
> don't have symbols, and downloading/building a b2g emulator build is going
> to take a very long time. Eric, would you mind looking at this? I've
> reproduced with the emulator build I got from automation with
> "LD_PRELOAD=/system/b2g/libmozglue.so cat"

It appears |getprop| is deadlocked when initializing jemalloc. Our |__wrap_pthread_key_create| [1] function uses a std::map [2] which then tries to allocate memory resulting in a deadlock.

[1] https://hg.mozilla.org/mozilla-central/annotate/67257a3edeb5/mozglue/build/Nuwa.cpp#l724
[2] https://hg.mozilla.org/mozilla-central/annotate/67257a3edeb5/mozglue/build/Nuwa.cpp#l730

Full stack:
> #0  __futex_syscall3 () at bionic/libc/arch-arm/bionic/atomics_arm.S:183
> #1  0x40087264 in _normal_lock (mutex=<optimized out>) at bionic/libc/bionic/pthread.c:951
> #2  pthread_mutex_lock (mutex=0x4006f5d4) at bionic/libc/bionic/pthread.c:1041
> #3  0x400395b4 in malloc_init_hard () at /home/erahm/dev/mozilla-central/memory/jemalloc/src/include/jemalloc/internal/mutex.h:77
> #4  0x4003a098 in je_malloc () at /home/erahm/dev/mozilla-central/memory/jemalloc/src/src/jemalloc.c:249
> #5  0x4002a992 in std::priv::_Rb_tree<int, std::less<int>, std::pair<int const, void (*)(void*)>, std::priv::_Select1st<std::pair<int const, void (*)(void*)> >, std::priv::_MapTraitsT<std::pair<int const, void (*)(void*)> >, std::allocator<std::pair<int const, void (*)(void*)> > >::_M_create_node ()
>    at /home/erahm/dev/mozilla-central/build/stlport/stlport/stl/_new.h:134
> #6  0x4002c60a in std::priv::_Rb_tree<int, std::less<int>, std::pair<int const, void (*)(void*)>, std::priv::_Select1st<std::pair<int const, void (*)(void*)> >, std::priv::_MapTraitsT<std::pair<int const, void (*)(void*)> >, std::allocator<std::pair<int const, void (*)(void*)> > >::_M_insert(std::priv::_Rb_tree_node_base*, std::pair<int const, void (*)(void*)> const&, std::priv::_Rb_tree_node_base*, std::priv::_Rb_tree_node_base*) ()
>    at /home/erahm/dev/mozilla-central/build/stlport/stlport/stl/_tree.c:359
> #7  0x4002c6d0 in __wrap_pthread_key_create () at /home/erahm/dev/mozilla-central/build/stlport/stlport/stl/_tree.c:422
> #8  0x40043204 in je_malloc_tsd_boot0 () at /home/erahm/dev/mozilla-central/memory/jemalloc/src/include/jemalloc/internal/tsd.h:605
> #9  0x40039600 in malloc_init_hard () at /home/erahm/dev/mozilla-central/memory/jemalloc/src/src/jemalloc.c:1123
> #10 0x4003f0d8 in jemalloc_constructor () at /home/erahm/dev/mozilla-central/memory/jemalloc/src/src/jemalloc.c:249
> #11 0xb0001156 in call_array (ctor=0x4006d9c8, count=0, reverse=<optimized out>) at bionic/linker/linker.c:1589
> #12 0xb0001dc6 in call_constructors (si=<optimized out>) at bionic/linker/linker.c:1619
> #13 __dl_$t () at bionic/linker/linker.c:2013
> #14 0xb00028a4 in init_library (si=<optimized out>) at bionic/linker/linker.c:1169
> #15 find_library (name=<optimized out>) at bionic/linker/linker.c:1212
> #16 0xb0001b90 in __dl_$t () at bionic/linker/linker.c:1917
> #17 0xb0002108 in __linker_init (elfdata=<optimized out>) at bionic/linker/linker.c:2200
> #18 0xb000100c in __dl__start () at bionic/linker/arch/arm/begin.S:37
> #19 0xb000100c in __dl__start () at bionic/linker/arch/arm/begin.S:37
> Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Comment 25 Mike Hommey [:glandium] 2015-01-13 23:42:10 PST
Bob, would you mind checking how startup goes for the builds from this try? Thanks.
https://treeherder.mozilla.org/#/jobs?repo=try&author=mh%40glandium.org
Comment 26 Bob Clary [:bc:] 2015-01-14 06:31:42 PST
Ok, I'm testing api-9 and api-11 from http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-471936506ea6/ compared to the latest mozilla-inbound. It will take a while to download the builds, but I'll let you know as soon as I get the results.
Comment 27 Bob Clary [:bc:] 2015-01-14 08:16:13 PST
Created attachment 8548957 [details]
jemalloc.org

comparison of http://hg.mozilla.org/try/rev/471936506ea6 to http://hg.mozilla.org/integration/mozilla-inbound/rev/ad2042b4c668

twitter and blank start times are comparable between the two builds but the webappstartup test start time regressed. Strangely the galaxy s3 Android 4.0 regressed more than the nexus one Android 2.3 on the webappstartup start time.

twitter, blank and webappstartup stop times all regressed.
Comment 28 Mike Hommey [:glandium] 2015-01-15 02:45:54 PST
So, I got autophone working with Bob's help, and got somehow plausible results despite the huge stddev. Then, since it was all slow, I factory resetted the phone and cleaned up its sd card. The phone ended up much faster (for instance, I don't need autophone config adjustments because of slow reboot anymore), but now the results are completely unexploitable: stddev is still big, and the jemalloc3 builds end up with better results than the mozjemalloc builds... which makes it hard to investigate what's wrong.

That being said, I have a theory, so I'd appreciate a test of those two try builds:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-36dd4fb91b48/try-android-api-11/
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-c00ce6c720e1/try-android-api-11/
Comment 30 Bob Clary [:bc:] 2015-01-15 08:37:59 PST
glandium, it would help to get both api-9 and api-11 builds so I can test using both my nexus one and my gs3.

I posted these to phonedash-dev so you can see the graphs and ran your two try builds (the first didn't have any builds) along with the latest mozilla-inbound for comparison.

http://phonedash-dev.allizom.org/#/org.mozilla.fennec/throbberstop/local-blank/norejected/2015-01-15/2015-01-15/notcached/errorbars/standarderror

http://phonedash-dev.allizom.org/#/org.mozilla.fennec/throbberstop/local-twitter/norejected/2015-01-15/2015-01-15/notcached/errorbars/standarderror

http://phonedash-dev.allizom.org/#/org.mozilla.fennec/throbberstop/webappstartup/norejected/2015-01-15/2015-01-15/notcached/errorbars/standarderror

http://phonedash-dev.allizom.org/#/org.mozilla.fennec/throbberstop/webappstartup/norejected/2015-01-15/2015-01-15/notcached/errorbars/standarderror

As you can see, throbber stop regressed with both try builds on s1s2 blank and twitter and both throbber start and stop regressed on webappstartup.
Comment 31 Hannes Verschore [:h4writer] 2015-01-15 12:40:45 PST
FYI I was looking at the browsermark 2.1 knockout benchmark score. The difference between v8 and spidermonkey is the time we spend in "compareSmallArrayToBigArray" first loop. Which is currently mostly MinorGC, "freeHugeSlots", which does only js_free.

I created a js shell benchmark out of it in bug 1118938. The numbers should somewhat relate to the full browsermark, but the scores reported here are from the shell benchmark.

On trunk we have scores 2900ms with the first loop taking 1000ms. (On linux the loop takes 500ms, due to faster js_free).
I was told by ehoogeveen to test jemalloc3, since it could potentially improve scores. Bad luck here:
total score become: 3398ms, while the loop now takes 1422ms!
Comment 32 Mike Hommey [:glandium] 2015-01-15 20:11:41 PST
Bob, can you get numbers for these builds?
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-aef35ebef737/
Comment 33 Bob Clary [:bc:] 2015-01-15 21:22:54 PST
You can see the various graphs at http://phonedash-dev.allizom.org/#/org.mozilla.fennec/throbberstop/local-blank/norejected/2015-01-13/2015-01-15/notcached/errorbars/standarderror

You can select the different tests local blank, local twitter, webappstartup and look at the start and stop times. The regression pattern remains pretty much the same.
Comment 34 Mike Hommey [:glandium] 2015-01-20 01:37:49 PST
Bob, could you test these two trys:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-31a67cc812d5/
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-592a30a8a738/

This time, though, I'm interested in the logcat.
Comment 35 Bob Clary [:bc:] 2015-01-20 09:01:57 PST
(In reply to Mike Hommey [:glandium] from comment #34)
> http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-
> 31a67cc812d5/

This one is failing to get any measurements of the Throbber times and is generating literally millions of E/GeckoAlloc( 3342): overflow messages in logcat. It is taking quite a while to complete. It doesn't look like the logcat contains anything else of use.
Comment 36 Bob Clary [:bc:] 2015-01-20 09:25:32 PST
(In reply to Mike Hommey [:glandium] from comment #34)

> http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-
> 592a30a8a738/

This one also has the E/GeckoAlloc( 1566): overflow issue. It is still early in the run, but it don't look like this will get any measurements either. I'll let it continue though just in case.
Comment 37 Bob Clary [:bc:] 2015-01-20 15:10:50 PST
Created attachment 8552061 [details]
gecko-alloc.zip

GeckoAlloc non-overflow messages only
Comment 38 Mike Hommey [:glandium] 2015-01-20 17:24:35 PST
When they're up, please test those builds:
 http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-d135ff58d0e7
 http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-dee0ae7edb6b

I'm only interested in the GeckoAlloc logs this time as well.
Comment 39 Mike Hommey [:glandium] 2015-01-20 19:45:21 PST
(In reply to Mike Hommey [:glandium] from comment #38)
> When they're up, please test those builds:
>  http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-
> d135ff58d0e7
>  http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-
> dee0ae7edb6b
> 
> I'm only interested in the GeckoAlloc logs this time as well.

If d135ff58d0e7 doesn't get throbber times, can you also try this one:
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-3b57b77accb7/
Comment 40 Bob Clary [:bc:] 2015-01-21 07:07:55 PST
provided info via irc
Comment 41 Mike Hommey [:glandium] 2015-01-21 13:44:19 PST
I think I finally got builds that can get the info I want. Please test those:
http://ftp.mozilla.org/pub/mozilla.org/mobile/try-builds/mh@glandium.org-042225505427/
http://ftp.mozilla.org/pub/mozilla.org/mobile/try-builds/mh@glandium.org-952452fa15ad/

For those, I'd be interested in their score and their GeckoAlloc output. Thanks.
Comment 42 Bob Clary [:bc:] 2015-01-21 15:35:32 PST
done
Comment 43 Mike Hommey [:glandium] 2015-01-22 00:52:17 PST
I did two more try builds, with the autophone trigger, but it didn't trigger anything for the nexus one and the gs3 :(
Could you test these?
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-64566756bf92
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-9d46e7b57308
Comment 44 Bob Clary [:bc:] 2015-01-22 06:50:29 PST
The nexus-one-3 and samsgung-gs3-3 devices are my local devices and won't have picked up your try builds unless I had set them up for it at the time and had my local instance of autophone running when you submitted them.

Several of your try builds have completed testing and are available at http://phonedash.mozilla.org/#/org.mozilla.fennec/throbberstop/local-twitter/norejected/2015-01-21/2015-01-21/cached/noerrorbars/standarderror/try

You still have outstanding jobs for nexus-s-3, nexus-5-kot49h-1, nexus-s-4, nexus-5-kot49h-3 and nexus-s-5 for

http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-4b002c559b19/
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-64566756bf92/
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-9d46e7b57308/
http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-c789abcd50e7/

To see only your builds, click on the legend items on the left for the non try builds to hide their series. 

To get the logs you will need to visit the staging treeherder instance for now, click on the relevant test and look in the job details panel for links to the logcat. Not the tombstones signifying crashes of some type. See

https://treeherder.allizom.org/ui/#/jobs?repo=try&revision=64566756bf92
https://treeherder.allizom.org/ui/#/jobs?repo=try&revision=9d46e7b57308
https://treeherder.allizom.org/ui/#/jobs?repo=try&revision=4b002c559b19
https://treeherder.allizom.org/ui/#/jobs?repo=try&revision=c789abcd50e7

The workers are a bit behind at the moment. Several disconnected and I will need to reboot the system.
Comment 45 Bob Clary [:bc:] 2015-01-23 08:23:09 PST
Autophone is now available in
http://trychooser.pub.build.mozilla.org/ to help with your try
syntax. Please limit your tests to the ones you need and don't
use the mochitests unless you really need them. Also remember we
don't have many devices and they are usually pretty busy so
please don't DOS them. I'll work on the documentation in the next
few days but this should work for most of your cases:

try: -b o -p android-api-9,android-api-11 -u autophone-s1s2 -t none

Until Autophone begins reporting to production Treeherder, you
can find logs etc on the staging instance
https://treeherder.allizom.org.

Let me know if you have issues.
Comment 46 Mike Hommey [:glandium] 2015-01-26 20:55:40 PST
(In reply to Bob Clary [:bc:] from comment #45)
> Autophone is now available in
> http://trychooser.pub.build.mozilla.org/ to help with your try
> syntax. Please limit your tests to the ones you need and don't
> use the mochitests unless you really need them. Also remember we
> don't have many devices and they are usually pretty busy so
> please don't DOS them. I'll work on the documentation in the next
> few days but this should work for most of your cases:
> 
> try: -b o -p android-api-9,android-api-11 -u autophone-s1s2 -t none
> 
> Until Autophone begins reporting to production Treeherder, you
> can find logs etc on the staging instance
> https://treeherder.allizom.org.
> 
> Let me know if you have issues.

Unfortunately, the result of my recent try builds don't show up properly on treeherder.allizom.org, so I can't get to their logcat :(
https://treeherder.allizom.org/#/jobs?repo=try&revision=790f61d4a2d8
https://treeherder.allizom.org/#/jobs?repo=try&revision=ceea827908dd
Comment 47 Bob Clary [:bc:] 2015-01-27 07:44:03 PST
glandium: Sorry about the problems. Your first build appears on treeherder.allizom.org but the second doesn't.

Staging treeherder had an issue with netflows when they enabled the new db setup. I've gotten permission to submit to treeherder production with the jobs automatically hidden. I'd held off until I could find a clean switch over time but with the recent load, I don't think I'll ever find a perfect time. So, I've switched to reporting to treeherder production.

All new builds will report to treeherder.mozilla.org and any existing builds in the job queue will begin reporting there as well. We should have a much improved system availability and stability on production going forward.
Comment 48 Mike Hommey [:glandium] 2015-02-26 22:14:15 PST
Created attachment 8570335 [details] [diff] [review]
Make jemalloc's opt.lg_dirty_mult work as documented

This applies https://github.com/jemalloc/jemalloc/pull/192/ to our tree.
Comment 49 Nicholas Nethercote [:njn] 2015-02-26 22:24:54 PST
Comment on attachment 8570335 [details] [diff] [review]
Make jemalloc's opt.lg_dirty_mult work as documented

Review of attachment 8570335 [details] [diff] [review]:
-----------------------------------------------------------------

rs=me
Comment 50 Mike Hommey [:glandium] 2015-02-27 02:35:29 PST
Created attachment 8570394 [details] [diff] [review]
Fix "result of 32-bit shift implicitly converted to 64 bits" on win64

memory/build has fatal warnings, and this warning hits win64 only. I don't know how I didn't get this error before...
Comment 53 Mike Hommey [:glandium] 2015-03-11 23:45:56 PDT
Actually, let's keep this bug open to track definitely enabling jemalloc3 (as in, let it ride the trains)
Comment 54 Mike Hommey [:glandium] 2015-03-18 23:55:52 PDT
Disabled jemalloc 3 for now.
https://hg.mozilla.org/integration/mozilla-inbound/rev/0a11b73c77b7
Comment 55 Joel Maher ( :jmaher) 2015-03-25 06:21:19 PDT
this has a lot of improvements and 2 private byte regressions on linux (23 + 64):
http://alertmanager.allizom.org:8080/alerts.html?rev=0a11b73c77b7&showAll=1&testIndex=0&platIndex=0
Comment 56 Joel Maher ( :jmaher) 2015-03-25 06:23:59 PDT
as a note, these correspond to the improvements when this originally landed:
http://alertmanager.allizom.org:8080/alerts.html?rev=a1a89ff4ee31&showAll=1&testIndex=0&platIndex=0
Comment 57 Florian Bender 2015-08-10 12:57:34 PDT
Mike, any estimate when this can be re-enabled? Bug 1005844 is waiting on it.
Comment 59 Carsten Book [:Tomcat] - PTO-back Sept 4th 2015-09-04 07:09:50 PDT
https://hg.mozilla.org/mozilla-central/rev/8b380feae2ae
Comment 60 Morris Tseng [:mtseng] [:Morris] 2015-09-17 00:36:00 PDT
On my local machine (Ubuntu 15.04 and Ubuntu 14.10). I got start-up crash with my custom build. Call stack is as follow:

#0  0x0000000000439002 in run_quantize (size=0) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:96
#1  0x00000000004396ef in run_quantize_first (size=4096) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:155
#2  0x0000000000446857 in arena_run_first_best_fit (arena=0x7ffff6a00180, size=4096) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:1073
#3  0x0000000000446fbb in arena_run_alloc_small_helper (arena=0x7ffff6a00180, size=4096, binind=20) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:1129
#4  0x0000000000447101 in arena_run_alloc_small (arena=0x7ffff6a00180, size=4096, binind=20) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:1148
#5  0x0000000000454b2e in arena_bin_nonfull_run_get (arena=0x7ffff6a00180, bin=0x7ffff6a01750) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:1898
#6  0x0000000000454c58 in arena_bin_malloc_hard (arena=0x7ffff6a00180, bin=0x7ffff6a01750) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:1940
#7  0x00000000004558c4 in je_arena_malloc_small (arena=0x7ffff6a00180, size=1024, zero=true) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:2153
#8  0x00000000004f8234 in je_arena_malloc (tcache=0x0, zero=true, size=1024, arena=0x7ffff6a00180, tsd=0x7ffff7fe66b0)
    at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/include/jemalloc/internal/arena.h:1145
#9  je_iallocztm (arena=0x0, is_metadata=false, tcache=0x0, zero=true, size=1024, tsd=0x7ffff7fe66b0) at src/include/jemalloc/internal/jemalloc_internal.h:887
#10 je_icalloc (size=1024, tsd=0x7ffff7fe66b0) at src/include/jemalloc/internal/jemalloc_internal.h:920
#11 je_calloc (num=1, size=1024) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/jemalloc.c:1663
#12 0x0000000000423b4b in calloc (num=1, size=1024) at /home/morris/mozilla/gecko-dev/memory/build/replace_malloc.c:181
#13 0x00007ffff65c8890 in PR_Calloc (nelem=1, elsize=1024) at /home/morris/mozilla/gecko-dev/nsprpub/pr/src/malloc/prmem.c:443
#14 0x00007ffff65c69ce in PR_SetThreadPrivate (index=2, priv=0x7ffff6937f98) at /home/morris/mozilla/gecko-dev/nsprpub/pr/src/threads/prtpd.c:161
#15 0x00007fffe4008c2d in mozilla::BlockingResourceBase::ResourceChainAppend (this=0x7ffff6937f98, aPrev=0x0) at ../../dist/include/mozilla/BlockingResourceBase.h:181
#16 0x00007fffe400381d in mozilla::BlockingResourceBase::Acquire (this=0x7ffff6937f98) at /home/morris/mozilla/gecko-dev/xpcom/glue/BlockingResourceBase.cpp:322
#17 0x00007fffe40039f8 in mozilla::OffTheBooksMutex::Lock (this=0x7ffff6937f98) at /home/morris/mozilla/gecko-dev/xpcom/glue/BlockingResourceBase.cpp:383
#18 0x00007fffe3e92f02 in mozilla::Monitor::Lock (this=0x7ffff6937f98) at ../../dist/include/mozilla/Monitor.h:35
#19 0x00007fffe3e92f62 in mozilla::MonitorAutoLock::MonitorAutoLock (this=0x7ffff7fe5e40, aMonitor=...) at ../../dist/include/mozilla/Monitor.h:78
#20 0x00007fffe4080b59 in mozilla::net::ClosingService::ThreadFunc (this=0x7ffff6937f80) at /home/morris/mozilla/gecko-dev/netwerk/base/ClosingService.cpp:206
#21 0x00007fffe4097f2e in mozilla::net::ClosingService::ThreadFunc (aClosure=0x7ffff6937f80) at /home/morris/mozilla/gecko-dev/netwerk/base/ClosingService.h:52
#22 0x00007ffff65e557e in _pt_root (arg=0x7ffff6855fc0) at /home/morris/mozilla/gecko-dev/nsprpub/pr/src/pthreads/ptthread.c:212
#23 0x00007ffff7bc26aa in start_thread (arg=0x7ffff7fe6700) at pthread_create.c:333
#24 0x00007ffff6ec6eed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109


Revert it to jemalloc3 and everything is fine. But on my ubuntu with VM(host os is mac) jemelloc4 works well without problem.
Comment 61 Mike Hommey [:glandium] 2015-09-17 00:42:59 PDT
This likely is bug 1205016
Comment 62 Jan Beich 2015-12-03 09:55:49 PST
(In reply to Carsten Book [:Tomcat] from comment #59)
> https://hg.mozilla.org/mozilla-central/rev/8b380feae2ae

Why this bug is FIXED when the change was backed out in bug 1205249 and never came back?

Note You need to log in before you can comment on or make changes to this bug.