Closed
Bug 762449
(jemalloc4-by-default)
Opened 12 years ago
Closed 7 years ago
Enable jemalloc 4 by default
Categories
(Core :: Memory Allocator, defect)
Tracking
()
RESOLVED
WONTFIX
mozilla39
People
(Reporter: glandium, Unassigned)
References
(Blocks 2 open bugs)
Details
Attachments
(5 files, 1 obsolete file)
1.20 KB,
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
1.12 KB,
text/plain
|
Details | |
64.94 KB,
application/zip
|
Details | |
4.38 KB,
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
1.20 KB,
patch
|
n.nethercote
:
review+
|
Details | Diff | Splinter Review |
No description provided.
Comment 1•12 years ago
|
||
Although I'm no longer hopeful this will improve our fragmentation situation, it's still relevant to [MemShrink].
Whiteboard: [MemShrink]
Comment 2•12 years ago
|
||
At this point it seems like memory consumption won't improve, but it would be nice to be on the upstream version of jemalloc.
Whiteboard: [MemShrink]
I spent some time looking at http://dromaeo.com/?dom-modify for another bug and noticed that it was spending a lot of time spinning on a lock. Today Ehsan, Jeff and I looked at it seems that the new jemalloc should improve the situation by using TLS for for the data structures used in some allocations at least. Is there an easy way to enable it on a build in OS X to check that?
Comment 4•12 years ago
|
||
Note that when we initially turn on jemalloc3, we may disable the TLS cache, since it (unsurprisingly) appears to cause a memory-usage regression. > Is there an easy way to enable it on a build in OS X to check that? According to bug 580408 comment 60, you need to build with MOZ_JEMALLOC to get the new jemalloc. But I'm not sure it works (or has even been tested on) OSX.
Reporter | ||
Comment 5•12 years ago
|
||
export MOZ_JEMALLOC=1 in your mozconfig. (In reply to Justin Lebar [:jlebar] from comment #4) > According to bug 580408 comment 60, you need to build with MOZ_JEMALLOC to > get the new jemalloc. But I'm not sure it works (or has even been tested > on) OSX. It was tested on all platforms. Only b2g is broken because of a toolchain problem.
Updated•12 years ago
|
Blocks: replace-malloc
Reporter | ||
Updated•11 years ago
|
No longer blocks: replace-malloc
Reporter | ||
Updated•10 years ago
|
Alias: jemalloc3-default
Reporter | ||
Updated•10 years ago
|
Alias: jemalloc3-default → jemalloc3-by-default
Comment 6•9 years ago
|
||
Completed triage of the last 3 years of commits [1], added 3 more blockers that can be resolved on mozilla's side. 3 changesets still need to be triaged by other folks. [1] https://docs.google.com/document/d/1YkJaXVlO4uDHKE47Iel5hT5uRDHao2dku8BCSwGbtRs/edit?usp=sharing
Reporter | ||
Comment 7•9 years ago
|
||
So here is what I think we should do, considering the holidays and the timing wrt next uplift: - Land bug 1107694 when there is a proper fix for it. I found what's wrong there, I just don't know what the right value for the fix is. - Switch to jemalloc 3 by default on Jan 12 or 13, after the uplift. - Resolve the other blockers to this bug: those that can be fixed before Jan 12 can be fixed before then, but at that point I don't think it's worth technically blocking on them, as long as we fix them in the following 6 weeks, and if we don't, we can still make jemalloc 3 not ride the train. FTR, I have a bunch of WIP patches applied on my git clone for the bugs I'm assigned to ; I just won't have them ready before next year because of holidays :)
Comment 8•9 years ago
|
||
FYI I'm keeping an eye on this from the FxOS side mostly to ensure we don't hit memory usage regressions because - as usual - we're dealing with devices with a very tight memory budget.
Reporter | ||
Comment 9•9 years ago
|
||
Assignee: nobody → mh+mozilla
Attachment #8547838 -
Flags: review?(n.nethercote)
Reporter | ||
Comment 10•9 years ago
|
||
In fact, we rely on the shell variable being set too, so set it.
Attachment #8547838 -
Attachment is obsolete: true
Attachment #8547838 -
Flags: review?(n.nethercote)
Attachment #8547840 -
Flags: review?(n.nethercote)
Updated•9 years ago
|
Attachment #8547840 -
Flags: review?(n.nethercote) → review+
Reporter | ||
Comment 11•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/cf4eb744f2e1
Reporter | ||
Comment 12•9 years ago
|
||
Had to disable on b2g: https://hg.mozilla.org/integration/mozilla-inbound/rev/9229135ca287
Reporter | ||
Comment 13•9 years ago
|
||
And backed out: https://hg.mozilla.org/integration/mozilla-inbound/rev/ffafa737cb7c DMD test failures, presumably because of size classes changes: https://treeherder.mozilla.org/ui/logviewer.html#?job_id=5376229&repo=mozilla-inbound More critically, there's an infinite loop involving a0alloc in both mac and windows: https://treeherder.mozilla.org/ui/logviewer.html#?job_id=5374694&repo=mozilla-inbound https://treeherder.mozilla.org/ui/logviewer.html#?job_id=5376065&repo=mozilla-inbound And there's the b2g emulator issue, but it might be related to the above, I haven't attached a debugger yet.
Reporter | ||
Comment 14•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #13) > And there's the b2g emulator issue, but it might be related to the above, I > haven't attached a debugger yet. So with a debugger attached, it looks like it's stuck in the libc, but I don't have symbols, and downloading/building a b2g emulator build is going to take a very long time. Eric, would you mind looking at this? I've reproduced with the emulator build I got from automation with "LD_PRELOAD=/system/b2g/libmozglue.so cat"
Comment 16•9 years ago
|
||
This started on your push too. AFAICT, it was linux64 opt only. https://treeherder.mozilla.org/ui/logviewer.html#?job_id=5375538&repo=mozilla-inbound
Reporter | ||
Comment 17•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #13) > More critically, there's an infinite loop involving a0alloc in both mac and > windows: > https://treeherder.mozilla.org/ui/logviewer. > html#?job_id=5374694&repo=mozilla-inbound > https://treeherder.mozilla.org/ui/logviewer. > html#?job_id=5376065&repo=mozilla-inbound Investigated and filed https://github.com/jemalloc/jemalloc/issues/184
Reporter | ||
Comment 18•9 years ago
|
||
(In reply to Ryan VanderMeulen [:RyanVM UTC-5] from comment #16) > This started on your push too. AFAICT, it was linux64 opt only. > https://treeherder.mozilla.org/ui/logviewer. > html#?job_id=5375538&repo=mozilla-inbound I identified this one. It's either a bug in the pkcs11 loader or the pkcs11testmodule, or both. We've been really lucky it didn't happen before. I will file a bug about it tomorrow.
Comment 19•9 years ago
|
||
Looks like this patch regressed startup by ~200ms on Android too. Here's a quick screenshot: http://cl.ly/image/0G1Y3s0L422c Link to the tests: http://phonedash.mozilla.org/#/org.mozilla.fennec/totalthrobber/local-blank/norejected/2015-01-12/2015-01-13/notcached/noerrorbars/standarderror
Reporter | ||
Comment 20•9 years ago
|
||
(In reply to Mark Finkle (:mfinkle) from comment #19) > Looks like this patch regressed startup by ~200ms on Android too. Here's a > quick screenshot: > http://cl.ly/image/0G1Y3s0L422c > > Link to the tests: > http://phonedash.mozilla.org/#/org.mozilla.fennec/totalthrobber/local-blank/ > norejected/2015-01-12/2015-01-13/notcached/noerrorbars/standarderror How does one test that on try?
Flags: needinfo?(mark.finkle)
Reporter | ||
Comment 21•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #14) > (In reply to Mike Hommey [:glandium] from comment #13) > > And there's the b2g emulator issue, but it might be related to the above, I > > haven't attached a debugger yet. > > So with a debugger attached, it looks like it's stuck in the libc, but I > don't have symbols, and downloading/building a b2g emulator build is going > to take a very long time. Eric, would you mind looking at this? I've > reproduced with the emulator build I got from automation with > "LD_PRELOAD=/system/b2g/libmozglue.so cat" This may well be related to https://github.com/jemalloc/jemalloc/issues/184 because we don't have native tls on android and gonk, so we're effectively in the same kind of infinite-loopy setup as windows and mac.
Comment 22•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #20) > (In reply to Mark Finkle (:mfinkle) from comment #19) > > Looks like this patch regressed startup by ~200ms on Android too. Here's a > > quick screenshot: > > http://cl.ly/image/0G1Y3s0L422c > > > > Link to the tests: > > http://phonedash.mozilla.org/#/org.mozilla.fennec/totalthrobber/local-blank/ > > norejected/2015-01-12/2015-01-13/notcached/noerrorbars/standarderror > > How does one test that on try? One does not. Bob Clary is working on getting Try working for PhoneDash, but it's not completed. In the meantime, we do one of two things: 1. Send test patches to Bob and he runs them in the PhoneDash framework 2. We try to use a local script that launches Fennec via ADB and watches for Throbber Start and Throbber Stop messages. I forget where the script for #2 lives. Bob might have other alternatives.
Flags: needinfo?(mark.finkle) → needinfo?(bob)
Comment 23•9 years ago
|
||
glandium, I can walk you through the set up of autophone if you have a slow, rooted android phone available or I can test your patches for you if you like.
Flags: needinfo?(bob)
Comment 24•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #14) > (In reply to Mike Hommey [:glandium] from comment #13) > > And there's the b2g emulator issue, but it might be related to the above, I > > haven't attached a debugger yet. > > So with a debugger attached, it looks like it's stuck in the libc, but I > don't have symbols, and downloading/building a b2g emulator build is going > to take a very long time. Eric, would you mind looking at this? I've > reproduced with the emulator build I got from automation with > "LD_PRELOAD=/system/b2g/libmozglue.so cat" It appears |getprop| is deadlocked when initializing jemalloc. Our |__wrap_pthread_key_create| [1] function uses a std::map [2] which then tries to allocate memory resulting in a deadlock. [1] https://hg.mozilla.org/mozilla-central/annotate/67257a3edeb5/mozglue/build/Nuwa.cpp#l724 [2] https://hg.mozilla.org/mozilla-central/annotate/67257a3edeb5/mozglue/build/Nuwa.cpp#l730 Full stack: > #0 __futex_syscall3 () at bionic/libc/arch-arm/bionic/atomics_arm.S:183 > #1 0x40087264 in _normal_lock (mutex=<optimized out>) at bionic/libc/bionic/pthread.c:951 > #2 pthread_mutex_lock (mutex=0x4006f5d4) at bionic/libc/bionic/pthread.c:1041 > #3 0x400395b4 in malloc_init_hard () at /home/erahm/dev/mozilla-central/memory/jemalloc/src/include/jemalloc/internal/mutex.h:77 > #4 0x4003a098 in je_malloc () at /home/erahm/dev/mozilla-central/memory/jemalloc/src/src/jemalloc.c:249 > #5 0x4002a992 in std::priv::_Rb_tree<int, std::less<int>, std::pair<int const, void (*)(void*)>, std::priv::_Select1st<std::pair<int const, void (*)(void*)> >, std::priv::_MapTraitsT<std::pair<int const, void (*)(void*)> >, std::allocator<std::pair<int const, void (*)(void*)> > >::_M_create_node () > at /home/erahm/dev/mozilla-central/build/stlport/stlport/stl/_new.h:134 > #6 0x4002c60a in std::priv::_Rb_tree<int, std::less<int>, std::pair<int const, void (*)(void*)>, std::priv::_Select1st<std::pair<int const, void (*)(void*)> >, std::priv::_MapTraitsT<std::pair<int const, void (*)(void*)> >, std::allocator<std::pair<int const, void (*)(void*)> > >::_M_insert(std::priv::_Rb_tree_node_base*, std::pair<int const, void (*)(void*)> const&, std::priv::_Rb_tree_node_base*, std::priv::_Rb_tree_node_base*) () > at /home/erahm/dev/mozilla-central/build/stlport/stlport/stl/_tree.c:359 > #7 0x4002c6d0 in __wrap_pthread_key_create () at /home/erahm/dev/mozilla-central/build/stlport/stlport/stl/_tree.c:422 > #8 0x40043204 in je_malloc_tsd_boot0 () at /home/erahm/dev/mozilla-central/memory/jemalloc/src/include/jemalloc/internal/tsd.h:605 > #9 0x40039600 in malloc_init_hard () at /home/erahm/dev/mozilla-central/memory/jemalloc/src/src/jemalloc.c:1123 > #10 0x4003f0d8 in jemalloc_constructor () at /home/erahm/dev/mozilla-central/memory/jemalloc/src/src/jemalloc.c:249 > #11 0xb0001156 in call_array (ctor=0x4006d9c8, count=0, reverse=<optimized out>) at bionic/linker/linker.c:1589 > #12 0xb0001dc6 in call_constructors (si=<optimized out>) at bionic/linker/linker.c:1619 > #13 __dl_$t () at bionic/linker/linker.c:2013 > #14 0xb00028a4 in init_library (si=<optimized out>) at bionic/linker/linker.c:1169 > #15 find_library (name=<optimized out>) at bionic/linker/linker.c:1212 > #16 0xb0001b90 in __dl_$t () at bionic/linker/linker.c:1917 > #17 0xb0002108 in __linker_init (elfdata=<optimized out>) at bionic/linker/linker.c:2200 > #18 0xb000100c in __dl__start () at bionic/linker/arch/arm/begin.S:37 > #19 0xb000100c in __dl__start () at bionic/linker/arch/arm/begin.S:37 > Backtrace stopped: previous frame identical to this frame (corrupt stack?)
Flags: needinfo?(erahm)
Reporter | ||
Comment 25•9 years ago
|
||
Bob, would you mind checking how startup goes for the builds from this try? Thanks. https://treeherder.mozilla.org/#/jobs?repo=try&author=mh%40glandium.org
Flags: needinfo?(bob)
Comment 26•9 years ago
|
||
Ok, I'm testing api-9 and api-11 from http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-471936506ea6/ compared to the latest mozilla-inbound. It will take a while to download the builds, but I'll let you know as soon as I get the results.
Comment 27•9 years ago
|
||
comparison of http://hg.mozilla.org/try/rev/471936506ea6 to http://hg.mozilla.org/integration/mozilla-inbound/rev/ad2042b4c668 twitter and blank start times are comparable between the two builds but the webappstartup test start time regressed. Strangely the galaxy s3 Android 4.0 regressed more than the nexus one Android 2.3 on the webappstartup start time. twitter, blank and webappstartup stop times all regressed.
Flags: needinfo?(bob)
Reporter | ||
Comment 28•9 years ago
|
||
So, I got autophone working with Bob's help, and got somehow plausible results despite the huge stddev. Then, since it was all slow, I factory resetted the phone and cleaned up its sd card. The phone ended up much faster (for instance, I don't need autophone config adjustments because of slow reboot anymore), but now the results are completely unexploitable: stddev is still big, and the jemalloc3 builds end up with better results than the mozjemalloc builds... which makes it hard to investigate what's wrong. That being said, I have a theory, so I'd appreciate a test of those two try builds: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-36dd4fb91b48/try-android-api-11/ http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-c00ce6c720e1/try-android-api-11/
Flags: needinfo?(bob)
Reporter | ||
Comment 29•9 years ago
|
||
And another one: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-9424e8fdfd16/try-android-api-11/
Comment 30•9 years ago
|
||
glandium, it would help to get both api-9 and api-11 builds so I can test using both my nexus one and my gs3. I posted these to phonedash-dev so you can see the graphs and ran your two try builds (the first didn't have any builds) along with the latest mozilla-inbound for comparison. http://phonedash-dev.allizom.org/#/org.mozilla.fennec/throbberstop/local-blank/norejected/2015-01-15/2015-01-15/notcached/errorbars/standarderror http://phonedash-dev.allizom.org/#/org.mozilla.fennec/throbberstop/local-twitter/norejected/2015-01-15/2015-01-15/notcached/errorbars/standarderror http://phonedash-dev.allizom.org/#/org.mozilla.fennec/throbberstop/webappstartup/norejected/2015-01-15/2015-01-15/notcached/errorbars/standarderror http://phonedash-dev.allizom.org/#/org.mozilla.fennec/throbberstop/webappstartup/norejected/2015-01-15/2015-01-15/notcached/errorbars/standarderror As you can see, throbber stop regressed with both try builds on s1s2 blank and twitter and both throbber start and stop regressed on webappstartup.
Flags: needinfo?(bob)
Comment 31•9 years ago
|
||
FYI I was looking at the browsermark 2.1 knockout benchmark score. The difference between v8 and spidermonkey is the time we spend in "compareSmallArrayToBigArray" first loop. Which is currently mostly MinorGC, "freeHugeSlots", which does only js_free. I created a js shell benchmark out of it in bug 1118938. The numbers should somewhat relate to the full browsermark, but the scores reported here are from the shell benchmark. On trunk we have scores 2900ms with the first loop taking 1000ms. (On linux the loop takes 500ms, due to faster js_free). I was told by ehoogeveen to test jemalloc3, since it could potentially improve scores. Bad luck here: total score become: 3398ms, while the loop now takes 1422ms!
Reporter | ||
Comment 32•9 years ago
|
||
Bob, can you get numbers for these builds? http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-aef35ebef737/
Flags: needinfo?(bob)
Comment 33•9 years ago
|
||
You can see the various graphs at http://phonedash-dev.allizom.org/#/org.mozilla.fennec/throbberstop/local-blank/norejected/2015-01-13/2015-01-15/notcached/errorbars/standarderror You can select the different tests local blank, local twitter, webappstartup and look at the start and stop times. The regression pattern remains pretty much the same.
Flags: needinfo?(bob)
Reporter | ||
Comment 34•9 years ago
|
||
Bob, could you test these two trys: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-31a67cc812d5/ http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-592a30a8a738/ This time, though, I'm interested in the logcat.
Flags: needinfo?(bob)
Comment 35•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #34) > http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org- > 31a67cc812d5/ This one is failing to get any measurements of the Throbber times and is generating literally millions of E/GeckoAlloc( 3342): overflow messages in logcat. It is taking quite a while to complete. It doesn't look like the logcat contains anything else of use.
Flags: needinfo?(bob)
Comment 36•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #34) > http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org- > 592a30a8a738/ This one also has the E/GeckoAlloc( 1566): overflow issue. It is still early in the run, but it don't look like this will get any measurements either. I'll let it continue though just in case.
Comment 37•9 years ago
|
||
GeckoAlloc non-overflow messages only
Reporter | ||
Comment 38•9 years ago
|
||
When they're up, please test those builds: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-d135ff58d0e7 http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-dee0ae7edb6b I'm only interested in the GeckoAlloc logs this time as well.
Flags: needinfo?(bob)
Reporter | ||
Comment 39•9 years ago
|
||
(In reply to Mike Hommey [:glandium] from comment #38) > When they're up, please test those builds: > http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org- > d135ff58d0e7 > http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org- > dee0ae7edb6b > > I'm only interested in the GeckoAlloc logs this time as well. If d135ff58d0e7 doesn't get throbber times, can you also try this one: http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-3b57b77accb7/
Reporter | ||
Comment 41•9 years ago
|
||
I think I finally got builds that can get the info I want. Please test those: http://ftp.mozilla.org/pub/mozilla.org/mobile/try-builds/mh@glandium.org-042225505427/ http://ftp.mozilla.org/pub/mozilla.org/mobile/try-builds/mh@glandium.org-952452fa15ad/ For those, I'd be interested in their score and their GeckoAlloc output. Thanks.
Flags: needinfo?(bob)
Reporter | ||
Comment 43•9 years ago
|
||
I did two more try builds, with the autophone trigger, but it didn't trigger anything for the nexus one and the gs3 :( Could you test these? http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-64566756bf92 http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-9d46e7b57308
Flags: needinfo?(bob)
Comment 44•9 years ago
|
||
The nexus-one-3 and samsgung-gs3-3 devices are my local devices and won't have picked up your try builds unless I had set them up for it at the time and had my local instance of autophone running when you submitted them. Several of your try builds have completed testing and are available at http://phonedash.mozilla.org/#/org.mozilla.fennec/throbberstop/local-twitter/norejected/2015-01-21/2015-01-21/cached/noerrorbars/standarderror/try You still have outstanding jobs for nexus-s-3, nexus-5-kot49h-1, nexus-s-4, nexus-5-kot49h-3 and nexus-s-5 for http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-4b002c559b19/ http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-64566756bf92/ http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-9d46e7b57308/ http://ftp.mozilla.org/pub/mozilla.org/firefox/try-builds/mh@glandium.org-c789abcd50e7/ To see only your builds, click on the legend items on the left for the non try builds to hide their series. To get the logs you will need to visit the staging treeherder instance for now, click on the relevant test and look in the job details panel for links to the logcat. Not the tombstones signifying crashes of some type. See https://treeherder.allizom.org/ui/#/jobs?repo=try&revision=64566756bf92 https://treeherder.allizom.org/ui/#/jobs?repo=try&revision=9d46e7b57308 https://treeherder.allizom.org/ui/#/jobs?repo=try&revision=4b002c559b19 https://treeherder.allizom.org/ui/#/jobs?repo=try&revision=c789abcd50e7 The workers are a bit behind at the moment. Several disconnected and I will need to reboot the system.
Flags: needinfo?(bob)
Comment 45•9 years ago
|
||
Autophone is now available in http://trychooser.pub.build.mozilla.org/ to help with your try syntax. Please limit your tests to the ones you need and don't use the mochitests unless you really need them. Also remember we don't have many devices and they are usually pretty busy so please don't DOS them. I'll work on the documentation in the next few days but this should work for most of your cases: try: -b o -p android-api-9,android-api-11 -u autophone-s1s2 -t none Until Autophone begins reporting to production Treeherder, you can find logs etc on the staging instance https://treeherder.allizom.org. Let me know if you have issues.
Reporter | ||
Comment 46•9 years ago
|
||
(In reply to Bob Clary [:bc:] from comment #45) > Autophone is now available in > http://trychooser.pub.build.mozilla.org/ to help with your try > syntax. Please limit your tests to the ones you need and don't > use the mochitests unless you really need them. Also remember we > don't have many devices and they are usually pretty busy so > please don't DOS them. I'll work on the documentation in the next > few days but this should work for most of your cases: > > try: -b o -p android-api-9,android-api-11 -u autophone-s1s2 -t none > > Until Autophone begins reporting to production Treeherder, you > can find logs etc on the staging instance > https://treeherder.allizom.org. > > Let me know if you have issues. Unfortunately, the result of my recent try builds don't show up properly on treeherder.allizom.org, so I can't get to their logcat :( https://treeherder.allizom.org/#/jobs?repo=try&revision=790f61d4a2d8 https://treeherder.allizom.org/#/jobs?repo=try&revision=ceea827908dd
Reporter | ||
Updated•9 years ago
|
Flags: needinfo?(bob)
Comment 47•9 years ago
|
||
glandium: Sorry about the problems. Your first build appears on treeherder.allizom.org but the second doesn't. Staging treeherder had an issue with netflows when they enabled the new db setup. I've gotten permission to submit to treeherder production with the jobs automatically hidden. I'd held off until I could find a clean switch over time but with the recent load, I don't think I'll ever find a perfect time. So, I've switched to reporting to treeherder production. All new builds will report to treeherder.mozilla.org and any existing builds in the job queue will begin reporting there as well. We should have a much improved system availability and stability on production going forward.
Flags: needinfo?(bob)
Reporter | ||
Comment 48•9 years ago
|
||
This applies https://github.com/jemalloc/jemalloc/pull/192/ to our tree.
Attachment #8570335 -
Flags: review?(n.nethercote)
Comment 49•9 years ago
|
||
Comment on attachment 8570335 [details] [diff] [review] Make jemalloc's opt.lg_dirty_mult work as documented Review of attachment 8570335 [details] [diff] [review]: ----------------------------------------------------------------- rs=me
Attachment #8570335 -
Flags: review?(n.nethercote) → review+
Reporter | ||
Comment 50•9 years ago
|
||
memory/build has fatal warnings, and this warning hits win64 only. I don't know how I didn't get this error before...
Attachment #8570394 -
Flags: review?(n.nethercote)
Updated•9 years ago
|
Attachment #8570394 -
Flags: review?(n.nethercote) → review+
Reporter | ||
Comment 51•9 years ago
|
||
https://hg.mozilla.org/integration/mozilla-inbound/rev/0ad2bf058230 https://hg.mozilla.org/integration/mozilla-inbound/rev/4252e03b5156 https://hg.mozilla.org/integration/mozilla-inbound/rev/a1a89ff4ee31
https://hg.mozilla.org/mozilla-central/rev/0ad2bf058230 https://hg.mozilla.org/mozilla-central/rev/4252e03b5156 https://hg.mozilla.org/mozilla-central/rev/a1a89ff4ee31
Status: NEW → RESOLVED
Closed: 9 years ago
status-firefox39:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla39
Reporter | ||
Comment 53•9 years ago
|
||
Actually, let's keep this bug open to track definitely enabling jemalloc3 (as in, let it ride the trains)
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Reporter | ||
Comment 54•9 years ago
|
||
Disabled jemalloc 3 for now. https://hg.mozilla.org/integration/mozilla-inbound/rev/0a11b73c77b7
Comment 55•9 years ago
|
||
this has a lot of improvements and 2 private byte regressions on linux (23 + 64): http://alertmanager.allizom.org:8080/alerts.html?rev=0a11b73c77b7&showAll=1&testIndex=0&platIndex=0
Comment 56•9 years ago
|
||
as a note, these correspond to the improvements when this originally landed: http://alertmanager.allizom.org:8080/alerts.html?rev=a1a89ff4ee31&showAll=1&testIndex=0&platIndex=0
Comment 57•9 years ago
|
||
Mike, any estimate when this can be re-enabled? Bug 1005844 is waiting on it.
Flags: needinfo?(mh+mozilla)
Reporter | ||
Updated•9 years ago
|
Flags: needinfo?(mh+mozilla)
Reporter | ||
Updated•9 years ago
|
Comment 59•9 years ago
|
||
https://hg.mozilla.org/mozilla-central/rev/8b380feae2ae
Status: REOPENED → RESOLVED
Closed: 9 years ago → 9 years ago
status-firefox43:
--- → fixed
Resolution: --- → FIXED
Updated•9 years ago
|
Alias: jemalloc3-by-default → jemalloc4-by-default
Summary: Enable jemalloc 3 by default → Enable jemalloc 4 by default
Comment 60•9 years ago
|
||
On my local machine (Ubuntu 15.04 and Ubuntu 14.10). I got start-up crash with my custom build. Call stack is as follow: #0 0x0000000000439002 in run_quantize (size=0) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:96 #1 0x00000000004396ef in run_quantize_first (size=4096) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:155 #2 0x0000000000446857 in arena_run_first_best_fit (arena=0x7ffff6a00180, size=4096) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:1073 #3 0x0000000000446fbb in arena_run_alloc_small_helper (arena=0x7ffff6a00180, size=4096, binind=20) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:1129 #4 0x0000000000447101 in arena_run_alloc_small (arena=0x7ffff6a00180, size=4096, binind=20) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:1148 #5 0x0000000000454b2e in arena_bin_nonfull_run_get (arena=0x7ffff6a00180, bin=0x7ffff6a01750) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:1898 #6 0x0000000000454c58 in arena_bin_malloc_hard (arena=0x7ffff6a00180, bin=0x7ffff6a01750) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:1940 #7 0x00000000004558c4 in je_arena_malloc_small (arena=0x7ffff6a00180, size=1024, zero=true) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/arena.c:2153 #8 0x00000000004f8234 in je_arena_malloc (tcache=0x0, zero=true, size=1024, arena=0x7ffff6a00180, tsd=0x7ffff7fe66b0) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/include/jemalloc/internal/arena.h:1145 #9 je_iallocztm (arena=0x0, is_metadata=false, tcache=0x0, zero=true, size=1024, tsd=0x7ffff7fe66b0) at src/include/jemalloc/internal/jemalloc_internal.h:887 #10 je_icalloc (size=1024, tsd=0x7ffff7fe66b0) at src/include/jemalloc/internal/jemalloc_internal.h:920 #11 je_calloc (num=1, size=1024) at /home/morris/mozilla/gecko-dev/memory/jemalloc/src/src/jemalloc.c:1663 #12 0x0000000000423b4b in calloc (num=1, size=1024) at /home/morris/mozilla/gecko-dev/memory/build/replace_malloc.c:181 #13 0x00007ffff65c8890 in PR_Calloc (nelem=1, elsize=1024) at /home/morris/mozilla/gecko-dev/nsprpub/pr/src/malloc/prmem.c:443 #14 0x00007ffff65c69ce in PR_SetThreadPrivate (index=2, priv=0x7ffff6937f98) at /home/morris/mozilla/gecko-dev/nsprpub/pr/src/threads/prtpd.c:161 #15 0x00007fffe4008c2d in mozilla::BlockingResourceBase::ResourceChainAppend (this=0x7ffff6937f98, aPrev=0x0) at ../../dist/include/mozilla/BlockingResourceBase.h:181 #16 0x00007fffe400381d in mozilla::BlockingResourceBase::Acquire (this=0x7ffff6937f98) at /home/morris/mozilla/gecko-dev/xpcom/glue/BlockingResourceBase.cpp:322 #17 0x00007fffe40039f8 in mozilla::OffTheBooksMutex::Lock (this=0x7ffff6937f98) at /home/morris/mozilla/gecko-dev/xpcom/glue/BlockingResourceBase.cpp:383 #18 0x00007fffe3e92f02 in mozilla::Monitor::Lock (this=0x7ffff6937f98) at ../../dist/include/mozilla/Monitor.h:35 #19 0x00007fffe3e92f62 in mozilla::MonitorAutoLock::MonitorAutoLock (this=0x7ffff7fe5e40, aMonitor=...) at ../../dist/include/mozilla/Monitor.h:78 #20 0x00007fffe4080b59 in mozilla::net::ClosingService::ThreadFunc (this=0x7ffff6937f80) at /home/morris/mozilla/gecko-dev/netwerk/base/ClosingService.cpp:206 #21 0x00007fffe4097f2e in mozilla::net::ClosingService::ThreadFunc (aClosure=0x7ffff6937f80) at /home/morris/mozilla/gecko-dev/netwerk/base/ClosingService.h:52 #22 0x00007ffff65e557e in _pt_root (arg=0x7ffff6855fc0) at /home/morris/mozilla/gecko-dev/nsprpub/pr/src/pthreads/ptthread.c:212 #23 0x00007ffff7bc26aa in start_thread (arg=0x7ffff7fe6700) at pthread_create.c:333 #24 0x00007ffff6ec6eed in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Revert it to jemalloc3 and everything is fine. But on my ubuntu with VM(host os is mac) jemelloc4 works well without problem.
Reporter | ||
Comment 61•9 years ago
|
||
This likely is bug 1205016
Comment 62•8 years ago
|
||
(In reply to Carsten Book [:Tomcat] from comment #59) > https://hg.mozilla.org/mozilla-central/rev/8b380feae2ae Why this bug is FIXED when the change was backed out in bug 1205249 and never came back?
Flags: needinfo?(mh+mozilla)
Reporter | ||
Updated•8 years ago
|
Status: RESOLVED → REOPENED
Flags: needinfo?(mh+mozilla)
Resolution: FIXED → ---
Comment 63•8 years ago
|
||
Mike, what's needed to move this forward? Maybe I can find someone to help on this.
Flags: needinfo?(mh+mozilla)
Reporter | ||
Comment 64•8 years ago
|
||
(In reply to Kan-Ru Chen [:kanru] (UTC+8) from comment #63) > Mike, what's needed to move this forward? Maybe I can find someone to help > on this. First would be bug 1277704, if it doesn't break things. Then there are at least two things: - Figure out why 1219914 happened (iow, why bug 1203840 caused it, which would very well be attributed to how AWSY is measuring things) - Test the reality of the talos regressions (cf. how the msvc 2015 regressions were found not to have an actual impact contrary to what talos said)
Flags: needinfo?(mh+mozilla)
Reporter | ||
Comment 65•7 years ago
|
||
Per bug 1363992, jemalloc 4 related bugs are now irrelevant.
Assignee: mh+mozilla → nobody
Status: REOPENED → RESOLVED
Closed: 9 years ago → 7 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•