stylo: Lots of time spend in bzero when doubling vecs

RESOLVED FIXED in Firefox 56

Status

()

Core
CSS Parsing and Computation
P1
normal
RESOLVED FIXED
3 months ago
15 days ago

People

(Reporter: bholley, Assigned: glandium)

Tracking

(Blocks: 3 bugs)

unspecified
mozilla56
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(firefox56 fixed)

Details

MozReview Requests

()

Submitter Diff Changes Open Issues Last Updated
Loading...
Error loading review requests:

Attachments

(7 attachments)

During parsing and stylist flushing, we build a lot of vecs. Those vecs use realloc to double their buffer during appends.

When loading the 100x myspace testcase, we spend 50ms in bzero, all of it in callstacks emanating from double().

As far as I know realloc is not supposed to zero, and the callstack in mozjemalloc does seem to honor that. I haven't debugged it, but I suspect it may have something to do with this line:

http://searchfox.org/mozilla-central/rev/784ec1af1451fd479641b40efd7720a2f3f5781a/memory/mozjemalloc/jemalloc.c#2808

I don't understand why a flag set on the chunk we're recycling should require us to zero the memory before returning it to the caller. The caller can do whatever it wants with that memory, so the allocator shouldn't be relying on it.
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(jseward)
Created attachment 8863110 [details] [diff] [review]
Don't force zeroing when allocating just because the chunk is zeroed. v1

Gosh, this sure is a tempting patch.

MozReview-Commit-ID: JNqTWjPXZEM
Hm, that doesn't quite do it. I think the issue is this: http://searchfox.org/mozilla-central/rev/784ec1af1451fd479641b40efd7720a2f3f5781a/memory/mozjemalloc/jemalloc.c#3655

Changing that to pass "zero" causes crashes on mac opt (but not on linux debug). As far as I can tell there's nothing that's going to assume that the data here is zero, but I'm guessing we're expecting some of the header fields to be zeroed.

I don't fully understand the storage scheme here. Can we get away with zeroing less than we are?

Changing NI to the two memory peers who aren't on PTO.
Flags: needinfo?(n.nethercote)
Flags: needinfo?(mh+mozilla)
Flags: needinfo?(jseward)
Flags: needinfo?(erahm)
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #2)
> Hm, that doesn't quite do it. I think the issue is this:
> http://searchfox.org/mozilla-central/rev/
> 784ec1af1451fd479641b40efd7720a2f3f5781a/memory/mozjemalloc/jemalloc.c#3655
> 
> Changing that to pass "zero" causes crashes on mac opt (but not on linux
> debug). As far as I can tell there's nothing that's going to assume that the
> data here is zero, but I'm guessing we're expecting some of the header
> fields to be zeroed.
> 
> I don't fully understand the storage scheme here. Can we get away with
> zeroing less than we are?
> 
> Changing NI to the two memory peers who aren't on PTO.

This stuff was backported from jemalloc 3 [1]. It's not clear to me why they're relying on the node's previous zero status at all, I think what they really want is to zero the buffer if *zero is true (regardless of the past zero value, that might be why you're seeing crashes with your patch). We've also seen stuff where mac internals expect allocated memory to be zeroed (we junk fill it by default) and things go awry.

For |realloc| we really just want to zero the trailing portion *if* the previous allocation was zeroed so there's a possible optimization there as well.

I can take a further look at this on Monday.

[1] https://github.com/jemalloc/jemalloc/commit/14a2c6a698a207ac3f3825443cf3441c8842e990
(In reply to Eric Rahm [:erahm] from comment #3)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #2)
> > Hm, that doesn't quite do it. I think the issue is this:
> > http://searchfox.org/mozilla-central/rev/
> > 784ec1af1451fd479641b40efd7720a2f3f5781a/memory/mozjemalloc/jemalloc.c#3655
> > 
> > Changing that to pass "zero" causes crashes on mac opt (but not on linux
> > debug). As far as I can tell there's nothing that's going to assume that the
> > data here is zero, but I'm guessing we're expecting some of the header
> > fields to be zeroed.
> > 
> > I don't fully understand the storage scheme here. Can we get away with
> > zeroing less than we are?
> > 
> > Changing NI to the two memory peers who aren't on PTO.
> 
> This stuff was backported from jemalloc 3 [1]. It's not clear to me why
> they're relying on the node's previous zero status at all,

Well, if the caller wants zeroed memory, it makes sense to avoid zeroing the buffer if the node is already known to be zeroed, right?

> I think what they
> really want is to zero the buffer if *zero is true (regardless of the past
> zero value, that might be why you're seeing crashes with your patch).

So to be clear, the patch attached to this bug seems to work fine, but doesn't fix the performance problem. I believe the performance problem is coming from this:

http://searchfox.org/mozilla-central/rev/784ec1af1451fd479641b40efd7720a2f3f5781a/memory/mozjemalloc/jemalloc.c#3655

Where we pass |true| instead of |zero| in arena_run_alloc. But changing that to pass zero causes crashes.

> We've
> also seen stuff where mac internals expect allocated memory to be zeroed (we
> junk fill it by default) and things go awry.
> 
> For |realloc| we really just want to zero the trailing portion *if* the
> previous allocation was zeroed so there's a possible optimization there as
> well.

I don't follow this. My understanding is that realloc does not guarantee that the new extra memory is zeroed. Realloc currently passes |false| for |zero|, and the only reason it's getting zeroed is because of that business linked to above.

> 
> I can take a further look at this on Monday.

Awesome, thank you!

> 
> [1]
> https://github.com/jemalloc/jemalloc/commit/
> 14a2c6a698a207ac3f3825443cf3441c8842e990
Assignee: nobody → erahm
Flags: needinfo?(n.nethercote)
Priority: -- → P1
I am happy to defer to erahm on this one.
Okay tracing through, node->zeroed is set depending on whether or not our madvise call zeroes the pages. So this is true on Linux, false on Mac. It's possible instead of memsetting we could recommit the range we want which I think should do page level zeroing for us.
Flags: needinfo?(erahm)
Created attachment 8863548 [details] [diff] [review]
Commit recycled pages to zero them on mac

This is a WIP patch, I still need to run it against the myspace test.

Rather than memsetting we can re-commit pages that have been recycled in order
to zero them. If re-committing fails we can just fall back to memset.
Interesting! Though I still don't understand why we need to pass zero=true instead of zero=zero at [1]. Can you shed some light on that?

[1] http://searchfox.org/mozilla-central/rev/784ec1af1451fd479641b40efd7720a2f3f5781a/memory/mozjemalloc/jemalloc.c#3655
Flags: needinfo?(erahm)
Fixing this is probably good.  I would however speculatively offer the
observation that what you're effectively doing is moving write misses from
|realloc| to the point where the new memory is written for the first
time.  But you're not getting rid of them.  So it might be the case that 
the speedup you get is not as big as you may have hoped.
(In reply to Julian Seward [:jseward] from comment #9)
> Fixing this is probably good.

By "this" you mean the issue in comment 8? If so I totally agree, but the problem is that I get crashes when I do, as noted in comment 2.

Should I file a separate bug for that, or should we handle both in this bug?

> I would however speculatively offer the
> observation that what you're effectively doing is moving write misses from
> |realloc| to the point where the new memory is written for the first
> time.  But you're not getting rid of them.  So it might be the case that 
> the speedup you get is not as big as you may have hoped.

Yes, but this mostly occurs when doubling vecs, and we're not guaranteed to access the entirety of the buffer soon (or ever). So while I agree there are some cases where the speedup wouldn't be as high as we hoped, it would certainly be a win.
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #10)
> (In reply to Julian Seward [:jseward] from comment #9)
> > Fixing this is probably good.
> 
> By "this" you mean the issue in comment 8? If so I totally agree, but the
> problem is that I get crashes when I do, as noted in comment 2.

I think it was re: comment 7.
 
> Should I file a separate bug for that, or should we handle both in this bug?

We can do both here, I started to do some code archeology to find out what's going on here, but that code doesn't exist upstream anymore. I'll look a bit further today. I woudl guess the answer is: "OSX crashes if we don't do this"

> > I would however speculatively offer the
> > observation that what you're effectively doing is moving write misses from
> > |realloc| to the point where the new memory is written for the first
> > time.  But you're not getting rid of them.  So it might be the case that 
> > the speedup you get is not as big as you may have hoped.
> 
> Yes, but this mostly occurs when doubling vecs, and we're not guaranteed to
> access the entirety of the buffer soon (or ever). So while I agree there are
> some cases where the speedup wouldn't be as high as we hoped, it would
> certainly be a win.

Also the hope is that whatever osx does to zero pages is more efficient, but I guess I wouldn't be surprised if that's not the case.
Flags: needinfo?(erahm)
(In reply to Eric Rahm [:erahm] from comment #11)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #10)
> > (In reply to Julian Seward [:jseward] from comment #9)
> > > Fixing this is probably good.
> > 
> > By "this" you mean the issue in comment 8? If so I totally agree, but the
> > problem is that I get crashes when I do, as noted in comment 2.
> 
> I think it was re: comment 7.
>  
> > Should I file a separate bug for that, or should we handle both in this bug?
> 
> We can do both here, I started to do some code archeology to find out what's
> going on here, but that code doesn't exist upstream anymore. I'll look a bit
> further today. I woudl guess the answer is: "OSX crashes if we don't do this"

Oh I see. So the point is that osx system libraries may invoke mozjemalloc's realloc, and may expect the additional memory to be zeroed? That would suck. I wonder if we could at least make the memsetting conditional on osx to limit the damage.

> 
> > > I would however speculatively offer the
> > > observation that what you're effectively doing is moving write misses from
> > > |realloc| to the point where the new memory is written for the first
> > > time.  But you're not getting rid of them.  So it might be the case that 
> > > the speedup you get is not as big as you may have hoped.
> > 
> > Yes, but this mostly occurs when doubling vecs, and we're not guaranteed to
> > access the entirety of the buffer soon (or ever). So while I agree there are
> > some cases where the speedup wouldn't be as high as we hoped, it would
> > certainly be a win.
> 
> Also the hope is that whatever osx does to zero pages is more efficient, but
> I guess I wouldn't be surprised if that's not the case.

i.e. the patch in comment 7? Worth a shot for sure.
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #12)
> Oh I see. So the point is that osx system libraries may invoke mozjemalloc's
> realloc, and may expect the additional memory to be zeroed? That would suck.
> I wonder if we could at least make the memsetting conditional on osx to
> limit the damage.

Effectively it always is (at least for the chunk recycling logic), node->zeroed is always true on Linux due to madvise(free) zeroing the pages. The windows logic is harder to follow, I'll take a look.

> i.e. the patch in comment 7? Worth a shot for sure.

Alright I'll pursue some perf tests on OSX with that patch.
(In reply to Eric Rahm [:erahm] (PTO May 4 - 8) from comment #13)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #12)
> > Oh I see. So the point is that osx system libraries may invoke mozjemalloc's
> > realloc, and may expect the additional memory to be zeroed? That would suck.
> > I wonder if we could at least make the memsetting conditional on osx to
> > limit the damage.
> 
> Effectively it always is (at least for the chunk recycling logic),
> node->zeroed is always true on Linux due to madvise(free) zeroing the pages.
> The windows logic is harder to follow, I'll take a look.
> 
> > i.e. the patch in comment 7? Worth a shot for sure.
> 
> Alright I'll pursue some perf tests on OSX with that patch.

I just applied the patch and had a look. It almost entirely eliminates the bzero time during Servo_Stylesheet_FromUTF8Bytes on the 100x myspace testcase (~50ms->1ms), which is awesome! Performance in general does seem to be improved, so I don't think we're giving it all up in cache misses.

However, it doesn't seem to move the needle on the bzero overhead in Servo_StyleSet_FlushStyleSheets. There's still a significant amount of it, so it's worth investigating that I think.
Blocks: 1360881
Eric, any updates here?
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #15)
> Eric, any updates here?

Not really, I don't think I'll have time to look into it before I'm off next week. Given this is OSX only I think it should be pretty low priority. We should be focused on Win64 at this point.

mstange determined that bzero showing up on mac is most likely due to it's overlap with memset at the assembly level, so it's possible the remaining bzero overhead you're seeing is really our junk fill or free poisoning which is unavoidable. The best option there is to reuse whatever memory you've already allocated if possible.
Comment on attachment 8863548 [details] [diff] [review]
Commit recycled pages to zero them on mac

Mike, aside from being rather bitrotted what do you think of this approach?
Attachment #8863548 - Flags: feedback?(mh+mozilla)
(In reply to Eric Rahm (Out 5/27 - 6/4) [:erahm] from comment #16)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #15)
> > Eric, any updates here?
> 
> Not really, I don't think I'll have time to look into it before I'm off next
> week. Given this is OSX only I think it should be pretty low priority. We
> should be focused on Win64 at this point.

Fair, but I am profiling (and demoing) on mac, so it's pretty important as a performance baseline. Given the improvement in comment 14 I think it's worth getting into the tree.

More to the point, if the argument is that we should focus on Win64, then we need to answer the question from comment 13. If windows is zeroing chunks during realloc, that sounds like a pretty important optimization target. 

> 
> mstange determined that bzero showing up on mac is most likely due to it's
> overlap with memset at the assembly level, so it's possible the remaining
> bzero overhead you're seeing is really our junk fill or free poisoning which
> is unavoidable.

I don't follow this (AFAICT jemalloc uses memset to zero, which gets converted in the system libraries to bzero), but I'm skeptical - I have pretty clear stacks of this showing up during realloc, when we shouldn't be doing any memset or bzero at all.

Anyway, the most important thing to do here is to figure out whether windows zeros pages during chunk recycling or not. Second most important would be to land the patch that produced the results in comment 14.
Flags: needinfo?(erahm)
(Assignee)

Comment 19

2 months ago
Comment on attachment 8863548 [details] [diff] [review]
Commit recycled pages to zero them on mac

Review of attachment 8863548 [details] [diff] [review]:
-----------------------------------------------------------------

::: memory/mozjemalloc/jemalloc.c
@@ +2028,5 @@
> +{
> +#  ifdef MOZ_MEMORY_WINDOWS
> +  pages_commit(addr, size);
> +#  else
> +	if (mmap(addr, size, PROT_READ | PROT_WRITE, MAP_FIXED | MAP_PRIVATE |

It feels like this only moves the cost of the memset to the kernel, which will involve a page fault, so this might actually be worse than plain memset.

@@ +2869,5 @@
>  	pages_commit(ret, size);
>  #endif
>  	if (*zero) {
>  		if (zeroed == false)
> +      pages_zero(ret, size);

note that on Windows + DECOMMIT, we're already doing pages_commit, so we could skip doing stuff here entirely.
Attachment #8863548 - Flags: feedback?(mh+mozilla)
(Assignee)

Comment 20

2 months ago
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #18)
> I don't follow this (AFAICT jemalloc uses memset to zero, which gets
> converted in the system libraries to bzero), but I'm skeptical - I have
> pretty clear stacks of this showing up during realloc, when we shouldn't be
> doing any memset or bzero at all.

realloc can call memset for poisoning when shrinking allocations. That's not memset(ptr, 0, size), but memset(ptr, 0xe5, size), and mstange's finding is that that still ends up in the bzero function at the assembly level, because bzero and memset are actually two different entry points to the same function in libSystem or something along those lines.
(In reply to Mike Hommey [:glandium] from comment #20)
> (In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #18)
> > I don't follow this (AFAICT jemalloc uses memset to zero, which gets
> > converted in the system libraries to bzero), but I'm skeptical - I have
> > pretty clear stacks of this showing up during realloc, when we shouldn't be
> > doing any memset or bzero at all.
> 
> realloc can call memset for poisoning when shrinking allocations.

Ok, but we're growing allocations, not shrinking them.

In any case, the code clearly does invoke memset on realloc if zeroed is not already true. It sounds like we _do_ need it to be zeroed on mac, and linux should always have it zeroed anyways. But windows is unclear, so we still need Eric to answer the question from comment 13.
This is the testcase: https://www.dropbox.com/s/h51fspacftf1lcq/myspace.tar.bz2?dl=0

profile of parsing and stylist updating: http://bit.ly/2u35JJy
Flags: needinfo?(mh+mozilla)
Mike said he'd take a look.
Assignee: erahm → mh+mozilla
Flags: needinfo?(erahm)
(Assignee)

Comment 24

22 days ago
Comment on attachment 8863110 [details] [diff] [review]
Don't force zeroing when allocating just because the chunk is zeroed. v1

Review of attachment 8863110 [details] [diff] [review]:
-----------------------------------------------------------------

So interestingly, this patch doesn't actually change anything. But it's still worth taking because we never use the outvalue from chunk_recycle. So I'd r+ the same patch, but with a different summary.
Attachment #8863110 - Flags: feedback+
(Assignee)

Comment 25

22 days ago
(In reply to Mike Hommey [:glandium] from comment #24)
> Comment on attachment 8863110 [details] [diff] [review]
> Don't force zeroing when allocating just because the chunk is zeroed. v1
> 
> Review of attachment 8863110 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> So interestingly, this patch doesn't actually change anything. But it's
> still worth taking because we never use the outvalue from chunk_recycle. So
> I'd r+ the same patch, but with a different summary.

... except to fix this issue properly, there are chances I'll end up using that outvalue...
(Assignee)

Updated

19 days ago
Flags: needinfo?(mh+mozilla)
(Assignee)

Updated

19 days ago
Depends on: 1356701
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
(Assignee)

Comment 31

19 days ago
Bobby, can you check what the patch queue I attached does for you? It does reduce actual bzero calls (as in memset(ptr, 0, size). However, if I run the testcase several times and get a profile for that, I actually can see that there remains some bzero calls, but I confirmed those actually are memset calls for the poisoning.
Flags: needinfo?(bobbyholley)
(Assignee)

Comment 32

19 days ago
(In reply to Bobby Holley (:bholley) (busy with Stylo) from comment #21)
> Ok, but we're growing allocations, not shrinking them.

BTW, since I've noticed this while experimenting: if the realloc is not in-place, and doesn't involve huge allocations, we're effectively doing:
  void *new_ptr = malloc(new_size);
  memcpy(new_ptr, old_ptr, old_size);
  free(old_ptr);

The latter (obviously) *does* poison.

Comment 33

18 days ago
mozreview-review
Comment on attachment 8884189 [details]
Bug 1360772 - Make pages_purge return whether the pages were purged.

https://reviewboard.mozilla.org/r/155118/#review160402

::: memory/mozjemalloc/mozjemalloc.cpp:1926
(Diff revision 1)
>  
>          MOZ_ASSERT(ret);
>          return (ret);
>  }
>  
> -bool
> +static bool

Please add a brief comment here explaining the return value, given that it's not a typical success/fail indicator.
Attachment #8884189 - Flags: review?(n.nethercote) → review+

Comment 34

18 days ago
mozreview-review
Comment on attachment 8884190 [details]
Bug 1360772 - Store the chunk type in chunk recycling queue.

https://reviewboard.mozilla.org/r/155120/#review160404

It's unclear to me if this patch is changing functionality at all?

::: memory/mozjemalloc/mozjemalloc.cpp:533
(Diff revision 1)
>  
>  	/* Total region size. */
>  	size_t	size;
>  
> -	/* True if zero-filled; used by chunk recycling code. */
> -	bool	zeroed;
> +	/* What type of chunk is there; used by chunk recycling code. */
> +	enum ChunkType type;

Is the |enum| keyword necessary here?

Also, |type| is a generic name; |chunk_type| is probably better.

::: memory/mozjemalloc/mozjemalloc.cpp:2161
(Diff revision 1)
>  		 * remove/insert from/into chunks_szad.
>  		 */
>  		extent_tree_szad_remove(chunks_szad, node);
>  		node->addr = chunk;
>  		node->size += size;
> -		node->zeroed = node->zeroed && zeroed;
> +		node->type = (node->type == type) ? type : RECYCLED_CHUNK;

This might be clearer:

> if (node->type != type) {
>   node->type = RECYCLED_CHUNK:
> }

::: memory/mozjemalloc/mozjemalloc.cpp:2198
(Diff revision 1)
>  		extent_tree_ad_remove(chunks_ad, prev);
>  
>  		extent_tree_szad_remove(chunks_szad, node);
>  		node->addr = prev->addr;
>  		node->size += prev->size;
> -		node->zeroed = (node->zeroed && prev->zeroed);
> +		node->type = (node->type == prev->type) ? node->type : RECYCLED_CHUNK;

Similarly, an |if| might be clearer here.
Attachment #8884190 - Flags: review?(n.nethercote) → review+

Comment 35

18 days ago
mozreview-review
Comment on attachment 8884191 [details]
Bug 1360772 - Make chunk_alloc callers handle zeroing when they need it.

https://reviewboard.mozilla.org/r/155122/#review160408

This doesn't change any zeroing behaviour, right?

::: memory/mozjemalloc/mozjemalloc.cpp:2069
(Diff revision 1)
>  #else
>  #define CAN_RECYCLE(size) true
>  #endif
>  
>  static void *
> -chunk_alloc(size_t size, size_t alignment, bool base, bool zero)
> +chunk_alloc(size_t size, size_t alignment, bool base, bool *zeroed)

A comment explaining how |zeroed| works would be nice here.
Attachment #8884191 - Flags: review?(n.nethercote) → review+

Comment 36

18 days ago
mozreview-review
Comment on attachment 8884192 [details]
Bug 1360772 - Indicate to chunk_recycle callers that pages_commit has zeroed the chunk.

https://reviewboard.mozilla.org/r/155124/#review160410

::: memory/mozjemalloc/mozjemalloc.cpp:2052
(Diff revision 1)
>  
>  	if (node)
>  		base_node_dealloc(node);
>  #ifdef MALLOC_DECOMMIT
>  	pages_commit(ret, size);
> +	if (zeroed) {

Please add a comment here that pages_commit() is guaranteed to zero the chunk.
Attachment #8884192 - Flags: review?(n.nethercote) → review+

Comment 37

18 days ago
mozreview-review
Comment on attachment 8884193 [details]
Bug 1360772 - Allow to initialize non-zeroed arena chunks.

https://reviewboard.mozilla.org/r/155126/#review160416

Oof, this one is tricky. I feel like some of the explanation in the commit message would also be valuable as comments in the code, probably above the computation of |flags| in arena_chunk_init().

::: memory/mozjemalloc/mozjemalloc.cpp:2140
(Diff revision 1)
>  	}
>  
>  	/* Only record zeroed chunks, arena chunks or previously recycled
>  	 * chunks. */
> +	/* /!\ Adjust arena_run_alloc if other types are allowed to be
> +	 * recorded. */

This warning could be made clearer. I suggest:

"WARNING: if you add other chunk types here, please adjust arena_run_alloc accordingly."

It might be worth moving the comment between the |switch| line and the first |case| line, too?
Attachment #8884193 - Flags: review?(n.nethercote) → review+
(In reply to Nicholas Nethercote [:njn] from comment #33)
> Comment on attachment 8884189 [details]
> Bug 1360772 - Make pages_purge return whether the pages were purged.

The second paragraph in the commit message ("The code further zeroing...") is hard to understand, and would be improved by rewording.
The squashed diff from reviewboard doesn't seem to apply locally for me.

Trying to use cinnabar (with the eventual intention of cherry-picking), I get:

bholley@slice /files/mozilla/mc/w (slice_w) $ git mozreview fetch 83a41b5f47b30be435d6190df90568f29e2a12d6
Reading 7 changesets
Reading 7 manifests
Reading and importing 12 files
Importing 7 manifests
Importing 7 changesets
fatal: bad object 0000000000000000000000000000000000000000
error: hg::https://reviewboard-hg.mozilla.org/gecko did not send all necessary objects

Failed to fetch the specified revision; aborting


Am I doing it wrong?
Flags: needinfo?(bobbyholley)
Flags: needinfo?(mh+mozilla)
(Assignee)

Comment 40

15 days ago
(ni? addressed on irc)
Flags: needinfo?(mh+mozilla)
(Assignee)

Comment 41

15 days ago
mozreview-review-reply
Comment on attachment 8884190 [details]
Bug 1360772 - Store the chunk type in chunk recycling queue.

https://reviewboard.mozilla.org/r/155120/#review160404

FWIW, it mostly doesn't on its own, but, after the full queue is applied:
- on OSX, there's a slight chance that it allows to skip zeroing in huge_palloc when doing huge callocs, if zeroed chunks are not coalesced with other recycled chunks in chunk_record. That's something that could be further improved in followups
- on other OSes where pages_purge guarantees the pages are always purged, it's not a slight chance, it's always happening. So huge callocs should be faster, if we ever do some.
(Assignee)

Comment 42

15 days ago
mozreview-review-reply
Comment on attachment 8884191 [details]
Bug 1360772 - Make chunk_alloc callers handle zeroing when they need it.

https://reviewboard.mozilla.org/r/155122/#review160408

It doesn't.
(Assignee)

Comment 43

15 days ago
(In reply to Nicholas Nethercote [:njn] from comment #34)
> >  	/* Total region size. */
> >  	size_t	size;
> >  
> > -	/* True if zero-filled; used by chunk recycling code. */
> > -	bool	zeroed;
> > +	/* What type of chunk is there; used by chunk recycling code. */
> > +	enum ChunkType type;
> 
> Is the |enum| keyword necessary here?

It's not. I introduced the enum in bug 1378658, and there I intentionally left the keyword in e.g. function declarations in case the patch needed to be backported. For some reason, I didn't switch mindset when writing the code for this bug. I'm removing the keyword in the rebased patches that will land.
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)
Comment hidden (mozreview-request)

Comment 49

15 days ago
Pushed by mh@glandium.org:
https://hg.mozilla.org/integration/autoland/rev/3edd431c888d
Make pages_purge return whether the pages were purged. r=njn
https://hg.mozilla.org/integration/autoland/rev/bd8104b68956
Store the chunk type in chunk recycling queue. r=njn
https://hg.mozilla.org/integration/autoland/rev/9c9309460219
Make chunk_alloc callers handle zeroing when they need it. r=njn
https://hg.mozilla.org/integration/autoland/rev/9f73e5f2dc3c
Indicate to chunk_recycle callers that pages_commit has zeroed the chunk. r=njn
https://hg.mozilla.org/integration/autoland/rev/094ae2ae92c5
Allow to initialize non-zeroed arena chunks. r=njn

Comment 50

15 days ago
bugherder
https://hg.mozilla.org/mozilla-central/rev/3edd431c888d
https://hg.mozilla.org/mozilla-central/rev/bd8104b68956
https://hg.mozilla.org/mozilla-central/rev/9c9309460219
https://hg.mozilla.org/mozilla-central/rev/9f73e5f2dc3c
https://hg.mozilla.org/mozilla-central/rev/094ae2ae92c5
Status: NEW → RESOLVED
Last Resolved: 15 days ago
status-firefox56: --- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla56
You need to log in before you can comment on or make changes to this bug.