Avoid fragmenting cache files (use fallocate)

RESOLVED FIXED in mozilla21

Status

()

Core
Networking: Cache
RESOLVED FIXED
5 years ago
4 years ago

People

(Reporter: (dormant account), Assigned: michal)

Tracking

unspecified
mozilla21
Points:
---
Dependency tree / graph

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [Snappy:P1])

Attachments

(1 attachment, 1 obsolete attachment)

(Reporter)

Description

5 years ago
We had a fix to avoid fragmenting cache files a few years ago. It landed, but got disabled (http://dxr.mozilla.org/mozilla-central/netwerk/cache/nsDiskCacheBlockFile.cpp.html#l390). 
This is bad, because our cache block files can get fragmented into hundreds of fragments resulting in severe performance deterioration(ie tendency for disk cache to be much slower than network). Fragmentation can result in cache IO times of in seconds per entry. This is especially bad on mac os and android.
(Reporter)

Comment 1

5 years ago
Note on platforms like windows, linux it's still bad and results in 10x or so slower IO. Macos fragments much worse. Cheap flash memory on mobile amplifies effects of fragmentation(esp on writes).
OS: Windows 7 → All
Hardware: x86_64 → All

Updated

5 years ago
Assignee: nobody → michal.novotny
(Assignee)

Comment 2

5 years ago
Created attachment 687065 [details] [diff] [review]
fix

Jason disabled fallocate() for block files and separate files in bug #617123 (comment #23). My patch that fixed the regression re-enabled fallocate() only for separate files. I can't see a reason why I did it, maybe I just missed the block file... AFAICS the regression occurred only in case of separate files, so there should be no risk to simply uncomment fallocate() in nsDiskCacheBlockFile::Write().
Attachment #687065 - Flags: review?(jduell.mcbugs)
Attachment #687065 - Flags: review?(jduell.mcbugs) → review+
(Assignee)

Comment 3

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/944c97dabc33
Backed out on the suspicion of regressing Tp5 by 50%!!!

https://hg.mozilla.org/integration/mozilla-inbound/rev/6d1462fbb539

Please reland once you make sure that this regression was not caused by this patch.



Regression  Tp5 No Network Row Major Shutdown MozAfterPaint increase 45.1% on MacOSX 10.8 Mozilla-Inbound
-----------------------------------------------------------------------------------------------------------
    Previous: avg 398.667 stddev 17.175 of 30 runs up to revision ce1e1e13511b
    New     : avg 578.600 stddev 27.043 of 5 runs since revision 944c97dabc33
    Change  : +179.933 (45.1% / z=10.476)
    Graph   : http://mzl.la/QTeoJd

Changeset range: http://hg.mozilla.org/integration/mozilla-inbound/pushloghtml?fromchange=ce1e1e13511b&tochange=944c97dabc33

Changesets:
  * http://hg.mozilla.org/integration/mozilla-inbound/rev/5ae27ec3d9ec
    : Gregor Wagner <anygregor@gmail.com> - Bug 815974 - [System] Remember geolocation permission for an app is broken. r=sicking
    : http://bugzilla.mozilla.org/show_bug.cgi?id=815974

  * http://hg.mozilla.org/integration/mozilla-inbound/rev/b88467155adc
    : Axel Hecht <axel@pike.org> - bug 796051, add chrome-% target to package b2g localized files, r=fabrice
    : http://bugzilla.mozilla.org/show_bug.cgi?id=796051

  * http://hg.mozilla.org/integration/mozilla-inbound/rev/e47b059493cf
    : Benoit Girard <b56girard@gmail.com> - Bug 799640 - Part 1: Refactor JSObjectBuilder. r=ehsan
    : http://bugzilla.mozilla.org/show_bug.cgi?id=799640

  * http://hg.mozilla.org/integration/mozilla-inbound/rev/b130bb991d84
    : Benoit Girard <b56girard@gmail.com> - Bug 799640 - Part 2: Save profiles on shutdown using custom JSON encoder. r=ehsan
    : http://bugzilla.mozilla.org/show_bug.cgi?id=799640

  * http://hg.mozilla.org/integration/mozilla-inbound/rev/133c704dbcc6
    : Benoit Girard <b56girard@gmail.com> - Bug 799640 - Part 3: Add shutdown labels. r=espindola
    : http://bugzilla.mozilla.org/show_bug.cgi?id=799640

  * http://hg.mozilla.org/integration/mozilla-inbound/rev/944c97dabc33
    : Michal Novotny <michal.novotny@gmail.com> - Bug 816642 - Avoid fragmenting cache files, r=jduell
    : http://bugzilla.mozilla.org/show_bug.cgi?id=816642

Bugs:
  * http://bugzilla.mozilla.org/show_bug.cgi?id=799640 - Need a shutdown-profiling mode
  * http://bugzilla.mozilla.org/show_bug.cgi?id=816642 - Avoid fragmenting cache files
  * http://bugzilla.mozilla.org/show_bug.cgi?id=796051 - Generate make target to package needed l10n files in gecko
  * http://bugzilla.mozilla.org/show_bug.cgi?id=815974 - [System] Remember geolocation permission for an app is broken
_______________________________________________
dev-tree-management mailing list
dev-tree-management@lists.mozilla.org
https://lists.mozilla.org/listinfo/dev-tree-management
(Assignee)

Comment 5

5 years ago
I'm not sure what is the purpose of Tp5 test and what exactly it measures. Anyway, if the browser starts without a profile, then it includes cache creation. Without this patch we write about 22kB (size of _CACHE_00[1,2,3]_ and _CACHE_MAP_) and with this patch we write about 12MB. Impact of more I/O operations will be probably much smaller after fixing bug #763555, but even then the I/O will affect the startup time.
I pushed my patch from the suspected range to tp5. This should rule out whether it's this patch or mine:
https://tbpl.mozilla.org/?tree=Try&rev=432f3b63d2f7
(In reply to comment #5)
> I'm not sure what is the purpose of Tp5 test and what exactly it measures.
> Anyway, if the browser starts without a profile, then it includes cache
> creation. Without this patch we write about 22kB (size of _CACHE_00[1,2,3]_ and
> _CACHE_MAP_) and with this patch we write about 12MB. Impact of more I/O
> operations will be probably much smaller after fixing bug #763555, but even
> then the I/O will affect the startup time.

Tp5 measures page load time.  You can find more info here: <https://wiki.mozilla.org/Buildbot/Talos#tp5>.

Updated

5 years ago
Summary: Avoid fragmenting cache files → Avoid fragmenting cache files (use fallocate)
(Assignee)

Comment 8

5 years ago
Created attachment 697997 [details] [diff] [review]
patch v2 - don't call fallocate() on MacOSX

Comment 4 refers only to MacOSX, so it seems that we could call fallocate() on all other platforms. Is it possible to see that Tp5 test on try server? I'd like to test it before landing. BTW I can't even navigate to the problematic graph on http://graphs.mozilla.org using "Custom Chart". There is no "Tp5 No Network Row Major Shutdown MozAfterPaint" test when I choose Mozilla-Inbound.
Attachment #687065 - Attachment is obsolete: true
(In reply to Michal Novotny (:michal) from comment #8)
> Is it possible to see that Tp5 test on try server?

If this is addressed at me, I'm not sure.  Ask on #developers?

Comment 10

5 years ago
One way to find out if this one line patch impacts TP5 is to land it on mozilla-central and monitor the dashboard, and if indeed impacting TP5 back it out.
> on all other platforms. Is it possible to see that Tp5 test on try server?

land your repo without any patches onto try to get a control run
try: -b o -p macosx64 -u none -t tpn

and then add your patch and land it again. Now you've got apples to apples to compare something to.

in tbpl click on tp and trigger 3 or 4 runs of the tp test for both builds.

when you click on tp I believe you want to look at tp5n_paint for comparisons. There will be some variation between runs but ime this is generally enough to pretty clearly see if your patch is causing a problem.
(Assignee)

Comment 12

5 years ago
Thanks Patrick. The patch doesn't seem to affect Linux and Windows:

with patch - https://tbpl.mozilla.org/?tree=Try&rev=f38cbcf500c7   
w/o patch  - https://tbpl.mozilla.org/?tree=Try&rev=46cde1b200bc
(Assignee)

Comment 13

5 years ago
https://hg.mozilla.org/integration/mozilla-inbound/rev/af7d77467148
https://hg.mozilla.org/mozilla-central/rev/af7d77467148
Status: NEW → RESOLVED
Last Resolved: 5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla21
(Assignee)

Updated

5 years ago
Depends on: 835099
Depends on: 854863
You need to log in before you can comment on or make changes to this bug.