Closed Bug 1540777 Opened 8 months ago Closed 6 months ago

Fresh profile page load regressions from enabling LSNG on Mar 8th, 2019

Categories

(Core :: Storage: localStorage & sessionStorage, defect, P1)

defect

Tracking

()

RESOLVED FIXED
mozilla68
Tracking Status
firefox68 --- fixed

People

(Reporter: denispal, Assigned: janv, NeedInfo)

References

(Blocks 2 open bugs, Regression)

Details

(Whiteboard: [qf:p1:pageload])

Attachments

(1 file)

These were measured on the Acer Aspire E 15 running Windows 10 using April 27th Nightly:

pageLoadTime:                                             LSNG off                  LSNG on
----------------------------------------------------------------------------------------------------------
www.amazon.com                                    |    648.33 (+-  32)      |  1747.20 (+- 276) -169.49% |
www.cbc.ca                                        |   1678.71 (+-  56)      |  4229.00 (+- 177) -151.92% |
www.cbsnews.com                                   |    509.00 (+-  36)      |   515.00 (+-  36)   -1.18% |
www.cnbc.com                                      |   5877.11 (+- 216)      | 11622.40 (+- 980)  -97.76% |
www.dw.com                                        |   1903.67 (+-  51)      |  2735.80 (+- 170)  -43.71% |
www.expedia.com                                   |   2835.56 (+-  86)      |  4183.83 (+- 365)  -47.55% |
www.foxnews.com                                   |   5223.50 (+- 262)      |  4402.67 (+- 368)   15.71% |
www.huffingtonpost.com                            |   3177.00 (+-1555)      |  3595.17 (+- 321)  -13.16% |
www.imgur.com                                     |   5179.20 (+- 405)      |  5048.50 (+- 356)    2.52% |
www.indiatimes.com                                |   2851.22 (+-  59)      |  5696.00 (+- 540)  -99.77% |
www.instagram.com-justintimberlake-               |   1833.29 (+- 112)      |  2534.83 (+- 227)  -38.27% |
www.news.com.au                                   |   3325.11 (+- 302)      | 11250.20 (+- 947) -238.34% |
www.nytimes.com                                   |   6446.22 (+- 128)      | 12250.00 (+- 747)  -90.03% |
www.reddit.com                                    |   4005.11 (+- 162)      |  5453.44 (+- 444)  -36.16% |
www.shutterstock.com                              |   2761.33 (+-  93)      |  3822.80 (+- 509)  -38.44% |
www.thedailybeast.com                             |   3049.11 (+- 265)      |  5347.33 (+- 346)  -75.37% |
www.thehindu.com                                  |   1770.44 (+-  77)      |  2668.50 (+-  71)  -50.72% |
www.tripadvisor.com                               |   1803.11 (+- 103)      |  1786.44 (+- 101)    0.92% |
www.twitch.tv                                     |   2071.67 (+-  54)      |  3412.29 (+- 338)  -64.71% |
www.usnews.com                                    |   3454.67 (+- 251)      |  5153.80 (+-1841)  -49.18% |
www.youtube.com                                   |   2780.22 (+- 219)      |  3790.75 (+- 373)  -36.35% |


SpeedIndex:
----------------------------------------------------------------------------------------------------------
www.amazon.com                                    |   3010.11 (+-  81)      |  3876.80 (+- 446)  -28.79% |
www.cbc.ca                                        |   3330.75 (+- 280)      |  4963.83 (+- 327)  -49.03% |
www.cbsnews.com                                   |   1737.11 (+-  49)      |  2755.38 (+- 275)  -58.62% |
www.cnbc.com                                      |   1555.33 (+- 120)      |  3657.20 (+- 288) -135.14% |
www.dw.com                                        |   1789.33 (+-  70)      |  2655.40 (+- 168)  -48.40% |
www.expedia.com                                   |   2942.33 (+- 148)      |  4457.71 (+- 423)  -51.50% |
www.foxnews.com                                   |   1367.71 (+-  61)      |  2946.33 (+- 284) -115.42% |
www.huffingtonpost.com                            |   1404.14 (+- 112)      |  2673.71 (+- 229)  -90.42% |
www.imgur.com                                     |   2316.00 (+- 202)      |  3130.86 (+- 287)  -35.18% |
www.indiatimes.com                                |   1446.11 (+-  88)      |  2566.60 (+- 231)  -77.48% |
www.instagram.com-justintimberlake-               |   1370.44 (+-  61)      |  2327.50 (+- 145)  -69.84% |
www.news.com.au                                   |   1062.44 (+-  68)      |  3019.80 (+- 246) -184.23% |
www.nytimes.com                                   |   3118.78 (+- 114)      |  5622.00 (+- 471)  -80.26% |
www.reddit.com                                    |   1191.00 (+-  61)      |  1390.38 (+-  99)  -16.74% |
www.shutterstock.com                              |   2874.67 (+-  99)      |  3990.40 (+- 501)  -38.81% |
www.thedailybeast.com                             |   1381.33 (+- 107)      |  1587.60 (+- 132)  -14.93% |
www.thehindu.com                                  |   1394.33 (+-  35)      |  1851.00 (+- 125)  -32.75% |
www.tripadvisor.com                               |   4921.67 (+- 132)      |  6610.44 (+- 644)  -34.31% |
www.twitch.tv                                     |   2816.78 (+- 179)      |  4334.38 (+- 331)  -53.88% |
www.usnews.com                                    |    538.62 (+-  45)      |   991.40 (+-  90)  -84.06% |
www.youtube.com                                   |   1923.78 (+-  53)      |  2872.14 (+- 144)  -49.30% |


FirstVisualChange:
----------------------------------------------------------------------------------------------------------
www.amazon.com                                    |    570.33 (+-  36)      |  1660.00 (+- 272) -191.06% |
www.cbc.ca                                        |   1272.22 (+-  55)      |  2236.17 (+- 169)  -75.77% |
www.cbsnews.com                                   |    385.11 (+-  25)      |   390.67 (+-  28)   -1.44% |
www.cnbc.com                                      |    853.78 (+-  65)      |  2366.60 (+- 337) -177.19% |
www.dw.com                                        |   1527.78 (+-  77)      |  2400.20 (+- 162)  -57.10% |
www.expedia.com                                   |    590.67 (+-  32)      |  2096.60 (+- 190) -254.95% |
www.foxnews.com                                   |    892.67 (+-  30)      |  2563.40 (+- 169) -187.16% |
www.huffingtonpost.com                            |    990.43 (+-  94)      |  2161.33 (+- 134) -118.22% |
www.imgur.com                                     |    989.17 (+-  89)      |  1924.83 (+- 138)  -94.59% |
www.indiatimes.com                                |   1102.00 (+-  41)      |  2190.00 (+- 229)  -98.73% |
www.instagram.com-justintimberlake-               |    193.20 (+-   9)      |   185.71 (+-  15)    3.87% |
www.news.com.au                                   |    835.11 (+-  53)      |  2523.40 (+- 193) -202.16% |
www.nytimes.com                                   |    881.33 (+-  68)      |  2206.86 (+- 215) -150.40% |
www.reddit.com                                    |    540.67 (+-  38)      |   538.83 (+-  49)    0.34% |
www.shutterstock.com                              |   2488.67 (+- 103)      |  3596.80 (+- 481)  -44.53% |
www.thedailybeast.com                             |    454.12 (+-  41)      |  1397.00 (+-  52) -207.62% |
www.thehindu.com                                  |    713.00 (+-  28)      |   720.22 (+-  35)   -1.01% |
www.tripadvisor.com                               |   1177.89 (+- 115)      |  1253.67 (+- 111)   -6.43% |
www.twitch.tv                                     |    263.00 (+-  22)      |  1800.00 (+- 188) -584.41% |
www.usnews.com                                    |    390.89 (+-  19)      |   391.88 (+-  20)   -0.25% |
www.youtube.com                                   |   1200.00 (+-  46)      |  2164.14 (+- 139)  -80.35% |


VisualComplete85:
----------------------------------------------------------------------------------------------------------
www.amazon.com                                    |  10231.67 (+- 314)      | 11824.88 (+- 952)  -15.57% |
www.cbc.ca                                        |   5249.71 (+- 452)      |  7833.50 (+- 538)  -49.22% |
www.cbsnews.com                                   |   5068.44 (+- 220)      |  8645.88 (+- 762)  -70.58% |
www.cnbc.com                                      |   1472.50 (+- 141)      |  3303.40 (+- 321) -124.34% |
www.dw.com                                        |   2313.00 (+-  74)      |  3253.00 (+- 301)  -40.64% |
www.expedia.com                                   |   4609.33 (+- 260)      |  6318.62 (+- 625)  -37.08% |
www.foxnews.com                                   |   1228.43 (+-  69)      |  3000.17 (+- 260) -144.23% |
www.huffingtonpost.com                            |   1233.12 (+-  87)      |  2383.43 (+- 184)  -93.28% |
www.imgur.com                                     |   4427.25 (+- 391)      |  5235.00 (+- 381)  -18.24% |
www.indiatimes.com                                |   1719.00 (+- 169)      |  2795.14 (+- 231)  -62.60% |
www.instagram.com-justintimberlake-               |   2050.29 (+- 155)      |  3022.17 (+- 219)  -47.40% |
www.news.com.au                                   |   1287.00 (+- 114)      |  2816.60 (+- 234) -118.85% |
www.nytimes.com                                   |   8275.78 (+- 351)      | 13593.38 (+-1216)  -64.25% |
www.reddit.com                                    |   2897.88 (+- 158)      |  4321.29 (+- 370)  -49.12% |
www.shutterstock.com                              |   3064.56 (+- 108)      |  4143.60 (+- 504)  -35.21% |
www.thedailybeast.com                             |   1733.14 (+- 108)      |  2273.40 (+- 555)  -31.17% |
www.thehindu.com                                  |   1792.67 (+-  43)      |  2733.17 (+-  65)  -52.46% |
www.tripadvisor.com                               |   6818.56 (+- 128)      |  8804.12 (+- 672)  -29.12% |
www.twitch.tv                                     |   4918.67 (+- 190)      |  6235.11 (+- 568)  -26.76% |
www.usnews.com                                    |    893.60 (+-  19)      |  2250.20 (+- 154) -151.81% |
www.youtube.com                                   |   2020.22 (+-  57)      |  2975.86 (+- 141)  -47.30% |

denispal, can you grab some profiles from 58, Jan 1 and Nightly?

Flags: needinfo?(dpalmeiro)
Flags: needinfo?(dpalmeiro)

Denis: Would you be game to do a mozregression to identify a regressing changeset?

Flags: needinfo?(dpalmeiro)
Whiteboard: [qf] → [qf:p1:pageload]
See Also: → 1541552
See Also: → 1541640

Bisecting points to bug 1517090. Seems like a rather large regression on cnbc.

dom.storage.next_gen = true
[2019-04-05 18:14:52] INFO: https://www.cnbc.com 0, backEndTime: 149ms (±6.15ms), firstPaint: 2.14s (±81.76ms), firstVisualChange: 2.23s (±78.53ms), DOMContentLoaded: 4.35s (±207.97ms), Load: 10.74s (±428.77ms), speedIndex: 3644 (±285.34), visualComplete85: 11.25s (±1.29s), lastVisualChange: 13.92s (±260.10ms), rumSpeedIndex: 1076 (±100.26) (10 runs)

dom.storage.next_gen = false
[2019-04-05 18:34:13] INFO: https://www.cnbc.com 0, backEndTime: 242ms (±42.97ms), firstPaint: 1.17s (±63.40ms), firstVisualChange: 1.42s (±91.11ms), DOMContentLoaded: 3.84s (±216.81ms), Load: 6.80s (±169.23ms), speedIndex: 2778 (±240.59), visualComplete85: 8.88s (±740.97ms), lastVisualChange: 10.11s (±185.67ms), rumSpeedIndex: 1253 (±78.49) (10 runs)

Jan, any ideas on how to improve performance here? Thanks.

Component: Performance → DOM: Web Storage
Flags: needinfo?(dpalmeiro) → needinfo?(jvarga)
Regressed by: 1517090

Can you describe/link to the testing methodology as it relates to:
a) flipping the prefs in comment 4
b) how profiles are used for the test

Specifically, there is an upgrade step for the profile when it is first run with the pref enabled, and then a per-origin upgrade step when the origin is first loaded with the pref enabled. So it's quite possible that if the Firefox run is using a profile that had last run with dom.storage.next_gen=false, then we would expect 2 sets of hits from this which could explain added latency.

Also, in comment 4, I'm assuming that's with the March 29th nightly?

Flags: needinfo?(dpalmeiro)
Blocks: 1540402
Flags: needinfo?(jvarga)

(In reply to Andrew Sutherland [:asuth] (he/him) from comment #5)

Can you describe/link to the testing methodology as it relates to:
a) flipping the prefs in comment 4
b) how profiles are used for the test

Specifically, there is an upgrade step for the profile when it is first run with the pref enabled, and then a per-origin upgrade step when the origin is first loaded with the pref enabled. So it's quite possible that if the Firefox run is using a profile that had last run with dom.storage.next_gen=false, then we would expect 2 sets of hits from this which could explain added latency.

Also, in comment 4, I'm assuming that's with the March 29th nightly?

Hi Andrew,

Comment 4 is using April 4th Nightly. I also tested it on 67 as well with the same results.

I'm using browsertime & selenium to drive these tests. Each run (so 20 in total above) uses a completely new profile. So the first set of 10 runs would each use a completely new profile directory that had no changes at all. The second set of 10 runs would also use completely new profile directories, but the pref would be flipped in each one of those.

The exact steps are something like this:

  1. Create a new profile directory, and set any extra prefs
  2. Launch browser and navigate to http://www.example.com
  3. Sit idle for 30 seconds.
  4. Navigate to http://www.cnbc.com and measure page load.
  5. Close browser and repeat 1-4.

If I understand correctly, step 4 may be impacted by an upgrade step that is done only one time?

Flags: needinfo?(dpalmeiro)

If the upgrade is not a problem, then I would say that LS database is bigger than 16KB, so it can't operate in super fast mode from the beginning. Bug 1513915 could help here.
But I would have to debug/profile this to be sure.

Also, what about other major sites ?
Did you also measure performance of google.com, amazon.com, facebook.com ?

Also note that LSNG must write to 2 databases to not lose data when the pref is flipped back to false.
That can be disabled with dom.storage.shadow_writes set to false. Can you re-test cnbc with that ?

You can also try to increase dom.storage.snapshot_prefill. It's currently set to 16384 (bytes).

(In reply to Jan Varga [:janv] from comment #8)

Also, what about other major sites ?
Did you also measure performance of google.com, amazon.com, facebook.com ?

Not yet. Only youtube, twitch and cnbc. I noticed this by accident and just started bisecting. I can do a more comprehensive test later on.

I think the summary should be changed because the regression is probably caused by enabling LSNG on Mar 8th.

Duplicate of this bug: 1541552
Summary: Page load and SpeedIndex regression on cnbc since Jan 1st, 2019 → Multiple page load regressions from enabling LSNG on Mar 8th, 2019
Duplicate of this bug: 1541640

I think we should try to reuse the profile between tests.

Reusing the profile between tests is definitely worthwhile to look at, but it's also a different test. By our terminology, the tests Denis was doing were "Fresh Profile" tests, similar to what a newly-installed user would see. You're suggesting testing one of "Cold load" (running browser, data for site not cached), "warm load" (running browser, data for site cached - such as Reload), or "Fresh Start" (freshly started browser and a cold load of the site).

In this case, there are complications around first time the pref changes or first run with the new code. I'm not concerned about the one-time perf hits for switching (though it's always good to know what they are before landing; I presume you knew that). This is not the case in this bug (which is a fresh profile; no upgrade needed).

Note that "Fresh Profile" may also imply some performance differences to "fresh Start" or plain "cold load", since various files may have a first-time-used-creation-cost.

I understand, but if this doesn't happen with reused profiles, I wouldn't worry so much.
A typical user doesn't create a new profile for each browser session.
But as you pointed out, we should investigate why is this happening.

I implemented an optimization that could help here, bug 1513915.
There are links for try builds in the bug.

Anyway, I still wonder why talos tp6 and raptor tp6 didn't catch this.
www.youtube.com is definitely covered by tp6.
tp6 probably reuses profiles.

I ran tp6 tests on try when I was implementing/finishing LSNG, and there were no perf regressions.

I can think of one possible problem that can slightly affect performance after creating new profile.
Quota manager must create storage/ls-archive.sqlite if it doesn't exist.
webappstore.sqlite must be created too.

One more suggestion, there are 10 tests currently and each one creates a new profile and visits given URL once.
What about vising given URL 10 times for each new profile ?

I tried a few different combinations, and it seems like this regression does not show up in any type of warm load which explains why tp6 did not catch this. It also does not show up if I cold load the page using the same profile in each load. However, if I use a new profile in each session, then this seems to consistently show up when I visit the target page for the first time, even if I load a bunch of random other websites first (I tried navigating to amazon, reddit, cnn before finally goiing to cnbc).

Summary: Multiple page load regressions from enabling LSNG on Mar 8th, 2019 → Fresh profile page load regressions from enabling LSNG on Mar 8th, 2019

Ok, that's what I suspected. It shouldn't affect normal daily browsing sessions.
Anyway, this probably means that nothing triggers storage initialization when the browser is launched which is a bit surprising. I would expect that there is some chrome IndexedDB or something that triggers it.
We can easily fix this by creating a timer immediately after firefox launches (maybe just 1 second timeout) and once it fires we trigger storage initialization, so when a page needs local storage there would be less (I/O) work to do.
The timer is needed to not slow down initial launch of chrome by touching disk too much.

Actually I was wrong a bit. When I create a new profile and launch the browser, the storage gets initialized, and these directories/files exist:
<profile>/storage/
<profile>/storage/ls-archive.sqlite

However these directories/files don't exist:
<profile>/storage/default/https+++www.cnbc.com/
<profile>/storage/default/https+++www.cnbc.com/.metadata
<profile>/storage/default/https+++www.cnbc.com/.metadata-v2
<profile>/storage/default/https+++www.cnbc.com/ls/
<profile>/storage/default/https+++www.cnbc.com/ls/data.sqlite
<profile>/storage/default/https+++www.cnbc.com/ls/usage

So when I go to www.cnbc.com all these have to be created when local storage is accessed for first time.

We implemented preloading which means we trigger such initialization ASAP, but if the page tries to access local storage very quickly there's a chance that the preload hasn't finished yet and there's a delay in the end.

This can be optimized in future by collecting inforamation about origins that have data for local storage during storage initialization done by quota manager. So when we get a preload request and we know there's nothing stored we can respond quickly and create dirs/files for given origin asynchronously. The problem is that archived data (data already stored by old implementation) are currently lazily imported during first access which makes this even more complicated.

Does the storage get initialized for each new website visited, or is it domain based? Just trying to gauge on whether users will see a performance hit every time they visit a website they have never visited before which I believe is a very common case.

It's per origin. Anyway, we need to create those dirs and files asynchronously.

P2, given this requires quite risky changes and we are now trying to stabilize for initial release/enabling LSNG in FF 68.

Priority: -- → P2

Originally, quota in LSNG should have been tracked as sqlite file size, but later we changed that to tracking logical size of the database. So for trackig sqlite file size, the database had to be synchronously created before allowing access to local storage from a content process. Now we can avoid that, because we don't track sqlite file size anymore.
However, as I mentioned above, this is quite risky change.

Depends on: 1546752

I'm going to investigate this again, how much time would be needed to fix this and also the risk factor.

Assignee: nobody → jvarga
Priority: P2 → P1

I have a real concern here: our testing (raptor, talos, local testing on development laptops/desktops) is largely/completely based on SSDs. Rotating disks, especially on low-end windows laptops, are slow as molasses at creating files (even at opening files). We've seen this in startup profiles on reference laptops when just opening the safebrowsing databases - that's taking around 6 seconds.

A design where we create 2 directories and 4 files for each site that uses local storage concerns me (also, visiting those sites on a rotating-disk machine like a reference laptop may be slow as well).

Has any performance measurements on create time or open time been done, and have they been done on rotating disk vs SSD or on a reference laptop? Do we have telemetry on this in the LSNG code as landed? If so, can we filter the LSNG telemetry based on cores/rotating disk/etc? (I suspect the answer is "no" to all of these, but I need to ask).

Perhaps we're good on this except for the specific case flagged here; I doubt that though.

NI to janv for answers; NI to mconley for his info since he's looked at reference laptop startup file access times. NI to florian since he's looked at file access issue also

Flags: needinfo?(mconley)
Flags: needinfo?(jvarga)
Flags: needinfo?(florian)

No need to worry that much anymore, I have a patch that moves creation of empty databases to the connection thread, which means they are created asynchronously after 5 seconds.

Flags: needinfo?(jvarga)

If the database already exists, opening of such database is not a big issue with LSNG because:

  1. We asynchronously start a local storage preload operation when a page is going to be loaded
    This happens much sooner with LSNG because it's triggered in ContentParent::AboutToLoadHttpFtpDocumentForChild (old implementation triggers it from a content process). So there's a better chance that we have everything in memory already when a content process actually asks for data.

  2. Data is loaded from small per origin databases
    LSNG uses small per origin databases rather than a large common database which gets fragmented after some time (It's normal to see 200MB webappsstore.sqlite with old implementation). One thing is fragmentation, other thing is that we can easily remove data for an origin. A common database would have to be vacuumed to really remove data.

  3. Not all data has to be transferred to content process
    Old implementation transfers everything to a content process (max 5 MB).
    New implementation has a concept of snapshots which are highly optimized to transfer less data over IPC.
    It's also highly optimized to use a minimum number of sync IPC calls.

  4. Data is stored in UTF8 (as opposed to UTF16) and also compressed if possible
    This makes it use less memory which should help with eliminating OOM crashes especially on 32 bit platforms (this hasn't landed yet, patches need to be reviewed this week).

So all in all, we start preloading sooner and we load less data from disk.

However, I agree that creating of a new sqlite database which needs to use fsync in the end, can be a problem on very slow disks.
But as I said, I have a patch for that.

(In reply to Jan Varga [:janv] from comment #28)

If the database already exists, opening of such database is not a big issue with LSNG because:

  1. We asynchronously start a local storage preload operation when a page is going to be loaded
    This happens much sooner with LSNG because it's triggered in ContentParent::AboutToLoadHttpFtpDocumentForChild (old implementation triggers it from a content process). So there's a better chance that we have everything in memory already when a content process actually asks for data.

Does this mean we're doing a file-exists/fstat operation on <profile>/storage/default/https+++<origin> on every navigation, or do we know from some other thing if there is expected to be a file of localstorage for that navigation?

  1. Data is loaded from small per origin databases
    LSNG uses small per origin databases rather than a large common database which gets fragmented after some time (It's normal to see 200MB webappsstore.sqlite with old implementation). One thing is fragmentation, other thing is that we can easily remove data for an origin. A common database would have to be vacuumed to really remove data.

Sure; I'm certain the previous design had problems as you mention. That doesn't mean there might still be (different) problems with the replacement. My main concern with spinning disks (especially on low-end HW) is not so much file size (spinning disks can often transfer at a reasonable rate), but with time-to-open (which often requires multiple seeks if not cached).

So all in all, we start preloading sooner and we load less data from disk.

(In reply to Randell Jesup [:jesup] (needinfo me) from comment #31)

(In reply to Jan Varga [:janv] from comment #28)

If the database already exists, opening of such database is not a big issue with LSNG because:

  1. We asynchronously start a local storage preload operation when a page is going to be loaded
    This happens much sooner with LSNG because it's triggered in ContentParent::AboutToLoadHttpFtpDocumentForChild (old implementation triggers it from a content process). So there's a better chance that we have everything in memory already when a content process actually asks for data.

Does this mean we're doing a file-exists/fstat operation on <profile>/storage/default/https+++<origin> on every navigation, or do we know from some other thing if there is expected to be a file of localstorage for that navigation?

Yeah, we currently do, but it can be easily optimized by checking a hashtable first, I'll file a bug and fix it.

(In reply to Randell Jesup [:jesup] (needinfo me) from comment #31)

  1. Data is loaded from small per origin databases
    LSNG uses small per origin databases rather than a large common database which gets fragmented after some time (It's normal to see 200MB webappsstore.sqlite with old implementation). One thing is fragmentation, other thing is that we can easily remove data for an origin. A common database would have to be vacuumed to really remove data.

Sure; I'm certain the previous design had problems as you mention. That doesn't mean there might still be (different) problems with the replacement. My main concern with spinning disks (especially on low-end HW) is not so much file size (spinning disks can often transfer at a reasonable rate), but with time-to-open (which often requires multiple seeks if not cached).

File size can be a problem if (relatively small) data that you want to load is spread across entire big file, in other words, when the file/database is heavily fragmented.
Also, one big sqlite database can't be easily used by multiple threads at the same time. LSNG still uses one thread, but can be improved in future, given per origin databases.

Filed bug 1547688.

I edited the description with the comprehensive run on the 2017 reference machine which uses a HDD.

And, can I run these tests myself ?

So with the attached patch, number of synchronously created dirs/files is reduced from:
<profile>/storage/default/<origin>/
<profile>/storage/default/<origin>/.metadata
<profile>/storage/default/<origin>/.metadata-v2
<profile>/storage/default/<origin>/ls/
<profile>/storage/default/<origin>/ls/data.sqlite
<profile>/storage/default/<origin>/ls/usage

to:
<profile>/storage/default/<origin>/
<profile>/storage/default/<origin>/.metadata
<profile>/storage/default/<origin>/.metadata-v2

.metadata won't be created anymore after bug 1546752 is fixed (a patch is already attached)

The preload operation won't fstat <profile>/storage/default/<origin>/ after bug 1547688 is fixed (a patch is already attached)

I'm also investigating how to avoid creating these two:
<profile>/storage/default/<origin>/
<profile>/storage/default/<origin>/.metadata-v2
I'll file a bug once I know more.

(In reply to Jan Varga [:janv] from comment #36)

Can you do that using a build from this try push ?
https://treeherder.mozilla.org/#/jobs?repo=try&revision=25ec95d33d9a468a3e10b58117abd2d12f3b55dc

I only ran pageLoad numbers this time, but the results look a bit better with this build. The regression went from 0.65 geomean to 0.84.

pageLoadTime:                                              LSNG off                 LSNG on
----------------------------------------------------------------------------------------------------------
www.amazon.com                                    |    596.00 (+-  20)      |   840.00 (+-  56)  -40.94% |
www.cbc.ca                                        |   2276.00 (+- 638)      |  3167.00 (+-1053)  -39.15% |
www.cbsnews.com                                   |    458.00 (+-  11)      |   449.00 (+-  19)    1.97% |
www.cnbc.com                                      |   5671.00 (+- 162)      |  7642.00 (+-1422)  -34.76% |
www.dw.com                                        |   1864.00 (+-  63)      |  2136.00 (+- 217)  -14.59% |
www.expedia.com                                   |   2759.00 (+- 260)      |  3707.00 (+- 589)  -34.36% |
www.foxnews.com                                   |   4167.00 (+- 901)      |  5413.00 (+-1347)  -29.90% |
www.imgur.com                                     |   3995.00 (+- 525)      |  5083.00 (+- 974)  -27.23% |
www.indiatimes.com                                |   2746.00 (+- 102)      |  3515.00 (+- 526)  -28.00% |
www.instagram.com-justintimberlake-               |   1792.00 (+-  90)      |  2102.00 (+- 614)  -17.30% |
www.news.com.au                                   |   3130.00 (+- 226)      |  4449.00 (+- 717)  -42.14% |
www.nytimes.com                                   |   6334.00 (+- 106)      |  7709.00 (+- 806)  -21.71% |
www.reddit.com                                    |   3815.00 (+- 132)      |  4486.00 (+- 471)  -17.59% |
www.shutterstock.com                              |   2540.00 (+-  50)      |  2782.00 (+- 148)   -9.53% |
www.stackoverflow.com                             |   1226.00 (+-  66)      |  1329.00 (+- 124)   -8.40% |
www.thedailybeast.com                             |   2957.00 (+- 190)      |  3723.00 (+- 490)  -25.90% |
www.thehindu.com                                  |   1640.00 (+-  55)      |  1855.00 (+- 160)  -13.11% |
www.tripadvisor.com                               |   1681.00 (+- 126)      |  1663.00 (+- 178)    1.07% |
www.twitch.tv                                     |   1944.00 (+- 165)      |  2515.00 (+- 662)  -29.37% |
www.usnews.com                                    |   3876.00 (+-1373)      |  3723.00 (+-1181)    3.95% |
www.youtube.com                                   |   2132.00 (+- 229)      |  3759.00 (+- 437)  -76.31% |
----------------------------------------------------------------------------------------------------------
Geomean: 0.8375

Yeah, it looks much better now, thanks for providing these numbers. We have to get rid of synchronous creation of .metadata and .metadata-v2 which are fsync-ed after creation. Once that's done, we should see no difference between old and new implementation in regards of cold page load.

I'm not sure what's being asked of me here. jesup, what information can I provide that you're looking for?

Flags: needinfo?(mconley) → needinfo?(rjesup)

mconley: mostly just for awareness, and any commentary on file open/create/fstat access on rotating disks/reference laptops

Flags: needinfo?(rjesup)
Depends on: 1548221
No longer depends on: 1548221

I have an additional patch that avoids synchronous I/O completely when a new empty localStorage is requested (cold load).
I need to do more testing, but it looks promising.

Does this do async IO on every cold load?

It does async I/O only if a page mutated localStorage.

Here's a preliminary try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=db8e5ddde67b69af99bb1cf62335956e5b65929b

I still need to cleanup the last patch, but it should be good enough for performance testing.

Depends on: 1548748

FYI, I triggered some linux64 and win64 raptor jobs on these (including some cold-load runs). Warning: because a big fix to mitmproxy landed ~4 days ago, 1 week and longer comparisons in perfherder are very misleading - you must look at the graphs. and 2-day comparisons have only a few samples to compare against (~4) so noise numbers are especially suspect.

Those jobs use SSDs, don't they ?

Anyway, I plan to trigger more complex testing on try this weekend.
Usually, I do two try pushes, one without my patches and one with them.
Comparing talos and raptor between try and m-c can be inaccurate since it can be affected by other improvements or regressions that happened during the week.

(In reply to Jan Varga [:janv] from comment #48)

Those jobs use SSDs, don't they ?

Yes. "ux" raptor jobs use the (limited in number! be kind; try to run off-hours if there is such a thing) "reference" laptops (slow rotating disks, 2 cores). Ask in #perf on IRC for more info

Anyway, I plan to trigger more complex testing on try this weekend.
Usually, I do two try pushes, one without my patches and one with them.
Comparing talos and raptor between try and m-c can be inaccurate since it can be affected by other improvements or regressions that happened during the week.

Agreed!

(In reply to Jan Varga [:janv] from comment #46)

Here's a preliminary try push:
https://treeherder.mozilla.org/#/jobs?repo=try&revision=db8e5ddde67b69af99bb1cf62335956e5b65929b

I still need to cleanup the last patch, but it should be good enough for performance testing.

Thanks, Jan! Using this try build, I no longer see any significant performance impact of flipping the pref on and off. There may be some minor ones (e.g. youtube, reddit on desktop), but it's hard to tell here because I'm testing live sites and the noise is a bit higher than usual. I'd need to do more exhaustive testing to confirm.

On MOTO G5 - GVE:

pageLoadTime:                                             LSNG OFF                   LSNG ON
------------------------------------------------------------------------------------------------------------
www.reddit.com                                    |    3197.75 (+- 917)      |   2933.75 (+- 461)    8.26% |
blogger.com-about                                 |    1609.60 (+-  54)      |   1637.56 (+-  81)   -1.74% |
bbc.co.uk                                         |    3845.00 (+- 113)      |   3877.00 (+- 230)   -0.83% |
www.espn.com-nba-story-_-page-allstarweekend257...|    5413.75 (+- 275)      |   5644.60 (+- 295)   -4.26% |
fashionbeans.com                                  |    5885.86 (+- 501)      |   5864.30 (+- 528)    0.37% |
www.cnbc.com                                      |   11702.80 (+- 241)      |  11529.00 (+- 141)    1.49% |
booking.com                                       |    2032.38 (+-  99)      |   2116.57 (+- 136)   -4.14% |
booking.com-searchresults.en-gb.html?city=20088325|    4211.10 (+- 126)      |   4207.11 (+- 120)    0.09% |
m.ebay-kleinanzeigen.de-s-anzeigen-auf-zeit-wg-...|    4691.50 (+-  50)      |   4897.50 (+- 175)   -4.39% |
m.facebook.com-Cristiano                          |    2689.80 (+-  58)      |   2762.33 (+- 139)   -2.70% |
www.youtube.com-watch?v=COU5T-Wafa4               |    1924.00 (+-  44)      |   1808.25 (+-  79)    6.02% |
discordapp.com-                                   |    4192.78 (+-  79)      |   4296.20 (+-  95)   -2.47% |
m.ebay-kleinanzeigen.de                           |    4212.20 (+- 376)      |   4430.43 (+- 421)   -5.18% |
aframe.io-examples-showcase-animation-            |    4813.20 (+-  74)      |   4963.56 (+- 353)   -3.12% |
------------------------------------------------------------------------------------------------------------
Geomean: 0.9918   

On Acer Aspire E 15:

pageLoadTime:                                            LSNG OFF                   LSNG ON
----------------------------------------------------------------------------------------------------------
www.amazon.com                                    |    821.00 (+- 128)      |   851.67 (+-  71)   -3.74% |
www.cnbc.com                                      |   6194.44 (+-1458)      |  6032.89 (+-1063)    2.61% |
www.expedia.com                                   |   2671.40 (+- 753)      |  2494.00 (+- 674)    6.64% |
www.indiatimes.com                                |   4859.89 (+- 888)      |  4267.89 (+- 342)   12.18% |
www.instagram.com-justintimberlake-               |   1597.00 (+- 196)      |  1829.22 (+- 479)  -14.54% |
www.nytimes.com                                   |   5518.78 (+- 607)      |  5767.00 (+- 528)   -4.50% |
www.reddit.com                                    |   4566.67 (+- 292)      |  4710.56 (+- 283)   -3.15% |
www.tripadvisor.com                               |   2762.56 (+- 189)      |  2840.00 (+- 255)   -2.80% |
www.twitch.tv                                     |   1759.89 (+- 217)      |  1682.78 (+-  46)    4.38% |
www.youtube.com                                   |   5197.78 (+-1221)      |  5656.78 (+-1464)   -8.83% |
------------------------------------------------------------------------------------------------------------
Geomean: 0.9911    


PerceptualSpeedIndex:                                    LSNG OFF                   LSNG ON
----------------------------------------------------------------------------------------------------------
www.amazon.com                                    |    834.25 (+- 122)      |   858.11 (+-  80)   -2.86% |
www.cnbc.com                                      |   3996.75 (+- 988)      |  3999.88 (+- 960)   -0.08% |
www.expedia.com                                   |   2901.11 (+- 466)      |  2683.33 (+- 296)    7.51% |
www.indiatimes.com                                |   2102.33 (+- 171)      |  2070.00 (+- 195)    1.54% |
www.instagram.com-justintimberlake-               |    980.25 (+-  81)      |  1016.44 (+- 132)   -3.69% |
www.nytimes.com                                   |   3670.88 (+- 907)      |  3737.57 (+- 668)   -1.82% |
www.reddit.com                                    |   2187.44 (+- 241)      |  2363.56 (+- 280)   -8.05% |
www.tripadvisor.com                               |   5554.44 (+- 221)      |  5721.78 (+- 325)   -3.01% |
www.twitch.tv                                     |   2457.44 (+- 210)      |  2449.56 (+- 248)    0.32% |
www.youtube.com                                   |   4360.78 (+- 964)      |  4627.56 (+-1115)   -6.12% |
------------------------------------------------------------------------------------------------------------
Geomean: 0.9848


SpeedIndex:                                              LSNG OFF                   LSNG ON
----------------------------------------------------------------------------------------------------------
www.amazon.com                                    |    746.00 (+- 119)      |   753.00 (+-  70)   -0.94% |
www.cnbc.com                                      |   1539.00 (+- 229)      |  1705.56 (+- 376)  -10.82% |
www.expedia.com                                   |   2981.67 (+- 454)      |  2754.22 (+- 354)    7.63% |
www.indiatimes.com                                |   2069.56 (+- 143)      |  2057.22 (+- 193)    0.60% |
www.instagram.com-justintimberlake-               |   1775.25 (+- 140)      |  1872.00 (+- 378)   -5.45% |
www.nytimes.com                                   |   2557.67 (+- 510)      |  1999.83 (+- 554)   21.81% |
www.reddit.com                                    |   2197.11 (+- 392)      |  2330.22 (+- 307)   -6.06% |
www.tripadvisor.com                               |   6312.11 (+- 220)      |  6514.00 (+- 363)   -3.20% |
www.twitch.tv                                     |   2270.22 (+- 179)      |  2328.56 (+- 176)   -2.57% |
www.youtube.com                                   |   4320.11 (+- 942)      |  4581.89 (+-1112)   -6.06% |
------------------------------------------------------------------------------------------------------------
Geomean: 0.9992

Nice! Thanks, :janv.

What to say, thanks everyone who helped with this!

Thank you for landing the fixes so quickly :janv!
Thanks :denispal for finding the issue and verifying the fixes.
Glad we can continue to ride the train!

Pushed by jvarga@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/b5d8d57838a3
LSNG: Move creation of empty databases to the connection thread; r=asuth
Status: NEW → RESOLVED
Closed: 6 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Target Milestone: mozilla68 → ---
Pushed by jvarga@mozilla.com:
https://hg.mozilla.org/integration/mozilla-inbound/rev/7658320df0a9
LSNG: Move creation of empty databases to the connection thread; r=asuth
Status: REOPENED → RESOLVED
Closed: 6 months ago6 months ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla68
Blocks: 1513881
You need to log in before you can comment on or make changes to this bug.