Closed Bug 459197 Opened 11 years ago Closed 11 years ago

[meta] Ts regression by enabling Places Fsync stuff

Categories

(Firefox :: Bookmarks & History, defect, P1, major)

defect

Tracking

()

VERIFIED FIXED
Firefox 3.1b2

People

(Reporter: mak, Unassigned)

References

Details

(Keywords: perf, verified1.9.1)

Attachments

(2 files)

the latest push of fsync stuff caused a Ts and a Tp regression, and was backed out to investigate on possible ways of fixing it.

We should investigate on possible solutions to solve both issues, actual ideas could be
- move some of the visit insert trigger work to an async query (update visit_count later?)
- reduce nsPlacesDBFlush init impact (add a delayed start for stuff not needed at startup)
- move sync from places background to an async query and discard PlacesBackground (so we save time to create the thread)
- do cleanup of init in all places services to reduce their impact on startup

Shawn has more precise numbers on the regressions
Let's try to keep these two regressions separate since the issues are impacted by different bits of code.  I've filed bug 459235 for the Tp regression.
Summary: [meta] Ts and Tp regressions by enabling Places Fsync stuff → [meta] Ts regression by enabling Places Fsync stuff
Linux Ts regression (http://tinyurl.com/4txf2j) is ~3%

Mac OS X
10.4 (http://tinyurl.com/3vuv42) might show a regression, but the numbers are pretty noisy, so it's hard to tell.
10.5 (http://tinyurl.com/54c99u) doesn't look like there is a regression either.

Windows
Vista (http://tinyurl.com/435ete) looks like there is a ~3.4% regression
XP (http://tinyurl.com/3q7hpn) shows a ~4% Ts regression
Neither windows boxes seemed to have dropped back down to pre-regression levels after the backout (yay!)
i suppose the ts hit could also be caused by us creating the temp tables and their indexes, those are new tasks we have to do at every browser start.
Depends on: 459491
Depends on: 455474
exacly on every run we are creating:

- 2 temp tables
- 8 indexes on temp tables
- 2 views
- 6 triggers

all of these are required for correct work of sync and were not present before, we could at last discard some index since the temp tables should be always small sized
How much time do we actually spend generating those?  That will give us a good idea on that being the cause or not.
Attached file shark profile
This data seems mostly on par with what Marco is seeing on windows, so I figure I'll attach it.
on Windows we are suffering a lot in InitDBFile due to NSLocalFile::load, and we suffer a bit from replaceSubstring (from createAutocomplete).

Since on both platforms most time is spent in initializing statements that will be used later we could reconsider asking for a delayed initialization in mozStorage (statements initialized on first use). 

But first steps will be getting rid of stuff not needed at init.
This shows even more interesting data :)
Attachment #342988 - Attachment is patch: false
Depends on: 459773
Depends on: 459776
Depends on: 459781
Depends on: 459786
Depends on: 459787
Depends on: 459788
Depends on: 459789
ts appear good on latest tryserver builds, i think we can finish reviewing things and land.
There's a few more issues that I filed last night that I didn't put patches up for.  I want to do those too to make sure we hit this hard.  Maybe we'll get a Ts win? :)
Depends on: 459934
I'm pretty sure we got this with all of our fixes.
New numbers - sadly I'm still seeing regressions :(

Linux http://tinyurl.com/5pqcop (1-2%)
OS X 10.4 http://tinyurl.com/5p9w4a (none?)
OS X 10.5 http://tinyurl.com/6nvf73 (none?)
Windows XP http://tinyurl.com/5saqwg (1-2%)
Windows Vista http://tinyurl.com/5gzcld (2.5%)
Depends on: 460822
Fsync has landed again. Per the discussion with drivers et al, the Ts regression will block the 3.1 release, marking blocking+.
Flags: blocking-firefox3.1+
New numbers!  We landed between 4:00PM and 5:00PM on these graphs

Linux http://tinyurl.com/686ylj
Of these two boxes, it looks like one shows a 0.6% regression, and the other one shows a 3.6% increase.  There's one other machine, but it jumps all over the place so I don't think it's trustable.  I should also note that it looks like we had a Ts regression on Linux before we landed this yesterday.  Awesome.
The range in ms is 8 - 51 ms.  I should note that the numbers are a bit weird here.  The first changeset that built us didn't really show a regression, then the following one showed a bigger jump, but then we drop back down.  These are a bit noisy, so I'm not sure how reliable they are.

OS X 10.4 http://tinyurl.com/6jfwf6
no regression

OS X 10.5 http://tinyurl.com/6jfwf6
really really noisy, but no regression

Windows XP http://tinyurl.com/6jfwf6
1.1% - 1.7% regression (10 - 20 ms)

Windows Vista http://tinyurl.com/6jfwf6
0.6% - 2% regression (4 - 12 ms)
Depends on: 462050
Depends on: 462077
Shawn, all links are the same...

i see a sort of Ts regression BEFORE our push, between 11:30am-1:00pm, is that real?
Sorry - synergy was failing for me with copy and paste yesterday.  I apparently didn't catch it in time for this bug.

I mention that there might be a Ts regression before our push for linux, but I didn't see it in any other platform, so I don't know.  Someone should investigate that probably.
i see that also on windows... one regression on 27 and one on 28 before our push (if times on graphs are the same as for pushlog). My fear is that people will think those regressions are due to this, while it is not.
You actually have to look at when tinderbox started building our changeset.  For linux and mac, this is closer to 4:00 PM, but for windows it was closer to 5:00PM
ops, i meant there is a Ts regression on 27 before the push and one on 28 fairly after the push (and much worst than fsync one). The 27 regression is before 4:27pm, while i don't see any results from our check-in before 5:14pm, so that's not us.
Tracemonkey merged after we did as well.  Can you link to graphs that you are seeing this for reference purposes please?
it's quite difficult associate correct changesets for every Ts number (i hope there's a better way than comparing waterfall with graphs), so actually the regression i was seeing on 27 was really us (i was looking a bit too late), while previously there is some noise.
The 28th regression is there instead: http://tinyurl.com/6xyu4k
Sorry for misunderstoodment.
I landed bug 462077 earlier today

On linux, the numbers are pretty noisy.  However, looking at the chart from when we landed, and now, I think we are even (unclear however):
http://tinyurl.com/5azeth

Windows machines have been down all day, so I'm not sure of the results there yet.
is there a bug to track the 28th Ts regression on Windows? i can't find one, was not Mossop looking at a range for that?
Keywords: perf
Priority: -- → P1
Target Milestone: --- → Firefox 3.1b2
Looking at numbers for bug 462077 for windows now that we have some numbers:

Windows XP http://tinyurl.com/69yymm
0 - 6ms win

Windows Vista http://tinyurl.com/64sc38
1ms loss to 4ms win

Sadly, we didn't have perf numbers for a large range of checkins here.

Looking at Linux again, I was able to get a better idea was to what is going on.
http://tinyurl.com/5qsf9a
One machine is quite sporadic and has a very long time between data points, so I"m not going to count it here.  Another machine has the same long time between data points, but it's fairly consistent, so I'm counting it here.  Overall, we have a 8 - 23ms win.
For what it is worth, I think we've gotten our linux Ts regression down to zero, and are really close with windows.
shawn relanded bug 462050, here are the results:

linux: 0 - 98ms win
xp:    7 - 8ms win
vista: 5 - 6ms win

graphs here: https://bugzilla.mozilla.org/show_bug.cgi?id=462050#c5 (note: i threw away aberrant numbers when it was 4-to-1 in favor of a given direction)

original regression:

linux: 8 - 51 ms
xp:   10 - 20ms
vista: 4 - 12ms

the wins from bug 452050 and bug 462077 combined should be enough to offset the fsync Ts hit. sayrer, beltzner: can you take a look, and comment as to whether these improvements are sufficient?
Looks great to me. Thanks for the hard work on this, guys.
No longer depends on: 460822, 462050, 462077
Priority: P1 → --
Target Milestone: Firefox 3.1b2 → ---
(In reply to comment #29)
> Looks great to me. Thanks for the hard work on this, guys.

yeah, looks good
Severity: normal → major
Status: NEW → RESOLVED
Closed: 11 years ago
Priority: -- → P1
Resolution: --- → FIXED
Target Milestone: --- → Firefox 3.1b2
verified per comment 28
Status: RESOLVED → VERIFIED
Keywords: verified1.9.1
Keywords: fixed1.9.1
Bug 451915 - move Firefox/Places bugs to Firefox/Bookmarks and History. Remove all bugspam from this move by filtering for the string "places-to-b-and-h".

In Thunderbird 3.0b, you do that as follows:
Tools | Message Filters
Make sure the correct account is selected. Click "New"
Conditions: Body   contains   places-to-b-and-h
Change the action to "Delete Message".
Select "Manually Run" from the dropdown at the top.
Click OK.

Select the filter in the list, make sure "Inbox" is selected at the bottom, and click "Run Now". This should delete all the bugspam. You can then delete the filter.

Gerv
Component: Places → Bookmarks & History
QA Contact: places → bookmarks
You need to log in before you can comment on or make changes to this bug.