Test sentry with new Bouncer database

RESOLVED FIXED

Status

mozilla.org Graveyard
Server Operations
RESOLVED FIXED
9 years ago
3 years ago

People

(Reporter: wenzel, Assigned: wenzel)

Tracking

Details

(Assignee)

Description

9 years ago
Thanks for setting up the tuxedo staging instance over in bug 543452. No we need to check out sentry with the new, locale-aware, data.

Justdave or Jeremy, please give sentry* (and its multi-process incarnation) a shot. Do you have time to do that next week?

What would be important to know is, if there are any bugs left that need fixed, and if it performs well enough considering the amount of files we check now.

*) you should find it in the tuxedo/sentry directory.
Can we get an ETA for when someone might have time for this?

Comment 2

9 years ago
Dave is on PTO until the 28th. I can run it, but I'm not really aware of what the current problems are and what we are trying to solve. 

Does this need to happen before he is back from PTO?
Fred - would it suffice to get a new dump of the db and test it on khan?  Also, another option would be to just run it and have Fred analyze the results.  I don't think we're looking for much as far as IT is concerned aside from just running it.  What do they need to do, Fred?

Comment 4

9 years ago
I should be able to get you a dump if that's the easiest. I haven't checked, but you might already have one in the dbdump directory on khan.
(Assignee)

Comment 5

9 years ago
Yes, I think I can run this on khan to see how it behaves / how long it takes. If it's fine on khan already, we'll be good in production. If it's borderline on khan, we'll have to see.

Jeremy, can you get me a current Bouncer db dump and drop it on khan?
(Assignee)

Comment 6

9 years ago
(In reply to comment #5)
> Jeremy, can you get me a current Bouncer db dump and drop it on khan?

nvm, I have a pretty current one that should allow me to get an overview. I'm doing a full run of sentry-multi.pl. It forked about 135 parallel processes, but it's I/O-bound, so that shouldn't kill the box. Full-blown Perl processes are kind of expensive though, I might use threads, not new processes, instead, and see how it performs.
(Assignee)

Comment 7

9 years ago
Okay. I changed sentry to spawn 16 children, each taking on one of our mirrors, and everytime one exits a new one is spawned to work down the mirror list. Each mirror is checked for each of approx. 95,000 files.

A full run took right about 6:30 hours. I killed the last remaining child because it was checking one very slow mirror, but all others were done.

I'll run a new test tomorrow with the following plan:
- add log statements to get exact timestamps for each mirror as well as the overall time. This way we can get a 95% quantile or so instead of judging everything by the slowest mirror.
- Increase the child count: 64 parallel children would probably not hurt.
- Finally, I might write a little script to fix bug 547711, and see how far that cuts down on the 95,000 files. Less 404s to check for => faster sentry.
Assignee: nobody → fwenzel
Depends on: 547711
(Assignee)

Comment 8

9 years ago
Ran another test:

- I was wrong yesterday, not all 90k were checked, but merely around 4k locations whose products are marked as "checknow". If I understand it right, this is to not run a full mirror check every time, but instead check the most important products frequently, and all others less frequently.
- With 64 children, all but one active mirror were done in 2:30 hours.
- Fixing bug 547711 did not help a whole lot, it reduced the location number by only 700.

I'll file another bug to run tests for the (PHP) bounce script itself, followed by a request to push this live.
Status: NEW → RESOLVED
Last Resolved: 9 years ago
Resolution: --- → FIXED
(Assignee)

Updated

9 years ago
Blocks: 547992
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.