Closed
Bug 383744
Opened 18 years ago
Closed 18 years ago
Load testing for crash reporting system
Categories
(mozilla.org Graveyard :: Server Operations, task)
Tracking
(Not tracked)
RESOLVED
FIXED
People
(Reporter: morgamic, Assigned: morgamic)
Details
Attachments
(2 files)
We need to start load testing the crash reporting setup. The two pieces we'd like to test are the collector and the reporter (web apps).
Page to discuss:
http://wiki.mozilla.org/Breakpad/Design/Loadtesting
We are not sure what our expected req/s is so it'd likely be finding the limits of the applications and deciding. I talked about this on the wiki page.
Aravind -- is there a good time we can set up to do some preliminary load testing? We'd need someone to watch and log app and db load during tests.
![]() |
||
Updated•18 years ago
|
Assignee: server-ops → aravind
Comment 1•18 years ago
|
||
The collector can be tested much independently, because it doesn't have any DB load. I expect a load of around 60,000 reports per day, but Jay says that it could peak much higher than that, so we should be prepared for peak loading of 180,000 reports/day.
The .dump and .json files that aravind has saved from the processor should be useful for this, and I wrote a python script that can be used to mimic clients sending minidumps:
http://socorro.googlecode.com/svn/trunk/scripts/collector-loadtesting.py
![]() |
Assignee | |
Comment 2•18 years ago
|
||
So is this on hold until we get the actual hardware?
In the meantime it might be useful to profile all three pieces of the system to make sure we aren't missing any time wasters. I would suggest we focus on that while the hardware is spec'd/purchased/set up. Thoughts?
Comment 3•18 years ago
|
||
The collector is on the production cluster, so that can be tested now. There isn't much point in load-testing the processor/reporter/DB combination until it's on production hardware.
I think it would be useful to profile the app code... though I'm not sure how to do it. I will be looking at the pathological case of the infinite-recursion crash in particular.
![]() |
Assignee | |
Comment 4•18 years ago
|
||
I was looking at using this:
http://docs.python.org/lib/module-profile.html
...to call paster serve, the standalone processor in order to get a handle on how resources are allocated and time is spent. Not sure how to profile the collector (trick will be calling it from the command line).
![]() |
||
Comment 5•18 years ago
|
||
We will be getting some new servers next week, so I can set this up on it.
![]() |
||
Comment 6•18 years ago
|
||
Okay, sorry this took a while. We now have a brand new server processing crash reports and serving the reporter app.
https://crash-reports.mozilla.com/submit lives on the production web farm and accepts crash dumps.
These are processed on the new breakpad server. The db backed serving this is also on a production database.
The reports are now available at http://crash-stats.mozilla.com. This is fronted by the netscaler and goes to the backend breakpad production server.
I think we are in good shape to start load testing the app now.
The db server is already monitored in nagios. I will get the other urls into nagios as well.
![]() |
||
Updated•18 years ago
|
Assignee: aravind → morgamic
![]() |
Assignee | |
Comment 7•18 years ago
|
||
Picking this up. We hope to deploy an update to breakpad that eliminates the memory leak this week, then use raw minidumps as POST data in combination with grinder to mimic high load. This should give us a good indication of what the collector's upper limits are.
Right now our goal by Friday should be to reliably update production then set a time for load testing next Monday or Tuesday evening during off-peak. Does that sound reasonable?
Status: NEW → ASSIGNED
Comment 8•18 years ago
|
||
Comment 9•18 years ago
|
||
![]() |
||
Comment 10•18 years ago
|
||
morgamic: Can we close this bug if we don't have any actionable items in the near future?
![]() |
||
Comment 11•18 years ago
|
||
Please re-open the bug once we (IT) have any actionable items.
Status: ASSIGNED → RESOLVED
Closed: 18 years ago
Resolution: --- → FIXED
Updated•11 years ago
|
Product: mozilla.org → mozilla.org Graveyard
You need to log in
before you can comment on or make changes to this bug.
Description
•