Closed Bug 1139424 Opened 11 years ago Closed 11 years ago

Performance test Postgres in Heroku, compare to PHX1

Categories

(Socorro :: Database, task)

x86_64
Linux
task
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: selenamarie, Assigned: selenamarie)

References

Details

Attachments

(7 files)

Will share BenchmarkCrashStore data here from Heroku and PHX1 Postgres databases.
Blocks: 1118468
Log of second aws instance in a 20k crash, 2 concurrent systems, 8 threads per node on a t2.medium instance.
Log from first ec2 instance t2.medium, 20k crashes (10k per instance), 8 threads per processor
Test from aws1 system with pgbouncer and stunnel connection from AWS into Heroku Pg. MUCH FASTER: vars n mean sd median trimmed mad min max range skew kurtosis se 1 1 1979 0.5 0.03 0.49 0.49 0.03 0.42 0.67 0.25 0.51 0.71 0 (mean is 0.49 seconds, std dev .03 - compared to mean ~1.4 seconds without pgbouncer)
Assignee: nobody → sdeckelmann
Test of 1000 crashes using a c4.xlarge instance. vars n mean sd median trimmed mad min max range skew kurtosis se duration 2 1000 0.01 0 0.01 0.01 0 0.01 0.03 0.02 5.1 47.24 0
Results of a 1000 crash run using t2.medium instance. Plenty fast. vars n mean sd median trimmed mad min max range skew kurtosis se duration 2 999 0.01 0.01 0.01 0.01 0 0.01 0.23 0.22 17.37 427.74 0
Running backfill_matviews() on 210k crashes, which result in about 66k crashes in reports_clean and not very much in our various reporting tables, takes about 4 minutes. This is not a realistic test of our full matview runs, but it was close enough on our reports_clean process (the normalization step) to feel confident that the database can keep up. A production test (piping in 300k production crashes and running a backfill while continuing to push crashes in) is worth doing to ensure we don't have any glaring issues or excessive locking problems with reporting.
Raw data from inserting 210k crashes.
Attached file prod-parsed-logs.tgz
Production data > describe(mydata$duration) vars n mean sd median trimmed mad min max range skew kurtosis se 1 1 56466 0.65 1.43 0.3 0.4 0.22 0.03 61.55 61.52 12.12 266.12 0.01 I'm seeing some performance degredation and times -- I'm guessing due to interactive things or maintenance that are causing write locks for crashes -- or there are some very large objects causing problems. Another vector that would be helpful in this is *size* of crash.
Closing this as I've done as much testing as I can with a fake crash stream.
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: