in load test, too many requests take > 60s to complete and time out
Categories
(Eliot :: Symbolication, defect, P2)
Tracking
(Not tracked)
People
(Reporter: willkg, Assigned: willkg)
References
Details
Attachments
(3 files)
I’m load testing prod and dialing in the load test for a “normal load”. One of the things I’m seeing is a lot of requests taking > 60s to complete and thus are timing out and returning an HTTP 502.
This bug covers figuring out what’s going on and fixing it.
Related Jira issue: https://mozilla-hub.atlassian.net/browse/DSRE-1288
| Assignee | ||
Updated•2 years ago
|
| Assignee | ||
Comment 1•2 years ago
|
||
I have some theories:
- cross-cloud download (AWS -> GCP) take longer
- the total amount being downloaded causes Eliot GCP to get throttled by something in AWS or GCP
- Eliot GCP is underpowered CPU-wise and it takes longer to parse sym files
- some or all of the above
- something I haven't thought of
It should be noted that Eliot AWS is set up with an ELB idle timeout of 300s because Tecken symbol upload requests take forever and the ELB is configured the same between Tecken and Eliot, whereas Eliot GCP is set up with a GCLB idle timeout of 60s.
I spent today looking into request handling timings and break them down into smaller steps using the information we have available. Jira doesn't support Markdown (or if it does, I can't figure out how to get it working), so I'm dumping my results of that in this bug.
Generally, the xul modules are the largest sym files. When we added inline function data in September 2022, those ballooned in size and now clock in over 500mb. They take longer to download and longer to parse.
I grabbed stacks from crash reports processed recently on Crash Stats and ran them through symbolication in the Eliot AWS stage and Eliot GCP stage environments. Then I did a bunch of work to get download timings. Then I noticed the download timings are nuts. Then I looked at the code and the download timings include both downloading and parsing the sym file.
I'm going to fix that in Eliot and Tecken. I'm pretty sure it's not a hard fix. Then I'll get a new set of download timings.
| Assignee | ||
Comment 2•2 years ago
|
||
| Assignee | ||
Comment 3•2 years ago
|
||
This fixes it for Tecken so we have something to compare with.
willkg merged PR #2716: "bug 1828542: fix debug timings for symbolication" in 5495486.
| Assignee | ||
Comment 4•2 years ago
|
||
| Assignee | ||
Comment 5•2 years ago
|
||
| Assignee | ||
Comment 6•2 years ago
|
||
| Assignee | ||
Comment 7•2 years ago
|
||
I fixed the debug stats and broke up "download" into "download", "parse sym", and "save symcache". Then I redid all the symbolication to get new timings from Eliot AWS stage and Eliot GCP stage.
| module | size | aws download | gcp download | aws parse | gcp parse | aws save | gcp save |
|---|---|---|---|---|---|---|---|
| libxul.so/928F7EFAF5B17BE3C6C64FBCA39795DE0 | 712.22 mb | 8.62 s | 4.10 s | 23.81 s | 28.49 s | 0.36 s | 0.46 s |
| XUL/596BFCAE074A3843944A3671A53D603C0 | 698.66 mb | 5.64 s | 4.05 s, 3.95 s | 24.18 s | 28.39 s, 28.67 s | 0.33 s | 0.38 s, 0.38 s |
| XUL/ED2116CD112E35539F9EAED9F6EFEFD50 | 698.20 mb | 6.42 s | 3.87 s | 24.72 s | 28.62 s | 0.35 s | 0.39 s |
| XUL/67E6AC1D44793553B1536917215EE0BD0 | 697.39 mb | 6.05 s | 3.81 s, 3.82 s | 25.34 s | 28.70 s, 28.43 s | 0.25 s | 0.43 s, 0.38 s |
| XUL/F259B2E3EF1A3360BD2199A9191752A20 | 696.50 mb | 6.22 s | 3.89 s | 24.65 s | 28.66 s | 0.36 s | 0.39 s |
| XUL/F66529C1E677373D977181BFD383D5640 | 696.12 mb | 5.49 s | 3.47 s | 25.31 s | 28.64 s | 0.37 s | 0.38 s |
| XUL/17729273AA9033B588C29947BB5828960 | 695.58 mb | 5.98 s | 3.94 s | 24.44 s | 29.31 s | 0.35 s | 0.46 s |
| XUL/6B543FDF74883A769CDF92D53253E42C0 | 694.65 mb | 5.65 s | 3.68 s | 24.26 s | 27.25 s | 0.25 s | 0.46 s |
| XUL/64EA7076D2A13E22B8FA39C338C23E240 | 694.59 mb | 6.03 s | 4.05 s | 23.17 s | 27.77 s | 0.33 s | 0.39 s |
| XUL/CAEE89CEFB1E3CDBB71CBF2F05511DE50 | 694.44 mb | 6.76 s | 3.97 s | 23.55 s | 27.46 s | 0.27 s | 0.38 s |
| XUL/0CF2C904860A32508BCABA9F993207E30 | 694.19 mb | 5.94 s | 4.50 s | 24.28 s | 28.64 s | 0.34 s | 0.47 s |
| XUL/1D2C349742183916AFCD407237EF313F0 | 693.99 mb | 6.07 s | 3.86 s | 24.26 s | 29.09 s | 0.34 s | 0.42 s |
| XUL/494084676ADC3E1FBFB7BD9B46571E3B0 | 693.80 mb | 6.15 s, 6.41 s | 3.76 s, 3.71 s | 23.85 s, 24.92 s | 28.06 s, 28.71 s | 0.29 s, 0.37 s | 0.37 s, 0.44 s |
| XUL/C3AE19039A0038169F7AA7C7B11808650 | 693.72 mb | 6.80 s | 4.03 s, 3.86 s, 3.75 s | 23.97 s | 28.23 s, 28.52 s, 28.69 s | 0.25 s | 0.40 s, 0.43 s, 0.43 s |
| XUL/3556F28FC6A1384A94EB355257BA2B160 | 692.69 mb | 7.14 s | 3.94 s | 23.65 s | 27.89 s | 0.34 s | 0.38 s |
| XUL/E59E2024A147335DAA34E294C74D0CA00 | 692.49 mb | 6.56 s | 3.79 s | 23.33 s | 28.10 s | 0.34 s | 0.46 s |
| XUL/1FBAEF0839403357B102D6217CA47CCA0 | 692.41 mb | 6.29 s | 3.84 s | 23.85 s | 28.47 s | 0.24 s | 0.46 s |
| XUL/EBE31C4CB78C389E96E5CFB65D10FB240 | 691.08 mb | 6.50 s, 6.25 s | 3.77 s, 3.71 s | 24.59 s, 23.61 s | 27.36 s, 26.92 s | 0.31 s, 0.39 s | 0.42 s, 0.39 s |
| XUL/4FD8E54B2EDB3C048BAF8FFAC78B2D250 | 690.95 mb | 6.80 s | 4.04 s | 23.89 s | 27.27 s | 0.38 s | 0.39 s |
| XUL/4D8E6C42A19C397E996965FB09B96A600 | 690.83 mb | 5.65 s | 3.55 s | 24.04 s | 28.50 s | 0.29 s | 0.40 s |
| XUL/D3202F1B57783CA8A51DB8DADFB6C5140 | 690.81 mb | 5.49 s | 3.59 s | 23.15 s | 27.02 s | 0.35 s | 0.43 s |
| XUL/0E727D6625453AE6ADB3FA10BCAAE0580 | 690.78 mb | 6.05 s | 3.79 s | 23.43 s | 27.49 s | 0.37 s | 0.40 s |
| XUL/C8B4A908062C3C14809A60FC7F90DE930 | 690.70 mb | 5.70 s | 3.73 s, 3.79 s | 24.41 s | 27.68 s, 27.08 s | 0.31 s | 0.45 s, 0.38 s |
| XUL/FF6BE1795602334481D74FCBB86BE1F20 | 690.62 mb | 6.09 s | 3.93 s | 25.32 s | 28.53 s | 0.27 s | 0.44 s |
| XUL/1446CBEA42D23E7989525AD449E0C42A0 | 689.80 mb | 5.43 s | 3.74 s, 3.50 s | 23.81 s | 27.86 s, 28.38 s | 0.35 s | 0.39 s, 0.40 s |
| libxul.so/765DF8A455756914097F7500A44F930C0 | 685.76 mb | 5.89 s | 3.61 s | 21.99 s | 25.38 s | 0.36 s | 0.38 s |
| libxul.so/0B138F556A1C1A39EBD2DB3371999E980 | 683.63 mb | 6.01 s | 3.58 s | 21.46 s | 25.32 s | 0.31 s | 0.43 s |
| libxul.so/0B4B80A7EA725963C739C019C5A7A8E30 | 682.92 mb | 6.02 s | 3.62 s | 21.72 s | 25.23 s | 0.33 s | 0.40 s |
| libxul.so/85DF1EBB5F7B0E2EA438839A59F6A69E0 | 682.80 mb | 5.98 s | 3.92 s, 3.57 s | 22.04 s | 25.50 s, 25.74 s | 0.31 s | 0.38 s, 0.38 s |
| libxul.so/DCF1D3C74528BBA99A1C5A8AD7AF547E0 | 682.38 mb | 5.90 s | 3.67 s | 22.23 s | 25.66 s | 0.36 s | 0.44 s |
| libxul.so/542F47F4184593D522CE475B564C3D810 | 682.30 mb | 6.49 s | 3.52 s, 3.61 s | 21.80 s | 25.46 s, 25.13 s | 0.24 s | 0.38 s, 0.38 s |
| libxul.so/645ECEACFEE6F1D530174726790451B30 | 681.29 mb | 6.09 s | 3.80 s | 21.58 s | 25.88 s | 0.32 s | 0.43 s |
| libxul.so/B0F7F3DCA470C5FEC807CBBC69E977F70 | 680.54 mb | 6.33 s | 4.04 s | 21.87 s | 26.15 s | 0.33 s | 0.43 s |
| libxul.so/F273F5C997904C506CE14C5790268D490 | 680.20 mb | 5.80 s | 3.53 s | 21.75 s | 25.32 s | 0.34 s | 0.38 s |
| libxul.so/3B025FC9039C3E493A8B2CC2052A061A0 | 680.20 mb | 6.07 s | 3.97 s | 21.49 s | 25.22 s | 0.32 s | 0.44 s |
| libxul.so/9FD48936AEE131D44AE8271731D378A50 | 679.65 mb | 6.56 s | 3.54 s | 21.38 s | 25.79 s | 0.33 s | 0.39 s |
| libxul.so/3C62499505EC76405A25C8ECB99282310 | 679.07 mb | 5.99 s | 3.55 s | 22.61 s | 25.52 s | 0.29 s | 0.44 s |
| libxul.so/08C6EC1A8E0FD7E78397D2620F79731C0 | 678.12 mb | 6.42 s | 3.57 s | 21.75 s | 24.99 s | 0.33 s | 0.37 s |
| libxul.so/7B0AE61B639226C8DB5948D3A980BDAF0 | 676.59 mb | 5.70 s | 3.60 s, 3.57 s, 3.56 s | 22.53 s | 24.98 s, 25.10 s, 25.10 s | 0.26 s | 0.36 s, 0.41 s, 0.43 s |
| libxul.so/9C8EA39DED75D9014E04AF8A94552F6B0 | 676.35 mb | 5.93 s | 3.59 s, 3.63 s, 3.68 s | 22.38 s | 25.37 s, 24.82 s, 25.21 s | 0.33 s | 0.37 s, 0.41 s, 0.44 s |
| libxul.so/8369F1AEC5776DCA611ABEDC182128E10 | 676.21 mb | 5.68 s | 3.58 s, 3.64 s | 21.55 s | 25.04 s, 25.17 s | 0.34 s | 0.40 s, 0.37 s |
| libxul.so/E7EC5FD688D6517BABBDB97D3CECCFA70 | 676.14 mb | 6.06 s | 3.82 s, 3.61 s | 22.00 s | 25.06 s, 25.32 s | 0.36 s | 0.39 s, 0.44 s |
| libxul.so/AF7B2656DF97FD3606B1BCC70B78F94F0 | 676.05 mb | 5.51 s | 3.64 s | 21.26 s | 25.27 s | 0.30 s | 0.37 s |
| libxul.so/DBBB86E144CB0BF954223E0C9F6297540 | 676.03 mb | 5.59 s | 3.53 s | 21.57 s | 25.63 s | 0.25 s | 0.41 s |
| libxul.so/348D23C3D8588D5882071122CD68CE960 | 676.01 mb | 5.90 s | 3.75 s, 3.57 s, 3.78 s | 22.77 s | 25.01 s, 24.60 s, 25.22 s | 0.25 s | 0.44 s, 0.36 s, 0.43 s |
| libxul.so/923394C87DD48E0F9650CF549D4A1EA70 | 670.23 mb | 5.87 s | 3.82 s | 21.58 s | 25.04 s | 0.36 s | 0.37 s |
| libxul.so/55B63C8E60335CF3980F820C7C62750B0 | 665.43 mb | 5.83 s | 3.57 s | 21.00 s | 24.63 s | 0.34 s | 0.36 s |
| libxul.so/7FF31C430F97B8DCBB59E5AD816EB7870 | 665.05 mb | 8.03 s | 3.58 s | 21.59 s | 24.79 s | 0.35 s | 0.38 s |
| libxul.so/75DB6EDD80593B667884C376300A196E0 | 656.49 mb | 5.62 s | 3.43 s | 19.93 s | 23.67 s | 0.36 s | 0.44 s |
| libxul.so/1D612D16E100A55E7340D2032E42CB8D0 | 656.49 mb | 9.32 s | 3.70 s | 20.47 s | 23.34 s | 0.35 s | 0.42 s |
| xul.pdb/7AB6462CEE9621714C4C44205044422E1 | 630.27 mb | 5.40 s | 3.13 s | 18.53 s | 22.05 s | 0.28 s | 0.33 s |
| xul.pdb/12F3F52BE9645DC94C4C44205044422E1 | 629.51 mb | 5.11 s | 3.34 s | 18.49 s | 21.73 s | 0.29 s | 0.37 s |
| xul.pdb/5BDA36D8CCAFC6114C4C44205044422E1 | 629.47 mb | 5.03 s | 3.32 s | 18.53 s | 21.51 s | 0.30 s | 0.35 s |
| xul.pdb/9B52BF0645C4621B4C4C44205044422E1 | 628.69 mb | 5.85 s | 3.52 s | 18.39 s | 21.80 s | 0.28 s | 0.29 s |
| xul.pdb/5953938BB52628AF4C4C44205044422E1 | 626.79 mb | 5.50 s | 3.11 s | 18.24 s | 21.35 s | 0.27 s | 0.33 s |
| xul.pdb/A0FAF5BCE4BE85A74C4C44205044422E1 | 626.45 mb | 5.00 s | 3.18 s, 3.19 s, 3.24 s | 18.18 s | 21.58 s, 21.54 s, 21.54 s | 0.21 s | 0.29 s, 0.34 s, 0.35 s |
| xul.pdb/A03E0CF0A43C95244C4C44205044422E1 | 625.74 mb | 5.01 s | 3.12 s | 18.55 s | 22.32 s | 0.30 s | 0.38 s |
| xul.pdb/4BF3B5A0968EDB0F4C4C44205044422E1 | 625.57 mb | 5.17 s | 3.16 s | 18.42 s | 22.06 s | 0.29 s | 0.31 s |
| xul.pdb/75830C7888DCCDF94C4C44205044422E1 | 624.63 mb | 5.18 s | 3.20 s, 3.17 s | 19.18 s | 21.95 s, 21.31 s | 0.27 s | 0.32 s, 0.32 s |
| xul.pdb/4753FA9A86ABC7294C4C44205044422E1 | 624.45 mb | 5.04 s | 3.19 s | 18.24 s | 21.32 s | 0.22 s | 0.33 s |
| xul.pdb/0C33E384E77DA75A4C4C44205044422E1 | 624.43 mb | 5.85 s | 3.34 s, 3.39 s, 3.19 s | 19.47 s | 21.55 s, 21.53 s, 21.77 s | 0.22 s | 0.29 s, 0.34 s, 0.33 s |
| xul.pdb/614E6871CBF27E6E4C4C44205044422E1 | 624.41 mb | 5.06 s | 3.61 s, 3.16 s, 3.19 s | 18.32 s | 22.04 s, 21.47 s, 21.47 s | 0.30 s | 0.30 s, 0.31 s, 0.31 s |
| xul.pdb/1DB1C16017CC54884C4C44205044422E1 | 624.28 mb | 5.09 s | 3.21 s | 18.19 s | 21.76 s | 0.22 s | 0.30 s |
| xul.pdb/7E8CE16205FE6A6D4C4C44205044422E1 | 624.15 mb | 5.17 s | 3.22 s, 3.11 s, 3.12 s | 18.28 s | 21.54 s, 21.43 s, 22.42 s | 0.19 s | 0.31 s, 0.32 s, 0.35 s |
| xul.pdb/BDF935DEA8D64AF44C4C44205044422E1 | 615.33 mb | 4.92 s | 3.16 s | 19.37 s | 21.23 s | 0.22 s | 0.35 s |
| xul.pdb/063D0CB9687B5CC04C4C44205044422E1 | 523.22 mb | 4.76 s, 4.79 s | 2.91 s | 19.66 s, 18.47 s | 21.70 s | 0.22 s, 0.34 s | 0.33 s |
| xul.pdb/6331F4B199AA9D5D4C4C44205044422E1 | 523.19 mb | 4.46 s | 3.03 s, 2.83 s | 19.83 s | 21.63 s, 22.31 s | 0.23 s | 0.31 s, 0.39 s |
| xul.pdb/D503A396C0E8562D4C4C44205044422E1 | 522.76 mb | 4.65 s | 2.89 s, 2.87 s | 18.37 s | 22.39 s, 21.61 s | 0.24 s | 0.32 s, 0.36 s |
| xul.pdb/BFF1732D9DE9A4624C4C44205044422E1 | 522.72 mb | 4.46 s | 2.83 s | 18.43 s | 21.71 s | 0.30 s | 0.38 s |
| xul.pdb/622B10953131BE5B4C4C44205044422E1 | 522.46 mb | 4.72 s | 2.82 s, 3.10 s | 18.29 s | 21.89 s, 21.64 s | 0.22 s | 0.32 s, 0.30 s |
| xul.pdb/BE781044367FCD034C4C44205044422E1 | 522.06 mb | 4.84 s | 2.81 s | 18.35 s | 21.73 s | 0.28 s | 0.31 s |
| xul.pdb/B1E386995B6797984C4C44205044422E1 | 522.05 mb | 5.17 s | 2.95 s | 18.29 s | 21.68 s | 0.31 s | 0.32 s |
| xul.pdb/152ABE6E873298524C4C44205044422E1 | 521.95 mb | 4.65 s | 2.91 s | 19.60 s | 22.02 s | 0.24 s | 0.31 s |
| xul.pdb/3948996C0283B5E94C4C44205044422E1 | 521.83 mb | 4.67 s | 2.95 s | 18.34 s | 21.53 s | 0.19 s | 0.40 s |
| xul.pdb/E010FFF4C7CCE3E24C4C44205044422E1 | 521.63 mb | 4.76 s | 3.18 s, 2.91 s | 18.62 s | 21.67 s, 21.68 s | 0.31 s | 0.32 s, 0.31 s |
| xul.pdb/F2C45586502DF3434C4C44205044422E1 | 520.68 mb | 5.06 s | 2.86 s | 18.46 s | 21.75 s | 0.33 s | 0.30 s |
| xul.pdb/C1C01AFFCF4842A34C4C44205044422E1 | 520.37 mb | 4.90 s | 3.00 s | 18.24 s | 21.63 s | 0.21 s | 0.35 s |
| xul.pdb/0DAAAC064D5D4BED4C4C44205044422E1 | 519.58 mb | 4.93 s, 5.11 s | 3.01 s, 2.79 s | 18.09 s, 18.11 s | 21.10 s, 21.39 s | 0.21 s, 0.27 s | 0.29 s, 0.34 s |
| xul.pdb/1B7E072433B357CA4C4C44205044422E1 | 519.50 mb | 5.25 s | 2.84 s, 3.30 s, 3.03 s | 19.29 s | 21.27 s, 21.44 s, 21.68 s | 0.24 s | 0.33 s, 0.35 s, 0.31 s |
| xul.pdb/A36497CFA0EEA6314C4C44205044422E1 | 519.49 mb | 4.56 s | 2.77 s, 2.88 s | 18.54 s | 21.31 s, 21.18 s | 0.22 s | 0.31 s, 0.37 s |
| xul.pdb/999A8958519CC4DB4C4C44205044422E1 | 519.47 mb | 7.80 s | 3.04 s, 3.07 s, 2.76 s | 19.32 s | 21.32 s, 21.29 s, 21.62 s | 0.22 s | 0.36 s, 0.31 s, 0.30 s |
| xul.pdb/B2920C693A00DBF34C4C44205044422E1 | 519.42 mb | 5.67 s | 2.94 s | 18.73 s | 21.63 s | 0.33 s | 0.32 s |
| xul.pdb/67C34107722EAFA44C4C44205044422E1 | 519.39 mb | 4.59 s | 3.15 s | 18.03 s | 21.71 s | 0.22 s | 0.34 s |
| xul.pdb/5B45ABB8C1A220684C4C44205044422E1 | 519.29 mb | 4.37 s | 2.85 s | 18.19 s | 21.42 s | 0.27 s | 0.31 s |
| xul.pdb/ACEA1B318AB7CE4D4C4C44205044422E1 | 519.20 mb | 4.74 s | 2.85 s | 18.38 s | 21.23 s | 0.29 s | 0.34 s |
| xul.pdb/65694F4C92CA530E4C4C44205044422E1 | 519.11 mb | 4.48 s | 2.97 s | 18.10 s | 21.42 s | 0.28 s | 0.34 s |
| xul.pdb/50CB4664B174DF884C4C44205044422E1 | 519.09 mb | 4.55 s | 2.77 s, 2.99 s | 18.10 s | 21.55 s, 21.46 s | 0.22 s | 0.34 s, 0.31 s |
| xul.pdb/203985E16CE26DE14C4C44205044422E1 | 519.05 mb | 5.34 s | 2.94 s, 2.95 s, 3.16 s | 19.22 s | 22.04 s, 21.76 s, 21.35 s | 0.23 s | 0.30 s, 0.29 s, 0.29 s |
| xul.pdb/E1EA6C840974A6544C4C44205044422E1 | 518.94 mb | 4.86 s | 2.88 s | 18.23 s | 21.34 s | 0.22 s | 0.30 s |
| xul.pdb/677C3C0EDE9DE96E4C4C44205044422E1 | 518.54 mb | 4.84 s | 3.04 s, 2.83 s | 19.19 s | 21.46 s, 21.61 s | 0.22 s | 0.34 s, 0.37 s |
| xul.pdb/96D6FD5E518E30284C4C44205044422E1 | 518.43 mb | 4.54 s | 2.90 s, 2.79 s, 2.78 s | 18.02 s | 21.23 s, 21.35 s, 21.79 s | 0.23 s | 0.32 s, 0.29 s, 0.34 s |
| xul.pdb/2DD70C3BF4F359B74C4C44205044422E1 | 518.08 mb | 4.74 s | 2.78 s | 18.04 s | 21.19 s | 0.28 s | 0.33 s |
| xul.pdb/C0153E3CF0B27B7C4C4C44205044422E1 | 518.00 mb | 4.71 s | 2.89 s | 18.50 s | 21.18 s | 0.34 s | 0.33 s |
| xul.pdb/1F570BFB605FAA964C4C44205044422E1 | 517.79 mb | 4.66 s | 2.90 s | 18.34 s | 21.61 s | 0.29 s | 0.31 s |
| xul.pdb/46ADA0D7AB398DBE4C4C44205044422E1 | 517.06 mb | 6.59 s | 3.14 s, 2.96 s, 2.89 s | 17.99 s | 21.48 s, 21.51 s, 21.12 s | 0.23 s | 0.32 s, 0.29 s, 0.33 s |
| xul.pdb/B196A9B2F3FF7EE74C4C44205044422E1 | 509.67 mb | 4.69 s | 2.84 s | 18.36 s | 21.12 s | 0.32 s | 0.33 s |
| xul.pdb/EF6F08BB7E0932464C4C44205044422E1 | 508.89 mb | 5.02 s | 2.89 s | 17.82 s | 21.21 s | 0.28 s | 0.34 s |
| xul.pdb/E8214F2333468C674C4C44205044422E1 | 507.66 mb | 4.58 s | 2.77 s | 17.76 s | 20.49 s | 0.30 s | 0.30 s |
| xul.pdb/9DEB9BA1FE8AF73B4C4C44205044422E1 | 507.45 mb | 4.79 s | 2.82 s | 17.92 s | 20.75 s | 0.21 s | 0.37 s |
| xul.pdb/063C819FBDFCEB264C4C44205044422E1 | 506.84 mb | 4.75 s | 2.86 s | 17.90 s | 20.98 s | 0.21 s | 0.36 s |
| xul.pdb/C98AFF5FFB50657E4C4C44205044422E1 | 505.83 mb | 4.72 s | 2.97 s | 17.51 s | 20.29 s | 0.29 s | 0.29 s |
| xul.pdb/AA34E0C61FB4CB884C4C44205044422E1 | 504.32 mb | 4.55 s | 2.96 s, 3.27 s, 2.81 s | 17.97 s | 20.67 s, 20.34 s, 20.51 s | 0.29 s | 0.31 s, 0.33 s, 0.34 s |
There are a few interesting things in these results:
- Downloading is faster in GCP than in AWS. We're shaving several seconds off of large downloads. Yay!
- Parsing the sym files is significantly slower in GCP than in AWS. This takes several seconds longer. I suspect this is CPU bound. A faster CPU will produce faster parse times.
- Saving the symcache file to disk is slower in GCP than in AWS. This is a problem because it means our disk-based LRU cache is less effective.
These timings were taken by posting symbolication requests serially on environments that weren't getting any other traffic. It's entirely possible that when under a normal load, the timings change.
We've got panels in the app monitoring dashboard, so we'll see changes as load comes and goes. I think I don't want to do anything here right now. There's a "make eliot faster" bug (bug #1801212), so I'll add some notes to that bug.
The one thing I think we should change is to bump the GCLB idle timeout to 300s to match what we have with Eliot AWS. That's an absurd number, but it means that requests will complete. We don't currently have an SLO for timings, so we can go with this for now and if some group needs better performance, hopefully they will tell us and we can re-evaluate then.
I'll make the GCLB idle timeout change next.
| Assignee | ||
Comment 8•2 years ago
|
||
I did a PR to increase the idle timeout. Waiting on that and then we can close this out.
| Assignee | ||
Comment 9•2 years ago
|
||
The GCLB idle timeout change landed.
I did another normal load and there were a couple of HTTP 502 timeouts, but most of the failures were HTTP 504 at the 60s mark suggesting there's still a timeout.
I listed the timeout values as such:
- Eliot app has no timeouts.
- Gunicorn is set to time out after 300s (5m).
- nginx proxy_read_timeout isn’t set, so it uses the default which is 60s.
- GCLB idle timeout is 300s (5m).
I think we’re hitting the nginx proxy_read_timeout now.
I looked at the AWS nginx configuration and while Tecken has a proxy_read_timeout of 300s (symbol uploads take forever), Eliot doesn’t appear to have a proxy_read_timeout set at all. That's curious.
I did a PR to increase the proxy_read_timeout to 300s. Then I'll continue load testing.
| Assignee | ||
Comment 10•2 years ago
|
||
I did the PR, Harold approved it, I deployed the change to stage and prod and did some more load testing. Eliot is timing out a lot less now, so I think we're good here.
Description
•