The default bug view has changed. See this FAQ.

chrome.storage.sync: performance test of production stack for chrome.storage.sync

NEW
Unassigned

Status

()

Toolkit
WebExtensions: General
P3
normal
5 months ago
18 days ago

People

(Reporter: glasserc, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [storage]triaged)

(Reporter)

Description

5 months ago
We need to make sure that we have enough capacity in production for when this feature hits beta.
(Reporter)

Updated

5 months ago
Blocks: 1311710

Updated

5 months ago
Component: WebExtensions: Untriaged → WebExtensions: General
Priority: -- → P3
Whiteboard: [storage]triaged

Updated

4 months ago
Blocks: 1220494
No longer blocks: 1311710

Comment 1

2 months ago
Would this be you Remy, or someone else?
Flags: needinfo?(rhubscher)
Hi Andy,

We have some loadtest for Kinto ready there: https://github.com/mozilla-services/ailoads-kinto
Usually QA is running them. But I don't know who is the QA for the service side of the webextensions stack.

Stuart do you know if we have a QA that could run some loadtest on the webextension stack?
Flags: needinfo?(rhubscher) → needinfo?(sphilp)

Comment 3

2 months ago
Karl can take it, cc'ing him. Do we need this for a certain date?
Flags: needinfo?(sphilp)
QA Contact: kthiessen

Comment 4

2 months ago
(In reply to Stuart Philp :sphilp from comment #3)
> Karl can take it, cc'ing him. Do we need this for a certain date?

This would block the feature from landing in beta. So, sometime before that would be great. Release trains are at https://wiki.mozilla.org/RapidRelease/Calendar
I'll note that we're aiming for Firefox 53 here, which means the relevant merge date is currently 2017-03-06.

I can agree to that timeframe.
Who in Services Ops is going to be in charge of this production deployment?  Can we get them cc'ed on this bug, please, or get a pointer to another bug to use for communication with Ops?
Flags: needinfo?(eglassercamp)
More questions:

* Do we have defined desired capacities in terms of, for example, number of queries per second we want the service to stand up under?

* Do we need to co-ordinate with Ops to determine what the optimum size of the production cluster will be, or have they already made that decision?

* Who is our Ops contact for deployment verification?  Is there a stage instance for the AMO-specific cluster, or are we just using the existing https://webextensions-settings.stage.mozaws.net?

* My team are standing up the load testing apparatus today and tomorrow; we should have the first successful tests late this week or early next, and I'm hoping to have a go/no-go call by the end of next week.  Does that work with everyone's timetable?

Comment 8

2 months ago
(In reply to Karl Thiessen [:kthiessen] from comment #6)
> Who in Services Ops is going to be in charge of this production deployment? 
> Can we get them cc'ed on this bug, please, or get a pointer to another bug
> to use for communication with Ops?

I am the primary Ops on Kinto/Storage today, bobm is secondary.

Comment 9

2 months ago
(In reply to Karl Thiessen [:kthiessen] from comment #7)
> More questions:
> * Do we need to co-ordinate with Ops to determine what the optimum size of
> the production cluster will be, or have they already made that decision?
> 

As of right now production is up but with minimal resources, 3 web instances c4.large and RDS m4.large [1]. We can adjust as needed based on performance testing and how much traffic we expect to receive. Production endpoint is https://webextensions.settings.services.mozilla.com/v1/


> * Who is our Ops contact for deployment verification?  Is there a stage
> instance for the AMO-specific cluster, or are we just using the existing
> https://webextensions-settings.stage.mozaws.net?

I am the Ops contact, reach out to me with any questions. We should use https://webextensions-settings.stage.mozaws.net for testing.


[1] https://github.com/mozilla-services/cloudops-deployment/blob/master/projects/kintowe/ansible/envs/prod.yml#L15-L20
Brilliant!  Thanks, Jason.

The only outstanding question is:

* Do we have defined desired capacities in terms of, for example, number of queries per second we want the service to stand up under?

Ethan, have you got an answer for that, or an you point us in the direction of someone who does?

Comment 11

2 months ago
(In reply to Karl Thiessen [:kthiessen] from comment #10)
> Brilliant!  Thanks, Jason.
> 
> The only outstanding question is:
> 
> * Do we have defined desired capacities in terms of, for example, number of
> queries per second we want the service to stand up under?

AdBlock Plus uses this. If you took the number of users for AdBlock Plus (~20 million) multiplied it by the number of times it queries in a day, you'll get the idea. But AdBlock Plus isn't moving over to this for a few releases.

Overall approx. 15% of all add-ons on the Chrome store use this API [1]. We currently have 89 add-ons using it [2].

We've explicitly stated that this API end point has no SLA around usage or performance, developers get what they get and they don't get upset.

I really don't want us to end up throwing too many resources at this and would like to suggest we ramp up performance as the usage increases, I expect very little usage until it hits a peak when something like AdBlock Plus hits release (expected November).

It's worth noting that chrome.storage.sync only works if you are signed in through Firefox Sync. So we can probably say that a simple metric is to take the amount of traffic that syncing through Firefox Sync does and then dividing that by.

How many queries per second that translates into, I don't know. But I would be interested in the amount of GET, POSTs and PUTs on sync right now from other services and then suggesting that by Nov 57, the load on this service would be a fraction of that (amount of sync traffic / amount of add-ons using). 

What numbers do other sync services handle?

What numbers can Kinto put up right now?

[1] https://github.com/andymckay/arewewebextensionsyet.com/blob/master/usage.csv#L16
[2] https://gist.github.com/andymckay/10c3a4c64ce8990b589f0ac740f65955#file-firefox-permissions-L131
Flags: needinfo?(eglassercamp)
Thank you, Andy!  That's very useful information.  I'll check with the Sync metrics team and see if I can get some related data.
I'm not sure what the policy is for putting traffic numbers in public bugs, but I have the sync numbers that Andy asked for above, and will bring them to the meeting tomorrow.
Do we want load test results in this bug, or somewhere more private (since they're likely to include performance thresholds)?
Flags: needinfo?(jthomas)
I think we should keep performance thresholds private. Sharing via google docs works for me but if you want to include datadog graphs might be worth looking at https://app.datadoghq.com/notebook/.
Flags: needinfo?(jthomas)
https://app.datadoghq.com/notebook/list is better and has a notebook created by :miles for another project.
QA Contact: kthiessen → chartjes
We have a scenario document being used for load testing here:
   https://docs.google.com/document/d/1na-4DtECFRf0zEgJzaeK4G6MJAINfqY_UO_8rx5p_ME/edit

Please get the scenarios you want tested into that document, so that Chris can do the testing required to make sure this product is ready for release.
You need to log in before you can comment on or make changes to this bug.