Closed Bug 487323 Opened 15 years ago Closed 15 years ago

create read-only account with sql access to db behind new graph-server

Categories

(mozilla.org Graveyard :: Server Operations, task)

task
Not set
normal

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: joduinn, Assigned: justdave)

References

Details

(Whiteboard: waiting on source host/IP)

johnauth's dev dashboard currently assembles data from the old graph-server using a lot of post/get commands. It would be more efficient if he could access the same underlying data directly from the database, without adding load to the graph-server.

As we are switching from old graph-server to new graph-server, lets just make the jump to read-only access of the new sql db... do it all in one step.
Where will he be connecting to the mysql db from?
duh... "johnauth" should be "johnath". sorry about that typo.
Assignee: server-ops → justdave
Whiteboard: waiting on source host/IP
Ping?  Where's this dev dashboard running?  Looking for src address.
Had a conversation with Chris yesterday at the office about this, since he has to update his regression-watching scripts as well.

It's still not clear to me whether direct read only access will be easier or faster for me than just pulling via the existing JSON APIs, but if it turns out to be, the dashboard is hosted on my people account right now, you can see it here:

http://people.mozilla.org/~johnath/pdb2/?tree=moz192
(In reply to comment #4)
> Had a conversation with Chris yesterday at the office about this, since he has
> to update his regression-watching scripts as well.
> 
> It's still not clear to me whether direct read only access will be easier or
> faster for me than just pulling via the existing JSON APIs, but if it turns out
> to be, the dashboard is hosted on my people account right now, you can see it
> here:


I'm going to say JSON is the way to do.

I can't allow people access without allowing all of people access and I don't really want to do that.

If it's not JSON we could make a VM that has more limited access and grant access from there.  

I'll close this for now and let you guys figure this out - reopen if you need to.
Status: NEW → RESOLVED
Closed: 15 years ago
Resolution: --- → WONTFIX
It's a performance thing.  The JSON interface takes the server to its knees with the volume of data he needs to pull.  It needs direct DB access.  But yes, my intention was to get the script in source control somewhere and run the thing on a trusted server (i.e. actually deploy his app instead of running it from his account).  It's damn useful for the stuff they need to track, so it ought to be in a production environment anyway.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
(In reply to comment #6)
> It's a performance thing.  The JSON interface takes the server to its knees
> with the volume of data he needs to pull.

Huh? My understanding is that he's currently using the old graphs server (which isn't JSON). The new graphs server supports JSON, so it may work better for what he needs.
hmm, okay.  If that's the case maybe there's no issue.

If we do end up doing direct database access, though, I'm definitely not poking a hole in the firewall for dm-peep01 to get to it.  It'll need to be hosted somewhere else if they're doing that.
The current code calls a CGI on the old graph server which returns a JSON(ish) chunk of data.  Mu understanding, though, was that in moving to the new graph server, and in particular the new DB schema, we'd made things a lot better for data fetching.  If that's not true, then maybe we are back to SQL pulls.

FWIW, the code that runs it is already in source control, in my user repo space - http://hg.mozilla.org/users/jnightingale_mozilla.com/dashboard/ It could certainly be migrated and maintained as "more important than a hobby project" which might obviate the issue here, indeed.

Another thing that was discussed, though, was just having someone write a script which can farm the data I need from the DB into a static JS file that gets updated every 5 minutes or so.  That could run on the DB server itself, or somewhere otherwise-trusted - and the dashboard itself could live on people, or anywhere else convenient. Maybe that's the script Dave was talking about in comment 6 - that one doesn't exist yet, so it's certainly not in source control.  :)
I'm curious why the new JSON api calls are bringing the server to its knees. Is it the amount of requests, SQL queries or amount of data transmitted over http? How would direct db access make it faster or have less overhead?

AFAIK, the issue was how often/fast the dashboard code hit the API. Couldn't we just rate limit it?
(In reply to comment #10)
> I'm curious why the new JSON api calls are bringing the server to its knees. Is
> it the amount of requests, SQL queries or amount of data transmitted over http?
> How would direct db access make it faster or have less overhead?
> 
> AFAIK, the issue was how often/fast the dashboard code hit the API. Couldn't we
> just rate limit it?

To be clear, the dashboard code doesn't yet use the new APIs, it uses the old getdata.cgi (iirc), so if the new ones are more efficient, that is good news. As for rate limiting, I'm fine to do so, indeed I already have added sleeps before each fetch.

The dashboard is sort of bursty.  If it's not rate-limited, it has a cron job which, once every 5 minutes, spends about 7 seconds refreshing data, during which time it will send about 80 getdata.cgi calls, one for each data set.  If the expectation is that the new graph server should not suffer under that load, then it may well be that we don't need anything more complicated.
I think this is WFM or INVALID or WONTFIX or something.
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → WONTFIX
(In reply to comment #12)
> I think this is WFM or INVALID or WONTFIX or something.

Sorry, nope we do need this. 

The comments below about limiting bursts of JSON requests is a short-term workaround. We'll need this read-only access to db in place before we can change dashboard from using JSON to reading SQL... and before we can turn off the tinderbox pages.

Oh, and totally do not want this on people either. If I need to work out another VM for safe reliable access, let me know.
Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---
(In reply to comment #13)
> (In reply to comment #12)
> > I think this is WFM or INVALID or WONTFIX or something.
> 
> Sorry, nope we do need this. 

Actually, we don't.

The correct approach is to update the dashboard to use the new graph server's JSON API, which should not substantially increase the load on the SQL server.

If this turns out not to be the case, the next step should be to have the web heads use a read-only mirror of the database.
Status: REOPENED → RESOLVED
Closed: 15 years ago15 years ago
Resolution: --- → WONTFIX
Product: mozilla.org → mozilla.org Graveyard
You need to log in before you can comment on or make changes to this bug.