Closed Bug 384099 Opened 13 years ago Closed 12 years ago

get current Tp/etc. data onto new graph server

Categories

(Webtools Graveyard :: Graph Server, defect)

defect
Not set

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: vlad, Assigned: anodelman)

References

Details

Attachments

(2 files, 2 obsolete files)

The new graph server used to run a sync process as a cron job that would wget -c the data files from the old graph server and then import them.  This hasn't been running for a while, I believe ever since it was switched to a new machine.  Can we get this running again?

rhelmer, we talked a while back about just making the tinderboxes push to both the new graph server and the old -- is that easier to do now?  Doing a one-time import and then auto pushes would probably be the best solution..
Looks like the crontab itself is under your user.

vladimir@bm-buildgraph01[1014]$ crontab -l
SHELL=/bin/sh

#0 */2 * * * /home/vladimir/graphs/utils/pull >/dev/null 2>/dev/null

Maybe try uncommenting it yourself and see if it works? :)
Hardware: PC → All
(In reply to comment #0)
> rhelmer, we talked a while back about just making the tinderboxes push to both
> the new graph server and the old -- is that easier to do now?  Doing a one-time
> import and then auto pushes would probably be the best solution..

No easier than before, but certainly doable :)

How is this better than pulling the data files over and importing them, though? 
It's under my user because IT didn't give me or alice root access to that box, and at one point, didn't want to create an account for her -- so she was logging in as me to do her work.  I don't know if that's changed, but she would know better what's going on there; I'm not sure if that script even still works, with her changes.

rob, I thought we wanted to wait until you had an easy way to update all the tinderboxes without having to do pull new code manually or something.  Doing the crontab update is possible, it's just pretty clunky and wasteful.  But we can keep that going if that's easiest.
(In reply to comment #3)
> rob, I thought we wanted to wait until you had an easy way to update all the
> tinderboxes without having to do pull new code manually or something.  Doing
> the crontab update is possible, it's just pretty clunky and wasteful.  But we
> can keep that going if that's easiest.

Oh, yeah everything is on auto-update now, I guess it was quite a while ago that we discussed that. I don't think it'd be very hard to make a change to tinderbox that makes it deploy to two servers.. 

But I wonder if it'd be better just to make a change to build-graphs, and make it replicate changes (e.g. replay the HTTP request it received to the new graphs server)? Then the change would be in one place, wouldn't depend on configuring a bunch of standalone tinderbox machines, etc.

As it stands, I don't believe that the pull-script would currently work because of changes I made to the database schema.  I don't believe that it would be very hard to get it back up and running if we want to stick with a pull-system instead of having the old graph server push (replay) data to the new.
We'll need the script anyway to import historical data.

I'd much rather have the tinderboxes push, because otherwise we end up depending on the old graph server stuff -- having the tinderboxes do it means that we have the option of turning it off at some point.  It also means that we can change the tinderbox links to link to the new server when looking at graph data.
I would really really like this to happen. The way current graphs work, if a large number appears on the graph it makes it totally unreadable, since there's no zooming functionality. This makes it extremely hard to view long-term data trends.

Alice, do you think you could take a stab at this, even if it's just updating the database schema and getting the old pull script working again?
I'm going to work on getting the pull scripts up to date with the new schema. 
That should be the fasted route to having the data onto the new graph server. 
Status: NEW → ASSIGNED
Assignee: nobody → anodelman
Status: ASSIGNED → NEW
The pull scripts are working again.  What's missing is a list of tinderboxes to pull data from (names, machine types (win/linux/etc), branch (1.9, 1.8, etc)).  The list that was initially put together by vlad is now out of date.

I'm waiting on advice from vlad/joduinn.
I sent alice an up-to-date list.
Here's the fixed scripts for pulling tinderbox data to the new graph server.  The list of tinderboxes/tests being synced is listed in pull.sh - after much double checking I think that these actually reflect active tinderboxes.
Attachment #270383 - Flags: review?(vladimir)
Comment on attachment 270383 [details] [diff] [review]
updated pull scripts for the new graph server

Looks good, thanks alice!

At some point we should probably pull out the db manipulation python into its own module that can be shared amongst the scripts, so we don't have to keep duplicating the schema, etc., but it's not crucial.
Attachment #270383 - Flags: review?(vladimir) → review+
Checking in utils/import.py;
/cvsroot/mozilla/webtools/new-graph/utils/import.py,v  <--  import.py
new revision: 1.9; previous revision: 1.8
done
RCS file: /cvsroot/mozilla/webtools/new-graph/utils/pull.sh,v
done
Checking in utils/pull.sh;
/cvsroot/mozilla/webtools/new-graph/utils/pull.sh,v  <--  pull.sh
initial revision: 1.1
done
Status: NEW → RESOLVED
Closed: 13 years ago
Resolution: --- → FIXED
Reed - when will the data actually appear on graphs.mozilla.org? 
(In reply to comment #14)
> Reed - when will the data actually appear on graphs.mozilla.org? 

It's live on graphs-stage.mozilla.org currently. Once Alice determines that it is fine, I can add the script to production (graphs.mozilla.org).

<alice> okay.  we'll watch it for a day or so and then flip it on for production
Ok - reopening.  Please mark Fixed when it is on graphs.mozilla.org
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
I wasn't careful enough with the data that the tinderboxes were giving me in the earlier patch, now I ensure that I skip over missing data (ie, val = nan) and strip new lines from raw data which can mess with JSON string passing.
Attachment #270383 - Attachment is obsolete: true
Attachment #270811 - Flags: review?(vladimir)
Comment on attachment 270811 [details] [diff] [review]
fix for pulling data from the tinderboxes to the new graph server

Can you make this a diff against the (now existing) files instead of adding pull.sh again, etc.?
Attached patch reduced patch to fix first patch (obsolete) — Splinter Review
This fix can be applied after the original patch.
Attachment #270811 - Attachment is obsolete: true
Attachment #270838 - Flags: review?(vladimir)
Attachment #270811 - Flags: review?(vladimir)
Attachment #270383 - Attachment is obsolete: false
Comment on attachment 270838 [details] [diff] [review]
reduced patch to fix first patch

should work, though I think you want to use math.isnan or some function similar to that instead of checkcing for == 'nan'
Attachment #270838 - Flags: review?(vladimir) → review+
Actually, I'm not aware that Python has an isnan function, oddly.  val != val will do the trick for detecting NaN.
Smarter checking for NaN values as recommended in previous comments.  I tested this and it seems to work fine, I don't find my database cluttered with nan's.
Attachment #270838 - Attachment is obsolete: true
Attachment #271089 - Flags: review?(vladimir)
Attachment #271089 - Flags: review?(vladimir) → review+
Checking in import.py;
/cvsroot/mozilla/webtools/new-graph/utils/import.py,v  <--  import.py
new revision: 1.10; previous revision: 1.9
done
I'm happy with how things look on graphs-stage (ie, no more 'nan' in results, everything graphs okay).  I think that we can safely push to production.
(In reply to comment #24)
> I'm happy with how things look on graphs-stage (ie, no more 'nan' in results,
> everything graphs okay).  I think that we can safely push to production.

This was done a while ago. Marking fixed.
Status: REOPENED → RESOLVED
Closed: 13 years ago13 years ago
Resolution: --- → FIXED
So - the data doesn't look complete or right to me:

http://graphs.mozilla.org/#spst=range&spstart=1155027000&spend=1181795460&bpst=cursor&bpsc=1183658373.7714286&bpstart=1155027000&bpend=1181795460&m1tid=5252&m1bl=0&m1avg=0

Note that there's a tiny bit of data from 2006 and a tiny bit recently and nothing in the middle.  Did we verify the data?

Status: RESOLVED → REOPENED
Resolution: FIXED → ---
Looks like a match with build-graphs:

http://build-graphs.mozilla.org/graph.cgi?tbox=bl-bldxp01&testname=dhtml&autoscale=0&days=0

Maybe we just aren't interested in that particular data set?  My rational for choosing tinderboxes to pull data from was those that were actively reporting - in this case it appears to be a box that used to report and is now reporting again.
FWIW, I filed bug 388831 for getting the new Linux reference box and Mac OS X leak box numbers added.
You are comparing bl-bldx01 and bl-bldx01_head - that would be the difference.  If we want to switch to gathering info for bl-bldx01_head we can do that, but I still don't think that there is a problem in the pull script.
This has been up and running for some time.  Going to close this bug - if there is another issue we can re-open.
Status: REOPENED → RESOLVED
Closed: 13 years ago12 years ago
Resolution: --- → FIXED
Product: Webtools → Webtools Graveyard
You need to log in before you can comment on or make changes to this bug.