1033288 - Create documentation on Mana about our Memcache clusters

Reporter

Description

•

10 years ago

As part of the releng nagios audit, we've come up with a few open monitoring and documentation questions/actions for webops.

Questions:
1) why isn't upload2.dmz.scl3.mozilla.com in hostgroup seamicro-nodes like other seamicro nodes?
2) we seem to monitor product-delivery-ftp-vip but not product-delivery-ftp. Should we be monitoring both?


Actions:
1) please add documentation for memcached to mana
2) the tbpl documentation doesn't list any database dependencies, but there are two tbpl database. Can you please update the documentation and/or document what the tbpl databases are for?

Thanks!

:kanban

Updated

•

10 years ago

Whiteboard: [kanban:https://kanbanize.com/ctrl_board/4/479]

Amy Rich [:arr] [:arich]

Reporter

Updated

•

10 years ago

Blocks: 993044

C. Liang [:cyliang]

Comment 1

•

9 years ago

I suspect that the requests in this bug have been rendered moot during the intervening time.  (>_<)  I wanted to check if any of these were currently relevant and, if so, get them done.

RE: upload2.dmz.scl3.mozilla.com
This server went virtual in virtual in bug 1061825, so it should no longer need to be grouped with the seamicro nodes.


RE: monitoring of product-delivery-ftp-vip versus product-delivery-ftp
I'm not quite sure I grok this.  (This request may have been fulfilled by someone else.)  It looks like these are two host groups in SCL3, with the  ftp-vip hostgroup monitoring the ZLB-related properties and the ftp hostgroup monitoring things on a server level.  What were you looking to have monitored?


RE: memcached docs 
Are you looking for "generic" memcached documentation?  (I didn't find any in mana.)  There are docs for responding to Nagios memcache alerts (https://mana.mozilla.org/wiki/display/NAGIOS/memcached).


RE: documenting TBPL databases
It looks like both the mana page for tbpl.mozilla.org has the DBs listed (https://mana.mozilla.org/wiki/display/websites/tbpl.mozilla.org#tbpl.mozilla.org-Database) as does the wiki architecture docs (https://wiki.mozilla.org/Sheriffing/TBPL/DeveloperDocs#Architecture).  Is this sufficient or are you looking to capture what would be in the table descriptions?

:kanban

Updated

•

9 years ago

Whiteboard: [kanban:https://kanbanize.com/ctrl_board/4/479] → [kanban:https://webops.kanbanize.com/ctrl_board/2/236]

:Atoll

Assignee

Comment 2

•

9 years ago

(In reply to C. Liang [:cyliang] from comment #1)
> RE: monitoring of product-delivery-ftp-vip versus product-delivery-ftp
> I'm not quite sure I grok this.  (This request may have been fulfilled by
> someone else.)  It looks like these are two host groups in SCL3, with the 
> ftp-vip hostgroup monitoring the ZLB-related properties and the ftp
> hostgroup monitoring things on a server level.  What were you looking to
> have monitored?

> RE: memcached docs 
> Are you looking for "generic" memcached documentation?  (I didn't find any
> in mana.)  There are docs for responding to Nagios memcache alerts
> (https://mana.mozilla.org/wiki/display/NAGIOS/memcached).
> 
> 
> RE: documenting TBPL databases
> It looks like both the mana page for tbpl.mozilla.org has the DBs listed
> (https://mana.mozilla.org/wiki/display/websites/tbpl.mozilla.org#tbpl.
> mozilla.org-Database) as does the wiki architecture docs
> (https://wiki.mozilla.org/Sheriffing/TBPL/DeveloperDocs#Architecture).  Is
> this sufficient or are you looking to capture what would be in the table
> descriptions?

Setting needinfo? :arr for these three questions.

Flags: needinfo?(arich)

Amy Rich [:arr] [:arich]

Reporter

Comment 3

•

9 years ago

since product delivery is changing, I think the first is a noop.

It looks like the database info is there for TBPL now, cool.

As far as memcached, documentation on how it's architected/configured at mozilla.

Flags: needinfo?(arich)

:Atoll

Assignee

Comment 4

•

9 years ago

(In reply to Amy Rich [:arich] [:arr] from comment #3)
> As far as memcached, documentation on how it's architected/configured at
> mozilla.

I look through all of our mentions of memcached in mana and it appears that memcached is generally documented as "1-X servers" on each individual app's page. It's not immediately apparent to me how each app chooses to make use of its memcached server pool, but I have a suspicion (based on various mentions of Couchbase at certain points) that it's basically up to each app to make use of the memcached pool however they see fit.

Specifically which apps are of interest to releng as part of this Nagios check review bug? I can pin down their sharding mechanisms if needed, or if the general answer "each app uses the configured pool of memcached as designed" without a more precise definition of sharding is okay, cool. (Or if there's other questions about memcached I'm not addressing in this reply, I can try to answer those as well.)

Flags: needinfo?(arich)

Amy Rich [:arr] [:arich]

Reporter

Comment 5

•

9 years ago

The perspective I was looking at this from was that memcached is its own service, so I would expect to see some sort of documentation in either SysAdminWiki or IT Wiki (possibly using the ServiceTemplate) that documents the service itself. Since we have multiple different memcached "clusters," maybe that documentation just talks about the defaults (what specs the VMs are created with, do we do kernel tuning, what's our default for sharding, cachesize, etc). Maybe a reference to the puppet module and node defs where these things can be configured. The service documentation page would also have links to the nagios checks, etc. Does that make sense?

Flags: needinfo?(arich)

:kanban

Updated

•

9 years ago

Assignee: server-ops-webops → rsoderberg

Shyam Mani [:fox2mike]

Updated

•

8 years ago

QA Contact: nmaul → smani

Summary: releng nagios audit: open questions/actions for webops → Create documentation on Mana about our Memcache clusters

:Atoll

Assignee

Comment 6

•

8 years ago

(In reply to Amy Rich [:arr] [:arich] from comment #5)
> The perspective I was looking at this from was that memcached is its own
> service, so I would expect to see some sort of documentation in either
> SysAdminWiki or IT Wiki (possibly using the ServiceTemplate) that documents
> the service itself. Since we have multiple different memcached "clusters,"
> maybe that documentation just talks about the defaults (what specs the VMs
> are created with, do we do kernel tuning, what's our default for sharding,
> cachesize, etc). Maybe a reference to the puppet module and node defs where
> these things can be configured. The service documentation page would also
> have links to the nagios checks, etc. Does that make sense?

This does make sense, but we don't have *any* of these things defined. They are all inherited from legacy snowflake deployments, and there are no best practices evaluated, tested, defined, and so on. This is obviously wildly suboptimal, but at least we can accurately define how it functions today.

I'd like to say that we're going to do this someday, but we clearly aren't going to get around to it for a very long time to come. RESO INCO until we can get further traction on this.

Status: NEW → RESOLVED

Closed: 8 years ago

Resolution: --- → INCOMPLETE

BMO Automation

Updated

•

5 years ago

Product: Infrastructure & Operations → Infrastructure & Operations Graveyard

Bugzilla

Quick Search

Create documentation on Mana about our Memcache clusters

Categories

(Infrastructure & Operations Graveyard :: WebOps: Other, task)

Tracking

(Not tracked)

People

(Reporter: arich, Assigned: Atoll)

References

Details

(Whiteboard: [kanban:https://webops.kanbanize.com/ctrl_board/2/236] )

Crash Data

Security

(public)

User Story

Description

Updated

Updated

Comment 1

Updated

Comment 2

Comment 3

Comment 4

Comment 5

Updated

Updated

Comment 6

Updated