The default bug view has changed. See this FAQ.

improve instrumentation of bugzilla's internals

RESOLVED FIXED

Status

()

bugzilla.mozilla.org
General
P1
normal
RESOLVED FIXED
3 years ago
3 years ago

People

(Reporter: glob, Assigned: glob)

Tracking

({bmo-goal})

Production
bmo-goal
Dependency tree / graph

Details

Attachments

(3 attachments, 3 obsolete attachments)

(Assignee)

Description

3 years ago
we need to improve the instrumentation of bugzilla's internals with a mind to new-relic integration once their perl sdk is available.

bugzilla instrumentation

- determine instrumentation points
- provide an abstraction class so we can swap different reporting systems in/out

- start_web_transaction and end_web_transaction
  - call at the start and end of the script
  - simplest option is in mod_perl.pl
  - for mod_cgi support BEGIN and END blocks in Bugzilla.pm

- name_web_transaction
  - default to cgi's filename (eg. show_bug)
  - break down each api call (by transport and method) from webservice base class
  ? identify bzapi calls (from ua)

- start_datastore_statement and end_datastore_statement
  - wrap each dbi call which hits the database
  - simplest option is to add wrapper methods to Bugzilla::DBI for
    - do
    - selectrow_hashref
    - selectall_arrayref
    - ...
  - prepare + execute requires more work
    - subclass DBI::st
    - prepare --> grab sql
    - execute

Updated

3 years ago
Priority: -- → P1
(Assignee)

Comment 1

3 years ago
Created attachment 8362847 [details]
sample wip data

here's a sample of the data i've gathered after hooking up templates and database calls (show_bug for id 35).

Updated

3 years ago
Status: NEW → ASSIGNED
(Assignee)

Comment 2

3 years ago
Created attachment 8370776 [details] [diff] [review]
956230_1.patch

here's an "all dressed up with nowhere to go" patch.

this removes the current instrumentation class, replacing it with collectors for template-toolkit and mysql.  the data collection can be considered complete at this point.

i have two mock-up reporters, one which dumps to STDERR and another which dumps into a data/metrics.json file.  once we've figured out where to send this data so it can be reported upon, i'll add a reporter for that.
Attachment #8370776 - Flags: feedback?(dkl)
(Assignee)

Comment 3

3 years ago
Created attachment 8370778 [details]
sample json data
Attachment #8362847 - Attachment is obsolete: true
(Assignee)

Comment 4

3 years ago
Created attachment 8371300 [details] [diff] [review]
956230_2.patch

looks like elasticsearch is a go for receiving this data, with some modifications to the json structure.

here's an updated patch which includes the start of an ES reporter, which flattens the json structure after building a path array for each node.
Attachment #8370776 - Attachment is obsolete: true
Attachment #8370776 - Flags: feedback?(dkl)
Attachment #8371300 - Flags: feedback?(dkl)
(Assignee)

Comment 5

3 years ago
Created attachment 8371301 [details]
sample flat json data
(Assignee)

Updated

3 years ago
Attachment #8371301 - Attachment description: sample flag json data → sample flat json data
(Assignee)

Comment 6

3 years ago
Created attachment 8377551 [details] [diff] [review]
956230_3.patch

new since the last revision:
- configurable via parameters
- reports on memcached hit rates
- reporting restricted to show_bug.cgi and list of specified users
- log stacktrace instead of raw sql for database queries
- add this_duration to elasticsearch json
- removes confidential/personal data
- actually sends to elasticsearch
Attachment #8371300 - Attachment is obsolete: true
Attachment #8371300 - Flags: feedback?(dkl)
Attachment #8377551 - Flags: review?(dkl)
(Assignee)

Updated

3 years ago
Depends on: 974812
Comment on attachment 8377551 [details] [diff] [review]
956230_3.patch

Review of attachment 8377551 [details] [diff] [review]:
-----------------------------------------------------------------

All very nice. Code looks good and also works properly when using the STDERR reporter. I did not test the ElasticSearch submission pieces but look good from inspection. Few non-critical comments included.

r=dkl

::: Bugzilla.pm
@@ +131,5 @@
>  
>      my $script = basename($0);
>  
> +    # BMO - init metrics collection if required
> +    if (i_am_cgi() && $script eq 'show_bug.cgi') {

Should we also be looking for process_bug.cgi, page.cgi, xmlrpc.cgi, jsonrpc.cgi, and rest.cgi as well as you have altered those to include metrics data?

Maybe add a METRICS_SCRIPT_WHITELIST constant to keep the list of scripts to enable for. I realize you want to just do show_bug.cgi I would personally like to see metrics on the API services.

::: Bugzilla/Metrics/Reporter/STDERR.pm
@@ +61,5 @@
> +        $timer->{start_time} = $timer->{start_time} - $start_time;
> +        delete $timer->{end_time};
> +
> +        # show times in ms instead of fractional seconds
> +        foreach my $field (qw( start_time duration duration_this )) {

Use of uninitialized value in multiplication (*) at /home/bugzilla/devel/htdocs/956230/Bugzilla/Metrics/Reporter/STDERR.pm line 66

This is happening when backgrounding Reporter/STDERR.pm when accessing some timing data that does not have a duration_this key. From looking at the code duration_this is not set 
for memcache timing which I have enabled.

::: metrics.pl
@@ +9,5 @@
> +
> +#
> +# report errors to sentry
> +# expects a filename with a Data::Dumper serialised parameters
> +# called by Bugzilla::Sentry

remove this section

::: template/en/default/admin/params/advanced.html.tmpl
@@ +90,5 @@
> +  metrics_enabled =>
> +    "Collect metrics for reporting to ElasticSearch",
> +  metrics_user_ids =>
> +    "Comma separated list of user_id's which trigger data collection and reporting."
> +    _ " eg <code>3881,5038,5898,13647,20209,251051,373476,409787</code>.",

nit: <kbd></kbd> and others as it is not actually code.
Attachment #8377551 - Flags: review?(dkl) → review+
(Assignee)

Comment 8

3 years ago
(In reply to David Lawrence [:dkl] from comment #7)
> Should we also be looking for process_bug.cgi, page.cgi, xmlrpc.cgi,
> jsonrpc.cgi, and rest.cgi as well as you have altered those to include
> metrics data?

right now we're only interested in show_bug for specific users.

the other files were altered to make it easier to add those endpoints at a later date (this filtering code should be the only place we need to touch).

> Maybe add a METRICS_SCRIPT_WHITELIST constant to keep the list of scripts to
> enable for. I realize you want to just do show_bug.cgi I would personally
> like to see metrics on the API services.

as do i!  we can easily generate a volume of data that will swamp elasticsearch, so we'll probably require more complicated filtering rules than just the script name when we enable that.
(Assignee)

Comment 9

3 years ago
Committing to: bzr+ssh://bjones%40mozilla.com@bzr.mozilla.org/bmo/4.2/
modified Bugzilla.pm
added metrics.pl
modified page.cgi
modified process_bug.cgi
modified sentry.pl
modified show_bug.cgi
modified Bugzilla/DB.pm
missing Bugzilla/Instrument.pm
deleted Bugzilla/Instrument.pm
added Bugzilla/Metrics
modified Bugzilla/Template.pm
modified Bugzilla/Config/Advanced.pm
modified Bugzilla/Install/Filesystem.pm
modified Bugzilla/Install/Requirements.pm
added Bugzilla/Metrics/Collector.pm
added Bugzilla/Metrics/Memcached.pm
added Bugzilla/Metrics/Mysql.pm
added Bugzilla/Metrics/Reporter
added Bugzilla/Metrics/Reporter.pm
added Bugzilla/Metrics/Template
added Bugzilla/Metrics/Template.pm
added Bugzilla/Metrics/Reporter/ElasticSearch.pm
added Bugzilla/Metrics/Reporter/STDERR.pm
added Bugzilla/Metrics/Template/Context.pm
modified Bugzilla/WebService/Server.pm
modified template/en/default/admin/params/advanced.html.tmpl
Committed revision 9263.
Status: ASSIGNED → RESOLVED
Last Resolved: 3 years ago
Resolution: --- → FIXED
(Assignee)

Updated

3 years ago
Depends on: 980188
(Assignee)

Comment 10

3 years ago
unfortunately i overlooked the new module requirement for elasticsearch (bug 980188).
as there isn't a standard rpm for this module, it wasn't able to get installed in time for today's push.

this instrumentation code has been pushed to production, however i'll leave it disabled until the module has been installed and tested on our staging environment.
You need to log in before you can comment on or make changes to this bug.