Open Bug 1032080 Opened 10 years ago Updated 5 years ago

Use sereal for database schema

Categories

(bugzilla.mozilla.org :: General, task)

Production
task
Not set
normal

Tracking

()

People

(Reporter: glob, Assigned: dylanAtHome)

Details

(Keywords: perf)

Attachments

(1 file)

we should change bz_schema from using perl/safe to json.

currently bz_schema is a 325k chunk of perl which is evaluated with perl's Safe module on every page load.  it would be much faster to use the JSON::XS module to serialise and deserialise this the schema abstract.

the json version of the same variable is only 53k, and is 10 times faster to deserialise with JSON::XS.

benchmarking (1000 iterations):

      safe: 10 wallclock secs ( 9.62 usr +  0.42 sys = 10.04 CPU) @   99.60/s
safe_terse:  9 wallclock secs ( 8.65 usr +  0.00 sys =  8.65 CPU) @  115.61/s
   json_xs:  1 wallclock secs ( 0.85 usr +  0.00 sys =  0.85 CPU) @ 1176.47/s
   json_pp: 57 wallclock secs (55.03 usr +  0.00 sys = 55.03 CPU) @   18.17/s
      eval:  5 wallclock secs ( 5.69 usr +  0.00 sys =  5.69 CPU) @  175.75/s
eval_terse:  5 wallclock secs ( 5.18 usr +  0.00 sys =  5.18 CPU) @  193.05/s

"safe" is the current code
"json_xs" is what i think we should be using
"eval" is a straight "eval" of bz_schema (this isn't used by bugzilla)
"eval_terse" and "safe_terse" are identical to their non-terse counterparts, however they use a minimised data::dumper string (61k, generated by setting Data::Dumper::Indent = 0)
"json_pp" is the pure-perl json code (if json::xs isn't installed)

i also propose moving JSON::XS from optional to mandatory.  JSON::XS is an established module and is widely available.  if requiring JSON::XS isn't agreeable, then we could use json only if JSON::XS is available.
I agree that JSON::XS is much better and faster than Data::Dumper + Safe (and much easier to write code and to maintain it). Note that the Sereal module is even faster and produces a smaller string than JSON::XS:

  http://search.cpan.org/~yves/Sereal/lib/Sereal.pm


My testing shows the following timing to stringify + perlify ABSTRACT_SCHEMA:

Results for Data::Dumper:
  Length of stored stringified data: 38748
  Encoding + decoding time (ms): 11.6109848022461

Results for JSON::XS:
  Length of stored stringified data: 34129
  Encoding + decoding time (ms): 1.09410285949707

Results for Sereal with Snappy compression:
  Length of stored stringified data: 7050
  Encoding + decoding time (ms): 0.854015350341797

Results for Sereal without compression:
  Length of stored stringified data: 19863
  Encoding + decoding time (ms): 0.761032104492188

In all cases, I called is_deeply() from Test::More to make sure that the decoded string generates the same hashref as the original ABSTRACT_SCHEMA.
Severity: normal → enhancement
Assignee: glob → database
Assignee: database → dylan
Keywords: perf
Summary: change bz_schema from using perl/safe to json → change bz_schema from using perl/safe to Sereal
Attached patch 1032080_1.patchSplinter Review
Work in progress. This causes a strange error currently:

Updating column status_whiteboard in table bugs ...
Old: mediumtext NOT NULL
New: mediumtext DEFAULT '' NOT NULL
DBD::mysql::db do failed: BLOB, TEXT, GEOMETRY or JSON column 'status_whiteboard' can't have a default value [for Statement "ALTER TABLE bugs ALTER COLUMN status_whiteboard
                               SET DEFAULT ''"] at Bugzilla/DB.pm line 724.
What's the purpose of Bugzilla/Sereal.pm? Why not put the relevant code in Bugzilla/DB/Schema.pm directly?
Sharing the objects is useful for performance. Memcached will use this, and we can also implement dclone in terms of sereal here. (Will file bug for that later)
Assignee: dylan → dylan
Component: Database → General
Product: Bugzilla → bugzilla.mozilla.org
QA Contact: default-qa
Summary: change bz_schema from using perl/safe to Sereal → Use sereal for database schema
Version: unspecified → Production
Type: enhancement → task
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: