Closed Bug 1052724 Opened 10 years ago Closed 10 years ago

Use JSON::XS instead of Data::Dumper to store parameters into data/params

Categories

(Bugzilla :: Administration, task)

4.5.5
task
Not set
normal

Tracking

()

RESOLVED FIXED
Bugzilla 5.0

People

(Reporter: Wurblzap, Assigned: LpSolit)

References

Details

Attachments

(1 file, 1 obsolete file)

The data/params file is not necessarily encoded in UTF-8 even if Bugzilla's utf8 switch is turned on. This causes problems especially for parameters which are group names, if these group names contain non-ASCII characters.

This can be reproduced with a script along the lines of
----------------------------------------------------------------------
#!/usr/bin/perl -wT

use 5.10.1;
use strict;
use lib qw(. lib);

use Bugzilla;

my $chart_group_name    = Bugzilla->params->{chartgroup};
my $chart_group         = Bugzilla::Group->new({name => $chart_group_name});
my $chart_group_members = $chart_group->members_direct();
----------------------------------------------------------------------

If the chartgroup parameter contains a group with an ASCII-only name, there is no error. If the group does contain a non-ASCII-only name, there will be an error.

This causes problems for all group-specifying parameters.
I cannot reproduce. Which group name did you use to trigger the error?
I can reproduce with a group named "Frédéric" ;)

editparams.cgi writes "Fr\x{e9}d\x{e9}ric" into data/params for this. This appears to be escaped.
If I manually modify data/params so that the chartgroup parameter contains a utf8-ly correct "Frédéric", the testing script above works without error. (But then editparams.cgi cannot read the parameter correctly.)

Even more weird, if I have "Fr\x{e9}d\x{e9}ric" in data/params, and then put some non-ASCII in announcehtml, chartgroup gets modified into plain UTF-8 "Frédéric".

Fwiw, I'm on Windows, with Perl 5.16.3.
That's fun. Your script fails with a group named "Frédéric", but works fine with a group named "Дрееву".
The problem comes from Bugzilla::Config::write_params() which calls Data::Dumper->Dump() to write parameters into data/params. From what I found in the documentation, Data::Dumper *always* UTF8-escapes strings, which explains why you see "Fr\x{e9}d\x{e9}ric" instead of "Frédéric" in data/params. To read data/params, we use Safe->rdo() which reads "Fr\x{e9}d\x{e9}ric" and is unaware that we are talking about a UTF8-escaped string. So unless you first call utf8::upgrade() on this parameter value before using it, Bugzilla will "correctly" complain that there is no group named "Fr\x{e9}d\x{e9}ric".

So the easiest fix is to loop over each parameter value in Bugzilla::Config::read_param_file() and call utf8::upgrade(). Another solution would be to stop using Data::Dumper to save parameters into data/params so that we can specify "binmode $fh, ':utf8'" and write data ourselves. But special care would be required to escape/format data correctly.
Or convert data/params from Data::Dumper to JSON::XS which handles unicode both directions for us.
(In reply to (Away Aug 16-22) David Lawrence [:dkl] from comment #5)
> Or convert data/params from Data::Dumper to JSON::XS which handles unicode
> both directions for us.

I agree we should do that. See also bug 1032080.
Attached patch patch, v1 (obsolete) — Splinter Review
Attachment #8473779 - Flags: review?(dkl)
Attachment #8473779 - Flags: review?(bugzilla.1.wurblzap)
Summary: Even with utf8 switched on, parameters are being (may be?) written in other character set encoding → Use JSON::XS instead of Data::Dumper to store parameters into data/params
Target Milestone: --- → Bugzilla 5.0
Comment on attachment 8473779 [details] [diff] [review]
patch, v1

Very nice, r=Wurblzap :)

I really think checksetup.pl should do a conversion (switching data/params to data/params.js, perhaps?), so that we don't have to carry around this eval in live code, but this can be done in another bug.
Attachment #8473779 - Flags: review?(bugzilla.1.wurblzap) → review+
(In reply to Marc Schumann [:Wurblzap] from comment #8)
> I really think checksetup.pl should do a conversion

I thought about that too, but there are two problems:
1) checksetup.pl also calls this code. It doesn't run a separate code. We could certainly split the code into two pieces: moving Safe->rdo() into checksetup.pl itself and leaving JSON::XS->decode in Bugzilla::Config. But see 2) below.
2) when data/params is badly formatted, e.g. because you manually edited it and omitted a double quote, then JSON::XS throws an error and displays the line which causes problem. And depending on which line is displayed, it could leak sensitive data, such as smtp_password or RADIUS_secret. That's why you have to catch the error with eval {} to display a more generic error.

Data leakage in the error message is the reason why I use eval {}, and so it was easier to leave Safe->rdo() where it is now. But if you still think it's a better idea to move the Safe->rdo() code elsewhere, this is doable, but I don't see how to not use eval {} without leaking sensitive data.

Also, I think that if we want the new JS hash into data/params.js, then this should be done as part of this bug, not a separate one, else you would have to guess how the current data/params file is formatted (Data::Dumper output or JSON one), which would require additional code. It's much easier to assume that data/params = Data::Dumper, and data/params.js = JSON.
Attached patch patch, v2Splinter Review
Here is a 2nd patch in which I implement Marc's suggestion to have a distinct params.js file for the JSON format. I also made the comment clearer about why you have to eval()'uate JSON::XS->decode in all cases. See http://search.cpan.org/~mlehmann/JSON-XS/XS.pm#SECURITY_CONSIDERATIONS:

"Also keep in mind that JSON::XS might leak contents of your Perl data structures in its error messages, so when you serialise sensitive information you might want to make sure that exceptions thrown by JSON::XS will not end up in front of untrusted eyes."


About my patch, in case you wonder why I shifted tests in Bugzilla.pm, the reason is that if data/params.js is badly formatted, then |require Bugzilla| at line 74 of checksetup.pl triggers this code and as Bugzilla->params isn't accessible, an error is thrown. With this change, Bugzilla->params isn't triggered (yet) as checksetup.pl is whitelisted and the script can go on a bit further. It will fail later at line 114 when calling Bugzilla::DB::bz_check_requirements(). This gives checksetup.pl enough time to call update_localconfig(), which I consider an improvement, even if checksetup.pl didn't manage to complete successfully.
Assignee: administration → LpSolit
Status: NEW → ASSIGNED
Attachment #8474115 - Flags: review?(dkl)
Attachment #8474115 - Flags: review?(bugzilla.1.wurblzap)
Comment on attachment 8474115 [details] [diff] [review]
patch, v2

(In reply to Frédéric Buclin from comment #9)
> (In reply to Marc Schumann [:Wurblzap] from comment #8)
> > I really think checksetup.pl should do a conversion
> 
> I thought about that too, but there are two problems:

Ok, I understand why we need eval{}.
I'm not happy with having that $old_file code in Config.pm. This should run during checksetup.pl imho.

> Also, I think that if we want the new JS hash into data/params.js, then this
> should be done as part of this bug, not a separate one, else you would have
> to guess how the current data/params file is formatted (Data::Dumper output
> or JSON one), which would require additional code. It's much easier to
> assume that data/params = Data::Dumper, and data/params.js = JSON.

Yes, that's right; r+ anyway because it's there and can be moved in a separate bug.

(In reply to Frédéric Buclin from comment #10)
> About my patch, in case you wonder why I shifted tests in Bugzilla.pm, the

Thanks for the clarification, I was indeed wondering :)
Attachment #8474115 - Flags: review?(bugzilla.1.wurblzap) → review+
(In reply to Marc Schumann [:Wurblzap] from comment #11)
> I'm not happy with having that $old_file code in Config.pm. This should run
> during checksetup.pl imho.

That's exactly what I did. Bugzilla::Config::update_params() is only called by checkesetup.pl. It's the code which removes old parameters and generates the new ones, and which converts renamed ones. When an admin edits parameters from the UI (editparams.cgi), this code is not called at all. write_params() is called instead. Probably the name of the methods are misleading. :)
Oh, great, Plesse excuse me! So, r++, then :)
Attachment #8473779 - Attachment is obsolete: true
Attachment #8473779 - Flags: review?(dkl)
Comment on attachment 8474115 [details] [diff] [review]
patch, v2

Review of attachment 8474115 [details] [diff] [review]:
-----------------------------------------------------------------

r=dkl
Attachment #8474115 - Flags: review?(dkl) → review+
Flags: approval?
Flags: approval? → approval+
Keywords: relnote
To ssh://gitolite3@git.mozilla.org/bugzilla/bugzilla.git
   5802599..e1603d0  master -> master
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
The usual extension for a JSON file these days is .json, not .js. That makes it really clear this is a data file and not executable code in the normal sense. Is it too late to make that improvement? 

Gerv
(I don't manually renaming mine in the copy I just git pull-ed :-)

Gerv
(In reply to Gervase Markham [:gerv] from comment #16)
> The usual extension for a JSON file these days is .json, not .js. That makes
> it really clear this is a data file and not executable code in the normal
> sense. Is it too late to make that improvement? 
> 
> Gerv

Good point. As this is 5.0 only I don't see a reason why we can't go ahead rename the file in a new bug.

dkl
Blocks: 1066184
Added to relnotes for 5.0rc1.
Keywords: relnote
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: