Closed
Bug 405404
Opened 17 years ago
Closed 17 years ago
HTML::Scrubber throws "undef error - Wide character in subroutine entry" when a field filtered with html_light contains UTF-8 characters
Categories
(Bugzilla :: Bugzilla-General, defect)
Tracking
()
RESOLVED
FIXED
Bugzilla 3.2
People
(Reporter: himorin, Assigned: LpSolit)
References
()
Details
(Keywords: regression)
Attachments
(1 file)
1.12 KB,
patch
|
mkanat
:
review+
|
Details | Diff | Splinter Review |
bugzilla returns wide charactor error when show_bug
'Wide character in subroutine entry at /usr/lib/perl5/site_perl/5.8.5/HTML/Scrubber.pm line 322.'
bugzilla configurations are as following
# grep utf8 data/params
'utf8' => 1,
data are copied from current bugzilla.mozilla.gr.jp's (3.0+modified for -jp).
# so, you can see nearly the same one at http://bugzilla.mozilla.gr.jp/
* it seems error on TT when outputting mysql utf-8 flagged strings
* i had the same error on enter_bug.cgi
* could not fixed with binmode => ':utf8'
my $output;
$template->process("$format->{'template'}", $vars, \$output, binmode => ':utf8')
|| ThrowTemplateError($template->error());
print $output;
* could use with utf8 = 0 in data/params
garbaged on e-mail, but this might be another problem
* database was converted from 3.0(-ja) one (with checkconfig.pl)
* didn't try inserting '[% RAWPERL %]use utf8;[% END %]'
i have no idea to fix this currently, sorry.
![]() |
Assignee | |
Comment 2•17 years ago
|
||
OK, I can reproduce the bug. All you have to do is to inject UTF-8 characters (e.g. …) in a product/component/group description. As they are filtered using FILTER html_light, HTML::Scrubber is called and failed with:
undef error - Wide character in subroutine entry at /usr/lib/perl5/vendor_perl/5.8.8/HTML/Scrubber.pm line 322. Marc, as you reviewed the patch from bug 363153, could you investigate too?
Severity: normal → major
Status: UNCONFIRMED → NEW
Depends on: bz-utf8
Ever confirmed: true
Flags: blocking3.1.3? → blocking3.1.3+
Summary: bugzilla (cvs @2007112608) doesn't work with utf8 = 1 in params → HTML::Scrubber throws "undef error - Wide character in subroutine entry" when a field filtered with html_light contains UTF-8 characters
Target Milestone: --- → Bugzilla 3.2
![]() |
Assignee | |
Updated•17 years ago
|
Keywords: regression
Comment 3•17 years ago
|
||
Hmm, my example of “Insídeṛ” is from a product description, which worked well... Strange... I'm investigating.
Comment 4•17 years ago
|
||
I believe I must have used a product name after all, despite me thinking I checked a product description, too.
This starts working again if I force the code to use the part bracketed out by the
if ($@ || $HTML::Parser::VERSION < 3.40) { # Package(s) not installed.
line. Does this mean it's a problem of certain versions of HTML::Parser?
![]() |
Assignee | |
Comment 5•17 years ago
|
||
No, this means we don't use HTML::Parser and HTML::Scrubber at all if you enter this part of the code. That's why you don't get the error anymore.
![]() |
Assignee | |
Comment 6•17 years ago
|
||
Looks like HTML::Parser->utf8_mode(1) is no longer required. Note that I didn't test this patch on an installation with utf8 = 0.
man HTML::Parser:
$p->utf8_mode( $bool )
Enable this option when parsing raw undecoded UTF-8. This tells the parser that the entities expanded for strings reported by "attr", @attr and "dtext" should be expanded as decoded UTF-8 so they end up compatible with the surrounding text.
If "utf8_mode" is enabled then it is an error to pass strings containing characters with code above 255 to the parse() method, and the parse() method will croak if you try.
So if I understand the last sentence correctly, we shouldn't use utf8_mode anymore as we can now have wide characters.
Reporter | ||
Comment 7•17 years ago
|
||
worked well with patch v1.
Comment 8•17 years ago
|
||
Comment on attachment 290215 [details] [diff] [review]
patch, v1
This looks OK. Do we still need to have HTML::Parser in the OPTIONAL_MODULES for checksetup?
Attachment #290215 -
Flags: review?(wurblzap) → review+
![]() |
Assignee | |
Comment 9•17 years ago
|
||
Yes, as this module is still required to parse HTML tags in descriptions correctly.
Flags: approval?
Comment 10•17 years ago
|
||
(In reply to comment #9)
> Yes, as this module is still required to parse HTML tags in descriptions
> correctly.
Okay. Is a specific *version* of it still required?
![]() |
Assignee | |
Comment 11•17 years ago
|
||
(In reply to comment #10)
> Okay. Is a specific *version* of it still required?
I think we should still require 3.40 or better as we are sure it knows what to do with UTF8 stuff.
Comment 12•17 years ago
|
||
(In reply to comment #11)
> I think we should still require 3.40 or better as we are sure it knows what to
> do with UTF8 stuff.
Okay, yes. The Changes for 3.39_90 support that decision.
Flags: approval? → approval+
![]() |
Assignee | |
Comment 13•17 years ago
|
||
Checking in Bugzilla/Util.pm;
/cvsroot/mozilla/webtools/bugzilla/Bugzilla/Util.pm,v <-- Util.pm
new revision: 1.64; previous revision: 1.63
done
Status: ASSIGNED → RESOLVED
Closed: 17 years ago
Resolution: --- → FIXED
![]() |
Assignee | |
Comment 14•17 years ago
|
||
In the testcase, we should upgrade from a non-utf8 installation and see how fields using |FILTER html_light| are displayed. Be sure that utf8 = 0 in data/params (we already know the patch works fine when the param is turned on).
Flags: testcase?
![]() |
Assignee | |
Updated•12 years ago
|
Flags: testcase?
You need to log in
before you can comment on or make changes to this bug.
Description
•