Closed Bug 405404 Opened 18 years ago Closed 18 years ago

HTML::Scrubber throws "undef error - Wide character in subroutine entry" when a field filtered with html_light contains UTF-8 characters

Tracking

()

Status:

RESOLVED FIXED

Milestone:

Bugzilla 3.2

People

(Reporter: himorin, Assigned: LpSolit)

References

(
URL
)

Details

(Keywords: regression)

Keywords:

regression

Votes:

Bug Flags:

mkanat

approval

LpSolit

blocking3.1.3

Attachments

(1 file)

patch, v1 18 years ago Frédéric Buclin 1.12 KB, patch	mkanat : review+	Details \| Diff \| Splinter Review

A. Shimono [:himorin]

Reporter

Description

•

18 years ago

bugzilla returns wide charactor error when show_bug 'Wide character in subroutine entry at /usr/lib/perl5/site_perl/5.8.5/HTML/Scrubber.pm line 322.' bugzilla configurations are as following # grep utf8 data/params 'utf8' => 1, data are copied from current bugzilla.mozilla.gr.jp's (3.0+modified for -jp). # so, you can see nearly the same one at http://bugzilla.mozilla.gr.jp/ * it seems error on TT when outputting mysql utf-8 flagged strings * i had the same error on enter_bug.cgi * could not fixed with binmode => ':utf8' my $output; $template->process("$format->{'template'}", $vars, \$output, binmode => ':utf8') || ThrowTemplateError($template->error()); print $output; * could use with utf8 = 0 in data/params garbaged on e-mail, but this might be another problem * database was converted from 3.0(-ja) one (with checkconfig.pl) * didn't try inserting '[% RAWPERL %]use utf8;[% END %]' i have no idea to fix this currently, sorry.

Frédéric Buclin

Assignee

Comment 1

•

18 years ago

Looks similar to bug 405362. Max?

Flags: blocking3.1.3?

Frédéric Buclin

Assignee

Comment 2

•

18 years ago

OK, I can reproduce the bug. All you have to do is to inject UTF-8 characters (e.g. …) in a product/component/group description. As they are filtered using FILTER html_light, HTML::Scrubber is called and failed with: undef error - Wide character in subroutine entry at /usr/lib/perl5/vendor_perl/5.8.8/HTML/Scrubber.pm line 322. Marc, as you reviewed the patch from bug 363153, could you investigate too?

Severity: normal → major

Status: UNCONFIRMED → NEW

Depends on: bz-utf8

Ever confirmed: true

Flags: blocking3.1.3? → blocking3.1.3+

Summary: bugzilla (cvs @2007112608) doesn't work with utf8 = 1 in params → HTML::Scrubber throws "undef error - Wide character in subroutine entry" when a field filtered with html_light contains UTF-8 characters

Target Milestone: --- → Bugzilla 3.2

Frédéric Buclin

Assignee

Updated

•

18 years ago

Keywords: regression

Marc Schumann [:Wurblzap]

Comment 3

•

18 years ago

Hmm, my example of “Insídeṛ” is from a product description, which worked well... Strange... I'm investigating.

Marc Schumann [:Wurblzap]

Comment 4

•

18 years ago

I believe I must have used a product name after all, despite me thinking I checked a product description, too. This starts working again if I force the code to use the part bracketed out by the if ($@ || $HTML::Parser::VERSION < 3.40) { # Package(s) not installed. line. Does this mean it's a problem of certain versions of HTML::Parser?

Frédéric Buclin

Assignee

Comment 5

•

18 years ago

No, this means we don't use HTML::Parser and HTML::Scrubber at all if you enter this part of the code. That's why you don't get the error anymore.

Frédéric Buclin

Assignee

Comment 6

•

18 years ago

Attached patch patch, v1 — Details — Splinter Review

Looks like HTML::Parser->utf8_mode(1) is no longer required. Note that I didn't test this patch on an installation with utf8 = 0. man HTML::Parser: $p->utf8_mode( $bool ) Enable this option when parsing raw undecoded UTF-8. This tells the parser that the entities expanded for strings reported by "attr", @attr and "dtext" should be expanded as decoded UTF-8 so they end up compatible with the surrounding text. If "utf8_mode" is enabled then it is an error to pass strings containing characters with code above 255 to the parse() method, and the parse() method will croak if you try. So if I understand the last sentence correctly, we shouldn't use utf8_mode anymore as we can now have wide characters.

Assignee: general → LpSolit

Status: NEW → ASSIGNED

Attachment #290215 - Flags: review?(wurblzap)

A. Shimono [:himorin]

Reporter

Comment 7

•

18 years ago

worked well with patch v1.

Max Kanat-Alexander

Comment 8

•

18 years ago

Comment on attachment 290215 [details] [diff] [review] patch, v1 This looks OK. Do we still need to have HTML::Parser in the OPTIONAL_MODULES for checksetup?

Attachment #290215 - Flags: review?(wurblzap) → review+

Frédéric Buclin

Assignee

Comment 9

•

18 years ago

Yes, as this module is still required to parse HTML tags in descriptions correctly.

Flags: approval?

Max Kanat-Alexander

Comment 10

•

18 years ago

(In reply to comment #9) > Yes, as this module is still required to parse HTML tags in descriptions > correctly. Okay. Is a specific *version* of it still required?

Frédéric Buclin

Assignee

Comment 11

•

18 years ago

(In reply to comment #10) > Okay. Is a specific *version* of it still required? I think we should still require 3.40 or better as we are sure it knows what to do with UTF8 stuff.

Max Kanat-Alexander

Comment 12

•

18 years ago

(In reply to comment #11) > I think we should still require 3.40 or better as we are sure it knows what to > do with UTF8 stuff. Okay, yes. The Changes for 3.39_90 support that decision.

Flags: approval? → approval+

Frédéric Buclin

Assignee

Comment 13

•

18 years ago

Checking in Bugzilla/Util.pm; /cvsroot/mozilla/webtools/bugzilla/Bugzilla/Util.pm,v <-- Util.pm new revision: 1.64; previous revision: 1.63 done

Status: ASSIGNED → RESOLVED

Closed: 18 years ago

Resolution: --- → FIXED

Frédéric Buclin

Assignee

Comment 14

•

18 years ago

In the testcase, we should upgrade from a non-utf8 installation and see how fields using |FILTER html_light| are displayed. Be sure that utf8 = 0 in data/params (we already know the patch works fine when the param is turned on).

Flags: testcase?

Frédéric Buclin

Assignee

Updated

•

13 years ago

Flags: testcase?

You need to log in before you can comment on or make changes to this bug.

Bugzilla

HTML::Scrubber throws "undef error - Wide character in subroutine entry" when a field filtered with html_light contains UTF-8 characters

Categories

(Bugzilla :: Bugzilla-General, defect)

Tracking

()

People

(Reporter: himorin, Assigned: LpSolit)

References

(
URL
)

Details

(Keywords: regression)

Crash Data

Security

(public)

User Story

Attachments

(1 file)

Description

Comment 1

Comment 2

Updated

Comment 3

Comment 4

Comment 5

Comment 6

Comment 7

Comment 8

Comment 9

Comment 10

Comment 11

Comment 12

Comment 13

Comment 14

Updated

Attachment

General

Description

File Name

Content Type