UTF-8 characters are incorrectly displayed in New Charts and graphical reports

RESOLVED FIXED in Bugzilla 5.0

Status

()

defect
RESOLVED FIXED
14 years ago
3 years ago

People

(Reporter: roman, Assigned: LpSolit)

Tracking

({intl})

unspecified
Bugzilla 5.0
Bug Flags:
approval +

Details

(URL)

Attachments

(1 attachment, 2 obsolete attachments)

(Reporter)

Description

14 years ago
User-Agent:       Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.7b) Gecko/20040421 MultiZilla/1.6.3.1d
Build Identifier: 

At http://tinyurl.com/4jyql please look at Spider product - UTF-8 extended
characters are broken. 

Reproducible: Always

Steps to Reproduce:
1. generate any graphic report which contains UTF-8 chars (on the picture)
Actual Results:  
UTF-8 chars are broken.

Expected Results:  
UTF-8 chars should be displayed correctly.

According to http://www.boutell.com/gd/manual2.0.33.html - 'The string may
contain UTF-8 sequences like: "À"', but I don't know if it's usefull in
this case.
(Reporter)

Comment 1

14 years ago
If you are going to switch to UTF-8 in 2.20 (bug 126266), it would be nice to
display all UTF-8 characters correctly
Flags: blocking2.20?
bug 126266 which is a prerequisite for this has already missed the boat for
2.20, pushing this back accordingly.

Confirming, I can reproduce this on landfill even with all the UTF8 stuff applied. 
Status: UNCONFIRMED → NEW
Ever confirmed: true
Flags: blocking2.20? → blocking2.20-
Target Milestone: --- → Bugzilla 2.22
Posted patch Patch (obsolete) — Splinter Review
This patch goes the way suggested in comment 0. We need at least gdlib 2.0.26 for this, but I don't know how to enforce this. The patch doesn't work for me, btw -- either I have a too-low version of gdlib (how can I check?), or there is some other error left.
Assignee: gerv → wurblzap
Status: NEW → ASSIGNED
Attachment #205847 - Flags: review?
Note to self: the entity mangling should be conditional on the utf8 parameter. Fix for next patch or checkin.
Comment on attachment 205847 [details] [diff] [review]
Patch

This doesn't fix the problem with gd 2.0.28 (version can be shown with gdlib-config --all) and GD module v2.30. Looks like the values in the resulting data hash are not quoted.

>Index: report.cgi
>===================================================================
>+foreach (@col_names) {
>+    $_ = css_class_quote($_);
>+}

Using css_class_quote seems misleading as we are not quoting CSS classes. Or are we? Maybe filters should be added to the template(s) instead?

> $vars->{'col_names'} = \@col_names;
> $vars->{'row_names'} = \@row_names;
> $vars->{'tbl_names'} = \@tbl_names;

Shouldn't row_names and possibly others be UTF encoded too?
Attachment #205847 - Flags: review? → review-
I can't seem to make it work. By now, I suspect that the sentence about entity sequences refers to gdlib's gdImageStringFT function only.

Bailing.
Assignee: wurblzap → gerv
Status: ASSIGNED → NEW
http://tinyurl.com/4jyql shows a double conversion: some process is assuming that the data is in either ISO-8859-2 or ISO-8859-4 and converting it from that to UTF-8.
Duplicate of this bug: 388520

Comment 9

12 years ago
so, now  there is still no solution ?
i think you should specify some font contains griphs for the data..
like

  graph.set_x_label_font(Param('graphfontname'), 9);
  graph.set_x_axis_font(Param('graphfontname'), 9);
  graph.set_title_font(Param('graphfontname'), 9);
  graph.set_legend_font(Param('graphfontname'), 9);

in template/*/default/reports/report-*.png.tmpl.
You can use utf-8 data without converting to &#xx;, i think.

# but,,  i heard that l18n is not the current target of bugzilla??

Updated

11 years ago
Duplicate of this bug: 364505

Comment 12

11 years ago
Confirmed for 3.0+ per bug 364505, kindly add 'intl' keyword.

IMHO correct solution is offered by himorin.  But there are different ideas about font value:

1. Multiple font specification is supported by set_xxx_font.  If we manage to invent reasonable cross-platform default (just like <font face="Verdana,Arial"> in HTML and CSS) -- we can put these into default templates.  See http://search.cpan.org/dist/GDGraph/Graph.pm#FONTS

2. If correct fonts are not installed at all on some platforms and/or locales, and there are good public domain fonts around -- we can document it right after  GD installation instructions, and still hardcode the reference into default templates.

3. We can use a parameter (also a backport from Bugzilla-ja).

4. If we distinguish between server administrators and Bugzilla instance maintainers (Bug 364505 comment #0):
> Why localconfig and not data/params:

> - Font file paths are OS dependent
> - Local files may be not accessible by maintainer at all
(Assignee)

Comment 13

11 years ago
The 2.22 branch is restricted to security bugs -> 3.2 (unless you can attach a non-invasive patch for 3.0 before it becomes restricted to security bugs too, in which case I will retarget the bug to 3.0).
Target Milestone: Bugzilla 2.22 → Bugzilla 3.2
Duplicate of this bug: 417639
(Assignee)

Updated

11 years ago
Assignee: gerv → charting
Depends on: 427961
(Assignee)

Updated

10 years ago
Duplicate of this bug: 480089

Comment 16

10 years ago
I found a way to address the problem in GD::Graph

* define FONT_PATH in apache config: (for example, on my Ubuntu machine)

| FONT_PATH /usr/share/fonts/truetype/msttcorefonts

* in all the "template/.../default/reports/report-*.png.tmpl" define the
fonts:

|  graph.set_title_font(['verdana', 'arial'], 8);
|  graph.set_x_label_font(['verdana', 'arial'], 8);
|  graph.set_y_label_font(['verdana', 'arial'], 8);
|  graph.set_x_axis_font(['verdana', 'arial'], 8);
|  graph.set_y_axis_font(['verdana', 'arial'], 8);
|  graph.set_y_values_font(['verdana', 'arial'], 8);
|  graph.set_legend_font(['verdana', 'arial'], 8);

GD::Text will then use the TrueType font if available and will work with
UTF8.

P.S. I don't really like the idea of defining the FONT_PATH in the
apache config. Would it be a good idea to add an option in the parameters?
(In reply to comment #16)
> I found a way to address the problem in GD::Graph
> 
> * define FONT_PATH in apache config: (for example, on my Ubuntu machine)

You can specify fonts with the full path in templates.

I think we have another bug with testing patches. dupme? or did we divided the problem?

Comment 18

10 years ago
(In reply to comment #17)
> > * define FONT_PATH in apache config: (for example, on my Ubuntu machine)
> You can specify fonts with the full path in templates.

Again, separating font names from paths is good and sometimes necessary -- when server administrators and Bugzilla maintainers are not the same people (Bug 364505 comment #0):
| Why localconfig and not data/params:

| - Font file paths are OS dependent
| - Local files may be not accessible by maintainer at all

Generic font names would work for many instances, keeping templates distribution-agnostic.

Not sure whether httpd.conf is more convenient than any other place.

Comment 19

10 years ago
> (In reply to comment #17)
> I think we have another bug with testing patches. dupme? or did we divided the
> problem?

Bug 427961 you mean?  See also bug 287684.
(Assignee)

Comment 20

10 years ago
Bugzilla 3.2 is restricted to security bugs only. Mass-retargetting to 3.6.
Target Milestone: Bugzilla 3.2 → Bugzilla 3.6

Updated

9 years ago
Duplicate of this bug: 564629
(In reply to comment #21)
> *** Bug 564629 has been marked as a duplicate of this bug. ***

Wow! 5 years this bug waiting for resolve!

Updated

9 years ago
Flags: blocking4.0?
(Assignee)

Comment 23

9 years ago
Not a blocker. This bug exists for a long time.
Flags: blocking4.0? → blocking4.0-
(Assignee)

Comment 24

8 years ago
Bugzilla 3.6 is now restricted to security fixes only, and this bug got no traction for several months. We will retarget this bug once it has a patch ready for checkin.
Target Milestone: Bugzilla 3.6 → ---
(Assignee)

Updated

6 years ago
No longer depends on: 427961
Duplicate of this bug: 427961
(Assignee)

Comment 26

6 years ago
Unifont is the most complete and free font I know. We should point to it by default.
(Assignee)

Comment 27

6 years ago
http://unifoundry.com/unifont.html just released unifont-6.3.20131006.ttf, which can be installed on all machines. It has 100% coverage in the Unicode 6.3 Basic Multilingual Plane. That's all we need to fix this bug. This file is pretty big (14 Mb), so it cannot be included in the Bugzilla tarball. But it's not unreasonable to ask admins to install this file in their system. Then we can let Bugzilla look for it (/usr/share/fonts/TTF/ on Linux, C:\Windows\fonts on Windows).
(Assignee)

Comment 28

6 years ago
Posted patch patch, v1 (obsolete) — Splinter Review
I finally added a parameter as suggested in bug 427961. This will let admins specify another path to the font (e.g. the bugzilla/ directory if installed locally) or another font if they really want to, such as the proprietary Arial Unicode font included in Microsoft Office.
Assignee: charting → LpSolit
Attachment #205847 - Attachment is obsolete: true
Status: NEW → ASSIGNED
Attachment #816836 - Flags: review?(dkl)
(Assignee)

Comment 29

6 years ago
Posted patch patch, v1.1Splinter Review
Oops, forgot to reword a sentence in the documentation.
Attachment #816836 - Attachment is obsolete: true
Attachment #816836 - Flags: review?(dkl)
Attachment #816841 - Flags: review?(dkl)
(Assignee)

Comment 30

6 years ago
Comment on attachment 816841 [details] [diff] [review]
patch, v1.1

Marc: maybe you are interested in reviewing this patch as german is affected by this problem?
Attachment #816841 - Flags: review?(bugzilla.1.wurblzap)
Yup, ok.
I tried the patch, and it didn't help at all, with the parameter set to a downloaded Unifont file. What might I be doing wrong?

Non-ASCII characters are being displayed as two seemingly unrelated characters, before and now. Are you definite that this is not a character set encoding issue?
(Assignee)

Comment 33

6 years ago
(In reply to Marc Schumann [:Wurblzap] from comment #32)
> I tried the patch, and it didn't help at all, with the parameter set to a
> downloaded Unifont file. What might I be doing wrong?

What is font_file set to? Which OS?


> Non-ASCII characters are being displayed as two seemingly unrelated
> characters, before and now. Are you definite that this is not a character
> set encoding issue?

Without the patch, non-ASCII characters are unreadable. With the patch applied and the parameter above set to point to the .ttf file, all characters are displayed correctly (even Cyrillic, Chinese, and accentuated characters)
In practice, no single font can be truly universal. For example for the CJK unified ideograms, it can contain either the most appropriate shapes for Chinese, or the most appropriate ones for Japanese, but not for both.

It would be better to be able to either specify for each unicode range which font is preferred, or to be able to list several font to use in preferential order.
The second option might well be the simplest, Chinese, Japanese and Korean users could then first list their favorite national font, and then use unifont as a backup for characters that are missing inside it.
(Assignee)

Comment 35

6 years ago
(In reply to Jean-Marc Desperrier from comment #34)
> In practice, no single font can be truly universal.

As I said, unifont has full support for the BMP in Unicode 6.3. This should be sufficient for most cases as MySQL utf8 encoding is unable to support characters outside BMP anyway.


> It would be better to be able to either specify for each unicode range which
> font is preferred, or to be able to list several font to use in preferential
> order.

You cannot do that. It's not possible to use several fonts with ligbd. So if you have characters from a wide range of Unicode points, you have to select only one font anyway. It doesn't make sense to ask admins to specify fonts per Unicode range.
Comment on attachment 816841 [details] [diff] [review]
patch, v1.1

Ok, I know why I saw what I saw during review: I checked the Old Charts (reports.cgi). These are still broken. But the patch fixes the issue for me in New Charts (charts.cgi), so r=Wurblzap provided you change the bug title mentioning that you fix New Charts only. Is there a corresponding bug for Old Charts? If not, can you please file one?
Attachment #816841 - Flags: review?(bugzilla.1.wurblzap) → review+
(Assignee)

Comment 37

6 years ago
(In reply to Marc Schumann [:Wurblzap] from comment #36)
> Is there a corresponding bug for Old Charts? If not, can you please file one?

I didn't find one for Old Charts. Per http://search.cpan.org/~chartgrp/Chart/Chart.pod#TO_DO, the Chart module doesn't support True Type fonts, only GD fonts, which do not support UTF8 characters. So I doubt we can do anything about old charts if we still use Chart. And per bug 232113, I doubt a rewrite of the Old Charts code is going to happen. If you want a bug for it, feel free to file it. :)

Thanks for the review!
Flags: blocking4.0-
Flags: blocking2.20-
Flags: approval?
Keywords: relnote
Summary: UTF-8 chars incorrectly displayed on the graphic reports → UTF-8 characters are incorrectly displayed in New Charts and graphical reports
Target Milestone: --- → Bugzilla 5.0
(Assignee)

Updated

6 years ago
Attachment #816841 - Flags: review?(dkl)

Updated

6 years ago
Flags: approval? → approval+
(Assignee)

Comment 38

6 years ago
Committing to: bzr+ssh://lpsolit%40gmail.com@bzr.mozilla.org/bugzilla/trunk/
modified Bugzilla/Config/Common.pm
modified Bugzilla/Config/DependencyGraph.pm
modified docs/en/xml/administration.xml
modified template/en/default/admin/params/dependencygraph.html.tmpl
modified template/en/default/reports/chart.png.tmpl
modified template/en/default/reports/report-bar.png.tmpl
modified template/en/default/reports/report-line.png.tmpl
modified template/en/default/reports/report-pie.png.tmpl
Committed revision 8806.
Status: ASSIGNED → RESOLVED
Last Resolved: 6 years ago
Resolution: --- → FIXED
(Assignee)

Comment 39

4 years ago
Added to relnotes for 5.0rc1.
Keywords: relnote
(Assignee)

Updated

3 years ago
Duplicate of this bug: 863222
You need to log in before you can comment on or make changes to this bug.