Closed
Bug 245384
Opened 21 years ago
Closed 18 years ago
[ps] Unicode printing improvements
Categories
(Core :: Printing: Output, defect)
Tracking
()
RESOLVED
WONTFIX
People
(Reporter: kherron+mozilla, Assigned: kherron+mozilla)
Details
(Keywords: memory-footprint)
Attachments
(2 files)
|
195.71 KB,
patch
|
tor
:
review+
|
Details | Diff | Splinter Review |
|
43.82 KB,
patch
|
Details | Diff | Splinter Review |
The PS printing module prints PRUnichar (16-bit) strings by writing the string
to the postscript output and calling a custom procedure called "unicodeshow".
unicodeshow processes the string one byte pair at a time and uses a variety of
rules to find a glyph matching the character code; one of the rules is to
consult a hash table mapping unicode values to glyph names which is embedded
into each print job.
This process needs to be taken out and shot, but until that day comes, it could
be made more efficient. Some issues that can be addressed are:
1) The hash table is hardcoded into mozilla as a big string defining 1051
entries, which is copied verbatim into every print job. This wastes space both
inside mozilla and in each print job. It also precludes doing anything more
intelligent with the contents of the hash.
2) The string is written to the postscript output using a four-character octal
escape for every byte. This is wasteful for latin1 text and makes it harder to
read the postscript source.
3) The unicodeshow procedure performs a lot of processing for each character.
Mozilla should generate more friendly postscript.
| Assignee | ||
Comment 1•21 years ago
|
||
This patch introduces three things:
1) An enumerated type for the postscript glyph names built into mozilla
2) A compact data buffer storing the text of these glyph names
3) An array mapping unicode values to glyph names for the unicodeshow fallback
behavior.
When generating postscript to print a PRUnichar string, mozilla will identify
runs of latin1 characters and print those using an ordinary postscript "show"
operation. In this case, most characters just represent themselves, which
reduces the size of the print job by seven bytes per character.
For characters outside the latin1 character set, mozilla will generate a
"unicodeshow" procedure call as usual. It will also track these characters in a
character code map; at the end of the print job, it will use the new data
tables to generate a unicode->glyph hash table customized to the current print
job. For print jobs containing only latin1 text, mozilla leaves out the hash
table and unicodeshow postscript logic entirely.
Mozilla also embeds a latin1 encoding vector into every print job. I've
reworked that to be defined using the enum list instead of storing the actual
character strings.
With my linux debug builds, this patch reduces code size by about 9.5k of text
and 1k of data. Printing the mozilla start page, which is all latin1, yields a
print job about 45k smaller. Documents with international text will see less
improvement, and larger documents will see more improvement.
| Assignee | ||
Comment 2•21 years ago
|
||
nsPostScriptObj.cpp contains large blocks of lines which were just shifted
right. Here's a diff of that file with -b added to the diff flags.
| Assignee | ||
Updated•21 years ago
|
Attachment #149858 -
Flags: review?(tor)
Comment on attachment 149858 [details] [diff] [review]
Glyph name rework
I'm a bit uneasy about the large tables produced by a unreleased tool,
as it somewhat limits our options if the mapping needs to be modified
in the future. I'd like to see it polished as needed and checked into
the tree somewhere.
Other than that, r=tor.
Attachment #149858 -
Flags: review?(tor) → review+
| Assignee | ||
Comment 4•18 years ago
|
||
Closing this. The code in question is obsolete on the trunk, and only major bug fixes are being taken on the branches.
Status: NEW → RESOLVED
Closed: 18 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•