Closed Bug 75260 Opened 24 years ago Closed 23 years ago

Compress 8Kbyte Font Glyph Maps

Categories

(Core :: Layout, defect)

x86
Windows 95
defect
Not set
normal

Tracking

()

RESOLVED INVALID
Future

People

(Reporter: rbs, Assigned: shanjian)

Details

(Keywords: fonts)

Attachments

(1 file)

[spin-off of bug 74494 and bug 29358] For each font, the GFX font sub-system uses a map to record Unicode characters that can be represented with that font. The set of maps is kept in a hashtable. Each of the maps is 8K. Since different fonts can share the same map, a map can be re-used. Besides, maps are style-invariant, so fonts that only differ stylistically have de-facto the same map. The problem is that there is a cost in determining if a map can be shared. This bug is a request for further investigating about the cost of the sharing of font maps, with the aim of possibly enabling the sharing.
-> Roy
Assignee: karnaze → yokoyama
On Unix these 8K maps are shared (see: nsFontCharSetInfo) per font registry/encoding type.
I think we could save some run time memory by making the maps a two level lookup and sharing to second level map where identical. The most likey second level map to share is an "empty" map (no chars). One way to do this make each map an array of 256 pointers each pointing to 256 bytes. This would add 4 * 256 bytes = 2K. If more than 25% of the 65K range is empty then there would be memory savings. As I recollect Unicode has about 40,000 chars to even a fully populated map would save a tiny bit of memory. For maps such as ISO8859-* the saving would be in the range of 6K per map. There would be a slight run time cost to do the single level of indexing. No "if" statements should be required as every table would have the full 256 pointers and every pointer would point to a valid (possibly shared) sub map.
So your idea is that it is more likely to see a sharing of a sub-part than to see the sharing of the whole. Since many people read bugzilla, let me formalize things a little bit so that others who might have comments can see exactly what you mean. bstell's idea is this: 1) Picture the Unicode table 0x0000-0xFFFF as a long _bitwise_ vector where the n-th bit is set if the n-th Unicode character exists in the current font under consideration, i.e., there are 'holes' in this vector because the font doesn't represent the entire Unicode set. The size of this long bitwise vector is 64 Kbits, i.e., 8 KBytes. (BTW, this is the 'font map' that is currently used in GFX and is the subject of this bug.) 2) Conceptually divide this long bitwise vector into 'segments' of 256bits. The number of such segments is therefore 64 Kbits / 256bits = 256 segments. 3) Consider an auxilliary vector of length 256 such that the i-th element of this vector points to the i-th segment in the bitwise vector if the font has characters for that segment, or null otherwise. 4) Now, the sharing. (This is where the subtelety lies and differs from the existing scheme). Consider another new font for which some of the Unicode 'segments' match with that of the previous font. For these matches, make their auxiliary pointer points to the existing segments of the previous fonts. For the segments that don't match (i.e., the 'holes' are not the same), allocate new space and sync the remaining auxilliary pointers to point to these newly allocated segments. 5) Now that you got it, re-read all these steps with 1) replaced by just the initial segments needed for the first font -- not necessarily the _long_ one corresponding to the entire Unicode set, i.e., trim/collapse all the 'big holes' to start with. The question is, would there more win with this scheme in terms of space/speed?
One interesting advantage that I can see is that this scheme easily extends to plane 1 characters. If the 'null' entries in the auxilliary vector (i.e., the entries that point to the 'empty' submap) are trimmed, and the vector is turned into a hash with keys as the three higher bytes, 0xNNN00, then it becomes possible to support plane0, plane1, or higher. Unlike the current scheme, this extended scheme would'nt be hampered by the fixed 64Kbit array limit.
My proposal is a regular indexed map. Instead of each mmap being an 8K byte array, make it an array of 256 pointers to sub maps. Each sub map would have 256 bits = 32 bytes. If we allocate an "empty" sub map (no bits set) and share it then we save 32 bytes every time we share this "empty" sub map. For a given map the 256 pointers use 2K of memory. If we share the empty map 64 times (25% of the time) in a map then we break even. A CJK font will have the smallest number of non empty sub maps at around 50%. Even here (worst case) we would save around 2K per map. Other encodings will only use a single submap (or two) yielding about a 6K savings per map. By making every pointer valid we would not have to check for null pointers so the speed impact would be very small. The alternate suggestion of a hash is quite interesting. Using an inverted mapping table as the memory used is proportional to the actual number of sub maps (no 2K overhead per map). (Because of this feature, inverted maps are used by CPUs for their memory mapping table. Yes, hashing is done by the MMU). The speed might be an issue since instead of a bit map lookup we would be doing a hash lookup per character. It would allow us to map > 64K chars without a big memory overhead which is something we may/will need to do in the future.
You made a motivating remark: (most?) "encodings will only use a single submap (or two)". This means that there are going to be _lots_ of 'empty' in the 256-length pointer array (I use 'empty' to say that they can possibly point to the 'empty map' so as to avoid the null check). In view of this remark, adding a third level can further help in reducing the memory signficantly. Say, keep only the one (or two) pointers, + the empty one, and use another full 16-length pointer array where all the pointers point to something -- including those leading to the 'empty path'. With this, the memory used by the maps will become essentially negligible, at the expense of two levels of indirection.
A pointer takes up 8 bytes on some systems. It may be possible to use offsets instead of pointers. I haven't heard of fonts that allow more than 64K glyphs. Even TrueType (OpenType) only allows 64K glyphs, even if the newer specs allow Unicode surrogates. Also, since we are cramming the info into bits, it may be possible to use 16-bit offsets instead of 32- or 64-bit pointers.
Using a 3 level map we could actually map the surrogate range. for example if the levels were divided 2**6 -> 2**5 -> 2**6 -> 2**3 (2**3==byte) we would have a 2**20 bit range For Latin1 the memory would be about 640 bytes: 2**6 need exactly one array = 64*4 * 1 = 256 bytes 2**5 need 2 arrays: one for empty, one non-empty = 32*4 * 2 = 256 bytes 2**6 need 2 arrays: one for empty, one non-empty = 64*2 * 1 = 128 bytes For CJK the memory would be about 4864 bytes: 2**6 need exactly one array = 64*4 * 1 = 256 bytes 2**5 need 4 arrays: one for empty, for the 4K bytes of sub maps would need 64 pointers or 2 arrays but lets say 3 arrays since we don't expect the arrays to align perfectly with the sub maps = 32*4 * 4 = 512 bytes 2**6 need 2: enough bits for 27k chars, round up to 32K = 32K / 8 = 4K bytes For a *full* surrogate map of 1 million chars it would be about 140k: 2**6 need exactly one array = 64*4 * 1 = 256 bytes 2**5 need 4 arrays: one for empty, for the 128K bytes of sub maps would need 2048 pointers or 64 arrays = 32*4 * 64 = 8192 bytes 2**6 need 2: enough bits for 1024K chars = 1024K / 8 = 128K bytes
please ignore the "need 2" in the sub maps for CJK and full surrogate
Regarding erik's comments, yep, offsets sounds like the interesting way to go when implementing. As for the 64K limit, the concern is mostly about being able to tell if a font has a glyph for a Unicode point whose value is > 64K. So the font isn't assumed to have > 64K glyphs.
It may be a good idea to measure both the speed and memory usage of the various implementations discussed here before deciding on a particular one.
assign to bstell
Assignee: yokoyama → bstell
Status: NEW → ASSIGNED
Target Milestone: --- → mozilla0.9.1
change the "Summary:" from "Investigate the sharing of font maps" to "Compress 8Kbyte Font Glyph Maps"
Summary: Investigate the sharing of font maps → Compress 8Kbyte Font Glyph Maps
Compressing the Glyph maps The previous design assumed 4 byte pointers but because the map with pointers is only about 9 K we can use 2 byte 'pointers' (actually offsets). MAP SIZE: The map currently takes 2**16 bits = 2**13 bytes * 2**3 bits/byte = 8K bytes. If we define a page to be 32 bytes and we add one level of indirection we get one pointer (4 bytes) per 32 bytes or 11% overhead. A 64 byte page would have about 5% overhead. Thus the max map would be about 9K. The overhead is 1/2 of this with a 2 byte 'pointer'. A 64 byte page has 2**11 bits so the page map needs to have 2**7 entries. 2**7 page map entries at 2 bytes per page offset = 256 bytes We can build the map from a single malloc such that the page map is at the beginning and the pages follow. If we use a single memory alloc we get a minimum map size of: 256 bytes for page map 64 bytes for blank page 64 bytes for one actual page ----------------------------- 384 bytes Using a 32 byte page would give Assume the map is an array of PRUint16. The page map entry: page_map_entry = mMap[aChar>>9] Page base: page_base = mMap[page_map_entry] page_base = mMap[aChar>>9] Offset in page: (aChar>>6) & 0x1F The ushort: ushort = mMap[page_base + ((aChar>>6) & 0x1F)] ushort = mMap[mMap[aChar>>9] + ((aChar>>6) & 0x1F)] The bit: bit = ushort>>(aChar & 0xF) bit = mMap[mMap[aChar>>9] + ((aChar>>6) & 0x1F)] >>(aChar & 0xF) Test the bit: (bit) & 1 (mMap[mMap[aChar>>9] + ((aChar>>6) & 0x1F)] >>(aChar & 0xF)) & 1 So the accessor macro becomes: #define HAS_A_GLYPH(mMap,aChar) \ (mMap[mMap[(achar)>>9] + (((aChar)>>6) & 0x1F)] \ >> ((aChar)&0xF)) & 1 This takes 3 SHIFTs, 3 ANDs, and 2 OFFSETs. (If we have 2 levels of indirection we can further compress the minimum map to around 160 bytes. However, the macro becomes even more complicated.) The current macro is: #define FONT_HAS_GLYPH(map, char) IS_REPRESENTABLE(map, char) #define IS_REPRESENTABLE(info, c) (((info)[(c) >> 5] >> ((c) & 0x1f)) & 1L) which takes 2 SHIFTs, 2 ANDs, and 1 OFFSET.
Target Milestone: mozilla0.9.1 → mozilla0.9.2
move to moz0.9.3 since this is not moz0.9.2 critical. we probably should even consider it as future.
Target Milestone: mozilla0.9.2 → mozilla0.9.3
mark it as future
Target Milestone: mozilla0.9.3 → Future
the code is in for Linux, see bug 95518
Keywords: fonts
The GfxWin fix for the compressed charmap was recently checked-in in bug 102706. Remaining elements: - *sharing* of identical maps in the font map hashtable - support of surrogate (chars with Unicode points greater than 0x0000-0xFFFF)
move to shanjian for remaining issues
Assignee: bstell → shanjian
Status: ASSIGNED → NEW
please open a new bug focusing on the work to be done not the work that has already been done
Status: NEW → RESOLVED
Closed: 23 years ago
Resolution: --- → INVALID
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: