Closed
Bug 620446
Opened 14 years ago
Closed 14 years ago
Provide support for RTL languages in simple SVG text elements
Categories
(Core :: SVG, defect)
Core
SVG
Tracking
()
RESOLVED
FIXED
Tracking | Status | |
---|---|---|
blocking2.0 | --- | - |
People
(Reporter: smontagu, Assigned: smontagu)
References
Details
(Keywords: rtl)
Attachments
(4 files, 6 obsolete files)
238.89 KB,
image/png
|
Details | |
4.57 KB,
patch
|
Details | Diff | Splinter Review | |
17.69 KB,
patch
|
smontagu
:
review+
roc
:
approval2.0+
|
Details | Diff | Splinter Review |
1.09 KB,
text/html
|
Details |
We won't be fixing bug 311545 before Firefox 4, but we need to provide some basic support for RTL in SVG. It's hurting us that RTL languages are displayed backwards in prominent places in our own sites, for example the labels on the graph at http://input.mozilla.com/ar/ and http://input.mozilla.com/he/
The attached patch is not necessarily final, because I want to clear up some details that are unclear in the spec, especially concerning the direction and text-anchor properties.
Assignee | ||
Updated•14 years ago
|
blocking2.0: --- → ?
Comment 1•14 years ago
|
||
I'm not particularly experienced in C/C++, but
+ do {
+ *(dest++) = *(--src);
+ } while (--length);
looks a bit unsafe. The call to EnsureStringLength doesn't help if length overflows, and then the destination will be incorrect, which leads to writing in incorrect areas.
I don't think we should actually block on this, but hopefully we can take a patch.
blocking2.0: ? → -
Comment 3•14 years ago
|
||
What if there are clusters? Do they need to stay in the right order?
Assignee | ||
Comment 4•14 years ago
|
||
(In reply to comment #3)
> What if there are clusters? Do they need to stay in the right order?
My first reaction to this was "Wow, awesome catch", but when I tested clusters in RTL text they came out fine. Jonathan, does the text run creation code automagically order clusters correctly in RTL script with LTR order?
Assignee | ||
Comment 5•14 years ago
|
||
Attachment #498770 -
Attachment is obsolete: true
Attachment #499813 -
Flags: review?(roc)
Assignee | ||
Comment 6•14 years ago
|
||
Assignee | ||
Comment 7•14 years ago
|
||
Comment on attachment 499813 [details] [diff] [review]
Patch v.2
Strange, the reftests fail on tryserver on Windows only
Attachment #499813 -
Flags: review?(roc)
Comment 8•14 years ago
|
||
Hmmm....The opt reftest images show that the text has been completely scrambled (replaced with apparently-random characters, or maybe byte-swapped or something). The debug reftests crashed during string manipulations under nsBidiPresUtils::CopyLogicalToVisual(), so perhaps the scrambled text is also due to CopyLogicalToVisual() copying data from (or to) some random/incorrect location instead of the proper string buffer.
Assignee | ||
Comment 9•14 years ago
|
||
tryserver likes this version better, but I would be happier if I knew why.
The difference is that I am caching the pointer returned by toBegin.get() in the converter, rather than caching the iterator itself.
Debugging the previous version I saw that it was writing to the wrong location and corrupting the stack.
Attachment #499813 -
Attachment is obsolete: true
Attachment #500011 -
Flags: review?(roc)
nsBidiPresUtils::CopyLogicalToVisual should truncate aDest at first.
I would like to understand how this works with clusters. It seems to me that putting combining marks before the base character rather than after it would break things. Why doesn't it?
Assignee | ||
Comment 11•14 years ago
|
||
I was hoping that Jonathan would respond to comment 4 to explain why it doesn't break things. It doesn't break in reversed-order HTML either, e.g.
data:text/html,<p>‭םֹולָׁש‬</p>
It *does* break with surrogate pairs, but I'm happy to leave that (and also mirrored characters) to the real SVG bidi fix.
Comment 12•14 years ago
|
||
(In reply to comment #11)
> I was hoping that Jonathan would respond to comment 4 to explain why it doesn't
> break things. It doesn't break in reversed-order HTML either, e.g.
> data:text/html,<p>‭םֹולָׁש‬</p>
At least in the case of harfbuzz shaping, this is supposed to work correctly: the shaper knows the inherent directionality of the script, and handles the shaping of reversed (direction-overridden) text. (Internally, it reverses the text to ensure it is in the "natural" direction, then does shaping, then restores the original (overridden) direction.)
> It *does* break with surrogate pairs, but I'm happy to leave that (and also
> mirrored characters) to the real SVG bidi fix.
Hmm. I think it's ok to ignore mirroring for now, but I'd be inclined to fix the surrogate issue, as mangling surrogates will completely trash the text, and means that we'll be passing illegal UTF-16 to lower-level code.
Comment 13•14 years ago
|
||
(If you do want to add mirroring support, maybe you could re-use the utilities that the harfbuzz shaper relies on. There's a gfxUnicodeProperties::GetMirroredChar() method; I think you'd just have to add gfxUnicodeProperties.h to the EXPORTS in Makefile.in, and it should be usable.)
Assignee | ||
Comment 14•14 years ago
|
||
(In reply to comment #12)
> Hmm. I think it's ok to ignore mirroring for now, but I'd be inclined to fix
> the surrogate issue, as mangling surrogates will completely trash the text, and
> means that we'll be passing illegal UTF-16 to lower-level code.
OTOH, mirrored characters are far more likely to come up in real-world examples...
In fact, with the harfbuzz mirroring utility it's not too hard to fix both problems by using the WriteReverse method, currently ifdef'd out in nsBidi.cpp
Assignee | ||
Comment 15•14 years ago
|
||
... like so.
Attachment #500011 -
Attachment is obsolete: true
Attachment #500396 -
Flags: review?(roc)
Attachment #500011 -
Flags: review?(roc)
+ LogicalToVisual converter(toBegin.get(), aBaseDirection, mBidiEngine);
+ copy_string(fromBegin, fromEnd, converter);
I don't think copy_string is what you want here. It seems to be leftover cruft from when strings weren't necessarily contiguous in memory. I think you should just have a function that takes a const PRUnichar* and a length and writes the results to another PRUnichar* (plus the aBaseDirection and aBidiEngine parameters). And it should return a boolean indicating success.
Would it be too much to ask to extract the part of doWriteReverse we're actually going to use and clean up the code? It's amazingly ugly :-)
Assignee | ||
Comment 17•14 years ago
|
||
This is a slightly tweaked version of http://inkscape.org/doc/examples/i18n.svg, with this patch and Jonathan's patch from bug 621918.
The changes are:
modifications to text-anchor/startOffset in the right-to-left examples (all other browsers that I tested also need this change)
addition of diacritics to the Hebrew text
Comment 18•14 years ago
|
||
(In reply to comment #17)
> Created attachment 500398 [details]
> Screenshot of a version of http://inkscape.org/doc/examples/i18n.svg
>
> This is a slightly tweaked version of
> http://inkscape.org/doc/examples/i18n.svg, with this patch and Jonathan's patch
> from bug 621918.
>
> The changes are:
> modifications to text-anchor/startOffset in the right-to-left examples (all
> other browsers that I tested also need this change)
> addition of diacritics to the Hebrew text
Looks MUCH better! :) Except that the period at the end of the Arabic and Hebrew is appearing at the right-hand end instead of the left, suggesting that the text is being treated as primarily LTR rather than RTL. Any chance we can get this fixed?
Comment 19•14 years ago
|
||
Comment on attachment 500396 [details] [diff] [review]
Patch v.4
>+#include "gfxtypes.h"
Needs to be "gfxTypes.h".
Assignee | ||
Comment 20•14 years ago
|
||
(In reply to comment #18)
> Looks MUCH better! :) Except that the period at the end of the Arabic and
> Hebrew is appearing at the right-hand end instead of the left, suggesting that
> the text is being treated as primarily LTR rather than RTL. Any chance we can
> get this fixed?
To fix this we would have either to make another change to the source and specify direction: rtl for the Hebrew and Arabic text; or change the code to make direction depend on the inherent directionality of the text rather than the inherited direction property (with initial value of ltr). This is a possible reading of the rather vague spec, and it's what Batik does, but I don't think it's the right thing to do.
(In reply to comment #19)
> Comment on attachment 500396 [details] [diff] [review]
> >+#include "gfxtypes.h"
>
> Needs to be "gfxTypes.h".
D'oh! Thanks
Assignee | ||
Comment 21•14 years ago
|
||
Attachment #499814 -
Attachment is obsolete: true
Assignee | ||
Comment 22•14 years ago
|
||
Attachment #500396 -
Attachment is obsolete: true
Attachment #500604 -
Flags: review?(roc)
Attachment #500396 -
Flags: review?(roc)
Comment on attachment 500604 [details] [diff] [review]
Patch v.5
+ // no need to use the converter -- just copy the string in reverse order
+ WriteReverse(fromBegin.get(), srcLength, toBegin.get());
I think that WriteReverse returns srcLength in all cases, which you're assuming here, but I'm not 100% sure. If it does, then it shouldn't have a return value; remove it.
Attachment #500604 -
Flags: review?(roc) → review+
Assignee | ||
Comment 24•14 years ago
|
||
Updated to comments; transferring r=roc and requesting approval2.0
Attachment #500604 -
Attachment is obsolete: true
Attachment #500966 -
Flags: review+
Attachment #500966 -
Flags: approval2.0?
Comment on attachment 500966 [details] [diff] [review]
Patch v.6
I'm approving this on the grounds that what we have now is totally broken and this is very unlikely to affect anything else, plus Simon is on the hook and has no blockers currently assigned.
Attachment #500966 -
Flags: approval2.0? → approval2.0+
Assignee | ||
Comment 26•14 years ago
|
||
http://hg.mozilla.org/mozilla-central/rev/5a289c47df1a
http://hg.mozilla.org/mozilla-central/rev/e8c496238089
Status: NEW → RESOLVED
Closed: 14 years ago
Flags: in-testsuite+
Resolution: --- → FIXED
Comment 27•14 years ago
|
||
Nice, Simon! Looking at this test:
http://dev.w3.org/SVG/profiles/1.1F2/test/harness/htmlObjectMiniApproved/text-intro-01-t.html
you can see that there is a combining character that doesn't combine with the aleph (which happens to be the only Hebrew letter I can recognise ;)). Is this a known limitation of your patch (or perhaps a problem with the test)?
Assignee | ||
Comment 28•14 years ago
|
||
That seems to a problem with the Arial Unicode MS font. This testcase shows that the same thing happens in HTML.
More worrying is that the diacritic is misplaced in Ezra SIL and Arial (at least on Mac). It looks as if the harfbuzz shaping feature described in comment 12 doesn't work for these fonts. data:text/html,<p style="font-family: 'Ezra SIL'">‭םֹולָׁש‬</p> also renders with a misplaced diacritic.
Comment 29•14 years ago
|
||
(In reply to comment #28)
> More worrying is that the diacritic is misplaced in Ezra SIL and Arial (at
> least on Mac). It looks as if the harfbuzz shaping feature described in comment
> 12 doesn't work for these fonts. data:text/html,<p style="font-family: 'Ezra
> SIL'">‭םֹולָׁש‬</p>
> also renders with a misplaced diacritic.
We don't currently use harfbuzz by default for Hebrew; it is handled via the platform's shaping library (CoreText on Mac, Uniscribe or DirectWrite on Windows, Pango on Linux).
You can force all scripts to go through harfbuzz for testing purposes by setting gfx.font_rendering.harfbuzz.level to 3.
You need to log in
before you can comment on or make changes to this bug.
Description
•