Closed Bug 670601 Opened 13 years ago Closed 9 years ago

[ro] Romanian language dependent font rendering and text search

Categories

(Core :: Find Backend, defect)

5 Branch
All
Other
defect
Not set
normal

Tracking

()

RESOLVED DUPLICATE of bug 374795

People

(Reporter: cristian.adam, Unassigned)

References

Details

(Whiteboard: [ro])

Attachments

(6 files)

User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0
Build ID: 20110615151330

Steps to reproduce:

Search after "Țț Șș" in the following page.

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd"> 
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/> 
<style type="text/css">
h1 {
	font-family: Georgia;
	font-size: 36px;
	font-weight: normal;	
	margin: 10px 0 20px 0;
}
</style>
</head>

<h1 lang="ro">&#354;&#355; &#350;&#351; (cedilla)</h1>
<h1 lang="ro">&#538;&#539; &#536;&#537; (comma) </h1>

</html>


Actual results:

Search results only in one match, even though I see the same characters twice, due to the new language dependent font rendering.



Expected results:

Firefox should have matched both text instances, which look identical for the user.
I can reproduce the issue on:
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0a1) Gecko/20110710 Firefox/8.0a1

Although, if I have the 2 Romanian keyboard inputs: Legacy and Standard, I get a search result with Legacy on for the "ţ" character on the first row and with the Standard input on for the same character on the second row.
I don't think it's actually a Firefox bug but a question for the web developer which Romanian standard to choose.
Cristian, please change this status to resolved if consider it this way. Thanks
Firefox should do the same character promotion (cedilla -> comma) on search.

http://www.capisci.ro/ is affected by the locale dependent font rendering and I would expect searching for text that I see on the webpage to work.

I know that doesn't fall into strcmp/stricmp case. I think that Firefox should do more in this area.

Google Chrome has better support for collation, because they use ICU, thus it can find matches for any of "Țț Șș", "Ţţ Şş" or "Tt Ss" strings.
I was talking about strstr instead of strcmp/stricmp. I should have mentioned directly nsString::Find (https://developer.mozilla.org/en/nsString#Find), which lacks any locale / collation support.
Confirming this on:
Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0a1) Gecko/20110710 Firefox/8.0a1
Status: UNCONFIRMED → NEW
Ever confirmed: true
Duplicate of bug 202251?
Component: General → Find Backend
Product: Firefox → Core
QA Contact: general → find-backend
Fixing bug 202251 would also fix this one. 

But in this case we're searching after a text that is actually displayed in the web page namely "Țț Șș" and not after "Tt Ss" which would be the case of the bug 202251.
Depends on: 202251
Blocks: 632886
Whiteboard: [ro]
This bug is still present?
Summary: Romanian language dependent font rendering and text search → [ro] Romanian language dependent font rendering and text search
The bug is still present in Mozilla Firefox 22.

I have tested on an English Windows 8 installation with Romanian locale configured the following browsers:

Browser name           | Same characters rendered | Both characters highlighted
--------------------------------------------------------------------------------
Mozilla Firefox 22     |          yes             |              no
Google Chrome 28       |           no             |              yes
Internet Explorer 10   |           no             |              yes

I've attached screen shots to confirm my findings.
Attached image bug670601_chrome28.png
Attached image bug670601_ff22.png
Attached image bug670601_ie10.png
Blocks: 907793
Any update for this?
Bug 1128330 adds another case in which find is limited namely characters generated using combined diacritical marks.

This time there is no visual difference between s comma below (&#x0219;) and s + comma below (&#x0073;&#x0326;). The user will be for sure confused.

See https://www.assembla.com/code/cristianadam/subversion/node/blob/webpages/diacritice/test_diacritice_combinatorii.html?&rev=88 (souce code also at: http://pastebin.com/ir5QWTt1)

Google Chrome doesn't have this problem, it can find everything!
(In reply to Cristian Adam from comment #16)
> 
> This time there is no visual difference between s comma below (&#x0219;) and
> s + comma below (&#x0073;&#x0326;). The user will be for sure confused.
> 

Forgot that this bug was about the inability of the user to visually distinguish between s and t comma below and s and t cedilla because of the "locl" promotion :)

I've uploaded the test code from the description here: https://www.assembla.com/code/cristianadam/subversion/node/blob/webpages/diacritice/test_diacritice_locl.html?rev=92

Now there are two cases when the user will not find what words that (s)he sees on the webpage.
(In reply to Cristian Adam from comment #14)
> Created attachment 783943 [details]

I can't reproduce this with IE11 in Windows 7. It behaves the same as Firefox.

(In reply to Cristian Adam from comment #16)
> Google Chrome doesn't have this problem, it can find everything!

Bug 1147464. Same as Google Search, Chrome may very well find something other than what was intended. For example, it doesn't differentiate between S Ș Ş Š Ŝ Ś.
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → DUPLICATE
See Also: → 1147464
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: