Closed Bug 648521 Opened 13 years ago Closed 10 years ago

Page search does not normalize and find decomposed Unicode characters

Categories

(Toolkit :: Find Toolbar, defect)

defect
Not set
minor

Tracking

()

RESOLVED DUPLICATE of bug 640856

People

(Reporter: voss, Unassigned)

Details

User-Agent:       Mozilla/5.0 (X11; U; Linux i686; en-US) AppleWebKit/534.16 (KHTML, like Gecko) Ubuntu/10.04 Chromium/10.0.648.133 Chrome/10.0.648.133 Safari/534.16
Build Identifier: Mozilla/5.0 (X11; Linux i686; rv:2.0) Gecko/20100101 Firefox/4.0

Several Unicode characters can be represented in two ways: as one precomposed codepoint or as combination of a character with combining characters. The visual display is the same and you can normalization both to one form. But if a page contains both forms, the page search of Firefox does not find both.


Reproducible: Always

Steps to Reproduce:
1. Open a page with non-normalized Unicode. Here is an example:

<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
</head>
<body>
combined ü
<hr>
u with combining dots: u&#x0308;
</body></html>

2. Search for a combined character STRG+F, type "ü"
Actual Results:  
Only the combined character "ü" is found

Expected Results:  
Both instances should be found, no matter whether precombined or with combining character in the source document.

You should normalize the page content by Unicode Normalization Form Canonical Composition (NFC).
Confirmed on Mozilla/5.0 (Windows NT 6.1; rv:6.0a1) Gecko/20110413 Firefox/6.0a1

Able to search only Combined character.
Version: unspecified → 4.0 Branch
Component: General → Find Toolbar
Product: Firefox → Toolkit
QA Contact: general → fast.find
Version: 4.0 Branch → unspecified
This is a duplicate of bug 640856
Blocks: 565552
OS: Linux → All
Hardware: x86 → All
Flags: firefox-backlog?
Status: UNCONFIRMED → RESOLVED
Closed: 10 years ago
Resolution: --- → DUPLICATE
No longer blocks: 565552
Flags: firefox-backlog?
You need to log in before you can comment on or make changes to this bug.