Find in This Page should match regular quotes to curly quotes

RESOLVED FIXED in mozilla27

Status

()

Toolkit
Find Toolbar
--
enhancement
RESOLVED FIXED
14 years ago
4 years ago

People

(Reporter: shadytrees, Assigned: mbrubeck)

Tracking

unspecified
mozilla27
Points:
---
Dependency tree / graph
Bug Flags:
in-testsuite +

Firefox Tracking Flags

(Not tracked)

Details

(URL)

Attachments

(1 attachment, 3 obsolete attachments)

(Reporter)

Description

14 years ago
It's harder to type a curly quote using an Alt- combination than it is to type a
regular quote. To illustrate, trying to find the phrase ""Omit needless" in the
A List Apart article will only match the first sentence but not the second one
that uses curly quotes.
Ugh.  Why not search on "omit needless" sans quotes?  Special-casing quotes like
this leads to the question of what else would we get fuzzy on?  I'm not a big
fan of fuzzy matching, period.
(Reporter)

Comment 2

14 years ago
It's cropped up very rarely, I'll admit. A (bad) example that I can think of off
the top of my head is when an author quotes another text and you get "So-and-so
said this and this" (with curly quotes) and I only know the first two or three
words, which also appear frequently throughout the text.

Perhaps a better example would be if a web page is using a possessive or a
contraction that I want to find, but the web page is using right single quotes
for contractions and the word, by coincidence, appears numerous times elsewhere.

It doesn't appear often, but when it does, it's a tiny frustration for me.
This is an automated message, with ID "auto-resolve01".

This bug has had no comments for a long time. Statistically, we have found that
bug reports that have not been confirmed by a second user after three months are
highly unlikely to be the source of a fix to the code.

While your input is very important to us, our resources are limited and so we
are asking for your help in focussing our efforts. If you can still reproduce
this problem in the latest version of the product (see below for how to obtain a
copy) or, for feature requests, if it's not present in the latest version and
you still believe we should implement it, please visit the URL of this bug
(given at the top of this mail) and add a comment to that effect, giving more
reproduction information if you have it.

If it is not a problem any longer, you need take no action. If this bug is not
changed in any way in the next two weeks, it will be automatically resolved.
Thank you for your help in this matter.

The latest beta releases can be obtained from:
Firefox:     http://www.mozilla.org/projects/firefox/
Thunderbird: http://www.mozilla.org/products/thunderbird/releases/1.5beta1.html
Seamonkey:   http://www.mozilla.org/projects/seamonkey/
This bug has been automatically resolved after a period of inactivity (see above
comment). If anyone thinks this is incorrect, they should feel free to reopen it.
Status: UNCONFIRMED → RESOLVED
Last Resolved: 13 years ago
Resolution: --- → EXPIRED
Product: Firefox → Toolkit
Reopening for resolution (WONTFIX?)
Status: RESOLVED → UNCONFIRMED
Resolution: EXPIRED → ---
Duplicate of this bug: 477372
I don't think we should WONTFIX this. As authoring tools become more typographically sophisticated, or whenever people bring text from a source that already has "smart quotes", there are likely to be curly quotes or apostrophes. The example given (of "Omit needless" with quotes) may not seem compelling, but the issue will apply equally to any word with an apostrophe; these are too common to ignore, and users cannot be expected to search for variants with both U+0027 and U+2019 separately.

Updated

9 years ago
Duplicate of this bug: 475306

Updated

9 years ago
Duplicate of this bug: 477373

Comment 10

9 years ago
I’d like to sum up which characters are those to be taken as the same.
If you write the first in the line, those others should be found as well.

Double quotes, including double prime:
" (U+0022), “ (U+201C), ” (U+201D), „ (U+201E), ″ (U+2033)
Single quotes, apostrophes, including prime:
' (U+0027), ‘ (U+2018), ’ (U+2019), ‚ (U+201A), ′ (U+2032)
Triple dot (horizontal elipsis):
... , … (U+2026)
Maybe also hyphen, minus and dashes:
- (U+002D), – (U+2013), — (U+2014), ( ‐(U+2010), − (U+2212) )

Comment 11

9 years ago
This problem has been discussed on a famous website mixing typographically sophisticated articles and just common article (wikipedia). One on the main problem that we pointed out was that browser doesn't provide a way to search for curly apostrophe or any other similar characters just by using the common simple quote character.

One suggestion was to consider any type of single quote (and respectively double quote) as a common single quote, when case sensitive is not on, and to properly a specific quote as itself when case sensitive is on.
Assignee: bugzilla → nobody
QA Contact: fast.find

Comment 12

9 years ago
I'm surprised this bug is still marked UNCONFIRMED. I can confirm that this problem is particularly annonying with the typographic apostrophe (as described in Bug 477373 which has been marked as a duplicate of this bug), especially in French. The apostrophe is widely used in French, for instance: aujourd'hui (today), ... d'or (golden ...), baie d'Along (Halong Bay). These are only a few words chosen at random, but there are virtually an infinity of others.

The typographic apostrophe is becoming more common on the Web since it is often automatically converted from the typewriter apostrophe by word processing or desktop-publishing software. As tomas D said, the problem also exists on Wikipedia where typographic and typewriter apostrophes are both used. Both apostrophes are often mixed, even in the same page, e.g. visitors posting comments typing a different kind of apostrophe than the one used by the author.

I do agree that Firefox shoud match typewriter apostrophes to typographic apostrophes by default.
(Assignee)

Updated

5 years ago
Status: UNCONFIRMED → NEW
Ever confirmed: true
(Assignee)

Comment 13

5 years ago
Created attachment 801891 [details] [diff] [review]
WIP

Here's a simple fix.  I haven't written (or updated) any test code yet.  Before I do that, do you think this approach is acceptable?
Assignee: nobody → mbrubeck
Status: NEW → ASSIGNED
Attachment #801891 - Flags: feedback?(jst)
Comment on attachment 801891 [details] [diff] [review]
WIP

Seems like an acceptable approach to me.
Attachment #801891 - Flags: feedback?(jst) → feedback+
(Assignee)

Comment 15

5 years ago
Created attachment 802418 [details] [diff] [review]
curly

Now with tests.

Note: This patch *only* normalizes curly double quotes to straight double quotes and curly single quotes to straight single quotes.  I did not try any wider or more sophisticated normalization because I wanted to address common cases like you're versus you’re while minimizing the risk of unintended consequences.

https://tbpl.mozilla.org/?tree=Try&rev=c2085b3f4337
Attachment #801891 - Attachment is obsolete: true
Attachment #802418 - Flags: review?(jst)
(Assignee)

Comment 16

5 years ago
Created attachment 803770 [details] [diff] [review]
patch v2

No functional change; just fixed a misspelled identifier in the previous patch.
Attachment #802418 - Attachment is obsolete: true
Attachment #802418 - Flags: review?(jst)
Attachment #803770 - Flags: review?(jst)
(Assignee)

Comment 17

5 years ago
Created attachment 803773 [details] [diff] [review]
patch v3

Forgot to qref.  Sorry.
Attachment #803773 - Flags: review?(jst)
(Assignee)

Updated

5 years ago
Attachment #803770 - Attachment is obsolete: true
Attachment #803770 - Flags: review?(jst)

Comment 18

5 years ago
Could this be preffed off? Thanks muchly
(In reply to Philip Chee from comment #18)
> Could this be preffed off? Thanks muchly

Would you care to elaborate on why you feel the need to have this controlled by a pref?

Comment 20

5 years ago
> Would you care to elaborate on why you feel the need to have this controlled by a pref?
This is admittedly an edge case, but what if you want to search for only typographical apostrophes?

On the other hand adding a pref just means more code to maintain so if you don't think this is a good idea I'll withdraw my request.
Comment on attachment 803773 [details] [diff] [review]
patch v3

r=jst. I'd argue that this is too edge-casey to worry about adding a pref for. And if I'm proven wrong, doing so at a later stage is certainly an option...
Attachment #803773 - Flags: review?(jst) → review+
(Assignee)

Comment 22

5 years ago
https://hg.mozilla.org/integration/fx-team/rev/7a710c502b49
Flags: in-testsuite+
OS: Windows XP → All
Hardware: x86 → All
https://hg.mozilla.org/mozilla-central/rev/7a710c502b49
Status: ASSIGNED → RESOLVED
Last Resolved: 13 years ago5 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla27

Updated

4 years ago
Depends on: 998773

Comment 24

4 years ago
This new behaviour needs to be controlled by user preference; if I search for curly quote signs/apostrophes, I don't want straight ones.
(Assignee)

Comment 25

4 years ago
(In reply to Michael Bednarek from comment #24)
> This new behaviour needs to be controlled by user preference; if I search
> for curly quote signs/apostrophes, I don't want straight ones.

Let's use bug 998773 for this request.
You need to log in before you can comment on or make changes to this bug.