Using the "search in site" feature (f3) incorrect for umlauts
Categories
(Toolkit :: Find Toolbar, defect, P3)
Tracking
()
People
(Reporter: sirbitesalot, Unassigned)
Details
User Agent: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:97.0) Gecko/20100101 Firefox/97.0
Steps to reproduce:
Vistit a site that includes words with german umlauts
Search for umlauts/parts/words with umlauts.
Actual results:
Search finds all umlauts AND "normal" equivalents of the character
like:
ö->o
ü->u
ä->a
For example if you want to find the word "Öl" it will find everything that contains "ol"
Expected results:
Only find the actual umlauts.
Searching for "Öl" should only find "Öl"
Comment 1•4 years ago
|
||
The Bugbug bot thinks this bug should belong to the 'Firefox::Address Bar' component, and is moving the bug to that component. Please revert this change in case you think the bot is wrong.
Updated•4 years ago
|
Updated•4 years ago
|
Comment 2•4 years ago
|
||
S2 is normally reserved for pretty serious bugs that affect lots of users. This, AIUI, can be worked around by just ticking/unticking the "Match diacritics" checkbox in the find bar. Is this really S2?
Comment 3•4 years ago
|
||
Hi Gijs,
Thanks for checking in on this and letting us know about the work around. There is a "Match diacritics" checkbox to enable matching letters with diacritics.
I'm currently new to Find bar and I assumed this was serious and might be affecting all users without a known work around :).
I'm lowering the severity and priority levels.
Updated•4 years ago
|
Is the diacritics checkbox new? I used to copy all contents to a text editor to work around the issue (which was really annoying). In German, the checkbox is labeled "Akzente" (accents) - which is kinda off imho. Why would I have to check a box labeled "accents" to match äöü as what they are, i.e. regular characters? So, is this a bug that's related to the new checkbox? I don't remember seeing it pre-97.
Comment 5•3 years ago
|
||
Hi Ronald,
When I did a quick search for the diacritics checkbox, it is not a new feature.
The last work that has been done on this checkbox is in 2019, here: https://hg.mozilla.org/mozilla-central/rev/828a10a79e71ae1fd028bb37ccf128e2434e7f0b
By default "Match Diacritics" is set to false. But you can change the default to true, by setting "findbar.matchdiacritics" to 1 in instead of 0 in about:config.
Comment 6•3 years ago
|
||
Closing this bug to RESOLVED -> INVALID because the matching diacritics default option can be set to true in about:config.
Comment 7•3 years ago
•
|
||
(In reply to Ronald from comment #4)
Is the diacritics checkbox new? I used to copy all contents to a text editor to work around the issue (which was really annoying). In German, the checkbox is labeled "Akzente" (accents) - which is kinda off imho. Why would I have to check a box labeled "accents" to match äöü as what they are, i.e. regular characters? So, is this a bug that's related to the new checkbox? I don't remember seeing it pre-97.
So to be clear, this was implemented in bug 202251. The diacritics there are really not about different letters, and being able to get all the matches you want/expect based on the input provided should happen without having to type the exact diacritics/punctuation. In general, I would argue that when searching for a word, providing more matches is less harmful than providing fewer matches - it's impossible for the user to find the additional ones without significant effort, whereas discounting the ones they don't care about is usually easier.
I understand that for you in German these are not "accents" but separate letters. Unfortunately, it isn't possible to make this distinction universally, not even for the same letters. In Dutch, for instance, ë is not considered a "different" letter, it's just an e with "trema". French does the same thing, as far as I understand. There are also existing closed bugs for cyrillic (bug 1620826) and finnish (bug 1647335) that complain about effectively the same thing. It isn't technically possible to fix this - you can tick/untick the "match diacritics" checkbox in the find bar to govern the behaviour, because the browser cannot know which is the one you want/mean.
If you think the German label is wrong, you are welcome to file a bug against the German localization and have a conversation with the localizers about a more accurate label. My German is adequate but not sufficient for a discussion around the subtleties of various words for the type of thing that is meant here. Diacritics is also not the most technically correct word in English; it's a compromise for brevity and user understanding (see also discussions on the patch in that bug). Technically, we're talking about unicode combining characters, but that is clearly not what you want to put on a user-facing label, especially not in a location where space is at a premium anyway.
Thanks for the extensive reply!
To me, as an end user, this is still a bug, though. Iff I search for "u" then "u" shall be highlighted. Iff I search for "ü" then "ü" shall be highlighted. Having "ü" highlighted when I type "ue" is neat but I wouldn't expect it ("ue" is an alternative to "ü" in German but it's rarely used).
I don't remember having issues with umlauts in FF in the past.
I don't know the details - especially for other languages - but to make my point short: Chrome/Chromium does is "right". There are no boxes I have to tick to get my expected behaviour. "u" -> "u" and "ü|ue" -> "ü" (and "ue" -> "ue"). I don't know what they have to do - if they use your OS or browser language, the language set in the HTML root, map IPs to countries/regions - but it just feels right. Whereas the extra checkbox in FF feels like a poor man's solution (please don't get me wrong here ;)).
Description
•