Closed
Bug 697393
Opened 14 years ago
Closed 10 years ago
Generate more predictable (MediaWiki-like) IDs on headers
Categories
(support.mozilla.org :: Knowledge Base Software, task)
support.mozilla.org
Knowledge Base Software
Tracking
(Not tracked)
RESOLVED
WONTFIX
People
(Reporter: underpass_bugzilla, Unassigned)
Details
User Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0) Gecko/20100101 Firefox/8.0
Build ID: 20111019081014
Steps to reproduce:
In this revision
https://support.mozilla.com/it/kb/Firefox%20si%20chiude%20inaspettatamente/revision/18195
we have a paragraph with title
= Verificare se la chiusura inaspettata si manifesta in modalità provvisoria =
In the first lines of the document we also have an anchor leading to it
[[Firefox crashes#w_verificare-se-la-chiusura-inaspettata-si-manifesta-in-modalità-provvisoria|verificare se il problema si manifesta anche in modalità provvisoria]]
and this anchor does not work. I tried not to use the accented letter 'à' in the paragraph title and the problem does not occur
https://support.mozilla.com/it/kb/Firefox%20si%20chiude%20inaspettatamente/revision/18212
Expected results:
In Italian we have plenty of words ending with accented letters and we would like to be able to use them also in paragraph titles.
Thanks
Comment 1•14 years ago
|
||
In French, we use accents a lot and we have no problem to link to headings, such as in https://support.mozilla.com/fr/kb/R%C3%A9soudre%20des%20probl%C3%A8mes%20li%C3%A9s%20aux%20plugins#w_daeterminer-si-un-plugin-pose-problaeme or https://support.mozilla.com/fr/kb/R%C3%A9soudre%20des%20probl%C3%A8mes%20li%C3%A9s%20aux%20plugins#w_mettre-aa-jour-ou-rae-installer-vos-plugins
Status: UNCONFIRMED → RESOLVED
Closed: 14 years ago
Resolution: --- → INVALID
Scoobidiver: please, can you tell me why I cannot use the accented letter in that title?
Maybe it is not the most correct thing in the world to label a bug as INVALID just because you think it is.
Comment 3•14 years ago
|
||
(In reply to Scoobidiver from comment #1)
> In French, we use accents a lot and we have no problem to link to headings
"é" and "è" become "ae", "à" becomes "aa" in your URLs, do you have to do that change by hand?
Honestly I haven't used SUMO in years, but this sounds like a bug to me.
In the meantime this should work
https://support.mozilla.com/it/kb/Firefox%20si%20chiude%20inaspettatamente/revision/18330
Comment 4•14 years ago
|
||
Both of the testcases in comment 0 seem to work for me. Simone, could you verify the one that is not supposed to work, as it seems to work for me?
Note: I'm seeing "a'" in the title, not "à". Maybe that's why it works?
Status: RESOLVED → UNCONFIRMED
Resolution: INVALID → ---
Comment 5•14 years ago
|
||
It seems that the auto-generated anchor slugs used in the Table of Contents do work fine. Simone, when you created this manual anchor link, did you copy the exact slug used in the TOC? If so, it really should work. In the example you quote in comment 0, it looks like you've tried to add the accent in the #anchor_part of the link, which probably won't work.
Comment 6•14 years ago
|
||
(In reply to flod (Francesco Lodolo) from comment #3)
> "é" and "è" become "ae", "à" becomes "aa" in your URLs, do you have to do
> that change by hand?
You just need to hover the heading in the TOC to see what it looks like.
Hi all. What I say is: if I use the 'à' letter in the title and the à letter in the anchor link, the anchor does not work.
So in order to make it work *NOW*, I had to change the letter in the title with a' and the letter in the anchor link with plain a.
Every other combination simply fails.
What flod is pointing to is that even in existent articles like
https://support.mozilla.com/fr/kb/Résoudre des problèmes liés aux plugins
an anchor linking to a title containing the è letter has to be worked around with the ae combination instead.
(In reply to Scoobidiver from comment #6)
> You just need to hover the heading in the TOC to see what it looks like.
What about answering to flod, instead?
Comment 8•14 years ago
|
||
(In reply to Simone Lando from comment #7)
> What flod is pointing to is that even in existent articles like
>
> https://support.mozilla.com/fr/kb/Résoudre des problèmes liés aux plugins
>
> an anchor linking to a title containing the è letter has to be worked around
> with the ae combination instead.
I'm trying to wrap my head around this, so forgive me if I'm being slow or ignorant here... but this anchor that is the reason for this bug report, isn't that manually typed into the article's source? Assuming that's the case, don't you have to hover over the TOC link to figure out what the anchor URL slug is anyway? If so, does it matter that it's not using the accent and that it's changing ò to oe or something similar?
Again, sorry if I'm misunderstanding here. It's late. :)
Sorry David, I do not undestand your comment so please forgive me :)
What I am trying to say is that
1) it is not possible to use accented letters in links to anchors.
2) the workaround (i.e. use of ae oe etc.) I guess is undocumented and it is not this easy to apply - especially for new contributors.
So I propose two options:
1) Fix the bug thus allowing the use of accented letters in anchor links or
2) Explain the use of this workaround since it has to be applied manually to anchor links.
Comment 10•14 years ago
|
||
(In reply to Simone Lando from comment #9)
> 1) Fix the bug thus allowing the use of accented letters in anchor links
If this solution was chosen:
1. localizers would need to check manually the about 250 articles one by one to see if they contain this kind of anchor and then edit them. It's an additional work.
2. the current links given in external sources such as forums would go to the top of the article instead of the section.
> 2) Explain the use of this workaround since it has to be applied manually to
> anchor links.
This kind of request is not treated in bugzilla but in the Contributors forum.
| Reporter | ||
Comment 11•14 years ago
|
||
Scoobidiver: I think it is not necessary to answer always. Try to wait, at least sometimes.
Comment 12•14 years ago
|
||
Wait -- I'm still lost here. Why would you have to go through 250 articles to check things manually even if we decided to fix this? Doesn't this problem *only* occur for articles in which you're adding *manual* anchor links to sections in the article? That is pretty rare in our KB. For most of the time, we're not using anchor links at all, except for the auto-generated TOC links, which always work.
So, for this manual anchor links, isn't it true that in order to know the exact anchor link URL/slug, you have to hover over an existing TOC generated link in order to know it. For example, in order to link manually to "The crash doesn't happen in Safe Mode", I'd have to hover over the TOC link and I'd see https://support.mozilla.com/en-US/kb/Firefox%20crashes#w_the-crash-doesnt-happen-in-safe-mode. So, in order to manually greate an anchor link, I'd then use:
[[Firefox crashes#w_the-crash-doesnt-happen-in-safe-mode|link caption]]
If all of this is true, then I don't see why it's a problem which letters the TOC link generator uses when generating these links? I mean, it also added "w_" to the beginning of the links, and that's not part of the original heading either. So, why is it a problem that an à is converted to aa? It's just an anchor slug, it doesn't change the visual appearance of the heading for users, right?
| Reporter | ||
Comment 13•14 years ago
|
||
Ok, now I finally understand :)
You say that the solution is to find how the TOC generates the link and then use that link.
You know what? I've *NEVER* thought about that. I manually wrote every and all the internal anchors (never checking the TOC).
I think that this is still a minor bug and should be fixed. But obsiously the last word is yours (I mean David not Scoobidiver).
Thanks :-)
Comment 14•14 years ago
|
||
I think the issue with "fixing" this at this point is that right now, we generate a certain ID for a given anchor, e.g.:
= Déterminer si un plugin pose problème =
turns into
<h1 id="w_daeterminer-si-un-plugin-pose-problaeme">Déterminer si un plugin pose problème</h1>
So, anything that's already linking to this header (correctly) is linking to "#w_daeterminer-si-un-plugin-pose-problaeme". If we changed it to generate IDs with non-ASCII characters, any existing links to it would break.
I don't see any particular reason we couldn't put something else in the ID attribute for anchors--but if we're going to, and break existing links, I think it's worth considering breaking them more (i.e. getting rid of the "w_" and generally getting closer to how MediaWiki generates them, which is much more predictable).
| Reporter | ||
Comment 15•14 years ago
|
||
Ok. Maybe could be useful to check how many articles have internal anchors.
Thanks James, as usual.
Comment 16•14 years ago
|
||
(In reply to David Tenser [:djst] from comment #12)
> Why would you have to go through 250 articles to check things manually
> even if we decided to fix this?
I didn't mean that anchors are used in every articles, but in order to know in which articles those anchors are used, you need to check every articles manually.
Another solution would be to create a script that would do that as it was done during the Kitsune migration.
(In reply to James Socol [:jsocol, :james] from comment #14)
> getting rid of the "w_"
That's bug 616418.
> I think it's worth considering breaking them more
How will you treat differently {for} inside headings? See https://support.mozilla.com/en-US/kb/Firefox%20will%20not%20start#w_firefox-will-not-start-4on-os-x-10-4-or-earlier-or-with-a-powerpc-processorsf5on-windows-98me-or-earliersf
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 17•14 years ago
|
||
(In reply to Scoobidiver from comment #16)
> Another solution would be to create a script that would do that as it was
> done during the Kitsune migration.
Let's not put that on the table. It's huge, slow, error-prone, and generally a bad idea. Especially if not a lot of articles actually link this way. The original migration was only possible because it was the original, and even that took months of dev time.
> (In reply to James Socol [:jsocol, :james] from comment #14)
> > getting rid of the "w_"
> That's bug 616418.
I just closed that out--it was an issue with the original migration, filed _after_ the migration when it was too late to fix. We only need one "change how IDs are generated" bug.
> > I think it's worth considering breaking them more
> How will you treat differently {for} inside headings? See
> https://support.mozilla.com/en-US/kb/Firefox%20will%20not%20start#w_firefox-
> will-not-start-4on-os-x-10-4-or-earlier-or-with-a-powerpc-processorsf5on-
> windows-98me-or-earliersf
This is really tangential to the discussion here, so I don't want to get lost in the weeds, but...
Honestly, my answer there is: we won't. Don't use {for} that way. This was don't to avoid repeating 4 words: "Firefox will not start". The reasonable way to do that would've been:
> {for win}
> == Firefox will not start on Windows 98 ==
>
> Some text about Windows 98
> {/for}
> {for mac}
> == Firefox will not start on Mac OS 10.4 ==
>
> Some text about 10.4
> {/for}
I look at that paragraph and I see a whole ton of jumping through {for} hoops to avoid repeating very little text. At the very least, just make two separate headings. It's simpler on so many levels. I think this has been brought up on the contributor/kb-article forums before with the question "is this actually easier to localize?"
Comment 18•14 years ago
|
||
(In reply to James Socol [:jsocol, :james] from comment #14)
> I think the issue with "fixing" this at this point is that right now, we
> generate a certain ID for a given anchor, e.g.:
>
> = Déterminer si un plugin pose problème =
>
> turns into
>
> <h1 id="w_daeterminer-si-un-plugin-pose-problaeme">Déterminer si un plugin
> pose problème</h1>
>
> So, anything that's already linking to this header (correctly) is linking to
> "#w_daeterminer-si-un-plugin-pose-problaeme". If we changed it to generate
> IDs with non-ASCII characters, any existing links to it would break.
Getting rid of historic TikiWiki ballast would be good. We should indeed figure out how many of those links we have laying around in English and up to date localized articles. This is only about anchored links, right?
> I don't see any particular reason we couldn't put something else in the ID
> attribute for anchors--but if we're going to, and break existing links, I
> think it's worth considering breaking them more (i.e. getting rid of the
> "w_" and generally getting closer to how MediaWiki generates them, which is
> much more predictable).
Can you tell me how this would affect the integration of the new MediaWiki parser?
Status: NEW → UNCONFIRMED
Ever confirmed: false
Updated•14 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 19•14 years ago
|
||
(In reply to Kadir Topal [:atopal] from comment #18)
> Getting rid of historic TikiWiki ballast would be good. We should indeed
> figure out how many of those links we have laying around in English and up
> to date localized articles. This is only about anchored links, right?
Yes, it's only about links with anchors.
"w_" isn't from Tiki, it's from our current wiki parser. If you look through bug 616418 you'll see more of the context.
That said, and as I said in that bug, the current IDs we generate are difficult to predict, not nice to humans, and have a weird "w_" in front of them*. We could, if we're OK with breaking all existing anchored links, do closer to what MediaWiki does.
* The "w_" is to namespace the IDs so they never collide with other IDs on the page. It's not a bad idea, I just don't think it's practically necessary.
> Can you tell me how this would affect the integration of the new MediaWiki
> parser?
The new parser is so far away that it really wouldn't, at this point.
If we want to continue to generate "w_" (which I think is not a good idea for the parser by default) we'd have to do something easy like make it possible to specify a callable to generate the IDs. If we don't continue to support, "w_", we can just be opinionated, make it not configurable, and with output closer to MediaWiki's.
Status: NEW → UNCONFIRMED
Ever confirmed: false
Updated•14 years ago
|
Status: UNCONFIRMED → NEW
Ever confirmed: true
Comment 20•14 years ago
|
||
It sounds like we all agree that it's at most a cosmetic bug, then. Nice to fix, but not the highest priority. Hope that's a fair summary and rationale for changing its priority. :)
Severity: normal → minor
Updated•14 years ago
|
Summary: If a header contains an accented letter, anchors leading to it do not work → Use MediaWiki syntax for anchor IDs
Updated•14 years ago
|
Summary: Use MediaWiki syntax for anchor IDs → Generate more predictable (MediaWiki-like) IDs on headers
Comment 21•10 years ago
|
||
Almost 4 years later, this hasn't hurt us much. Changing it now would be too disruptive.
Converting things like á to aa (for example) is pretty common for urls, where including accented characters tends to make overly long, ugly urls. Keeping it this way keeps things consistent with our other urls practices.
Finally, users shouldn't be typing in anchors by hand. If they aren't finding the anchor links already in the document, we should be adding more help for people to find them instead of making users type them out by hand.
Status: NEW → RESOLVED
Closed: 14 years ago → 10 years ago
Resolution: --- → WONTFIX
You need to log in
before you can comment on or make changes to this bug.
Description
•