Open Bug 749835 Opened 12 years ago Updated 4 months ago

Can't make a wiki link to an article with a colon in the title

Categories

(support.mozilla.org :: Knowledge Base Software, defect, P5)

defect

Tracking

(Not tracked)

REOPENED

People

(Reporter: verdi, Unassigned)

References

Details

(Whiteboard: u=user c=wiki p=2)

Attachments

(1 file)

I can name an article "Learn the basics: Get started with Firefox" but article title links to don't work. Using [[Learn the basics: Get started with Firefox]] should work but it just produces the unlinked text "[[Learn the basics: Get started with Firefox]]"
Probably a parser bug? Adding to next sprint for now.
Target Milestone: --- → 2012.9
Priority: -- → P2
Whiteboard: u=contributor c=wiki p=
Whiteboard: u=contributor c=wiki p= → u=user c=wiki p=
Whiteboard: u=user c=wiki p= → u=user c=wiki p=2
Tweaking the scope of this:

We're going to look into this for at most a day. If we figure it out, yay! If it turns out to be complex or not possible to fix, we'll talk about it in a stand-up.
I'm going to look into this today and see where the rabbit hole goes.
Assignee: nobody → willkg
Here's the deal. A colon in a title has syntactic meaning to the parser. It's what the parser uses to denote internal namespaces. That's how Template:, Video:, Include:, and all that works. The problem here is that the article is titled "Learn the basics: Get started with Firefox", but when you put that in a link like this:

    [[Learn the basics: Get started with Firefox]]

that gets parsed into two parts:

    namespace: Learn the basics
    title: Get started with Firefox

and there's nothing that handles the "Learn the basics" namespace in our system. Ipso facto, it doesn't work and produces no link.

If you tweak the link to something like [[Learn the basics: Get started with Firefox|Learn the basics: Get started with Firefox]], then it does produce a link, but the link goes to the "create a new page!" page and not the page you were trying to link to.

I talked with Erik to confirm my prognosis and he says it's worth trying to fake out the parser using urlencoded colon. So I tried that and wasn't able to get it to work putting the urlencoded colon (&58;) in either the article title or the link. It keeps getting converted back to a colon.

At this point, I think it's either not possible to fix this within the constraints of our requirements and how wikimarkup gets parsed, or if it is possible, it'll be a serious project.

I don't think there are other options. I think we should close this out as WONTFIX and make a note near the title field that you shouldn't use colons.

Any thoughts?
Ok. Just as I hit "Save Changes", it occurred to me that I can make this work if we use COLON instead of : in the links because this is specifically a link-parsing problem.

WARNING: If you're not into hacktastic solutions, look no further!

I can tweak the link handling bit of the sumo parser so that it tries the article and if it doesn't exist and has COLON in the link, then it replaces COLON with : and tries again. So you'd be creating article links like this:

    [Learn the basicsCOLON Get started with Firefox]]

I can make that work.

Is that too hacktastic?
(In reply to Will Kahn-Greene [:willkg] from comment #5)
> Ok. Just as I hit "Save Changes", it occurred to me that I can make this
> work if we use COLON instead of : in the links because this is specifically
> a link-parsing problem.
> 
> WARNING: If you're not into hacktastic solutions, look no further!
> 
> I can tweak the link handling bit of the sumo parser so that it tries the
> article and if it doesn't exist and has COLON in the link, then it replaces
> COLON with : and tries again. So you'd be creating article links like this:
> 
>     [Learn the basicsCOLON Get started with Firefox]]
> 
> I can make that work.
> 
> Is that too hacktastic?

I'm leaning towards saying forget about it. What are the localization implications of having COLON in the title? I'm guessing we'd have to explain that COLON = :? Anything else?
(In reply to Verdi from comment #6)
> (In reply to Will Kahn-Greene [:willkg] from comment #5)
> > Ok. Just as I hit "Save Changes", it occurred to me that I can make this
> > work if we use COLON instead of : in the links because this is specifically
> > a link-parsing problem.
> > 
> > WARNING: If you're not into hacktastic solutions, look no further!
> > 
> > I can tweak the link handling bit of the sumo parser so that it tries the
> > article and if it doesn't exist and has COLON in the link, then it replaces
> > COLON with : and tries again. So you'd be creating article links like this:
> > 
> >     [Learn the basicsCOLON Get started with Firefox]]
> > 
> > I can make that work.
> > 
> > Is that too hacktastic?
> 
> I'm leaning towards saying forget about it. What are the localization
> implications of having COLON in the title? I'm guessing we'd have to explain
> that COLON = :? Anything else?

To clarify, this isn't an issue with the article title--this is just an issue with the link. So the title of the article as seen in the kb/new form and other forms and on the site and everywhere else is "Learn the basics: Get started with Firefox". But to use it in a link, you have to replace : with COLON so it looks like "[[Learn the basicsCOLON Get started with Firefox]]".

Do localized articles also go through and localize all the internal links?
(In reply to Will Kahn-Greene [:willkg] from comment #7)

> 
> Do localized articles also go through and localize all the internal links?

No, not normally. Here's a section of an Italian article:
=Organizzare i segnalibri=
I seguenti articoli spiegano come ordinare e organizzare i propri segnalibri:
* [[Sorting bookmarks]]
* [[Bookmark folders]]
* [[Deleting bookmarks]]

I guess my question is that if you have something like [[CookiesCOLON Chocolate chip, Oatmeal raisin and more]] will that display as "Cookies: al cioccolato, farina d'avena uva passa e più" in an Italian article?

Another issue I thought of is that the new forum article link maker thing would need to know to change : to COLON and we'd have to make sure people knew that when copying the title of an article and making it a link you'll have to manually change : to COLON.
(In reply to Verdi from comment #8)
> we'd have to make sure people
> knew that when copying the title of an article and making it a link you'll
> have to manually change : to COLON.

How critical are colons in titles? Could they be replaced with hyphens?

Step back and read that with less context. That sounds awful.

MediaWiki itself handles colons in titles, c.f. http://en.wikipedia.org/wiki/Star_Wars_Episode_VI:_Return_of_the_Jedi

Maybe our current parser (py-wikimarkup) could be pretty-easily extended to see if namespace is a registered namespace and, if not, rejoin it to the rest of the title?

> def split_title(page):
>     if ':' in page:
>         namespace, _, title = page.partition(':')
>     else:
>         namespace, title = None, page
>     if namespace and namespace not in self.registered_namespaces:
>         namespace, title = None, ':'.join((namespace, title))
>     return (namespace, title)

There's probably a less-hacky way to do that, but let's keep the hacky crap in the code, and not in the user experience, if we have to have hacky crap at all.
(In reply to James Socol [:jsocol, :james] from comment #9)
> (In reply to Verdi from comment #8)
> > we'd have to make sure people
> > knew that when copying the title of an article and making it a link you'll
> > have to manually change : to COLON.
> 
> How critical are colons in titles? Could they be replaced with hyphens?
> 
> Step back and read that with less context. That sounds awful.
> 
> MediaWiki itself handles colons in titles, c.f.
> http://en.wikipedia.org/wiki/Star_Wars_Episode_VI:_Return_of_the_Jedi
> 
> Maybe our current parser (py-wikimarkup) could be pretty-easily extended to
> see if namespace is a registered namespace and, if not, rejoin it to the
> rest of the title?
> 
> > def split_title(page):
> >     if ':' in page:
> >         namespace, _, title = page.partition(':')
> >     else:
> >         namespace, title = None, page
> >     if namespace and namespace not in self.registered_namespaces:
> >         namespace, title = None, ':'.join((namespace, title))
> >     return (namespace, title)
> 
> There's probably a less-hacky way to do that, but let's keep the hacky crap
> in the code, and not in the user experience, if we have to have hacky crap
> at all.

After what verdi has said, I'm -1 on the hack--it touches too many people.

I'm pretty sure there is no pretty-easily extended way to do this. The docs suggest that:

https://github.com/dcramer/py-wikimarkup#adding-internal-links

The py-wikimarkup project is pretty dead, so we could fork it and maintain our own fork. But this parser doesn't have many tests and we don't have many tests for the parser, either, so fundamental changes to the parser aren't without risk.

I'll look into it more today.
We never used dcramer's fork, always Paul's: https://github.com/pcraciunoiu/py-wikimarkup and, even though he left Mozilla, he's been great about merging pull reqs. Also I still have commit there.
We are currently making due with hyphens in titles instead of colons, which is fine. I don't think it's that big of deal to waste more time on. I'm ok with not fixing this now and looking into it some other time.
(In reply to Verdi from comment #12)
> We are currently making due with hyphens in titles instead of colons, which
> is fine. I don't think it's that big of deal to waste more time on. I'm ok
> with not fixing this now and looking into it some other time.

Sounds good to me.

I'm going to mark it as WONTFIX. If we want to work on this again, we can spin off a new bug.
Status: NEW → RESOLVED
Closed: 12 years ago
Resolution: --- → WONTFIX
I retroactively created a research bug for figuring out the deal because there's some good information in here should someone want to fix this in the future.

So I'm reopening this one, bumping it out of the sprint (to be replaced by the research bug), and unassigning myself.
Assignee: willkg → nobody
Status: RESOLVED → REOPENED
Depends on: 755299
Resolution: WONTFIX → ---
Target Milestone: 2012.9 → ---
This problem came up again - see bug 1084343
See Also: → 1084343
i doubt we'll solve this! setting to p5 and won't fix if somebody wants to re-do :willkg's research in 2017 (and beyond) please re-open but do the research first before re-opening!!!!!!
Status: REOPENED → RESOLVED
Closed: 12 years ago7 years ago
Resolution: --- → WONTFIX

This is still an issue. I edited the metadata today for three four articles, to remove the colon from the title. See:
https://support.mozilla.org/en-US/kb/premium-subscriber-faq/discuss/11679
https://support.mozilla.org/en-US/kb/pocket-premium-suggested-tags/discuss/11678
https://support.mozilla.org/en-US/kb/pocket-premium-permanent-library/discuss/11677
https://support.mozilla.org/en-US/kb/pocket-premium-full-text-search/discuss/11676

If this problem can't be solved, can we at least show a warning when we create a new article (or edit the metadata for an existing article) when the proposed title includes a colon?

Status: RESOLVED → REOPENED
Resolution: WONTFIX → ---

Another new article was just approved with a colon in the title. See:
https://support.mozilla.org/en-US/kb/firefox-accounts-renamed-mozilla-accounts/discuss/11723

Type: task → defect

Still an issue. This newly approved article includes a colon in the title:
Contributor Policy: Usage of Generative AI (e.g., ChatGPT)

You need to log in before you can comment on or make changes to this bug.