Closed Bug 849633 Opened 7 years ago Closed 7 years ago

Strings changed without updating entities on central

Categories

(Core :: XPCOM, defect)

22 Branch
defect
Not set

Tracking

()

RESOLVED FIXED
mozilla22
Tracking Status
firefox21 --- unaffected
firefox22 + fixed

People

(Reporter: tchevalier, Assigned: zwol)

References

(Blocks 1 open bug)

Details

Attachments

(2 files)

Bug 847181 landed string changes, but the name of several entities have not been updated. Please, see https://developer.mozilla.org/en-US/docs/Making_String_Changes

For instance, localizers will not be noticed about this change:
-NS_ERROR_NOT_AVAILABLE=Not available
+NS_ERROR_NOT_AVAILABLE=Some printing functionality is not currently available.
And so on.

Even if you can't change the entity name (it happens sometimes, for code reasons, I don't know if it is the case here), but you should warn Axel about it and notify localizers on the l10n mailing list.
What is the bug number for the removal of this ridiculous toolchain limitation?

The entity names cannot be changed in this case, because they are programmatically derived from XPCOM error codes.
Also, does the filing of this bug mean that everyone who needs to be notified has already been notified, or should I still send a note to the l10n list?
Zack, how about you stop ranting and say something constructive instead?

If you're changing the semantics of an error code, you should probably change the error code.
(In reply to Axel Hecht [:Pike] from comment #3)
> Zack, how about you stop ranting and say something constructive instead?

I'll admit to being a little peeved here, but I asked for the bug number because I want to help fix this limitation, not to rant about it.

(I see it as a serious limitation on two grounds: the extra process discourages people from improving the English localization, and more importantly, it means we completely lose the main benefit of keeping the messages separate from the code, i.e. that you *can* improve them without having to touch the code.)

> If you're changing the semantics of an error code, you should probably
> change the error code.

The semantics of the error codes themselves have not changed, and in fact, I'd argue that the "essential meaning" of the associated messages has not changed either; my goal was simply to make them more comprehensible to the end-user.  Despite that, this *is* a change that localizers should see, because (hopefully) the improved English messages can also help make all the other localizations better.
(In reply to Zack Weinberg (:zwol) from comment #2)
> Also, does the filing of this bug mean that everyone who needs to be
> notified has already been notified, or should I still send a note to the
> l10n list?

Considering the amount of strings changed here you just need to replace the IDS so that they don't move on to Aurora like this. I don't see any other way out.

Developers don't like this, I keep hearing their complaints (I'm usually the one filing bugs for this, thanks Théo!) but this is just a rule as others about Mozilla coding. Use two spaces for indentation, don't change strings after they landed on central without a proper ID/key change.
https://developer.mozilla.org/en-US/docs/Making_String_Changes

>I want to help fix this limitation
This is outside the scope of this bug, and I'm pretty sure Axel already thought a lot about this. Just to let you understand why things are not so simple: if you change an en-US string, how and when do you invalidate an existing localization? Locales will have those strings and look complete, but you can't be sure they reflect the new content unless you plan on becoming proficient in 70 something languages.
The changes in bug 847181 were mainly just removing obsolete stuff and improving English localization. The meaning of the entities didn't change.

So the changes follow "If you are changing a string without changing the essential meaning of it..." 
rule in https://developer.mozilla.org/en-US/docs/Making_String_Changes
If developers should not follow that rule, then the rules should be changed.
(In reply to Francesco Lodolo [:flod] from comment #5)
> (In reply to Zack Weinberg (:zwol) from comment #2)
> > Also, does the filing of this bug mean that everyone who needs to be
> > notified has already been notified, or should I still send a note to the
> > l10n list?
> 
> Considering the amount of strings changed here you just need to replace the
> IDS so that they don't move on to Aurora like this. I don't see any other
> way out.

As I said, these are the names of XPCOM error codes.  Some of them are not much used and could be renamed, but others (e.g. the one Théo called out up top) are used in thousands of places throughout the code.  Renaming them is not an option.

Question: From the code side, it seems as if string IDs only have to be unique within one property file.  Is that actually true, or do they have to be globally unique?  If they have to be globally unique, then this problem is worse than it seems: not only are there a bunch of unflagged fuzzy translations, but there are conflicts between the error messages in this file and the ones in xpc.msg.

> >I want to help fix this limitation
> This is outside the scope of this bug

Right, that's why I asked for the bug number.  Surely there is already a bug?

> and I'm pretty sure Axel already
> thought a lot about this. Just to let you understand why things are not so
> simple: if you change an en-US string, how and when do you invalidate an
> existing localization? Locales will have those strings and look complete,
> but you can't be sure they reflect the new content unless you plan on
> becoming proficient in 70 something languages.

It's not hard at all.  The principle is that whenever the English changes, the other localizations should all be flagged as *possibly* out of date and localizers should see both the old and new English strings so they can decide whether they need to change anything.  The implementation is to add a generation number to the database for each file+ID pair (assuming IDs do only have to be unique within a file), which gets bumped whenever the text changes.
(In reply to Zack Weinberg (:zwol) from comment #4)
> (In reply to Axel Hecht [:Pike] from comment #3)
> > Zack, how about you stop ranting and say something constructive instead?
> 
> I'll admit to being a little peeved here, but I asked for the bug number
> because I want to help fix this limitation, not to rant about it.
> 
> (I see it as a serious limitation on two grounds: the extra process
> discourages people from improving the English localization, and more
> importantly, it means we completely lose the main benefit of keeping the
> messages separate from the code, i.e. that you *can* improve them without
> having to touch the code.)

Note, the "id change" process doesn't keep people from improving the English in en-US, as those changes actually shouldn't change the id. All localizations are still good.

OTH, if the messages themselves are bad, this needs to be communicated to localizers, and that work is on the developer.

I think that often times when I hear the complaint about the IDs, it's about making the conscious decision about whether l10n needs to care or not, and much less about writing down the actual code. No idea in this case, though.

> > If you're changing the semantics of an error code, you should probably
> > change the error code.
> 
> The semantics of the error codes themselves have not changed, and in fact,
> I'd argue that the "essential meaning" of the associated messages has not
> changed either; my goal was simply to make them more comprehensible to the
> end-user.  Despite that, this *is* a change that localizers should see,
> because (hopefully) the improved English messages can also help make all the
> other localizations better.

What I suggest to do in this particular case is to have

#define NS_ERROR_TO_LOCALIZED_PRINT_ERROR_MSG2(nserr, locid) case nserr: stringName.AssignLiteral(#locid); break;

next to 

#define NS_ERROR_TO_LOCALIZED_PRINT_ERROR_MSG(nserr) case nserr: stringName.AssignLiteral(#nserr); break;

and use that for the entries that you change.

Also, "Print to File" and not "Print To File" is likely the right caps ;-).
Entity keys are only unique within one scope, which is a single file in .properties files. In DTDs, it's all the files that are ever included together with your file. Usually that's just file scope. brand.dtd being the most prominent exception, with a few extras here and there.

Generally though, file scope and using good variable names for the keys works fine.
(In reply to Axel Hecht [:Pike] from comment #8)
> (In reply to Zack Weinberg (:zwol) from comment #4)
> > (I see it as a serious limitation on two grounds: the extra process
> > discourages people from improving the English localization, and more
> > importantly, it means we completely lose the main benefit of keeping the
> > messages separate from the code, i.e. that you *can* improve them without
> > having to touch the code.)
> 
> Note, the "id change" process doesn't keep people from improving the English
> in en-US, as those changes actually shouldn't change the id. All
> localizations are still good.
> 
> OTH, if the messages themselves are bad, this needs to be communicated to
> localizers, and that work is on the developer.
> 
> I think that often times when I hear the complaint about the IDs, it's about
> making the conscious decision about whether l10n needs to care or not, and
> much less about writing down the actual code. No idea in this case, though.

This situation is IMHO right on the line.  None of these messages have had their *essential meaning* change, but the *English statement* of that meaning has been revised in order to convey it better.  I don't read any other language's technical vocabulary well enough to know whether other localizations already convey the essential meaning well enough that they don't need to change.  I'd rate it as likely that other localizations did a near-literal translation of the old messages, and so revisiting them would be *helpful*, but I don't think it would be wrong to leave all the existing localizations untouched.

I feel that this situation reflects a lack of a way for developers to communicate to localizers "hey, you should maybe have another look at this message set, but you might not need to change anything."  In my ideal world that would be expressed by changing the text of the message and leaving the ID alone.

(Still want to know that bug number.)

> What I suggest to do in this particular case is to have
> 
> #define NS_ERROR_TO_LOCALIZED_PRINT_ERROR_MSG2(nserr, locid) case nserr:
> stringName.AssignLiteral(#locid); break;
> 
> next to 
> 
> #define NS_ERROR_TO_LOCALIZED_PRINT_ERROR_MSG(nserr) case nserr:
> stringName.AssignLiteral(#nserr); break;
> 
> and use that for the entries that you change.

Not workable given future plans for this code (see bug 650960); this function will be deleted in short order, and the property-bundle lookup will be done from JavaScript based on the Components.results label corresponding to the numeric error code.

What I could do is add a prefix or suffix to *all* these message IDs, but I don't want to do that unless there is no alternative whatsoever.

> Also, "Print to File" and not "Print To File" is likely the right caps ;-).

Pre-existing condition.  I'll change it if the conclusion here is that that file needs to change again.
The original changeset is honestly unreadable, at least to me, so I took the time to list all strings changed without a proper renaming (I hope I didn't miss any).

Take a look at this list. Can you still say that "The meaning of the entities didn't change." when you change a string from "We are unable to Print or Print Preview this page." to "Printing XUL documents is not supported."? I don't really think so.

IMO the only two strings who can change without renaming are the last two.

Are there technical limitations that prevent you from changing those entity names? It would be the first time I see such a limit on Mozilla's code, and I think it should be solved (e.g. some layer that matches error codes to strings, but, again, I'm not a developer).
(In reply to Francesco Lodolo [:flod] from comment #11)
> Take a look at this list. Can you still say that "The meaning of the
> entities didn't change." when you change a string from "We are unable to
> Print or Print Preview this page." to "Printing XUL documents is not
> supported."? I don't really think so.

Yes, I can say that, because the meaning of the *entities* have not changed.  In the case you call out, the *text* was so badly worded as to be flat-out inaccurate.  In the other cases, the text was not inaccurate, it was just confusing; the new text is a better reflection of the meaning that the entity has always had.

Again, these are XPCOM error codes with well-defined meanings (especially the ones that don't start NS_ERROR_GFX_PRINTER_) and I feel very strongly that I should be able to make editorial corrections to the expression of that meaning without changing the entity names.

> IMO the only two strings who can change without renaming are the last two.

I'm glad we're on the same page about those two, at least.

> Are there technical limitations that prevent you from changing those entity
> names? It would be the first time I see such a limit on Mozilla's code, and
> I think it should be solved (e.g. some layer that matches error codes to
> strings, but, again, I'm not a developer).

Because these are the symbolic names of error codes, the only practical code-side option is to add a prefix or suffix to all of the NS_ERROR_* messages in this property list.  I do not want to do this unless there is no translation-side alternative whatsoever.

**Please** someone tell me the bug number for the improvements to the translation database that will render this whole issue a relic of the past.  I don't even know what product to look in, much less where to find the code, but I'll happily find the time to fix this properly if someone tells me where to begin.
(In reply to Zack Weinberg (:zwol) from comment #12)
> Yes, I can say that, because the meaning of the *entities* have not changed.

Ok, at this point we can only agree on desagreeing.

> **Please** someone tell me the bug number for the improvements to the
> translation database that will render this whole issue a relic of the past. 

There is no such bug because there's no such thing as a "translation database". Each localizer is free to use the tool of his choice, even a text editor if he prefers to, and there's no central database of strings, just a comparison between en-US and localized strings (compare-locales) to see if there are errors or strings are missing.
> not changed; the English diagnostics, however, were badly written to the
> point of being completely wrong in one case.

From the MDN[1]: "If you are changing a string such that its meaning has changed"

If something was completely wrong and you fixed it, you did, by all means, change its meaning, that *of the string*.  To deny that makes absolutely no sense.  At the very least you should change *that one* entity.

And then the MDN tells you how to do it: ", you must update the entity or property name for the string to match the new meaning."

you can say, "but the name of the entity is just right!"  Well, then you simply have to version it, that is, add "1" or "2" or "3".  It happens all the time!  Why would it not apply to your changeset?

Take a look at Bug 593126[2].  See how big a change that is?  Now take a look at the diff[3].  They did change the entity name by versioning it.  That's version 3 for the same entity.

I could go dig other changes and bugs, on and on, like Bug 412387[4].  They simply wanted to change "..." with "…".  How big a change is that?  So somebody asked (comment #3) if a entity change was needed and the short and obvious answer is in comment #4 "It's something that localizers should be noticed on as well."

Please, be nice to us, to all the toolkits we localizers use.  If people can have up to three versions of the same string you'd probably could version yours for the first time.

It's the way to go, if you want to suggest another way, go ahead, but make sure you present it to the l10n list and people actually agree with you, that it will work.  So far the entity change works like a charm, and I don't see why is it so hard to version them.

Need something more recent, say "Bug 812762 - Use &brandShortName; instead of Firefox"[5].  You can start reading the commit that closed it the first time, start at comment #20.  Bug reopened and changes recommited.  Just to let you know, I hadn't noticed the changes until they changed the entity.

Why knowingly break things?

Eduardo.

[1] https://developer.mozilla.org/en-US/docs/Making_String_Changes
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=593126
[3] http://hg.mozilla.org/mozilla-central/rev/4e868145aa6b
[4] https://bugzilla.mozilla.org/show_bug.cgi?id=412387
[5] https://bugzilla.mozilla.org/show_bug.cgi?id=812762
Discussion on .l10n has clarified the situation for me and I now agree that the entity names need to change.  I'd also like to apologize for my earlier crankiness and absolutist tone; it's been pointed out to me privately that I've been doing that rather more than is justified, in general, lately.

I am not going to have time to write a patch until this weekend or early next week, but I'll take responsibility for making sure the change lands before the next Aurora branch, or failing that, is cherry-picked as soon as possible thereafter.
(In reply to Zack Weinberg (:zwol) from comment #15)
> Discussion on .l10n has clarified the situation for me and I now agree that
> the entity names need to change.

Thanks ;-)

I'm aware that the current system is far from good for developers, but that's all we got right now.
(In reply to Zack Weinberg (:zwol) from comment #15)
> I am not going to have time to write a patch until this weekend or early
> next week, but I'll take responsibility for making sure the change lands
> before the next Aurora branch, or failing that, is cherry-picked as soon as
> possible thereafter.

Prior please, for sake of localizers (who start work at the beginning of the Aurora cycle). Thanks!
Assignee: nobody → zackw
Attached patch patchSplinter Review
This has not been fully tested yet, but I thought I'd better get it into someone's review queue as quickly as possible, what with Aurora looming.
Attachment #728637 - Flags: review?(theo.chevalier11)
Comment on attachment 728637 [details] [diff] [review]
patch

Review of attachment 728637 [details] [diff] [review]:
-----------------------------------------------------------------

From a localizer point of view, it looks good. But I can't say about C++, though.

Thanks for your patch, Zack!
Attachment #728637 - Flags: review?(theo.chevalier11) → review+
Comment on attachment 728637 [details] [diff] [review]
patch

Review of attachment 728637 [details] [diff] [review]:
-----------------------------------------------------------------

Thanks, r=me.
Attachment #728637 - Flags: review+
I'm not going to get a chance to land this until tomorrow morning at the earliest.  Given the schedule crunch, please someone beat me to it.
Keywords: checkin-needed
Green on try, and manually confirmed to show the appropriate error messages when printing fails (failure simulated by code modification).

https://hg.mozilla.org/integration/mozilla-inbound/rev/e4f27cf77851
Keywords: checkin-needed
https://hg.mozilla.org/mozilla-central/rev/e4f27cf77851
Status: NEW → RESOLVED
Closed: 7 years ago
Resolution: --- → FIXED
Target Milestone: --- → mozilla22
Nothing serious, but I believe the localization note in the file is incorrect. The incorrectness should however be obvious to localizers, so you may not need to correct it. I just want to say it since it might be good for developers to know in the future.

>+# Printing error messages.
>+#LOCALIZATION NOTE: Some of these messages come in pairs, one
>+# for printing and one for print previewing.  You can remove that
>+# distinction in your language by removing the entity with the _PP
>+# suffix; then the entity without a suffix will be used for both.

If a localizer removes the _PP string the build process will copy the string from en-US, so the string will be shown in English, and the l10n dashboard will complain so the localizer cannot sign off for the release.

>+# You can also add that distinction to any of the messages that don't
>+# already have it by adding a new entity with a _PP suffix.

That would be possible, but the l10n dashboard would complain about an obsolete string until the localizer removes it.
Blocks: 862470
You need to log in before you can comment on or make changes to this bug.