1388789 - Tests for TextFormatter to validate the usefulness of compare-locales checks

Reporter

Description

•

8 years ago

In bug 890900, we'd like to review the assumptions on what's bad and what's OK for compare-locales and TextFormatter, aka, bundle.format... To actually usefully do that, I would like to have tests that reproduce the problems. gtest works for that, thanks to Ted for the tip. My C++ foo is weak, in particular when it comes down to which kinds of literals we're allowed to use in our code right now, so I'll need feedback on that.

Comment hidden (mozreview-request)

Axel Hecht [:Pike]

Reporter

Comment 3

•

8 years ago

I've pushed my latest version to try, and I'm puzzled. https://treeherder.mozilla.org/#/jobs?repo=try&revision=41de7b5fa692d8686b87c230424a695421ebec7b&filter-searchStr=opt%20gtest shows that only one platform has these tests passing. Weirdly enough, not the platform I'm on. I'm building on mac, and the tests pass. Mac on try not so much. Also, there are different test failures per platform. Anyone here able to help on how to phrase this as a reliable test? Also, one test is hitting a MOZ_ASSERT. I do want to include that pattern into our tests, but I need something that's allowing me to run one variant expecting a crash, and one variant not expecting a properly formatted string. Do we have good defines for that? I'd look for MOZ_NOT_GOING_TO_ASSERT_IN_TEST or so ;-) I'm also interesting in general feedback on how bad my C++ is here, I wouldn't be surprised if some of the dependencies are due to that. Not that I know why a float is a good pointer to a string on some platforms.

Nathan Froyd [:froydnj]

Comment 4

•

8 years ago

I don't understand why you're trying to write the death tests. I guess the ones trying to pass off -1 as a char* are not that bad, as -1 is virtually guaranteed to be a bad address. But with trying to pass off 3.5 as a char*, the bits of 3.5 interpreted as a pointer likely point to some reasonable address, and depending on the platform and the contents of memory, you're going to try and read a very large string from that address. Those death tests look like the only problematic ones...so can you explain the rationale behind including those tests?

Flags: needinfo?(l10n)

Axel Hecht [:Pike]

Reporter

Comment 5

•

8 years ago

Background to this bug is that once upon a time, '%1$S is %1$S' used to trigger fatal errors in nsTextFormatter, and now it does, and we never updated compare-locales to allow that. Which supports that the number of fatal error cases in nsTextFormatter isn't static, and I'd like to have a full list of those and tests for them, to ensure that once someone fixes an error condition, they fail the test, and thus read the comment in the test to tell me to fix the l10n infrastructure. Thus I'd like to have automated tests for all cases where something that a localizer might do generates a problem at runtime, whether it's a crash or just an empty string or something else. And I'd like that list to be exhaustive, 'cause, as I put it, localization is fuzz testing. OTH, I want to make sure that all the things I forbid in the l10n toolchain actually isn't working. So that we don't end up again in situations where en-US uses a code pattern, and l10n can't because compare-locales rips the strings out at build time. As it does for errors in l10n. On my choice of -1 as integer, that's a vague test, and mostly is there to show the difference between signed and unsigned formatters. The more frequent way to get this wrong is to mistake a %d for a %S, and then the code passes in 3. And then crash. Or bewildering other things. PS: mail and suite strings in comm-central are the most prominent offenders these days, but I don't want to rely too much on "it's just mail" ;-) . mozilla-central code generally only goes through nsIStringBundle, which only supports unicode strings anyway. %S -> %d is bad, but less bad than the other way around.

Flags: needinfo?(l10n)

Nathan Froyd [:froydnj]

Comment 6

•

8 years ago

(In reply to Axel Hecht [:Pike] from comment #5) > Background to this bug is that once upon a time, '%1$S is %1$S' used to > trigger fatal errors in nsTextFormatter, and now it does, and we never > updated compare-locales to allow that. I'm assuming you meant "...and now it doesn't..." here. > Which supports that the number of fatal error cases in nsTextFormatter isn't > static, and I'd like to have a full list of those and tests for them, to > ensure that once someone fixes an error condition, they fail the test, and > thus read the comment in the test to tell me to fix the l10n infrastructure. This is a noble goal. > Thus I'd like to have automated tests for all cases where something that a > localizer might do generates a problem at runtime, whether it's a crash or > just an empty string or something else. And I'd like that list to be > exhaustive, 'cause, as I put it, localization is fuzz testing. > > OTH, I want to make sure that all the things I forbid in the l10n toolchain > actually isn't working. So that we don't end up again in situations where > en-US uses a code pattern, and l10n can't because compare-locales rips the > strings out at build time. As it does for errors in l10n. > > On my choice of -1 as integer, that's a vague test, and mostly is there to > show the difference between signed and unsigned formatters. The more > frequent way to get this wrong is to mistake a %d for a %S, and then the > code passes in 3. And then crash. Or bewildering other things. > > PS: mail and suite strings in comm-central are the most prominent offenders > these days, but I don't want to rely too much on "it's just mail" ;-) . > mozilla-central code generally only goes through nsIStringBundle, which only > supports unicode strings anyway. %S -> %d is bad, but less bad than the > other way around. I don't have a good sense of how all the l10n infrastructure works, which might be causing part of my confusion. But my impression is that all of the l10n stuff is coming from JS, where you can really only pass in strings anyway. So the sort of errors you'd want to catch there are "the format string contains a format directive that's not %s or %S". You're testing for very different sorts of errors with these gtest death tests. You can't even get into the situation you cite above where somebody mistakes %d for %S. Tests like these are also very weird: DisableCrashReporterForTextFormatter(); NS_NAMED_LITERAL_STRING(fmt, "%d"); nsString out; // just for completeness, this is our format, and works out.Adopt(nsTextFormatter::smprintf(fmt.get(), int(-1))); EXPECT_STREQ("-1", NS_ConvertUTF16toUTF8(out).get()); out.Adopt(nsTextFormatter::smprintf(fmt.get(), (uint32_t)-1)); EXPECT_STRNE("360999", NS_ConvertUTF16toUTF8(out).get()); // ??? out.Adopt(nsTextFormatter::smprintf(fmt.get(), float(3.5))); EXPECT_STRNE("3.5", NS_ConvertUTF16toUTF8(out).get()); // ??? What are these sorts of tests even trying to accomplish in the context of the way l10n uses nsTextFormatter? (I don't even know where this l10n code you're referring to lives; all I see is the call from nsStringBundle, and that is only subject to the %s or %S problem, and maybe some problems around positional arguments.)

Nathan Froyd [:froydnj]

Comment 7

•

8 years ago

If you want some sort of type-safe API, so bad things don't happen when somebody passes floats for %d or integers for %s, then the right thing to do is to have something that parses the format string and checks that the arguments you're going to pass in actual match up with what the format directives are requesting.

Axel Hecht [:Pike]

Reporter

Comment 8

•

8 years ago

Some use cases from comm-central: https://dxr.mozilla.org/comm-central/source/mailnews/import/outlook/src/nsOutlookImport.cpp#346 using %S and %d https://dxr.mozilla.org/comm-central/source/mailnews/imap/src/nsImapMailFolder.cpp#1608 uses %c Using code search for call-sites like this is sadly tedious, because most usage of nsTextFormatter is actually non-l10n, but not all of it.

Nathan Froyd [:froydnj]

Comment 9

•

8 years ago

OK, yeah, for such usages, we'd want something more like what comment 7 suggests. I'm not super excited about including tests in the style of comment 6.

Axel Hecht [:Pike]

Reporter

Comment 10

•

8 years ago

We will have a type-safe API for l20n. Right now, I need to deal with the fact that as long as nsTextFormatter is in the code-base, there will be developers feeding strings from localizations to it. And localizers will try to pass in garbage that we need to fix at some point in the build chain, and report early to ask them to not do that. That's why I need reliable checks for the status quo. I can't just decide that I want the world to turn the other way around. If we can only make this reliable for one platform, well, so be it. If we need to use constructed funky values for our params, that's fine, too.

Nathan Froyd [:froydnj]

Comment 11

•

8 years ago

(In reply to Axel Hecht [:Pike] from comment #10) > Right now, I need to deal with the fact that as long as nsTextFormatter is > in the code-base, there will be developers feeding strings from > localizations to it. And localizers will try to pass in garbage that we need > to fix at some point in the build chain, and report early to ask them to not > do that. > > That's why I need reliable checks for the status quo. I can't just decide > that I want the world to turn the other way around. Adding a checker like comment 7 describes, and then using it in appropriate places in the C++ codebase, doesn't seem at all like turning the world around. You don't even have to change the JS API or ask localizers to do anything different.

Axel Hecht [:Pike]

Reporter

Comment 12

•

8 years ago

So, in layman terms, rewrite nsTextFormatter to be type safe, and mass-rewrite mozilla-central and comm-central to use the new code instead of the old? Any idea who could lift that, and in which timeframe?

Axel Hecht [:Pike]

Reporter

Comment 13

•

8 years ago

(In reply to Axel Hecht [:Pike] from comment #12) > So, in layman terms, rewrite nsTextFormatter to be type safe, and > mass-rewrite mozilla-central and comm-central to use the new code instead of > the old? > > Any idea who could lift that, and in which timeframe? Nathan?

Flags: needinfo?(nfroyd)

Nathan Froyd [:froydnj]

Comment 14

•

8 years ago

I could, tromey probably could too, but I highly doubt it'd be for 57, unless we wanted to try and uplift it while 57 is in beta. Maybe 58, more likely 59, and that's speaking only for myself. (In reply to Axel Hecht [:Pike] from comment #10) > We will have a type-safe API for l20n. When is this coming?

Flags: needinfo?(nfroyd)

Zibi Braniecki [:zbraniecki][:gandalf]

Comment 15

•

8 years ago

(In reply to Nathan Froyd [:froydnj] from comment #14) > (In reply to Axel Hecht [:Pike] from comment #10) > > We will have a type-safe API for l20n. > > When is this coming? Our current plan is to start enabling it around November targeting 59, but I cannot yet predict when we'll be able to complete the migration away from the old formats.

bug 1388789, add tests for nsTextFormatter to reproduce badness that can come from l10n 8 years ago Axel Hecht [:Pike] 59 bytes, text/x-review-board-request		Details
Bug 1388789 - change return values of nsTextFormatter::vs{s,v}printf; 8 years ago Tom Tromey :tromey 59 bytes, text/x-review-board-request	froydnj : review+	Details
Bug 1388789 - remove prio.h include from nsTextFormatter.h; 8 years ago Tom Tromey :tromey 59 bytes, text/x-review-board-request	froydnj : review+	Details
Bug 1388789 - replace hex strings with static arrays; 8 years ago Tom Tromey :tromey 59 bytes, text/x-review-board-request	froydnj : review+	Details
Bug 1388789 - handle unrecognized escapes in nsTextFormatter; 8 years ago Tom Tromey :tromey 59 bytes, text/x-review-board-request	froydnj : review+	Details
Bug 1388789 - make nsTextFormatter runtime type-safe; 8 years ago Tom Tromey :tromey 59 bytes, text/x-review-board-request	froydnj : review+	Details
Bug 1388789 - use nsTextFormatter::ssprintf in more places; 8 years ago Tom Tromey :tromey 59 bytes, text/x-review-board-request	froydnj : review+	Details
Bug 1388789 - normalize null string handling in nsTextFormatter; 8 years ago Tom Tromey :tromey 59 bytes, text/x-review-board-request	froydnj : review+	Details
Bug 1388789 - clean up \0 emission in nsTextFormatter; 8 years ago Tom Tromey :tromey 59 bytes, text/x-review-board-request	froydnj : review+	Details
Bug 1388789 - make va_list nsTextFormatter private; 8 years ago Tom Tromey :tromey 59 bytes, text/x-review-board-request	froydnj : review+	Details
Bug 1388789 - fix invalid format in appcacheutils.properties; 8 years ago Tom Tromey :tromey 59 bytes, text/x-review-board-request	pbro : review+ flod : review+	Details