Last Comment Bug 652050 - Some non-code change on 2011-04-21 caused four SVG reftests to fail on Android, due to "intl.accept_languages" being unset & "systemLanguage" checks always passing
: Some non-code change on 2011-04-21 caused four SVG reftests to fail on Androi...
Status: RESOLVED FIXED
[android][multi-locale][test marked a...
:
Product: Core
Classification: Components
Component: IPC (show other bugs)
: Trunk
: ARM Android
: P3 normal (vote)
: mozilla12
Assigned To: Daniel Holbert [:dholbert]
:
Mentors:
Depends on: 607854 679890
Blocks:
  Show dependency treegraph
 
Reported: 2011-04-21 22:45 PDT by Phil Ringnalda (:philor)
Modified: 2012-01-01 02:50 PST (History)
14 users (show)
See Also:
Crash Signature:
(edit)
QA Whiteboard:
Iteration: ---
Points: ---
Has Regression Range: ---
Has STR: ---


Attachments
reftest failure log (61.22 KB, text/plain)
2011-04-21 23:41 PDT, Daniel Holbert [:dholbert]
no flags Details
temporary trivial diagnostic #2: add printfs (1.10 KB, patch)
2011-04-22 13:36 PDT, Daniel Holbert [:dholbert]
doug.turner: review+
Details | Diff | Splinter Review
snippet of printfs in logcat (2.26 KB, text/plain)
2011-08-24 11:30 PDT, Daniel Holbert [:dholbert]
no flags Details

Description Phil Ringnalda (:philor) 2011-04-21 22:45:34 PDT
Typing that summary made me doubt it yet again, but tbpl seems to agree with me: independent of which push or what it did before, every mobile reftest-2 which started after 14:17 today (the last one before that having started at 11:04) has failed in conditions-01.svg, dynamic-conditions-01.svg, dynamic-switch-01.svg, and switch-01.svg, though I have no clue what failing those four particular tests means.

http://dev.philringnalda.com/tbpl/?tree=Mobile&rev=7cdcae5dee49 had a green R2, then one from the nightly which failed to start (and a third which I retriggered, which comes up later). After that, we had exceptions and timeouts and skips, then the fun range, http://dev.philringnalda.com/tbpl/?tree=Mobile&fromchange=4e01eac2e447&tochange=da6f72c9e672

The first thing in that range had a timeout the first time it ran R2, (at 11:20, because it got a much slower build than the next push), then the next push regressed three *different* SVG tests, without hitting the other four, in runs starting at 10:37 and 11:04, then the last push in that range started at 15:31, and hit both those three for the last time before the cause was backed out, and also these four. In between, because I was getting suspicious about the three, I retriggered a second run on 636763da2e9a, which started at 14:17 and did not hit the three (it was from the push before the cause of them), but did hit the four.

I missed seeing that at the time, so when I started getting suspicious about it being non-code, I went back further to the skipped and timed out one, all of which hit the four on my retriggers, and back to 7cdcae5dee49 which had been green before, but on a retrigger also hit the four, so independent of what code was in the build, any run starting after somewhere between 11:04 and 14:17 will fail those four tests, even if a run on the same build and same test package passed them before.
Comment 1 John Ford [:jhford] CET/CEST Berlin Time 2011-04-21 23:27:39 PDT
There was an android automation that went into production today

http://hg.mozilla.org/build/buildbotcustom/rev/a8404b0bcbe4
Comment 2 Daniel Holbert [:dholbert] 2011-04-21 23:41:37 PDT
Created attachment 527728 [details]
reftest failure log
Comment 3 Robert Longson 2011-04-21 23:42:29 PDT
The failure in the tests indicates that a system language of "foo" is now a match!. Is it possible to check one of the devices and see whether somehow Options->Content->Languages has foo in the list somehow?
Comment 4 Daniel Holbert [:dholbert] 2011-04-21 23:56:11 PDT
(Just to confirm what Robert said -- in all four failing tests, the problem appears to be the same -- a red rect with systemLanguage="foo" is rendering when we don't expect it to.)

As a quick diagnostic hack, we could potentially check in a cset to replace "foo" with "zomgzomg" or some other nonsense-string in one of the tests, and see if that fixes it.   (That would tell us whether we're matching "foo" in particular vs. matching everything somehow.)

Best to check the devices' languages directly as Robert suggested, though, if that's possible.
Comment 5 Daniel Holbert [:dholbert] 2011-04-22 09:41:06 PDT
(In reply to comment #4)
> As a quick diagnostic hack, we could potentially check in a cset to replace
> "foo" with "zomgzomg" or some other nonsense-string in one of the tests

Done -- tweaked the first failing test, conditions-01.svg:
   http://hg.mozilla.org/mozilla-central/rev/ee7fef67c370
We'll see if it passes now...
Comment 6 Daniel Holbert [:dholbert] 2011-04-22 11:26:57 PDT
Nope -- the first reftest run to complete with my tweak still failed conditions-01.svg, with the same broken rendering as in the attached reftest-log:
http://tinderbox.mozilla.org/showlog.cgi?log=Mobile/1303494555.1303496286.14526.gz

So on the test box, we're rendering this element:
> <rect systemLanguage="XXXdholbertThisIsNotALanguage" x="50" y="100" width="50" height="50" fill="red"/>

I don't know how we implement systemLanguage, but maybe the system API that we plug into is just unconditionally returning true?
Comment 7 Aki Sasaki [:aki] 2011-04-22 11:30:04 PDT
Hm, I wonder if the nightly automatically tested the multi or en-US apk.
If the multi, maybe the language selection bit failed back to en-US?

Then Philor's re-run might have been en-US, which failed because it doesn't have the multi language setting.

All just a guess, but maybe we should re-try the rev before the nightly.
Comment 8 Daniel Holbert [:dholbert] 2011-04-22 11:33:10 PDT
Aha! So we don't plug into a system API - we plug into the "intl.accept_languages" pref.  And if that pref is empty, then we accept all languages (including foo and XXXdholbertThisIsNotALanguage).  I can reproduce this locally if I clear that pref.

Reference:
http://mxr.mozilla.org/mozilla-central/source/content/svg/content/src/nsSVGFeatures.cpp?mark=221-222,225-225,228-230#218

So I'm guessing that pref is empty for some reason -- perhaps because these are yet-to-be-localized builds or something?
Comment 9 Daniel Holbert [:dholbert] 2011-04-22 11:36:52 PDT
The pref is defined in chrome://global/locale/intl.properties, btw.  So I think this is likely because it's an un-localized build.

I think we want to be testing localized builds, right? Otherwise, we can't test locale-dependent functionality like the "systemLanguage" attribute.
Comment 10 Aki Sasaki [:aki] 2011-04-22 11:39:28 PDT
I think we're testing en-US across the board.
There are no localized android builds-on-change; there are only multilocale nightlies.

Also, ignore comment 7, as we tested the en-US build in the green run.
Comment 11 Daniel Holbert [:dholbert] 2011-04-22 12:11:50 PDT
Hm, so the log in comment 6 does definitely seem to be using an en-US build (fennec-6.0a1.en-US.eabi-arm.apk).

I actually downloaded that exact build to my phone (based on the URL in the log), and I confirmed that
 - "intl.accept_languages" is set correctly
 - it works correctly on conditions-01.svg

So the fennec-6.0a1.en-US.eabi-arm.apk builds that are purportedly being run (according to the logs) seem to be fine - this appears to be due to other aspect of the configuration.

Also: I backed out the diagnostic from comment 5:
  http://hg.mozilla.org/mozilla-central/rev/f7019527f489
Comment 12 Daniel Holbert [:dholbert] 2011-04-22 12:23:14 PDT
(In reply to comment #1)
> There was an android automation that went into production today
> 
> http://hg.mozilla.org/build/buildbotcustom/rev/a8404b0bcbe4

Bear says that this change just made us start creating PID files.

Is there *anything* else that might have changed, automation-wise? (particularly l10n-related)

This point of philor's in particular seems to indicate that this is a regression in some part of our testing automation system:
> independent of what code
> was in the build, any run starting after somewhere between 11:04 and 14:17 will
> fail those four tests, even if a run on the same build and same test package
> passed them before.
Comment 13 Daniel Holbert [:dholbert] 2011-04-22 13:36:33 PDT
Created attachment 527848 [details] [diff] [review]
temporary trivial diagnostic #2: add printfs

Just to be absolutely sure about Comment 8, I propose we push this temporary diagnostic (adding 2 printf's) and then back it out once we've gotten reftest logs back.

I'm assuming we'll trigger the "intl.accept_languages is EMPTY" printf, but I think it's worth making absolutely sure, since the cause of this is remaining elusive.
Comment 14 Daniel Holbert [:dholbert] 2011-04-22 14:23:22 PDT
(pinged dougt for a quick rubber-stamp on the diagnostic, since I'd like to get this patch in & out today if possible.  Thanks dougt!)
Landed diagnostic #2: http://hg.mozilla.org/mozilla-central/rev/7eb2de708edb
Comment 15 Daniel Holbert [:dholbert] 2011-04-22 16:37:40 PDT
d'oh... apparently on Android, printf doesn't work, and we don't have any other way to print things on Gecko that end up appearing in the reftest logs.

At dougt's suggestion, I'm going to tweak Comment 14 from printf --> printf_stderr, and then I'll try to get the logcat output (where printf_stderr apparently shows up)
Comment 16 Aki Sasaki [:aki] 2011-04-22 18:15:24 PDT
[18:12]	<aki>	i'm going to guess that this is caused by installing a multilocale build and then an en-US build
[18:12]	<dholbert>	ok - that sounds reasonable
[18:12]	<aki>	not the crashing, the lang thing
[18:12]	<dholbert>	right
[18:13]	<aki>	it would be good to know how to clean up, other than reimaging

If that is the cause,
a) reliably cleaning up would be the best fix
b) reimaging/cleaning up one time, and then never ever running multilocale apks on the tegras, is another potential "fix"
Comment 17 Daniel Holbert [:dholbert] 2011-04-22 18:33:23 PDT
(In reply to comment #15)
> At dougt's suggestion, I'm going to tweak Comment 14 from printf -->
> printf_stderr, and then I'll try to get the logcat output (where printf_stderr
> apparently shows up)

Turns out we can't reliably get logcat info right now, because telnetting into a tegra while it's running reftests seems to crash it within seconds, and telneting in afterwards is impossible because it reboots (clearing its logcat buffer) 9 seconds after reftests complete. :(

I backed out the diagnostic printf from Comment 14 -- I'm just going to assume that Comment 8 was correct, since I apparently have no way of testing it :) -- and I marked these tests as failing-on-android for now, to stop the permaorange:
  http://hg.mozilla.org/mozilla-central/rev/3589bf459f85
Comment 18 Robert Longson 2011-04-23 00:30:29 PDT
You could write debug text directly into the reftests using a small lime font positioned within the red failing squares.
Comment 19 Robert Longson 2011-04-23 01:54:37 PDT
Of course you'd have to put something in the reftest with an unlikely to be duplicated id so that the backend could find it and manipulate it to do its easter egg style stuff. Too complicated perhaps?
Comment 20 Joel Maher ( :jmaher) 2011-04-23 05:30:23 PDT
since reftest uses messages manager to load pages, can we just send debug messages to message manager and listen for them in the reftest harness?  On Android, we print all output from the harness to a log file (inside the chrome extension).  It shouldn't be too hard to do that from inside a reftest as well.
Comment 21 Daniel Holbert [:dholbert] 2011-04-23 08:36:35 PDT
Prooobably too complicated / too much work given that we're already pretty sure about comment 8 & comment 16, and given that I have higher priority bugs I'm supposed to be working on anyway. :) (and given that this bug's perma-orange pain has now been silenced with known-fail annotations)

If someone else wants to do something along the lines of comment 19/20, though, feel free!
Comment 22 Mike Taylor [:bear] 2011-04-28 13:52:18 PDT
aki,

is this something that your still working on or should we bounce it over to ateam for their followup?
Comment 23 Chris Cooper [:coop] 2011-05-27 13:33:23 PDT
(In reply to comment #22)
> aki,
> 
> is this something that your still working on or should we bounce it over to
> ateam for their followup?

Aki: any follow-up here?
Comment 24 Aki Sasaki [:aki] 2011-05-27 13:51:58 PDT
(In reply to comment #23)
> (In reply to comment #22)
> > aki,
> > 
> > is this something that your still working on or should we bounce it over to
> > ateam for their followup?
> 
> Aki: any follow-up here?

Sorry about that, I must have missed the original comment.

To directly answer the question, I'm not actively working on this.

However, per comment 16, I think that having ever installed a multilocale apk on a tegra is the culprit here.

Bug 657413 may have fixed this, though someone needs to verify.  Assuming that uninstalling the previous apk removes the profile, and assuming that the previous multilocale apk install is the culprit, this bug should be fixed.

If uninstalling the previous apk doesn't remove the profile, we should probably manually nuke the profile every test run.

If a previous multilocale apk install is not the culprit, we're potentially back at square one here in determining the root cause.
Comment 25 Daniel Holbert [:dholbert] 2011-05-27 14:01:00 PDT
> Bug 657413 may have fixed this, though someone needs to verify.

The tests involved are marked as "fails-on-Android" since comment 17, so verification can consist of watching for orange UNEXPECTED-PASS in Android reftest runs on Mobile tbpl.

(If we don't see that, then this isn't fixed.)
Comment 26 cmtalbert 2011-07-05 10:19:04 PDT
(In reply to comment #25)
> > Bug 657413 may have fixed this, though someone needs to verify.
> 
> The tests involved are marked as "fails-on-Android" since comment 17, so
> verification can consist of watching for orange UNEXPECTED-PASS in Android
> reftest runs on Mobile tbpl.
> 
> (If we don't see that, then this isn't fixed.)

Assuming that hasn't happened, since there's been no traction in this bug.  So what's our state?  This is now a known failure?  The test harnesses themselves should clean up the profile we put on the device in their cleanup step.
Comment 27 Robert Longson 2011-07-05 14:13:06 PDT
We expect the profile to contain at least one language. It does on desktop and used to on mobile but now it doesn't. Options->Content->Languages.
Comment 28 Daniel Holbert [:dholbert] 2011-07-05 14:26:18 PDT
(In reply to comment #27)
> We expect the profile to contain at least one language. It does on desktop
> and used to on mobile but now it doesn't. Options->Content->Languages.

(To clarify -- as noted in comment 11, everything does seem to be fine when *manually* running our builds on a mobile phone.  This is something specific to running under our test-harness.)
Comment 29 Robert Longson 2011-07-05 14:32:27 PDT
What does intl.properties look like on the devices? Can we upload that file from one?
Comment 30 Robert Longson 2011-07-05 14:48:27 PDT
Although that file is probably shipped with each build downloaded to the device.
Comment 31 :Ehsan Akhgari 2011-08-17 13:05:48 PDT
I guess the obvious path to success is for somebody who's familiar with how to debug Fennec to set a breakpoint in nsSVGFeatures::PassesConditionalProcessingTests and see what happens there.
Comment 32 Daniel Holbert [:dholbert] 2011-08-17 13:28:59 PDT
(In reply to Ehsan Akhgari [:ehsan] from comment #31)
> I guess the obvious path to success is for somebody who's familiar with how
> to debug Fennec to set a breakpoint

Note: Per comment 11, I was unable to reproduce the test-failure locally (with the same build that the test box ran & failed on) -- so I'm not sure that local debugging will get anywhere.

Rather, I think we need to debug this _on the test boxes_ with added diagnostic output.  (which I tried, but gave up in comment 17 since at that point there was no reliable way to capture output from device during test-runs tests.  Maybe the situation has improved now?)
Comment 33 :Ehsan Akhgari 2011-08-17 14:14:46 PDT
(In reply to comment #32)
> (In reply to Ehsan Akhgari [:ehsan] from comment #31)
> > I guess the obvious path to success is for somebody who's familiar with how
> > to debug Fennec to set a breakpoint
> 
> Note: Per comment 11, I was unable to reproduce the test-failure locally (with
> the same build that the test box ran & failed on) -- so I'm not sure that local
> debugging will get anywhere.
> 
> Rather, I think we need to debug this _on the test boxes_ with added diagnostic
> output.  (which I tried, but gave up in comment 17 since at that point there
> was no reliable way to capture output from device during test-runs tests. 
> Maybe the situation has improved now?)

Can we get you access to a test slave?
Comment 34 Daniel Holbert [:dholbert] 2011-08-17 14:27:22 PDT
Sounds good to me - I'd be open to taking another look at this, given a dedicated test slave.  Aki, would you be able to loan me one? (or, do you know who I should talk to about that?)
Comment 35 Aki Sasaki [:aki] 2011-08-17 14:52:22 PDT
tegra-030. I'll file a bug for IT to get that to you.
Comment 36 Aki Sasaki [:aki] 2011-08-17 14:55:07 PDT
bug 679890.
Comment 37 Daniel Holbert [:dholbert] 2011-08-19 10:53:32 PDT
Ok, I've got the tegra running a locally-made debug fennec build, and I've figured out how to get it to run reftests.

So far, I've found:
 - I can reproduce (TEST-EXPECTED-FAIL) when running remotereftest.py
 - When I load the testcases directly, in my build on the device, I get correct/expected behavior.

So this is definitely some artifact of the test harness itself.
Comment 38 Daniel Holbert [:dholbert] 2011-08-19 11:30:20 PDT
I applied the diagnostic from Comment 14 locally (with s/printf/printf_stderr/), and I verified in logcat that intl.accept_languages is empty in the reftest process.
Comment 39 Daniel Holbert [:dholbert] 2011-08-19 17:40:13 PDT
One more data point: these properties (all strings-valued, all defined in all.js) all are empty during the reftest process, too:
  network.proxy.no_proxies_on
  intl.charsetmenu.browser.more1
  intl.accept_languages
  intl.menuitems.alwaysappendaccesskeys
  intl.menuitems.insertseparatorbeforeaccesskeys
  intl.charsetmenu.mailedit
  intl.charset.detector
  font.language.group

But these three are nonempty:
  intl.charsetmenu.browser.static
  intl.charset.default
  intl.ellipsis

The first nonempty one (intl.charsetmenu.browser.static) stood out to me in particular, since it's got the same value (a properties pointer) in all.js as intl.accept_languages:
> 1079 pref("intl.accept_languages",               "chrome://global/locale/intl.properties");
[...]
> 1082 pref("intl.charsetmenu.browser.static",     "chrome://global/locale/intl.properties");

HOWEVER: it's actually overridden by a different value in mobile.js:
> pref("intl.charsetmenu.browser.static", "chrome://browser/locale/browser.properties");

And if I tweak that mobile.js line, to match the line from all.js, then that (.static) is reported as being empty, too.

So: It appears that the prefs service is able to get prefs from  chrome://browser/locale/browser.properties, but not from chrome://global/locale/intl.properties, for some reason.  (note the "//browser/" vs "//global/" difference)
Comment 40 Daniel Holbert [:dholbert] 2011-08-19 17:42:02 PDT
The other two nonempty prefs that I pointed out (intl.charset.default &   intl.ellipsis ) both point to "chrome://global-platform/locale/intl.properties" -- so our reftest process has access to that, too.
Comment 41 Daniel Holbert [:dholbert] 2011-08-23 11:44:28 PDT
Sorry, disregard Comment 39/40 -- I was using Preferences::GetLocalizedString() to look up all those prefs, which only makes sense for prefs that are indirectly set via a chrome:// URI.

I've got some more-useful data, though -- it appears that we're failing due to an early return in nsStringBundle::LoadProperties().  Specifically, that function thinks it's failed in the past, so it returns early -- BUT if I let it continue, it works fine!  I commented it out here on TryServer, and it passed the 4 SVG tests in question:
  http://hg.mozilla.org/try/rev/c415cfcdcdfc

(nsStringBundle::LoadProperties is the function that we use to actually read chrome://global/locale/intl.properties, BTW.)


Investigating more...  Getting close now, I think.
Comment 42 Daniel Holbert [:dholbert] 2011-08-23 15:05:09 PDT
OK - so the nsStringBundle::LoadProperties call ultimately ends up relying on a call to nsChromeRegistryContent::GetFlagsFromPackage("global"), which fails the first time we call it, because that object's 'mPackagesHash' is still empty.

A bit later on, we register a bunch of packages with the nsChromeRegistryContent.  However, we've already failed our first nsStringBundle::LoadProperties call, so we insist on making all future calls fail.

I'm guessing the package-registration is supposed to happen before pretty much anything else.  But we're somehow getting an earlier call to nsStringBundle::LoadProperties, which is fails, and that failure is sticky.
Comment 43 Alon Zakai (:azakai) 2011-08-24 11:09:36 PDT
Looking at this with dholbert, it appears the issue is that ContentParent launches the child process, then immediately fires off SendRegisterChrome(..). However the child tries to access the registry *before* that message is received and processed - looks like it loads XPCOM services and during that it looks for the registry.

I am not sure if there is a way for our IPC code to guarantee that something is received and handled before anything else happens in the child process (or if it even makes sense to do that). Perhaps the best solution is not to rely on IPC for this? We can send the information for SendRegisterChrome(..) by putting it in a file or in the environment if it's small enough, both of which would not rely on message timing.
Comment 44 Daniel Holbert [:dholbert] 2011-08-24 11:30:28 PDT
Created attachment 555467 [details]
snippet of printfs in logcat

Here's a snippet of printfs-in-logcat output, fleshing out comment 42 / comment 43 a bit.

It shows the content parent spawning the child process and calling 
SendRegisteredChrome -- but before the child receives that message, it hits nsHttpHandler::Init, which tries to access the intl.accept_languages pref, and that fails (at the line flagged "***BADNESS***") because we haven't received its properties-bundle yet.  (That happens in the final line of the attached log, inside of RecvRegisterChrome)
Comment 45 Daniel Holbert [:dholbert] 2011-08-24 11:33:46 PDT
(In reply to Daniel Holbert [:dholbert] from comment #44)
> and that fails (at the line flagged "***BADNESS***") because we haven't
> received its properties-bundle yet.
(er, by "properties-bundle" I meant "package", I guess; that is, the discrete bundles of prefs stored by nsChromeRegistryContent, registered in nsChromeRegistryContent::RegisterPackage)
Comment 46 Alon Zakai (:azakai) 2011-08-24 11:35:48 PDT
bsmedberg: would you be ok with the file solution in comment 43? Or is there a better way to get information to the child process that is guaranteed to arrive before the child process starts running things?
Comment 47 Alon Zakai (:azakai) 2011-08-24 11:36:44 PDT
(sorry, forgot the cc for comment 46)
Comment 48 Josh Matthews [:jdm] (away until 9/3) 2011-08-24 12:46:45 PDT
Actually, couldn't we just put off accessing the pref? That seems like a much easier solution to me.
Comment 49 Alon Zakai (:azakai) 2011-08-24 14:18:47 PDT
That might or might not be easier, I'm not sure, but it sounds like a workaround. In my opinion changing things to ensure the proper delivery order sounds better. And transferring this data via a file sounds very straightforward. Unless of course this is already the right delivery order, and accessing the pref that early is in violation of how things are expected to work.
Comment 50 John O'Duinn [:joduinn] (please use "needinfo?" flag) 2011-09-07 10:19:49 PDT
No update here in over 2 weeks - any progress on this android_tier_1 bug?
Comment 51 Daniel Holbert [:dholbert] 2011-09-07 10:29:13 PDT
No progress -- I was waiting on an answer to Comment 46 before proceeding.

(To be clear, this bug doesn't affect any "real" users on Android as far as we can tell; it only seems to reproduce when firefox is run in reftest-mode.)
Comment 52 Benjamin Smedberg [:bsmedberg] 2011-09-07 10:34:19 PDT
It's kindof a hack that we can't just send that data when we set things up: I think our good options here are:

* send the data when we create the channel (no IPDL support for this, though)
* don't start anything (XPCOM/etc) until the RegisterChrome message has been received (probably pretty easy)
* don't read the pref until later (also pretty easy)

Transferring the data via file sounds pretty horrid to me, actually, since it introduces yet another communication mechanism.
Comment 53 Daniel Holbert [:dholbert] 2011-09-07 10:45:27 PDT
(In reply to Benjamin Smedberg  [:bsmedberg] from comment #52)
> * don't start anything (XPCOM/etc) until the RegisterChrome message has been
> received (probably pretty easy)

This seems like the most robust & future-proof strategy to me. (since it seems like this could crop up with other prefs, too, at some point)

Is there some master "start xpcom/etc" function that we could move inside of ContentChild::RecvRegisterChrome?
Comment 54 Josh Matthews [:jdm] (away until 9/3) 2011-09-07 10:49:37 PDT
We'll want to make sure that we don't regress first-content-tab loading time with this change. We might be safe because we pre-init the content process on the start screen, IIRC.
Comment 55 Brad Lassey [:blassey] (use needinfo?) 2011-10-05 10:55:29 PDT
dhobert, what's the status of this bug? are you still working on it?
Comment 56 Daniel Holbert [:dholbert] 2011-10-05 11:12:47 PDT
I've been working on other bugs while tentatively waiting on a response to comment 53, to implement bsmedberg's second suggestion. (been intending to bug someone more actively about it)

(If anyone more familiar with the mobile startup / cross-process code is interested in taking this bug, I'm happy to have it stolen from me. :) I originally took it because it was SVG-related, but it looks like the fix isn't SVG-related at all & is in code that I'm less familiar with.  Otherwise, I can follow up on this in the next week or two, though.)
Comment 57 Robert Longson 2011-12-31 03:17:03 PST
pushed https://hg.mozilla.org/integration/mozilla-inbound/rev/0341759ffb4c to mark tests as passing now.
Comment 58 Daniel Holbert [:dholbert] 2011-12-31 04:14:53 PST
(In reply to Robert Longson from comment #57)
> pushed https://hg.mozilla.org/integration/mozilla-inbound/rev/0341759ffb4c
> to mark tests as passing now.

Oh, awesome!  Hmm... So that was from Bug 607854 landing, which (among other things) changed how we check for SVG requiredFeatures & systemLanguage, which is the way that this bug ended up tickling the underlying android issue.

IIUC, it seems like we might still have the underlying problem (potential race condition with pref-reading at startup, in lightweight environments like the reftest harness), and it's just not exposed by these SVG tests anymore.

On the other hand, if it's not causing problems with automated tests anymore (and doesn't affect actual users -- which it doesn't, as of comment 51), then I'm more than happy to tentatively consider this fixed-ish.
Comment 59 Robert Longson 2011-12-31 04:37:44 PST
(In reply to Daniel Holbert [:dholbert] from comment #58)
> IIUC, it seems like we might still have the underlying problem (potential
> race condition with pref-reading at startup, in lightweight environments
> like the reftest harness), and it's just not exposed by these SVG tests
> anymore.

That is indeed still true.
Comment 60 Phil Ringnalda (:philor) 2011-12-31 19:06:04 PST
https://hg.mozilla.org/mozilla-central/rev/0341759ffb4c

For some reason, we still don't have a DONTCAREUNLESSITBITESMEAGAIN resolution, settling for FIXED.
Comment 61 Daniel Holbert [:dholbert] 2012-01-01 02:50:31 PST
(In reply to Phil Ringnalda (:philor) from comment #60)
> For some reason, we still don't have a DONTCAREUNLESSITBITESMEAGAIN
> resolution, settling for FIXED.

Haha. :) Restoring an earlier bug-summary, to be clear about what's actually FIXED here.

Note You need to log in before you can comment on or make changes to this bug.