Closed Bug 1085335 Opened 10 years ago Closed 10 years ago

[zh-CN] Disconnect Error on Windows XP due to missing East Asian Language pack

Categories

(Mozilla QA Graveyard :: Infrastructure, defect, P1)

defect

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: andrei, Assigned: teodruta)

References

()

Details

(Keywords: reproducible, Whiteboard: [sprint])

Attachments

(3 files)

Attached file xp-zh-cn-log.txt
Branches:  35
Platforms: WinXP

This was a bit harder to track.

Only zh-CN on Windows XP is affected. Appears to be reproducible (rebuilt an affected testrun twice on CI and it failed both times). 

36 is probably affected but we don't run localised Nightly builds.

To see the failures sort by locale or OS version. Affected runs report either 6 or 2 tests passed.
http://mozmill-daily.blargon7.com/#/functional/reports?app=All&branch=35.0&platform=Win&from=2014-10-16&to=2014-10-20

Sample report: http://mozmill-daily.blargon7.com/#/functional/report/447e3274a76247e6567e1c2aa6065b95

Neither the dashboard report nor the console log (see attached) show anything interesting. I've watched one of the rebuild runs.

Failures have been on most WinXP machines:
- mm-win-xp-32-1
- mm-win-xp-32-2
- mm-win-xp-32-3
When you rebuilt the job, did you actually watch the test running on the affected node? Can you reproduce outside of CI by running it manually? I would like to get some debug info.
Flags: needinfo?(andrei.eftimie)
(In reply to Henrik Skupin (:whimboo) from comment #1)
> When you rebuilt the job, did you actually watch the test running on the
> affected node? Can you reproduce outside of CI by running it manually? I
> would like to get some debug info.

Yes, I watched the job (the second build). Nothing strange going on.
After the second test Firefox closed and didn't reopen. Timed out after 60 seconds.
Flags: needinfo?(andrei.eftimie)
I assume this is highly related to bug 974971. Are you able to constantly see that? I would be interested how a testrun would look like with the current Mozmill 2.1 code base, and the detailed disconnect failure description.
I'll check.
Flags: needinfo?(andrei.eftimie)
(In reply to Andrei Eftimie from comment #4)
> I'll check.

I am unable to reproduce the issue.
Latest occurrences are twice on 2014-10-20 and once on 2014-10-25.
Flags: needinfo?(andrei.eftimie)
Oh, we really need to investigate this properly.
I see reports also on 34 and 31ESR
Assignee: nobody → teodor.druta
Status: NEW → ASSIGNED
Whiteboard: [sprint]
Firefox 33.0b1 build 1 is affected.
Firefox 33.1 build 3 is not affected.
This seems to be a issue also in zh-TW locale builds and maybe all the zh-* locales are affected:
http://mm-ci-production.qa.scl3.mozilla.com:8080/job/release-mozilla-beta_functional/29149/console

I found this revision changes:
http://hg.mozilla.org/l10n-central/zh-TW/rev/eac97260bfd1 from October 12th from Bug - 603549.
http://hg.mozilla.org/l10n-central/zh-TW/rev/cc7924e17ee0 from October 12th from Bug - 443588.
This revision changes may be related to our failures.

I think that this failure may be related to the fact that the font-family used some components in zh-* locales "MingLiU" is not supported on Windows XP from Microsoft's web page:
http://www.microsoft.com/typography/fonts/font.aspx?FMID=1600

Also this failure is reproducible only when the testrun is built with jenkins on the staging and production machines.
Depends on: 603549
CC'ing Axel and Francesco who may have further information for us.
The changesets from comment 9 are from 2010, not a regression of those changesets.
Teodor please run the mozmill tests again, as best in debug mode. Do we always fail in the same test? It would be good to know then which data is getting transferred from Firefox to the Python code.
No longer depends on: 603549
This failure is not reproducible in local testruns nor mozmill runs, it can only be reproduced when the testrun is built with jenkins. 
From my observations there are two testcases when this bug is reproducible with zh-CN and zh-TW builds:
1. after running and succeeding "restartTests/testAddons_changeTheme/test2.js" and it crashes before running "restartTests/testAddons_changeTheme/test3.js" 
2. after running and succeeding "restartTests/testAddons_enableDisableExtension/test3.js" and it crashes before running "restartTests/testAddons_enableDisableExtension/test4.js".
(In reply to Teodor Druta from comment #13)
> This failure is not reproducible in local testruns nor mozmill runs, it can
> only be reproduced when the testrun is built with jenkins.

What about the testrun_* script? The above observation doesn't imply that job triggered via Jenkins is the fault yet.
Couldn't reproduce with the testrun_functional script on the same win xp staging machine where it fails when built and ran with Jenkins with exactly the same firefox build.
So check the environment variables as set when the job was run. That might let you re-produce the failure with the plain testrun script.
I tried to reproduce this bug on a local jenkins instance and it crashes, between other tests than on staging and production.
On the staging/production machines it specifically crashes between:
1. "restartTests\testAddons_changeTheme\test2.js" => "restartTests\testAddons_changeTheme\test3.js"
2. "restartTests\testAddons_enableDisableExtension\test3.js" => "restartTests\testAddons_enableDisableExtension\test4.js" (beta)
On my local instance of jenkins as win xp sp3 slave it crashes between:
1. "restartTests\testAddons_uninstallTheme\test2.js" => "restartTests\testAddons_uninstallTheme\test3.js"

I think it may have something to do with the persisted object used in those tests.
I managed to fix this issue on my local Windows XP machine by installing the files for East Asian languages.

Steps to install them:
1. Click on "Start" button
2. Click on "Control Panel" in the newly opened start menu
3. Select "Date, Time, Language, and Regional Options" in the control panel
4. Open "Regional and Language Options"
5. Select "Languages" tab
6. Check the "Install files for East Asian Languages"

It requires some files from Windows XP install CD.
Good find! When checking the screenshot I feel that even the other languages for complex scritps should be installed. Are those not present on our staging and production machines?
(In reply to Henrik Skupin (:whimboo) from comment #19)
> Good find! When checking the screenshot I feel that even the other languages
> for complex scritps should be installed. Are those not present on our
> staging and production machines?

They are not installed on the WinXP staging machine.
And from the reports I think we have this issue only on zh-* locales.
Should I try to install those files on the staging machine for now?
For testing purposes please do so. Please make sure to run tests before and after this change. If that fixes our problem, we can also update our production machines and the documentation on Mana. I assume no other Windows version is affected by this then?
Only windows XP is affected.
I've installed the East Asian Language files on the win xp staging machine, ran testrun jobs in jenkins, couldn't reproduce the disconnect error anymore.
That sounds good. So from where did you get the file to install? We do not have the original Windows XP installation disk. Is there a trusted resource we could download this file from, e.g. from Microsoft? We should place it on fs1.
(In reply to Henrik Skupin (:whimboo) from comment #23)
> That sounds good. So from where did you get the file to install? We do not
> have the original Windows XP installation disk. Is there a trusted resource
> we could download this file from, e.g. from Microsoft? We should place it on
> fs1.

I took them from an Install CD for windows xp, I couldn't find these files on the Microsoft site, only some third-party internet resources.
Ok, so can you please put those files onto our fs1 host?
(In reply to Henrik Skupin (:whimboo) from comment #25)
> Ok, so can you please put those files onto our fs1 host?

I placed the files under "\\share\tools\East Asian Language Pack for Windows XP"

1. Check the "Install East Assian Language Pack" checkbox, it will prompt for a Windows XP CD.
2. Click browse, choose the folder.
Thanks. I moved them into a new Windows sub directory and also updated the Mana documentation for the location and the instructions. 

Can you please update the Windows XP production nodes once the current release testing has been done? Most likely there will be free time tomorrow morning.
(In reply to Henrik Skupin (:whimboo) from comment #27)
> Thanks. I moved them into a new Windows sub directory and also updated the
> Mana documentation for the location and the instructions. 
> 
> Can you please update the Windows XP production nodes once the current
> release testing has been done? Most likely there will be free time tomorrow
> morning.

Of course, I'll do it tomorrow morning once all the testing on production will finish.
I've managed to install the East Asian Language Pack for all four Windows XP production machines.
I'll mark this bug as fixed.
Status: ASSIGNED → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Thanks Teodor for investigating and fixing this issue!
Component: Mozmill Tests → Infrastructure
Summary: [zh-CN] Disconnect Error on Windows XP → [zh-CN] Disconnect Error on Windows XP due to missing East Asian Language pack
Attached file jenkins_zhcn_log.txt
This failed again on the latest aurora 36.0a2 zh-CN. It's possible that the East Asian Language Pack didn't resolved the problem.
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
The log mentioned in comment 31 looks to be a different issue.

This particular failure would end with a green run (with only 6 completed tests).
The above log fails with a jsbridge error in the first 2 tests. This does occasionally happens, but its a different issue.
Status: REOPENED → RESOLVED
Closed: 10 years ago10 years ago
Resolution: --- → FIXED
Product: Mozilla QA → Mozilla QA Graveyard
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Creator:
Created:
Updated:
Size: