Closed Bug 893139 Opened 11 years ago Closed 10 years ago

Create 'win64-debug' platform, and enable on date and try

Categories

(Release Engineering :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: catlee, Assigned: jhopkins)

References

Details

Attachments

(2 files)

Turns out we don't have any configs for win64 debug builds. We need to create those, and get them enabled on date and try.
Product: mozilla.org → Release Engineering
Assignee: nobody → catlee
Comment on attachment 802195 [details] [diff] [review]
create win64-debug platform, enable on m-c, date and try

LGTM. I also verified that you don't need to add *-debug platforms to puppet/modules/buildmaster/templates/BuildSlaves-tests.py.erb
Attachment #802195 - Flags: review?(rail) → review+
Comment on attachment 802195 [details] [diff] [review]
create win64-debug platform, enable on m-c, date and try

Thanks!
Attachment #802195 - Flags: checked-in+
in production
This is untested but we have a lot of win pending on try right now, yet slave health shows 0 pending there.... this is a shot-in-the-dark
Attachment #802675 - Flags: review?(coop)
Depends on: 914914
Attachment #802675 - Flags: review?(coop) → review+
Comment on attachment 802675 [details] [diff] [review]
[slave health] win64

Review of attachment 802675 [details] [diff] [review]:
-----------------------------------------------------------------

https://hg.mozilla.org/users/coop_mozilla.com/slave_health/rev/09301ab449ba
Attachment #802675 - Flags: checked-in+
(In reply to Chris Cooper [:coop] from comment #6)
> Comment on attachment 802675 [details] [diff] [review]
> [slave health] win64
> 
> Review of attachment 802675 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> https://hg.mozilla.org/users/coop_mozilla.com/slave_health/rev/09301ab449ba

This didn't actually help, so I went and looked at the list of jobs that were actually pending on try.

Turns out our slaveclass lookup was busted for try jobs and they were all ending up in the build job equivalent bucket.

https://hg.mozilla.org/users/coop_mozilla.com/slave_health/rev/fc109d668d4e
The win64-debug jobs are 100% failing in `make buildsymbols` so I'm just killing them whenever I see them pending or running, that should help with the pending count :)
Ehsan says he ran into this a while ago locally and never figured out why, but suggested that Ted may have some ideas.

Log is here:
https://tbpl.mozilla.org/php/getParsedLog.php?id=27676325&tree=Date&full=1
make.py[0]: Entering directory 'e:\builds\moz2_slave\date-w64-d-0000000000000000000\build\obj-firefox'
e:\builds\moz2_slave\date-w64-d-0000000000000000000\build\obj-firefox\Makefile:148:0$ echo building symbol store
building symbol store
e:\builds\moz2_slave\date-w64-d-0000000000000000000\build\obj-firefox\Makefile:149:0$ rm -f -r ./dist/crashreporter-symbols
e:\builds\moz2_slave\date-w64-d-0000000000000000000\build\obj-firefox\Makefile:150:0$ rm -f "./dist/firefox-26.0a1.en-US.win64-x86_64.crashreporter-symbols.zip"
e:\builds\moz2_slave\date-w64-d-0000000000000000000\build\obj-firefox\Makefile:151:0$ e:/builds/moz2_slave/date-w64-d-0000000000000000000/build/obj-firefox/_virtualenv/Scripts/python.exe e:/builds/moz2_slave/date-w64-d-0000000000000000000/build/config/nsinstall.py -D ./dist/crashreporter-symbols
e:\builds\moz2_slave\date-w64-d-0000000000000000000\build\obj-firefox\Makefile:152:0$ OBJCOPY="" \
e:/builds/moz2_slave/date-w64-d-0000000000000000000/build/obj-firefox/_virtualenv/Scripts/python.exe e:/builds/moz2_slave/date-w64-d-0000000000000000000/build/toolkit/crashreporter/tools/symbolstore.py \
  -c --vcs-info                                          \
  -s e:/builds/moz2_slave/date-w64-d-0000000000000000000/build               \
  e:/builds/moz2_slave/date-w64-d-0000000000000000000/build/toolkit/crashreporter/tools/win32/dump_syms_vc1600.exe                                                \
  ./dist/crashreporter-symbols                                   \
  . | grep -iv test >                        \
  ./dist/crashreporter-symbols/firefox-26.0a1-WINNT-20130910194100-x86_64-symbols.txt
2888: Submitting jobs for files: ('.\\accessible\\public\\ia2\\IA2Marshal.pdb',)
1556: Worker processing files: ('.\\accessible\\public\\ia2\\IA2Marshal.pdb',)
2888: Submitting jobs for files: ('.\\accessible\\public\\msaa\\AccessibleMarshal.pdb',)
2788: Worker processing files: ('.\\accessible\\public\\msaa\\AccessibleMarshal.pdb',)
2888: Submitting jobs for files: ('.\\browser\\app\\firefox.pdb',)
1220: Worker processing files: ('.\\browser\\app\\firefox.pdb',)
<snip>
2788: Worker processing files: ('.\\xpcom\\typelib\\xpt\\tests\\primitivetest.pdb',)
1556: Worker processing files: ('.\\xpcom\\typelib\\xpt\\tests\\simpletypelib.pdb',)
2788: Worker processing files: ('.\\xpcom\\windbgdlg\\windbgdlg.pdb',)

command timed out: 3600 seconds without output, attempting to kill
Flags: needinfo?(ted)
This is bug 669384 (see also bug 685887). It's a bug in the Microsoft library we use to read PDB files. Makoto filed it with Microsoft and they say it's fixed in a newer version of the toolchain (but I have no idea what version): https://connect.microsoft.com/VisualStudio/feedback/details/722366/idiaenumsymbolsbyaddr-next-doesnt-return-huge-pdb
Flags: needinfo?(ted)
ugh :(

We can't really support win64 as tier-1 without debug symbols.

What are our options here?
Depends on: 669384
We can find out what version of the toolchain this is fixed in and build a dump_syms binary using that library. Since we're using pre-built binaries we don't have to use the whole toolchain for our builds, just to build dump_syms. (We'd also need to install the DIA SDK libraries on the build slaves, but that's just a couple of DLL files that need to be installed + registered.)

If that somehow doesn't pan out we can also look into the code I mentioned in bug 669384 comment 20.
If VS2012 is installed on buildslave, we can  use VS2012's DIA interface via COM.
Is this fixed in VS2012? The Connect bug report does not make that clear.

We have a VC2012-built dump_syms in the tree: http://mxr.mozilla.org/mozilla-central/source/toolkit/crashreporter/tools/win32/dump_syms_vc1600.exe

You could test that against a PDB file from a Win64 build that's causing problems (presumably xul.pdb).
typo of dump_syms_vc1700.exe?

when I test this today, it seems to be fixed by VS2012.   PDB file is generated by VS2010 SP1 with --enable-debug and --disable-optimize, and both VS2010 SP1 and VS2012 UPDATE3 are installed on my build environment.


$ time ../mozilla-win64/toolkit/crashreporter/tools/win32/dump_syms_vc1700.exe toolkit/library/xul.pdb > 1700.txt

real    2m1.420s
user    0m0.000s
sys     0m0.000s

$ time ../mozilla-win64/toolkit/crashreporter/tools/win32/dump_syms_vc1600.exe toolkit/library/xul.pdb > 1600.txt

(during running 30 minutes, no return)
The result of VS2010 version is the following.  (3hour!)

$ time ../mozilla-win64/toolkit/crashreporter/tools/win32/dump_syms_vc1600.exe toolkit/library/xul.pdb > 1600.txt

real    177m28.142s
user    0m0.000s
sys     0m0.000s

So, I think that easy way is that installing VS2012 to buildslave, then use dump_syms_vc1700.exe instead of.
Does dump_syms_vc1700.exe require vs2012 to be installed on the slave? Or is it standalone?
(In reply to Chris AtLee [:catlee] from comment #17)
> Does dump_syms_vc1700.exe require vs2012 to be installed on the slave? Or is
> it standalone?

We need install VS2012 to use dump_syms_vc1700.
ah, If msdia100.dll is redist module, we install it, then we can use dump_syms_vs1700.
typo s/msdia100.dll/msdia110.dll/
(In reply to Makoto Kato (:m_kato) from comment #15)
> typo of dump_syms_vc1700.exe?

Oops, yes!

> when I test this today, it seems to be fixed by VS2012.   PDB file is
> generated by VS2010 SP1 with --enable-debug and --disable-optimize, and both
> VS2010 SP1 and VS2012 UPDATE3 are installed on my build environment.

Thanks for testing!

(In reply to Chris AtLee [:catlee] from comment #17)
> Does dump_syms_vc1700.exe require vs2012 to be installed on the slave? Or is
> it standalone?

It does not, you can simply copy msdia110.dll somewhere on the build machine and register it (with regsvr32). I tested this on a WinXP VM and it worked fine.

We'll have to tweak the build config to use it, currently it's setup to use the dump_syms binary that was built with the toolchain you're using:
http://mxr.mozilla.org/mozilla-central/source/Makefile.in#107
I will look into this next week.
Assignee: catlee → jhopkins
Work on bug 781277 has taken longer than expected.  This bug is still next on my priority list.
What are the next steps here?  Are we going ahead with VS2012?
Flags: needinfo?(catlee)
Yes, I think we should give VS2012 a try.
Flags: needinfo?(catlee)
Looks like remaining work is tracked in bug 669384.
Status: NEW → RESOLVED
Closed: 10 years ago
Resolution: --- → FIXED
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: