Closed Bug 873067 Opened 8 years ago Closed 8 years ago

Source tree appears not to be pristine when building at least on try

Categories

(Release Engineering :: General, defect)

x86_64
Linux
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: glandium, Assigned: catlee)

References

Details

I was hit twice today by this on try:
TEST-UNEXPECTED-FAIL | xpccheck | test test_propertyListsUtils.js is missing from test manifest /builds/slave/try-l64-0000000000000000000000/build/toolkit/content/tests/unit/xpcshell.ini

See https://tbpl.mozilla.org/php/getParsedLog.php?id=23024691&tree=Try for example.

What this means is that there is a /builds/slave/try-l64-0000000000000000000000/build/toolkit/content/tests/unit/test_propertyListsUtils.js file that is not listed in /builds/slave/try-l64-0000000000000000000000/build/toolkit/content/tests/unit/xpcshell.ini.

There is no such toolkit/content/tests/unit/test_propertyListsUtils.js file in the corresponding tree: https://hg.mozilla.org/try/file/ff31e0e5f9cc/toolkit/content/tests/unit

Note this file recently moved (bug 828116)

Retriggering the build (on a different slave), led to a green build.
Both failures were on the same slave: bld-linux64-ix-039
Component: Release Engineering: Automation (Release Automation) → Release Engineering: Automation (General)
QA Contact: bhearsum → catlee
Probably related to bug 851270?
(In reply to Ben Hearsum [:bhearsum] from comment #2)
> Probably related to bug 851270?

Most certainly. Combined with mercurial misbehaving...
Blocks: 851270
I've disabled the slave for investigation.

hg status claims this file is clean:
[cltbld@bld-linux64-ix-039 build]$ hg ident
3d67ad78ccb8

[cltbld@bld-linux64-ix-039 build]$ hg status -A | grep toolkit/content/tests/unit/test_propertyListsUtils.js
I obj-firefox/_tests/xpcshell/toolkit/content/tests/unit/test_propertyListsUtils.js
C toolkit/content/tests/unit/test_propertyListsUtils.js
The logs mention an unknown parent, which may be related?

command: START
command: hg update -C -r ff31e0e5f9cc3e2113a6b257f0f85dfe178ff7de
command: cwd: /builds/slave/try-l64-0000000000000000000000/build
command: output:
warning: ignoring unknown working parent 0e40d40ee804!
78531 files updated, 0 files merged, 0 files removed, 0 files unresolved
command: END (141.36s elapsed)
and since hg thinks this file should be there, it's not getting purged:

hg --config extensions.purge= purge -a --all /builds/slave/try-l64-0000000000000000000000/build --print | grep test_propertyListsUtils.js
obj-firefox/_tests/xpcshell/toolkit/content/tests/unit/test_propertyListsUtils.js
No output either from 'hg out' or hg log  -r 'outgoing("https://hg.mozilla.org/try")'
hg verify succeeds
hg manifest -r ff31e0e5f9cc3e2113a6b257f0f85dfe178ff7de | grep test_propertyListsUtils.js
toolkit/modules/tests/xpcshell/test_propertyListsUtils.js

So the file is in the manifest. This contradicts the assertion in the initial comment that this file should not exist.
Oops, paths are different: we're dealing with a rename:

$ hg debugrename toolkit/modules/tests/xpcshell/test_propertyListsUtils.js
toolkit/modules/tests/xpcshell/test_propertyListsUtils.js renamed from toolkit/content/tests/unit/test_propertyListsUtils.js:152c120826cb8845b2be943c5bebb0a038107884
I can't reproduce this with the archive catlee uploaded with Mercurial 2.6.

The changeset moving this file is:

changeset:   131921:b880a068345e
user:        Ekanan Ketunuti <ananuti@gmail.com>
date:        Tue May 14 14:37:18 2013 -0700
summary:     Bug 828116 - Move modules in toolkit/content and toolkit/mozapps/shared to toolkit/modules. r=Mossop

$ hg up -C -r 131920
250 files updated, 0 files merged, 42 files removed, 0 files unresolved
gps@gps-mbp:~/tmp/try/builds/hg-shared/try$ find toolkit/ | grep test_property
toolkit//content/tests/unit/test_propertyListsUtils.js

$ hg up -C -r ff31e0e5f9cc3e2113a6b257f0f85dfe178ff7de
154 files updated, 0 files merged, 18 files removed, 0 files unresolved
gps@gps-mbp:~/tmp/try/builds/hg-shared/try$ find toolkit/ | grep test_property
toolkit//modules/tests/xpcshell/test_propertyListsUtils.js

gps@gps-mbp:~/tmp/try/builds/hg-shared/try$ hg up -C -r ff31e0e5f9cc3e2113a6b257f0f85dfe178ff7de
206 files updated, 0 files merged, 86 files removed, 0 files unresolved
gps@gps-mbp:~/tmp/try/builds/hg-shared/try$ find toolkit/ | grep test_property
toolkit//modules/tests/xpcshell/test_propertyListsUtils.js
Output from hg debugindex -c for relevant changesets and their neighbors:

 131919  32242783     141  131919  131919 975667697d0b 9182c3e6a967 26ab72bfa9df
 131920  32242924     200  131919  131920 bf0bcf4ecf28 975667697d0b 000000000000
 131921  32243124     746  131919  131921 b880a068345e bf0bcf4ecf28 000000000000
 131922  32243870     238  131922  131922 d08934cfce04 b880a068345e 000000000000
 131923  32244108     210  131922  131923 fa1dc340708b d08934cfce04 000000000000

 132492  32383214     240  132491  132492 e6c4a33e480e 01072a33f2ed 000000000000
 132493  32383454     198  132493  132493 15ba59a74221 e6c4a33e480e 000000000000
 132494  32383652     256  132493  132494 04da0fd6380b 15ba59a74221 000000000000

 135918  33180494     114  135918  135918 b222eb01dc2f 657881fe8900 000000000000
 135919  33180608     164  135918  135919 ff31e0e5f9cc 15ba59a74221 000000000000
 135920  33180772     232  135918  135920 4dd6f2744682 630974b4fa14 000000000000
I'm bothered by this:

command: START
command: hg --config extensions.purge= purge -a --all /builds/slave/try-l64-0000000000000000000000/build
command: cwd: /builds/slave/try-l64-0000000000000000000000/build
command: output:
warning: ignoring unknown working parent 0e40d40ee804!
command: END (39.00s elapsed)

command: START
command: hg update -C -r ff31e0e5f9cc3e2113a6b257f0f85dfe178ff7de
command: cwd: /builds/slave/try-l64-0000000000000000000000/build
command: output:
warning: ignoring unknown working parent 0e40d40ee804!
78531 files updated, 0 files merged, 0 files removed, 0 files unresolved
command: END (141.36s elapsed)

Specifically, why is *every* file updated after the purge? I can only assume that the purge is blowing out everything. That shouldn't happen!

That "ignoring unknown working parent" sheds some light. That comes from http://selenic.com/hg/file/0fbcabe523bc/mercurial/localrepo.py#l368.

It appears that if any file under version control has modifications, purge will delete *all* files. This is because purge is unable to locate the manifest for the working directory changeset since that changeset isn't committed.

That still doesn't explain why the old file is lingering around. Purge should have removed it (along with every other file)!
My theory about a dirty file causing purge to purge everything doesn't hold:

http://gps.pastebin.mozilla.org/2410810
my theory is the following:
* we recently purged out a handful of try revs due to corruption from our repo.
* this slave had pulled one of those try revs
* This slave updated working directory to said rev
....
* This slave has a working directory with a rev not present in its share?
(In reply to Justin Wood (:Callek) from comment #14)
> my theory is the following:
> * we recently purged out a handful of try revs due to corruption from our
> repo.
> * this slave had pulled one of those try revs
> * This slave updated working directory to said rev
> ....
> * This slave has a working directory with a rev not present in its share?

I'll subscribe to this theory.

IMO if we roll back and reset the try repo (or any repo), I think any clone of the old repo should be purged.
Anything else we want from this machine, or can I wipe the build directory and throw it back into production?
Works for me.
Cleaned up the slave, back into the pool.
Assignee: nobody → catlee
Status: NEW → RESOLVED
Closed: 8 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.