Closed Bug 585084 Opened 14 years ago Closed 14 years ago

Try server results are misleading for a widget/test build failure on Windows

Categories

(Release Engineering :: General, defect)

x86
Windows Server 2003
defect
Not set
normal

Tracking

(Not tracked)

RESOLVED INCOMPLETE

People

(Reporter: ehsan.akhgari, Assigned: joduinn)

Details

(Whiteboard: [tryserver])

So, I pushed

http://hg.mozilla.org/mozilla-central/rev/b64704446120

today which caused a build failure across the board on Windows (example log:

http://tinderbox.mozilla.org/showlog.cgi?log=Firefox/1281103156.1281105600.5550.gz

) But the try server did not catch this problem (example log:

http://tinderbox.mozilla.org/showlog.cgi?log=MozillaTry/1281055088.1281068069.20205.gz

)

This is very serious, it wasted a couple of hours of everybody's time today.  I'd very much like to see this fixed.  Thanks!
I might be looking at things incorrectly but what you pushed to try is different than what you pushed to mozilla-central.

http://hg.mozilla.org/try/pushloghtml?changeset=953b4fccabda [1]
http://hg.mozilla.org/mozilla-central/pushloghtml?changeset=b64704446120 [2]

Could you help me understand why do you think it is an infrastructure problem?
The patch in question was lower in my queue, and I had pushed several other jobs, which means that the revision was already on try and not pushed that time (that's why it doesn't appear in the pushlog), but the code built and tested did have the change in it.

In addition, I've had this patch in my queue for over a month when I pushed it to m-c, and I had never seen this failure getting caught on the try server.
Try builds are clobbers, where you got a depend build on m-c (added 33 changesets with 79 changes to 76 files). Could your changes be tickling a dependency problem in the build system ?
Dependency issues would most likely manifest themselves as linker errors or random broken functionality, not as compilation errors.
The slave for the Firefox log in comment #0 was mw32-ix-slave23, doing a depend build. It previously built mozilla-central win32 debug at 2010-08-05 22:50:32 using revision abc28dec7bb5. That was depend build too. Even depend builds clobber objdir/dist/include - no errors were reported doing this for the build that failed.

A difference I noticed between the logs is the m-c build calling
d:/mozilla-build/python25/python2.5.exe -O e:/builds/moz2_slave/mozilla-central-win32-debug/build/build/cl.py cl -FoTestWinTSF.obj ...
which appears to be from
  http://hg.mozilla.org/mozilla-central/rev/6001758d1f47
That landed on m-c about a day and a half before Ehsan did. On try, the call is to cl(.exe) as we used to do.

This brings up two points
* we could have a regression from rev 6001758d1f47
* Ehsan's try push might have been based on an older m-c revision, and some change between that and tip caused the build problem
Assignee: nobody → joduinn
(In reply to comment #4)
> Dependency issues would most likely manifest themselves as linker errors or
> random broken functionality, not as compilation errors.

I think any of these could happen with broken dependencies in Makefiles. 


(In reply to comment #5)
...
> This brings up two points
> * we could have a regression from rev 6001758d1f47
> * Ehsan's try push might have been based on an older m-c revision, and some
> change between that and tip caused the build problem

Ehsan: How far from tip was your patch when you pushed to try? ...and when you pushed to m-c? 


Also, have you seen this happen since? I ask because I'm trying to figure out what, if anything, is for us to do here.
(In reply to comment #6)
> (In reply to comment #5)
> ...
> > This brings up two points
> > * we could have a regression from rev 6001758d1f47
> > * Ehsan's try push might have been based on an older m-c revision, and some
> > change between that and tip caused the build problem
> 
> Ehsan: How far from tip was your patch when you pushed to try? ...and when you
> pushed to m-c? 

A long time has passed since then, and I don't really remember.  Sorry.

> Also, have you seen this happen since? I ask because I'm trying to figure out
> what, if anything, is for us to do here.

Not really.  But I haven't done anything which could break that test since that time either.
(In reply to comment #7)
> (In reply to comment #6)
> > (In reply to comment #5)
> > ...
> > > This brings up two points
> > > * we could have a regression from rev 6001758d1f47
> > > * Ehsan's try push might have been based on an older m-c revision, and some
> > > change between that and tip caused the build problem
> > 
> > Ehsan: How far from tip was your patch when you pushed to try? ...and when you
> > pushed to m-c? 
> 
> A long time has passed since then, and I don't really remember.  Sorry.
No worries - took me a long time to find this bug in cleanup. My sorry too.


> > Also, have you seen this happen since? I ask because I'm trying to figure out
> > what, if anything, is for us to do here.
> 
> Not really.  But I haven't done anything which could break that test since that
> time either.

As there's nothing to do here, I'm closing this. We didnt do anything, and havent been able to reproduce it since, so closest fit is INCOMPLETE. Feel free to reopen if this happens again, ok?
Status: NEW → RESOLVED
Closed: 14 years ago
Resolution: --- → INCOMPLETE
(In reply to comment #8)
> > > Also, have you seen this happen since? I ask because I'm trying to figure out
> > > what, if anything, is for us to do here.
> > 
> > Not really.  But I haven't done anything which could break that test since that
> > time either.
> 
> As there's nothing to do here, I'm closing this. We didnt do anything, and
> havent been able to reproduce it since, so closest fit is INCOMPLETE. Feel free
> to reopen if this happens again, ok?

Sad choice, but a wise one.  I'll reopen if I ever see something like this again.
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.