The patches in bug 1147673 cause failures on Mulet Reftests

RESOLVED WORKSFORME

Status

()

defect
RESOLVED WORKSFORME
4 years ago
4 years ago

People

(Reporter: mstange, Assigned: mstange)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [buildduty][capacity][buildslaves][loaner])

Attachments

(1 attachment)

Assignee

Description

4 years ago
Hi,

I need to debug this test failure: https://treeherder.mozilla.org/logviewer.html#?job_id=18633322&repo=mozilla-inbound

It happens during Mulet reftests, which I don't know how to run locally. I'm assuming that I can just copy/paste commands from the log on the loaner machine to run tests there.
Talked to :garndt on IRC about this. So, for debugging this particular issue, you could either try using the interactive feature offered by taskcluster or running the docker task locally.

1. using the interactive environment:
    - log into taskcluster (login.taskcluster.net) using your LDAP credentials
	- click on the link that you pasted above, then hit the "Inspect Task" button -> "Edit Task"
	- do these changes: set "task.payload.features.interactive = true" and "task.payload.command = ['bash', '-c', 'sleep 600']". Also, "task.payload.maxRunTime" can be kept to "3600" or you can set a higher value for that. Submit the task.
e.g. https://tools.taskcluster.net/task-inspector/#C7caG7hcSUmu1ZlI5psj3A/
    - while running, you'll be able to see a link to the interactive shell (look for private/docker-worker/shell.html)
	- see the attached screenshot for an example

2. you can run the docker task locally
    - go to the "Inspect Task" part and towards the end of the page you'll see the version of the script to be run on your local machine
	- you will need to setup docker to begin with 
	- :garndt can give you more details on this, so I will cc him on the bug :)
	
As :garndt mentioned on IRC, the idea is to allow the developers to do the same things within this enteractive environment as they would on a loaner, also it would be good to understand the common things that you guys do to debug such issues.

Also, if you want to use this bug to track the process of fixing the issue, maybe it would be a good idea to rename it and assign it to you.
Just ping me if you have difficulty getting things running and we'll try to help out however can.

For option 2, you can (if you have docker installed in some kind of environment, consult the docker docs to get it running for your environment) copy and paste the run locally script into a local script you can run and hopefully that will help.

I am always available on IRC (garndt in #taskcluster) if there are questions and you need some help with understanding docker and using it for these types of things.
Assignee

Comment 3

4 years ago
Thanks for the suggestions!

I'm going to try approach 2: running the docker task locally (or rather, on an Ubuntu machine that I'm ssh'd into). I'll let you know how it goes.
Assignee: nobody → mstange
Blocks: 1147673
Status: NEW → ASSIGNED
Component: Loan Requests → Layout
Product: Release Engineering → Core
QA Contact: coop
Summary: Slave loan request for mstange [Mulet Linux x64 opt Mulet Reftest [TC] Reftest R(R5)] → The patches in bug 1147673 cause failures on Mulet Reftests
Assignee

Comment 4

4 years ago
Woo, it's working!

A few questions:
 - Can I inject my own mulet build somehow? Not sure how much I'm able to control from the outside here vs how much is baked into the container.
 - How can I debug the build? I assume I have to run gdb inside the container and ssh into the container. Is that correct?
 - Can I do something simple to avoid downloading gaia at the start of every run? I suppose once I'm ssh'd into the container and controlling more of what goes on inside it, I can restart just the reftest run inside it instead of always destroying and recreating the whole container.
>  - Can I inject my own mulet build somehow? Not sure how much I'm able to
> control from the outside here vs how much is baked into the container.

Well I'm not sure how well that would work out, but you can volume mount host folders within a docker container by using the -v flag.  It's basically "-v <host location>:<location to mount in container>".  

>  - How can I debug the build? I assume I have to run gdb inside the
> container and ssh into the container. Is that correct?

If you're running this locally, you can change the command that you use to start the container to "/bin/bash" and it will start a bash session within the container.  From there you could run a debugger inside.

>  - Can I do something simple to avoid downloading gaia at the start of every
> run? I suppose once I'm ssh'd into the container and controlling more of
> what goes on inside it, I can restart just the reftest run inside it instead
> of always destroying and recreating the whole container.

Typically what I do for troubleshooting things is if I have a local checkout of some code, I will volume mount it iinto the container and then just tell the appropriate thing where to find that code.  In the case of the mozharness script that's run for mulet reftests, I think you pass in --gaia-dir


Hope this helps.
Noticed that I didn't upload the screenshot mentioned in the second comment..so I did it now, may this bug can help other developers that run into similar issues.
Assignee

Comment 7

4 years ago
I figured out my failure, but mostly by pushing patches with debug information to try. There was one part I was able to debug using gdb in the docker container, so being able to run it locally was helpful after all.

The thing that limited me the most was the fact that the taskrunner docker image wasn't able to run Firefox builds that I made on a different Linux machine, because of what looked like an outdated version of libstdc++. I also wasn't able to run rr ( http://rr-project.org/ ) for that reason. apt-get update failed with a hash sum mismatch error.
If the taskrunner image had used a more up-to-date version of ubuntu, these things might have worked.
Status: ASSIGNED → RESOLVED
Closed: 4 years ago
Resolution: --- → WORKSFORME
You need to log in before you can comment on or make changes to this bug.