Status

P2
normal
RESOLVED FIXED
9 years ago
5 years ago

People

(Reporter: aki, Assigned: aki)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Whiteboard: [mobile][talos])

Attachments

(4 attachments, 5 obsolete attachments)

(Assignee)

Description

9 years ago
I think backing out xres monitoring may help, but may not get it green/orange.
xres is disabled on production talos, as it currently breaks things.  So yeah, if you have it enabled disable it.
(Assignee)

Comment 2

9 years ago
Created attachment 443217 [details] [diff] [review]
turn off tp4 xres monitoring on maemo
Attachment #443217 - Flags: review?(anodelman)
Attachment #443217 - Flags: review?(anodelman) → review+
(Assignee)

Comment 3

9 years ago
Comment on attachment 443217 [details] [diff] [review]
turn off tp4 xres monitoring on maemo

revision 1.5
Attachment #443217 - Flags: checked-in+
(Assignee)

Comment 4

9 years ago
Now I'm hitting this:

Traceback (most recent call last):
  File "run_tests.py", line 42, in <module>
    import yaml
  File "/usr/lib/python2.5/site-packages/yaml/__init__.py", line 16, in <module>
    def scan(stream, Loader=Loader):
NameError: name 'Loader' is not defined
(Assignee)

Comment 5

9 years ago
This may be a bad maemo-n810-44. Taking out of production pool.
(Assignee)

Comment 6

9 years ago
Ooh, we just got an orange with a frozen browser, instead of a red :)
Whiteboard: [mobile][talos]
is this still a problem?
(Assignee)

Comment 8

8 years ago
It's not launching, therefore not crashing.
(Assignee)

Comment 9

8 years ago
Ran this in staging for a bit.
Tp4 looks good atm; we should turn it back on.
Assignee: nobody → aki
Summary: maemo tp4 crashing → reenable tp4 on maemo
(Assignee)

Comment 10

8 years ago
Created attachment 470903 [details] [diff] [review]
reenable tp4, bump crashtest known_fail_count to 5
Attachment #470903 - Flags: review?(jhford)
Attachment #470903 - Flags: review?(jhford) → review+
(Assignee)

Comment 11

8 years ago
Comment on attachment 470903 [details] [diff] [review]
reenable tp4, bump crashtest known_fail_count to 5

http://hg.mozilla.org/build/buildbot-configs/rev/f2be68680580
Attachment #470903 - Flags: checked-in+
(Assignee)

Updated

8 years ago
Status: NEW → RESOLVED
Last Resolved: 8 years ago
Resolution: --- → FIXED
(Assignee)

Updated

8 years ago
Status: RESOLVED → REOPENED
Resolution: FIXED → ---
(Assignee)

Comment 12

8 years ago
Created attachment 470976 [details] [diff] [review]
fix tp4 chrome

I noticed, after turning on tp4 and forcing a run, that we got two sets of nochrome results.

You're checking for None in the factory, but in master.cfg you set it to False. I'm going to take a wild guess and say False is not None.

Also making this multi-line to be more readable.
Attachment #470976 - Flags: review?(jhford)
Comment on attachment 470976 [details] [diff] [review]
fix tp4 chrome

Would work with

self.nochrome = '' if not nochrome else '--noChrome'

as well.

r+ with either this patch or ^
Attachment #470976 - Flags: review?(jhford) → review+
(Assignee)

Comment 15

8 years ago
Gah.

  ln -s /tools/tp4 .
   in dir /builds/talos/page_load_test (timeout 1200 secs)

n900-002:~# ls /tools
ls: /tools: No such file or directory

The reason this is flying by and green is it's loading 100 404's.
(Assignee)

Comment 16

8 years ago
Backed out the enable tp4 (but left crashtest at 5).
We have to figure out how to get tp4's pageset on device; I think the best time is at imaging time.
Summary: reenable tp4 on maemo → add tp4 pageset to n900s
(In reply to comment #16)
> Backed out the enable tp4 (but left crashtest at 5).
> We have to figure out how to get tp4's pageset on device; I think the best time
> is at imaging time.

I don't.  The reason is that the raw flash (mtd device) doesn't have enough space for a 400MB pageset.  This would require a bunch of extra manual steps that will confuse an otherwise clean and simple process.

I would prefer to write a script that does something like:

if have_tp4_data('/builds/tp4'):
  exit 0
else:
  fetch_tp4_data('/builds/tp4.tar.bz2')
  unpack('/tools/tp4.tar.bz2')
  exit 1

and run that before the buildbot slave starts up or as one of the steps every automation run.  This could also be called by the initialization script, but I'd prefer that nothing in the imaging process depend on contacting another machine.

As for the exact location of where to save the data, it does need to be on the internal sd card and shouldn't be on the loopback mounted ext3 image.
(Assignee)

Comment 18

8 years ago
Comment on attachment 470976 [details] [diff] [review]
fix tp4 chrome

This patch borked all of mobile talos.

Since the cmdln args are in a list, and that list is sent to PerfConfigurator, and '' is one of the args, PerfConfigurator gets an arg of '' that it doesn't know what to do with, and doesn't create local.config.

I suppose I can say that's a PerfConfigurator bug, or that if this nochrome bit was working properly in the first place I wouldn't have had to touch this code, but really, it's my fault.

Backed this out locally on pmm's running config, and am working on a proper fix in smm.
Instead of
    def addRunSteps(self):
        perfconf_cmd=['python',
                      "PerfConfigurator.py",
                      '-v', '-e',
                      '%s/fennec/fennec' % self.base_dir,
                      '-t', WithProperties("%(slavename)s"),
                      '--branch', self.branch,
                      '--branchName', self.branch,
                      '--activeTests', self.test,
                      '--sampleConfig', self.talos_config_file,
                      '--browserWait', str(self.browser_wait),
                      '--resultsServer', self.results_server,
                      self.nochrome,
                      '--resultsLink', '/server/collect.cgi',
                      '--output', 'local.config']

We could do
    def addRunSteps(self):
        perfconf_cmd=['python',
                      "PerfConfigurator.py",
                      '-v', '-e',
                      '%s/fennec/fennec' % self.base_dir,
                      '-t', WithProperties("%(slavename)s"),
                      '--branch', self.branch,
                      '--branchName', self.branch,
                      '--activeTests', self.test,
                      '--sampleConfig', self.talos_config_file,
                      '--browserWait', str(self.browser_wait),
                      '--resultsServer', self.results_server]
        if self.nochrome:
            perfconf_cmd.append(self.nochrome)
        perfconf_cmd.extend(['--resultsLink', '/server/collect.cgi',
                             '--output', 'local.config'])

Not sure if nochrome needs to be in a specific location in the arg.
(Assignee)

Comment 20

8 years ago
I actually had that in a previous patch.
I changed that to

if not self.nochrome:
    perfconf_cmd.remove("--noChrome")

And am about to test that.
Do you have a preference?
i'd prefer to not include it in the list over removing it.
(Assignee)

Comment 22

8 years ago
Comment on attachment 470976 [details] [diff] [review]
fix tp4 chrome

backed out: http://hg.mozilla.org/build/buildbotcustom/rev/e248f12bb144
Attachment #470976 - Flags: checked-in+ → checked-in-
(Assignee)

Comment 23

8 years ago
Status: needs tp4 pageset, needs nochrome fix, needs baking in staging.
Assignee: aki → nobody
Summary: add tp4 pageset to n900s → fix maemo5 tp4
(Assignee)

Updated

8 years ago
Assignee: nobody → aki
Priority: P4 → P2
(Assignee)

Comment 24

8 years ago
Created attachment 479247 [details] [diff] [review]
[wip] maemo tp4 configs

Putting tp4 in the 27GB [!] /home/user/MyDocs on n900.
(Assignee)

Comment 25

8 years ago
Created attachment 479248 [details] [diff] [review]
[wip] maemo tp4 custom

Running in staging.
The download-once-in-buildbot is fine, as long as it works.
If it fails midway through the unpack step, we could be in for a lot of tp4 burning.
Attachment #470976 - Attachment is obsolete: true
(Assignee)

Comment 26

8 years ago
Trying /home/user as MyDocs may be FAT32 or something stupid, causing tar to exit non-0.
(Assignee)

Comment 27

8 years ago
Got an exit 2 on /home/user/MyDocs tar jxvf; exit 0 on the first of the /home/user tar jxvf 's.

/home/user it is.  I'll keep running these on smm.
(Assignee)

Comment 28

8 years ago
1 browser frozen in tp4; 1 green run [!] in tp4_nochrome.
Trying some more runs.
(Assignee)

Comment 29

8 years ago
Ok, bunches of oranges.  And the one green finished in 5 minutes, which is suspicious.
(In reply to comment #27)
> Got an exit 2 on /home/user/MyDocs tar jxvf; exit 0 on the first of the
> /home/user tar jxvf 's.
> 
> /home/user it is.  I'll keep running these on smm.

cool.  MyDocs is fat32/vfat as you suggested in comment 26.

if it needs more space then /home/user has or this make it nearly full, we could look into making another filesystem image to mount at runtime.

Regarding comment 25, i wonder if we could add more smarts to the unpack script to delete the tarball + tp4 parent directory on failure?

The wip patches look good :)
what about something like
==============================================================================
#!/bin/sh -x

if [[ "x$1" == "x" ]] ; then
    echo No URL to Download
    exit 1
fi
if [[ "x$2" == "x" ]] ; then
    echo No mount point
    exit 1
fi

TARBALL='/home/user/MyDocs/tp4.tar.bz2'
PAGESET=$1
TP4_DIR=$2
IMAGE=/home/user/MyDocs/tp4_fs.ext3

if [[ ! -e $TARBALL ]] ; then
    wget $PAGESET -O $TARBALL
fi

if [[ ! -e $IMAGE ]] ; then
    dd if=/dev/zero of=$IMAGE bs=1024 count=$((1024*400)) # 400MB
    echo y | mkfs.ext3 $IMAGE
fi

mkdir -p $TP4_DIR

mount -t ext3 $IMAGE $TP4_DIR

==============================================================================
for the script to set up the tp4 pageset.  I haven't tested this script yet.  We could either throw this script on build.m.o, grab a copy from an HGWeb url or the super hacky putting this in a giant command string in buildbot (yuck!).
(In reply to comment #31)

well, that script doesn't actually untar the pageset :(

Another option would be to create the ext3 image once and download that as an ext3.gz file that gets uncompressed and installed to /home/user/MyDocs/tp4_fs.ext3
(Assignee)

Comment 33

8 years ago
Created attachment 479696 [details] [diff] [review]
[wip] maemo tp4 configs

This actually loads pages since tp4 isn't chmod'ed 700 anymore (owned by user, which I assume nginx is not running as)... I was able to actually browse to the tp4 pageset, served from an n900, on my desktop browser, and load a page.

Result: segfaults, hangs, and occasionally goes partway loading tp4.
Looks like we're really running it now.
Attachment #479247 - Attachment is obsolete: true
(Assignee)

Comment 34

8 years ago
Created attachment 479697 [details] [diff] [review]
[wip] maemo tp4 custom
Attachment #479248 - Attachment is obsolete: true
(Assignee)

Comment 35

8 years ago
We can either run this and hide the perma-orange, or not go further.
(Assignee)

Comment 36

8 years ago
Created attachment 481991 [details] [diff] [review]
tp4 configs

Let's get these checked in & running in staging.
Attachment #479696 - Attachment is obsolete: true
Attachment #481991 - Flags: review?(jhford)
(Assignee)

Comment 37

8 years ago
Created attachment 481992 [details] [diff] [review]
tp4 custom
Attachment #479697 - Attachment is obsolete: true
Attachment #481992 - Flags: review?(jhford)
Comment on attachment 481992 [details] [diff] [review]
tp4 custom

Looks good! one small nit, can we change

>+                     'wget '+self.tp4_tarball+' -O tp4.tar.bz2; fi',

to 

>+                     'wget %s -O tp4.tar.bz2; fi' % self.tp4_tarball,

in

>+    def addTarballTp4Steps(self):


r=me with that change.
Attachment #481992 - Flags: review?(jhford) → review+
Attachment #481991 - Flags: review?(jhford) → review+
(Assignee)

Comment 41

8 years ago
From our end, this is running in staging and explicitly removed from production configs.

DougT was able to replicate the crash+hang; once the code is fixed to run Tp4 all the way through, we can enable in production (remove 2 lines).
Status: REOPENED → RESOLVED
Last Resolved: 8 years ago8 years ago
Resolution: --- → FIXED
http://hg.mozilla.org/build/buildbotcustom/rev/1351003e0957

This changeset is broken.  We haven't hit the bustage because buildbotcustom on production mobile master wasn't updated after landing (or since oct5) and we aren't running non-tp4 tests in staging where the patch was landed.

The specifically broken bits are:


    1.19 -        self.nochrome = '' if nochrome is None else '--noChrome'
    1.20 +        self.nochrome = nochrome

    1.87 @@ -266,9 +291,10 @@ class MobileTalosFactory(BuildFactory):
    1.88                        '--sampleConfig', self.talos_config_file,
    1.89                        '--browserWait', str(self.browser_wait),
    1.90                        '--resultsServer', self.results_server,
    1.91 -                      self.nochrome,
    1.92                        '--resultsLink', '/server/collect.cgi',
    1.93                        '--output', 'local.config']
    1.94 +        if self.nochrome:
    1.95 +            perfconf_cmd += ['--noChrome']
    1.96          runtest_cmd=["python", "run_tests.py", "--noisy",
    1.97                       "--debug", "local.config"]
    1.98          self.addStep(ShellCommand(

Fix coming in bug 602120
Status: RESOLVED → REOPENED
Depends on: 602120
Resolution: FIXED → ---
sorry, didn't mean to reopen
Status: REOPENED → RESOLVED
Last Resolved: 8 years ago8 years ago
Resolution: --- → FIXED
Product: mozilla.org → Release Engineering
You need to log in before you can comment on or make changes to this bug.