940690 - Get existing metrofx talos tests running on release/project branches

Reporter

Description

•

11 years ago

Currently we have 4 metro talos suites: other-metro, svgr-metro, tp5o-metro, and dromaeojs-metro, which are disabled by default and only enabled on Cedar.

We should enable these by default so they run on release/project branches (eg: m-c, aurora, etc).

Jordan Lund (:jlund)

Reporter

Updated

•

11 years ago

Depends on: 940699

Joel Maher ( :jmaher ) (UTC -8)

Comment 1

•

11 years ago

before we roll this out, we need to have a unique platform name so our win8 and metro results don't get mixed up on graph server and datazilla.

Jordan Lund (:jlund)

Reporter

Comment 2

•

11 years ago

(In reply to Joel Maher (:jmaher) from comment #1)
> before we roll this out, we need to have a unique platform name so our win8
> and metro results don't get mixed up on graph server and datazilla.

So the depending bug was resolved on this as of today so I am starting to take a look at this.

The last time there was talk of whether to do a new platform for metro or not, I believe the sheriffs and jimm said to leave it inside win 8 like we do with metro chrome unittests. I guess this is from the buildbot and tbpl point of view. As to graph server and datazilla (not too familiar on these), I am trying to figure out how to specify an additional platform. I am seeing some control flow like this:

http://mxr.mozilla.org/build/source/talos/talos/output.py#580
  def test_machine(self):

http://mxr.mozilla.org/build/source/talos/talos/output.py#436
  machine = self.test_machine()

http://mxr.mozilla.org/build/source/talos/talos/output.py#473
  collection = DatazillaResultsCollection(machine_name=machine['name'],


Which at some point is used here?:
http://mxr.mozilla.org/build/source/talos/talos/compare.py#391
    datazilla, pgodatazilla, xperfdata = getDatazillaCSET(options.revision, branch)

http://mxr.mozilla.org/build/source/talos/talos/compare.py#139
    platform = getDatazillaPlatform(testrun['test_machine']['os'], 

http://mxr.mozilla.org/build/source/talos/talos/compare.py#105
    def getDatazillaPlatform(os, platform, osversion, product):
    ...
    ...
        elif osversion == '6.2.9200':
            platform = 'Win8'


So from the above it seems we use mozinfo to determine platform information:
https://github.com/mozilla/mozbase/blob/master/mozinfo/mozinfo/mozinfo.py

Which it itself uses the built in os, sys, and platform modules. I am not sure at which point to change the platform that is given. I suppose before we init DatazillaResultsCollection()?:
https://github.com/mozilla/datazilla_client/blob/master/dzclient/client.py#L86

But to change at any point we would have to let mozinfo decide on the platform and then if osversion is '6.2.9200' or platform is 'Win8', change it to some metro known value. In which case, talos would have to be more conscious that it is running on the metro browser (right now we don't have a --immersive-mode or whatever, but we do know the --binary has metrotestharness.exe in it).

I am also not sure if datazilla has a set of pre known platforms you push results to or if it creates new ones as it gets posts/pushes?

Joel, would you be able to help me with this or direct me to someone who would be able to assist?

Flags: needinfo?(jmaher)

Jordan Lund (:jlund)

Reporter

Comment 3

•

11 years ago

also, I don't think we ever use cached datazilla responses as I think there may be a typo on this line:
http://mxr.mozilla.org/build/source/talos/talos/compare.py#126

Joel Maher ( :jmaher ) (UTC -8)

Comment 4

•

11 years ago

The cached issue is only for an out of band tool- I have it fixed in a local patch, although I really should get it checked in.

We had to solve this problem for android and tegras, and we changed the title to have a .n so graph server would work.  This was prior to datazilla:
http://hg.mozilla.org/build/talos/file/0ec5e19808f6/talos/remotePerfConfigurator.py#l44  (this won't work anymore, but it shows we have done it before)

lets key off the executable name and if it is metroharness.exe we can adjust the mozinfo block to suport win8.m.  We will need to update the graph database:
http://hg.mozilla.org/graphs/file/15609490c297/sql/data.sql

and make sure that the values going there and to datazilla are what we want.

Flags: needinfo?(jmaher)

Jordan Lund (:jlund)

Reporter

Comment 5

•

11 years ago

Attached patch bug_940690_metro_talos_major_releases-graphs.diff (obsolete) — Details — Splinter Review

First time looking at graphs repo. I would guess we have this data.sql file to insert more entries as we need them? I'm guessing this that because I do not see any t-w864-ix-{0-110} rows in the machines table in this file nor windows 6.2 (id 31?) entry in the os_list table. Yet, both must be there if lines like:
http://graphs.mozilla.org/graph.html#tests=[[230,26,31]] are working.

I added the '.m' to the machine names and i'll submit my patch I have for talos next

Attachment #8348552 - Flags: feedback?(jmaher)

Jordan Lund (:jlund)

Reporter

Comment 6

•

11 years ago

Attached patch bug_940690_metro_talos_major_releases-talos.diff (obsolete) — Details — Splinter Review

So again, I'm not too familiar with all of this so bare with me as I spill my logic verbosely so you can find the faults in them :)

So first I am appending '.m' to the title, if metrotestharness is in browser_config['browser_path']. This I think will do the trick once this dict makes its way through post():
http://mxr.mozilla.org/build/source/talos/talos/output.py#170

I am then also changing the value of osversion here:
http://mxr.mozilla.org/build/source/talos/talos/output.py#601

I am changing this because when we use the datazilla client:
http://mxr.mozilla.org/build/source/talos/talos/output.py#473
https://github.com/mozilla/datazilla_client/blob/master/dzclient/client.py#L86

it looks like currently for win8, datazilla uses a combination of the os ('win') and os_version ('6.2.9200') to make up its list of platforms:
https://datazilla.mozilla.org/?project=talos

Finally, I am also adjusting compare.py. I am not 100% on how getDatazillaPlatform() is used but I seems like it uses test_machine() as well and will need to know about the additional osversion I added?

Attachment #8348564 - Flags: feedback?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 7

•

11 years ago

Comment on attachment 8348552 [details] [diff] [review]
bug_940690_metro_talos_major_releases-graphs.diff

Review of attachment 8348552 [details] [diff] [review]:
-----------------------------------------------------------------

why do we have w64-ix-slave107 vs t-w864-ix-107.m?

I would assume it would be the old name with a .m!  Do machines such as  w64-ix-slave107 even exist?

Attachment #8348552 - Flags: feedback?(jmaher) → feedback+

Joel Maher ( :jmaher ) (UTC -8)

Comment 8

•

11 years ago

Comment on attachment 8348564 [details] [diff] [review]
bug_940690_metro_talos_major_releases-talos.diff

Review of attachment 8348564 [details] [diff] [review]:
-----------------------------------------------------------------

::: talos/output.py
@@ +603,5 @@
> +            #    datazilla platform
> +            # DatazillaResultsCollection takes an os and os_version.
> +            # Differentiate between win8 non-metro and win8 metro
> +            # by again appending an '.m' suffix to the os_version
> +            if self.results.title.endswith(".m"):

and not version.endswith('.m')

Attachment #8348564 - Flags: feedback?(jmaher) → feedback+

Jordan Lund (:jlund)

Reporter

Comment 9

•

11 years ago

(In reply to Joel Maher (:jmaher) from comment #7)
> Comment on attachment 8348552 [details] [diff] [review]
> bug_940690_metro_talos_major_releases-graphs.diff
> 
> Review of attachment 8348552 [details] [diff] [review]:
> -----------------------------------------------------------------
> 
> why do we have w64-ix-slave107 vs t-w864-ix-107.m?
> 
> I would assume it would be the old name with a .m!  Do machines such as 
> w64-ix-slave107 even exist?

So according to slave health and slavealloc these machines exist and are rev2 6.1 NT machines. I believe all the win8 machines have the t-w864-ix-{0..} format. Again I am not sure why they are not in this data.sql file but I'd guess they were entered another way? On second look, it seems we have 130 win8 machines. I'll update the patch and get someone from releng to confirm all is well.

Jordan Lund (:jlund)

Reporter

Comment 10

•

11 years ago

Attached patch bug_940690_metro_talos_major_releases-graphs-171213.diff (obsolete) — Details — Splinter Review

Hal I've seen you have worked on this repo a few times. I am trying to add some metro talos results to graph server and datazilla. Problem is I need to differentiate between win8 metro/non-metro. My understanding is that the title in the talos run is a 1x1 mapping of a machine name in the graph sql DB: for android we did something similar in the past (Bug 708793 - provide way to override platform name in POST)

So I am adding more machines with a '.m' suffix and then in Talos, when we run in the metro browser, I add  the same '.m' suffix to the title (see my talos patch I have in this bug for ref).

What I am not sure about is why other win8 machines are not in this data.sql file? I see w64-ix-slave{01..} machines but I think those are 6.1 NT machines. I'd assume the win8 machines were entered another way? I determined that the os_list table id we want is 31. This was based off poking in:
http://hg.mozilla.org/graphs/file/15609490c297/sql/schema.sql#l3
http://hg.mozilla.org/graphs/file/15609490c297/server/pyfomatic/collect.py#l274
and then finding the log line:
    "http://graphs.mozilla.org/graph.html#tests=[[287,94,31]]"
from:
    https://tbpl.mozilla.org/php/getParsedLog.php?id=32050416&full=1&branch=mozilla-central

1) Is this all I need to add the machine names to the database?
2) Will I have to run this file somehow once it's landed? 
3) Most importantly, am I going about this the right way?

Feel free to defer me to someone else if I should be asking someone else :)

Attachment #8348552 - Attachment is obsolete: true

Attachment #8348971 - Flags: review?(hwine)

Jordan Lund (:jlund)

Reporter

Comment 11

•

11 years ago

> > +            if self.results.title.endswith(".m"):
> 
> and not version.endswith('.m')

Thanks, will test this on my dev master once graph patch lands

Jordan Lund (:jlund)

Reporter

Comment 12

•

11 years ago

Comment on attachment 8348971 [details] [diff] [review]
bug_940690_metro_talos_major_releases-graphs-171213.diff

whoops this is wrong. I'll basically be achieving the same thing: I'd grab the machine name but it point to the same os in the os_list. I'll need to add another row to the os_list table and point the '.m' win8 machines to that id.

How did we know that the os added (Android 2.2 (Native)) would have id 23 here in this patch?:
https://bug708793.bugzilla.mozilla.org/attachment.cgi?id=582029

Jordan Lund (:jlund)

Reporter

Updated

•

11 years ago

Attachment #8348971 - Flags: review?(hwine) → review-

Jordan Lund (:jlund)

Reporter

Comment 13

•

11 years ago

Attached patch bug_940690_metro_talos_major_releases-graphs-191213.diff (obsolete) — Details — Splinter Review

This patch will add a new OS to os_list. It will also add 130 win8 machines with the '.m' suffix for metro. These 130 will be a mirror of our actual 130 machines in our build system.

Note: the WINNT 6.2 and t-w864-ix-{001, 130} entries are not in data.sql as I am only using this file to document/review what I will be manually entering on the staging + production sql instance.

Attachment #8350203 - Flags: review?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

11 years ago

Attachment #8350203 - Flags: review?(jmaher) → review+

Jordan Lund (:jlund)

Reporter

Comment 14

•

11 years ago

Attached patch bug_940690_metro_talos_major_releases-graphs-191213-3.diff — Details — Splinter Review

Hi sheeri,

So as per our convo this morning, I was about to insert my prev patch from this bug into production when I realized that staging is not only missing one row (WINNT 5.1 (ix)) but its highest auto incremented id is less than just the one row difference. It turns out there is one os_list row with id = 39.

So to clarify:
staging has: 30 rows in os_list with highest auto-inc ID of 36
production has: 31 rows in os_list with highest auto-inc ID of 39
https://pastebin.mozilla.org/3811391    :(

I only realized this after I tested everything out in staging and moved on to production. So now staging has "WINNT 5.1 (ix)" with id = 37. This is what we talked about doing before and it syncs with production. But it also has "WINNT 6.2 x64 (metro)" with id = 38 along with win8 metro machines with an os_list foriegn key of 38.

My plan is to just auto increment os_list in production. This I believe would set the win8 metro os_list row to 40. Then set all FK in machines to match the 40.

I guess this would again bring staging and production further out of sync but it already differs with the id of 39 for mac 10.9 in production. I did not want to assume without querying the DB gods first.

I'll do what you recommend. :)

Attachment #8348971 - Attachment is obsolete: true

Attachment #8350203 - Attachment is obsolete: true

Attachment #8350435 - Flags: review?(scabral)

Brandon Johnson [:cyborgshadow]

Comment 15

•

11 years ago

Hi Jordan,

Your statements in Comment 14 appear to be a little different than expected at this point in time.

In Production I have:

mysql> select * from os_list order by id;
+----+-----------------------------------+
| id | name                              |
+----+-----------------------------------+
|  1 | WINNT 5.1                         |
|  2 | WINNT 6.0                         |
|  3 | MacOSX Darwin 8.8.1               |
|  4 | MacOSX Darwin 9.2.2               |
|  5 | Ubuntu 7.10                       |
|  6 | CentOS release 5 (Final)          |
|  7 | MacOSX 10.5.2                     |
|  8 | WINNT 5.2                         |
|  9 | Nokia n810                        |
| 10 | Ubuntu 9.04 (x64)                 |
| 12 | WINNT 6.1                         |
| 13 | MacOSX 10.5.8                     |
| 14 | Fedora 12 - Constantine           |
| 15 | Fedora 12 x64 - Constantine       |
| 16 | Nokia n900                        |
| 17 | MacOSX 10.6.2 (rev3)              |
| 18 | CentOS (x86_64) release 5 (Final) |
| 19 | WINNT 6.1 x64                     |
| 20 | Android 2.2 (Native)              |
| 21 | MacOSX 10.6 (rev4)                |
| 22 | MacOSX 10.7                       |
| 23 | Android 2.2                       |
| 24 | MacOSX 10.8                       |
| 25 | WINNT 6.1 (ix)                    |
| 27 | Windows Server 2008 R2 (x64)      |
| 29 | Android 4.0.4                     |
| 31 | WINNT 6.2 x64                     |
| 33 | Ubuntu HW 12.04                   |
| 35 | Ubuntu HW 12.04 x64               |
| 37 | WINNT 5.1 (ix)                    |
| 39 | MacOSX 10.9                       |
+----+-----------------------------------+
31 rows in set (0.00 sec)


In Stage I have:

mysql> select * from os_list order by id;
+----+-----------------------------------+
| id | name                              |
+----+-----------------------------------+
|  1 | WINNT 5.1                         |
|  2 | WINNT 6.0                         |
|  3 | MacOSX Darwin 8.8.1               |
|  4 | MacOSX Darwin 9.2.2               |
|  5 | Ubuntu 7.10                       |
|  6 | CentOS release 5 (Final)          |
|  7 | MacOSX 10.5.2                     |
|  8 | WINNT 5.2                         |
|  9 | Nokia n810                        |
| 10 | Ubuntu 9.04 (x64)                 |
| 12 | WINNT 6.1                         |
| 13 | MacOSX 10.5.8                     |
| 14 | Fedora 12 - Constantine           |
| 15 | Fedora 12 x64 - Constantine       |
| 16 | Nokia n900                        |
| 17 | MacOSX 10.6.2 (rev3)              |
| 18 | CentOS (x86_64) release 5 (Final) |
| 19 | WINNT 6.1 x64                     |
| 20 | Android 2.2 (Native)              |
| 21 | MacOSX 10.6 (rev4)                |
| 22 | MacOSX 10.7                       |
| 23 | Android 2.2                       |
| 24 | MacOSX 10.8                       |
| 25 | WINNT 6.1 (ix)                    |
| 27 | Windows Server 2008 R2 (x64)      |
| 29 | Android 4.0.4                     |
| 31 | WINNT 6.2 x64                     |
| 33 | Ubuntu HW 12.04                   |
| 35 | Ubuntu HW 12.04 x64               |
| 36 | MacOSX 10.9                       |
| 37 | WINNT 5.1 (ix)                    |
| 38 | WINNT 6.2 x64 (metro)             |
+----+-----------------------------------+
32 rows in set (0.01 sec)

The differences between them:

- In Prod, MacOSX 10.9 has ID 39. In stage it's 39.
- In Stage, we have an extra value: ID 38 of  WINNT 6.2 x64 (metro).
- Auto Increment values differ between prod/stage.

As for the extra value and auto_increment problems, I've solved them both with the below sql:

insert into os_list values (38, 'WINNT 6.2 x64 (metro)'); -- prod only (this already existed on stage)
ALTER TABLE os_list AUTO_INCREMENT=40; -- run on both servers.

You're now safe to run the rest of your inserts on both prod and stage.

This will leave everything working and put you in a good spot. IDs will line up and the only difference becomes the difference in ID for MacOSX 10.9.

We can fix that if you're concerned about it, but we'd of course have to identify all the dependent areas that lookup that ID and adjust them.

Brandon Johnson [:cyborgshadow]

Comment 16

•

11 years ago

(In reply to Brandon Johnson [:cyborgshadow] from comment #15)

> - In Prod, MacOSX 10.9 has ID 39. In stage it's 39.


Correct that to be: In stage it's 36.

Jordan Lund (:jlund)

Reporter

Comment 17

•

11 years ago

> You're now safe to run the rest of your inserts on both prod and stage.
> 
> This will leave everything working and put you in a good spot. IDs will line
> up and the only difference becomes the difference in ID for MacOSX 10.9.

Thank you very much for taking time away to do this :)

Jordan Lund (:jlund)

Reporter

Comment 18

•

11 years ago

Comment on attachment 8350435 [details] [diff] [review]
bug_940690_metro_talos_major_releases-graphs-191213-3.diff

thanks for help on this! :)

Attachment #8350435 - Flags: review?(scabral) → review-

Jordan Lund (:jlund)

Reporter

Updated

•

10 years ago

Depends on: 956384

Marco Mucci [:MarcoM]

Updated

•

10 years ago

Blocks: metrobacklog

Whiteboard: [feature] p=0

Brandon Johnson [:cyborgshadow]

Comment 19

•

10 years ago

Were your inserts completed after the update? Just checking in.

Jordan Lund (:jlund)

Reporter

Comment 20

•

10 years ago

(In reply to Brandon Johnson [:cyborgshadow] from comment #19)
> Were your inserts completed after the update? Just checking in.

perfect. Graph server accepted the "new" platform. Thx for checking in.

Jordan Lund (:jlund)

Reporter

Comment 21

•

10 years ago

status update --

Both Graph Server and Datazilla differientiate between metro and non metro firefox. For datazilla, results were not showing up because talos did not recognize the appInfo ID for metro: {99bceaaa-e3c6-48c1-b981-ef9b46b67d60}

Results for test runs can be seen here:
Datazilla -- https://datazilla.mozilla.org/?start=1388459083&stop=1389063883&product=Firefox&repository=Cedar&os=win&os_version=6.2.9200.m&test=tsvgr_opacity&project=talos

Graph Server -- http://graphs.mozilla.org/graph.html#tests=[[72,26,38]]&sel=1388342640120,1388947440120&displayrange=7&datatype=running

Patches incoming.

Jordan Lund (:jlund)

Reporter

Comment 22

•

10 years ago

Attached patch 940690_metro_talos_major_releases-talos-060114.diff — Details — Splinter Review

same patch as for with the feedback addition and extra ID for firefox.

Attachment #8348564 - Attachment is obsolete: true

Attachment #8356421 - Flags: review?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 23

•

10 years ago

Comment on attachment 8356421 [details] [diff] [review]
940690_metro_talos_major_releases-talos-060114.diff

Review of attachment 8356421 [details] [diff] [review]:
-----------------------------------------------------------------

thanks for the good comments.  please land this on the talos repo, I have some other changes I am looking to land today/tomorrow.

Attachment #8356421 - Flags: review?(jmaher) → review+

Jordan Lund (:jlund)

Reporter

Comment 24

•

10 years ago

this has been pushed to default: https://hg.mozilla.org/build/talos/rev/8c5f2725fbdd

Marco Mucci [:MarcoM]

Updated

•

10 years ago

No longer depends on: 956384

Jordan Lund (:jlund)

Reporter

Comment 25

•

10 years ago

The addon blocker is fixed. Datazilla/graph server patches have landed in talos repo.

I think now all we need to do is update talos.json to at least 8c5f2725fbdd (my talos patches). This would enable graphserver/datazilla results for metro and update metrotestharness.exe with timeout patch. Two things we should add first to cedar before releasing on all branches.

Jmaher, is it safe to update talos repo or are we still testing/holding off for results? Is there something I should do here?

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 26

•

10 years ago

I have updated talos.json as of this morning.   I needed to make sure I had 6 hours of time to be around in case there were issues with the patch (i.e. land and disconnect is not my idea of responsible).

This is on inbound now, we can see it roll through the repositories in a short amount of time.

Flags: needinfo?(jmaher)

Jordan Lund (:jlund)

Reporter

Comment 27

•

10 years ago

> This is on inbound now, we can see it roll through the repositories in a
> short amount of time.

Perfect! I'll keep an eye on cedar for results.

Jordan Lund (:jlund)

Reporter

Comment 28

•

10 years ago

So we have metro talos suites passing again:
https://tbpl.mozilla.org/php/getParsedLog.php?id=33013670&full=1&branch=cedar

Performance results are also being posted:
https://datazilla.mozilla.org?product=Firefox&x86=false&repository=Cedar&os_version=6.2.9200.m&stop=1389767557&project=talos&start=1389162757&graph_search=9ea06b86b739&os=win&test=v8_7
http://graphs.mozilla.org/graph.html#tests=[[72,26,38]]&sel=none&displayrange=7&datatype=running

I think we are good to enable this. Patch incoming.

Jordan Lund (:jlund)

Reporter

Comment 29

•

10 years ago

Attached patch 940690_metro_talos_bbotcfgs_150114.diff (obsolete) — Details — Splinter Review

Armen, I think we are good to enable our metro talos tests. This patch will enable the suites across all project/release branches by default. 

I have some concerns here:

1) These suites can be temperamental and not pass every single time. Will that have consequences we do not want? I am not sure if we have branches that expect 100% green? Are there branches this patch will enable that we don't want enabled?

2) Metro talos suites require a very recent talos.json from m-c (http://hg.mozilla.org/mozilla-central/diff/c55b3d5314eb/testing/talos/talos.json). I think we will need to add a patch to the release branches that include this talos.json before enabling the buildbot patches? Are there other branches that will need a patch for talos.json beside 'mozilla-aurora', 'mozilla-beta', and 'mozilla-release'?

Thanks for patience.

Attachment #8360870 - Flags: feedback?(armenzg)

Jim Mathies [:jimm]

Comment 30

•

10 years ago

> Armen, I think we are good to enable our metro talos tests. This patch will
> enable the suites across all project/release branches by default. 

These need to roll out with the browser, which is currently sitting on aurora and will roll out with fx28. Not sure what happens if you enable on beta and release without the browser present in the tree?

Joel Maher ( :jmaher ) (UTC -8)

Comment 31

•

10 years ago

we would need to ensure all talos updates are on the branches we turn this on for.  Do we need to update talos on Aurora, or will we stick with m-c based branches and follow the trains?

Jim Mathies [:jimm]

Comment 32

•

10 years ago

(In reply to Joel Maher (:jmaher) from comment #31)
> we would need to ensure all talos updates are on the branches we turn this
> on for.  Do we need to update talos on Aurora, or will we stick with m-c
> based branches and follow the trains?

Today mc and aurora, add beta after the feb 4th merge, add release after 3-17.

Joel Maher ( :jmaher ) (UTC -8)

Comment 33

•

10 years ago

once these are live on m-c, we should work on getting the talos.json changes into aurora as much as we can.

Jordan Lund (:jlund)

Reporter

Updated

•

10 years ago

Attachment #8360870 - Flags: feedback?(armenzg)

Armen [:armenzg]

Comment 34

•

10 years ago

We have gecko_version to control on which branches a suite will be enabled for (see config.py).

Jordan Lund (:jlund)

Reporter

Comment 35

•

10 years ago

(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #34)
> We have gecko_version to control on which branches a suite will be enabled
> for (see config.py).

Ah OK, thanks Armen. So I see it as we have two options.

1) enable only branches that use gecko_version 29 (m-c)
   patch: https://pastebin.mozilla.org/4030868
   branches that would be affected/enabled: https://pastebin.mozilla.org/4030823

2) enable only on the m-c branch.
   patch: https://pastebin.mozilla.org/4030857
   branches that would be affected/enabled: m-c

Jimm, Jmaher: just to be crystal clear, we can enable uses m-c (gecko 29) but there are a lot. Is this what we want or do we literally only want m-c (option 2) for now?

Flags: needinfo?(jmathies)

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 36

•

10 years ago

we can do this on any gecko 29 based branch right now.

I would like to see that baked in for a week before rolling down to Aurora prior to the uplift on the 4th.

Flags: needinfo?(jmaher)

Jim Mathies [:jimm]

Updated

•

10 years ago

Flags: needinfo?(jmathies)

Jordan Lund (:jlund)

Reporter

Comment 37

•

10 years ago

Attached patch 940690_metro_talos_bbotcfgs_geck29_160114-3.diff (obsolete) — Details — Splinter Review

Thanks jmaher, jimm.

Here is a patch to enable all m-c based branches.

Armen, there were a few ways I thought about doing this but explicitly adding them like I did seemed best?

Attachment #8360870 - Attachment is obsolete: true

Attachment #8361856 - Flags: review?(armenzg)

Armen [:armenzg]

Comment 38

•

10 years ago

Comment on attachment 8361856 [details] [diff] [review]
940690_metro_talos_bbotcfgs_geck29_160114-3.diff

Review of attachment 8361856 [details] [diff] [review]:
-----------------------------------------------------------------

This seems right.

Did you use builder_list.py from braindump to compare that it only adds the builders you care about? Or similar approach to check it?

Attachment #8361856 - Flags: review?(armenzg) → review+

Jordan Lund (:jlund)

Reporter

Comment 39

•

10 years ago

> This seems right.
> 
> Did you use builder_list.py from braindump to compare that it only adds the
> builders you care about? Or similar approach to check it?

Cool thanks! I ran builder_list.py off a clean bbot-cfgs and then with my patch. The vimdiff: https://pastebin.mozilla.org/4030823 (see comment 35) shows the branches that would add these metro talos builders. There are a lot.

Note: There were other parts of the vimdiff that I did not include in the pastebin. These showed new totals of builders for a platform. eg: 

- t-w864-ix-046 has 1361 builders; limit is 2036; 66 percent of max
+ t-w864-ix-046 has 1493 builders; limit is 2036; 73 percent of max

I'd assume all these new builders are a result of creating 4 new talos builders for every branch based off gecko 29.

Jordan Lund (:jlund)

Reporter

Comment 40

•

10 years ago

After discussion and review of https://wiki.mozilla.org/Sheriffing/Job_Visibility_Policy, it seems this won't meet requirements to enable on m-c based branches. We will need <5% intermittent failures. Right now, looking at https://tbpl.mozilla.org/?tree=Cedar (since Jan 9th), we are no where close to that. Granted, there are only a few data points.

It seems the majority of failures continue to revolve around timing out after 3600s while trying to launch the metro browser. I recall we made improvements to this while changing display/sleep/screensaver timeouts but we haven't truly solved the core culprit. I don't know if the metrotestharness.exe timeout update affected anything. I am trying to find the differences between the way mochitest and talos launches the metro browser (they both use ProcessHandler) but I will need aid from those who work on the talos, mochi, and mozinfo scripts.

I will coordinate a discussion with ateam/windev and maybe even a brief vidyo session to hammer this out when I am back from PTO this thurs/friday. Jmaher said he would help out after he clears up some work next week.

Armen [:armenzg]

Comment 41

•

10 years ago

Are these messages relevant?
ProcessManager NOT managing child processes
ProcessManager UNABLE to use job objects to manage child processes
09:33:23     INFO -  INFO : Initialization of new profile failed

It might be worth triggering jobs on Cedar, seeing which fail, disable the slaves that had a failure from slavealloc and look at the Windows logs to see if anything shows up around the time that a failure happened.

It might also be worth forcing a screenshot before the talos job starts and another after it ends.
We might be able to see Windows error prompts.

Jim Mathies [:jimm]

Comment 42

•

10 years ago

(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #41)
> Are these messages relevant?
> ProcessManager NOT managing child processes
> ProcessManager UNABLE to use job objects to manage child processes
> 09:33:23     INFO -  INFO : Initialization of new profile failed
> 
> It might be worth triggering jobs on Cedar, seeing which fail, disable the
> slaves that had a failure from slavealloc and look at the Windows logs to
> see if anything shows up around the time that a failure happened.
> 
> It might also be worth forcing a screenshot before the talos job starts and
> another after it ends.
> We might be able to see Windows error prompts.

We could also try to force a crash using crashinject here like we do in mochitests. This mystery here is what's going wrong at the end of these runs -

https://tbpl.mozilla.org/php/getParsedLog.php?id=33337832&tree=Cedar&full=1#error0

Is this the machine? The browser hanging? Test harness problems? How can we get at one of these devices after this occurs?

Armen [:armenzg]

Comment 43

•

10 years ago

(In reply to Jim Mathies [:jimm] from comment #42)
> https://tbpl.mozilla.org/php/getParsedLog.
> php?id=33337832&tree=Cedar&full=1#error0
> 
> Is this the machine? The browser hanging? Test harness problems? How can we
> get at one of these devices after this occurs?

gbrown: do you think we need to try the unbuffered mode in here?

I think what you saw affecting Android jobs might be the same thing we're seeing at time on desktop unit test jobs.

Flags: needinfo?(gbrown)

Geoff Brown [:gbrown] (pto Apr 11-18)

Comment 44

•

10 years ago

(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #43)
> gbrown: do you think we need to try the unbuffered mode in here?

It looks like unbuffered mode is already being used:

> 'c:/mozilla-build/python27/python' '-u' 'scripts/scripts/talos_script.py' ...

But if I am reading that wrong, or there is some other place where python is being called without -u, and that process is generating output that is ultimately being collected for the log, then certainly I would recommend using -u -- it has helped get more accurate logs for Android hangs, and did not introduce any new problems.

Flags: needinfo?(gbrown)

Jordan Lund (:jlund)

Reporter

Comment 45

•

10 years ago

> > Are these messages relevant?
> > ProcessManager NOT managing child processes
> > ProcessManager UNABLE to use job objects to manage child processes
From what I gather, these mozprocess messages are normal and occur in other builds. I've been told they have not been proven to be the cause of any issues (yet)

> > 09:33:23     INFO -  INFO : Initialization of new profile failed
yes that is not good. rarely see these now. From what I can tell, not the main issue.
 
> > It might be worth triggering jobs on Cedar, seeing which fail, disable the
> > slaves that had a failure from slavealloc and look at the Windows logs to
> > see if anything shows up around the time that a failure happened.

I can reach timeouts on browser launch on my dev master so i don't see this as a 'bad machines' as much as an issue with one of the scripts.

> > It might also be worth forcing a screenshot before the talos job starts and
> > another after it ends.
> > We might be able to see Windows error prompts.

cool. I am also going to watch via vnc throughout a number of test builds today

> We could also try to force a crash using crashinject here like we do in
> mochitests. This mystery here is what's going wrong at the end of these runs

ahh. ya, looking at mochitests, somethign like this would be helpful:
http://mxr.mozilla.org/mozilla-central/source/testing/mochitest/runtests.py#699

> 
> https://tbpl.mozilla.org/php/getParsedLog.
> php?id=33337832&tree=Cedar&full=1#error0
> 
> Is this the machine? The browser hanging? Test harness problems? How can we
> get at one of these devices after this occurs?

I am not sure about that specific suite failing. That one seems rare as it happens after the metro browser launches during the test output. I *think* these ones are the predominant failures:
https://tbpl.mozilla.org/php/getParsedLog.php?id=33150302&full=1&branch=cedar
https://tbpl.mozilla.org/php/getParsedLog.php?id=33289400&full=1&branch=cedar
https://tbpl.mozilla.org/php/getParsedLog.php?id=33339900&full=1&branch=cedar
and maybe:
https://tbpl.mozilla.org/php/getParsedLog.php?id=33290265&full=1&branch=cedar

but ya, I am not sure if it's the machine, browser, or harness.  My guess is that since we are doing this successfully with mochitests and we call the talos script from mozharness the same way, I don't think it's the metroharness.exe or the machine. I feel it is talos itself or the way it uses mozinfo.

I am going to hack my testing machine so that it will not reboot after a hang and then try to grab some relevant state. will also look into trying to implement something like mochi's killAndGetStack() crashinject.

Will post findings later today.

Jordan Lund (:jlund)

Reporter

Comment 46

•

10 years ago

> I can reach timeouts on browser launch on my dev master so i don't see this
> as a 'bad machines' as much as an issue with one of the scripts.
> 

I may have been wrong about my own dev master results

tl;dr:
   1) I am not sure about last weeks Cedar builds being the best for evaluating intermittent failures
   2) my own dev master seems to fail rarely (consistent passes for every suite bar talos-other). WRT to talos-other it seems to be a process holding a path permission/lock where I need expertise in talos/mozinfo



So I'm actually having a really hard time hitting timeouts on my dev master. Going to try some older revisions to see if there it was something causing regressions in m-c from last week.

On a side note, I am basing my intermittent failures from results in Cedar. I am not sure if this is the best place for accuracy. Looking at the the last few days on cedar (the days I based my metro results on), the builds are riddled with orange and red for many builds/tests. For example, metro mochitest chrome failed 9 times out of 13 runs since Jan 10th. There are many other consistent failures. 

Going from my own dev machine, my metro talos builds seem to be passing quite consistently with talos-other ("a11yr", "ts_paint") being the only one failing a few times. So maybe we should be going by that?

These few talos-other failures are demonstrated by this uploaded log.
http://people.mozilla.org/~jlund/metro-talos-other-fail-processLock-220114.txt

I vnc'd and watched such a failure and noticed the browser open/close frequently and in quick succession. That is normal for this suite but with the way that metroharness.exe and the browser is, I think that a process hasn't been killed by the time another process requires a profile path or the browser_output.txt. 

I believe these are related to findings way back:
https://bugzilla.mozilla.org/show_bug.cgi?id=897420#c16
https://bugzilla.mozilla.org/show_bug.cgi?id=897420#c62

so maybe 1) we use output from metrotestharness like mochi does 2) we figure out a way to make sure things like the profile path/browser_output.txt file never gets locked.

Joel, you mentioned a while back with regards to permission/locks that this is something you would look into. I know your plate is full but maybe I could get a hand with this when time allows or else pass me a long to someone else who knows talos/mozinfo well?

Jim Mathies [:jimm]

Comment 47

•

10 years ago

> These few talos-other failures are demonstrated by this uploaded log.
> http://people.mozilla.org/~jlund/metro-talos-other-fail-processLock-220114.
> txt

This specific failure looks like a combination of things:

1) a browser startup failure -

bad startup:

16:50:38     INFO -  MetroAppShell::Run
16:50:38     INFO -  mozilla::widget::winrt::FrameworkView::ActivateView
16:50:38     INFO -  Resize: 0.000000 0.000000 1600.000000 1200.000000
(nothing here)
16:50:43     INFO -  mozilla::widget::winrt::ProcessManager UNABLE to use job objects to manage child processes
16:50:43     INFO -  DEBUG : Terminating: metrotestharness, plugin-container, crashreporter, dwwim

good startup from higher up in the log:

16:50:38     INFO -  MetroAppShell::Run
16:50:38     INFO -  mozilla::widget::winrt::FrameworkView::ActivateView
16:50:38     INFO -  Resize: 0.000000 0.000000 1600.000000 1200.000000
16:50:38     INFO -  mozilla::widget::winrt::MetroInput::MetroInput
16:50:38     INFO -  mozilla::widget::winrt::FrameworkView::AddEventHandlers hr=0
16:50:38     INFO -  mozilla::widget::winrt::FrameworkView::SetupContracts
16:50:38     INFO -  __start_report707__end_report
16:50:38     INFO -  * delay load started...
16:50:38     INFO -  * delay load complete.
16:50:38     INFO -  __startTimestamp1390438231916__endTimestamp
16:50:38     INFO -  [22F4240] MetroWidget::Destroy mWnd=160234 type=0
16:50:38     INFO -  mozilla::widget::winrt::MetroApp::ShutdownXPCOM: IsMainThread:1 ThreadId:6F4
16:50:38     INFO -  mozilla::widget::winrt::FrameworkView::ShutdownXPCOM

2) a failure by talos to kill the browser process when the browser hangs. It only kills the test harness, which leaves the browser around.

DEBUG : Terminating: metrotestharness, plugin-container, crashreporter, dwwim

from there on out, every launch fails because the browser is still running.

Jim Mathies [:jimm]

Comment 48

•

10 years ago

(In reply to Jim Mathies [:jimm] from comment #47)
> DEBUG : Terminating: metrotestharness, plugin-container, crashreporter, dwwim
> 
> from there on out, every launch fails because the browser is still running.

Can we add firefox to this list somehow?

Joel Maher ( :jmaher ) (UTC -8)

Comment 49

•

10 years ago

lets figure out how to kill firefox on metro.  What is the main process name we need to kill?  Right now we are trying to kill metrotestharness.exe, not firefox.exe.

Jim Mathies [:jimm]

Comment 50

•

10 years ago

(In reply to Joel Maher (:jmaher) from comment #49)
> lets figure out how to kill firefox on metro.  What is the main process name
> we need to kill?  Right now we are trying to kill metrotestharness.exe, not
> firefox.exe.

It's still firefox.exe. The test harness also reports this - 

Log(L"METRO_BROWSER_PROCESS=%d", processID);

Although I don't see this in jlund's log. Maybe an out of date harness there? We added this in mochitest bug 905628.

Joel Maher ( :jmaher ) (UTC -8)

Comment 51

•

10 years ago

The fix is to kill firefox, not metrotestharness, here is where we do the terminating of the process:
http://hg.mozilla.org/build/talos/file/tip/talos/ttest.py#l129
which then calls:
 - http://hg.mozilla.org/build/talos/file/tip/talos/ffprocess.py#l55

We reference browser_config['process'] which in this case is 'metrotestharness'. We set browser_config['process'] here:
http://hg.mozilla.org/build/talos/file/tip/talos/run_tests.py#l249
browser_config['process'] = os.path.basename(browser_config['browser_path'])

browser_path is set when we launch talos via the -e parameter (clip from the log above):
browser_path: C:\slave\test\build\application\firefox\metrotestharness

My recommendation is to hack on cleanupProcess (http://hg.mozilla.org/build/talos/file/tip/talos/ffprocess.py#l55) and add:
if process_name == 'metrotestharness':
    self.extra_prog.append('firefox')

alternatively we could edit the extra_prog data when we create ffprocess, but that is inside of ttest.py and we don't do anything there.  Another alternative would be to edit self.extra_prog in ffprocess_win32 (http://hg.mozilla.org/build/talos/file/tip/talos/ffprocess_win32.py#l17).  My top choice is my recommendation in this comment.

Jordan Lund (:jlund)

Reporter

Comment 52

•

10 years ago

Attached patch 940690_metro_talos_major_releases-talos-230114.diff — Details — Splinter Review

Sorry for the delay in reply; I wanted to make sure this patch made the difference.

After numerous tests all day, I was able to get a consistent pass rate: 20 straight greens (unsure of performance change but metro talos is new with few data points to begin with).

This success rate was due to both input from jimm and jmaher. 1st problem was I had the wrong binary after the most recent talos metroharness exe changeset. 2nd was ensuring that talos closes both the metro harness and the browser.

I was able to see an improvement after applying a different fix for both these issues and running them separately. This patch represents both fixes. Following an an r+ for this, I'll try and get this live ASAP. The buildbot patch to enable this on m-c already has an r+ so it should move quickly.

Attachment #8364871 - Flags: review?(jmaher)

Jim Mathies [:jimm]

Comment 53

•

10 years ago

Note that after these fixces land, if we do see some timeout issues on shutdown, they are likely bug 951120 which I've been working on. Mochitests for metro are currently timing out about 15% of the time, and we've made figuring out why a high priority.

Joel Maher ( :jmaher ) (UTC -8)

Comment 54

•

10 years ago

Comment on attachment 8364871 [details] [diff] [review]
940690_metro_talos_major_releases-talos-230114.diff

Review of attachment 8364871 [details] [diff] [review]:
-----------------------------------------------------------------

::: metro/metrotestharness_info.txt
@@ +7,5 @@
>      01/01/14
>      http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/mozilla-central-win32/1388536346/firefox-29.0a1.en-US.win32.tests.zip
> +
> +Current:
> +    23/01/14

I think the date formats are not compatible- earlier on you have mm/dd/yy, and this is dd/mm/yy

::: talos/ffprocess.py
@@ +63,5 @@
> +        # We must inform talos about the sub process, the metro browser itself,
> +        # that is spawned from metrotestharness. The metro browser proc is
> +        # given the same name as the non metro equivalent: 'firefox.exe'
> +        if (process_name == "metrotestharness" and
> +                "firefox" not in self.extra_prog):

I would remove the () around the clause, otherwise this is great stuff.

Attachment #8364871 - Flags: review?(jmaher) → review+

Jordan Lund (:jlund)

Reporter

Comment 55

•

10 years ago

Attached patch 940690_metro_talos_m-i_240114.diff — Details — Splinter Review

talos patch pushed to default: https://hg.mozilla.org/build/talos/rev/82b7680f9eaf

this patch is to enable that talos rev via talos.json

Attachment #8365167 - Flags: review?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Updated

•

10 years ago

Attachment #8365167 - Flags: review?(jmaher) → review+

Jordan Lund (:jlund)

Reporter

Updated

•

10 years ago

Whiteboard: [feature] p=0 → [feature] p=0 [leave open]

Marco Mucci [:MarcoM]

Updated

•

10 years ago

Whiteboard: [feature] p=0 [leave open] → [feature] [leave open]

Jordan Lund (:jlund)

Reporter

Comment 56

•

10 years ago

pushed to inbound: http://hg.mozilla.org/integration/mozilla-inbound/rev/ccbec5ed3ce7

Ryan VanderMeulen [:RyanVM]

Comment 57

•

10 years ago

https://hg.mozilla.org/mozilla-central/rev/ccbec5ed3ce7

Jordan Lund (:jlund)

Reporter

Comment 58

•

10 years ago

Attached patch 940690_metro_talos_m-i_270114.diff — Details — Splinter Review

I didn't sync the revision in both places.

Attachment #8366079 - Flags: review?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 59

•

10 years ago

Comment on attachment 8366079 [details] [diff] [review]
940690_metro_talos_m-i_270114.diff

Review of attachment 8366079 [details] [diff] [review]:
-----------------------------------------------------------------

did you create a new .zip file?  have you ran it on try server?  I need confirmation of that before r+

Attachment #8366079 - Flags: review?(jmaher) → review-

Jordan Lund (:jlund)

Reporter

Comment 60

•

10 years ago

==== Status update ====

This should have been in yesterday. Unfortunately, Cedar is perma red for windows builds for last 5 days. I havn't been able to test this new change that works so well on my dev-master even once.

I am tempted to just skip the queue and enable it on m-c branches. But with todays long closures, I'll try for tomorrow.

Jordan Lund (:jlund)

Reporter

Comment 61

•

10 years ago

I think I discovered why windows cedar is broken: web-platform-tests -> https://bugzilla.mozilla.org/show_bug.cgi?id=945222#c31

Jordan Lund (:jlund)

Reporter

Comment 62

•

10 years ago

cedar is fixed. merging m-c -> cedar one more time to get more data results.

will make decision tomorrow based on that.

Jordan Lund (:jlund)

Reporter

Comment 63

•

10 years ago

Wahoo!

Looks like we have a dramatic increase in pass rates. I wanted perfection but something is still not perfect. Some occasional timeouts that we have seen before are still creeping out. From what I can tell when I have caught this by vnc'n to a machine when this happens, the metro browser launches and runs tests but the output, and maybe the process itself, is lost to the talos script.

Here is an example of such a timeout (note, this is not new): https://tbpl.mozilla.org/php/getParsedLog.php?id=33795258&tree=Cedar

I think we are at a point where we either make this more visible today and call it 'good enough' by enabling on m-c, or make it perfect first. I'm fine with either.

However, if we go the route of solving first, I would like to request that we pull in someone from ateam to
dedicate themselves to hashing it out. I have tinkered here and there but I know that someone more familiar with the talos scripts would solve this much faster.

Joel, what's your thoughts? Are the improvements sufficient or should we pull in ateam?

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 64

•

10 years ago

I did a bunch of retriggers- lets see how stable these are in a couple hours.

Flags: needinfo?(jmaher)

Joel Maher ( :jmaher ) (UTC -8)

Comment 65

•

10 years ago

looking at the retriggers: https://tbpl.mozilla.org/?tree=Cedar&rev=c08328d9f6ad&jobname=WINNT%206.2, I don't want to move forward on this, although d-m looks pretty good right now.  Thoughts?

Jordan Lund (:jlund)

Reporter

Comment 66

•

10 years ago

The two blue retries are known and greened on the next attempt.

The remaining large majority are 3600 timeout. I think focus should be put on that.

Joel would you have time to look at this or suggest someone I can approach?

Joel Maher ( :jmaher ) (UTC -8)

Comment 67

•

10 years ago

I don't know anybody else who would be a good candidate to look at this.  Can we enable some of the tests now and put the timeouts off for a later time?  I have a few projects that need some attention this quarter and have been offset by investigating other oranges for the last 3 weeks.

Jordan Lund (:jlund)

Reporter

Comment 68

•

10 years ago

Attached patch 940690_metro_talos_bbotcfgs_geck29_drom_310114.diff — Details — Splinter Review

Thanks Joel, completely understand.

I am going to enable dromaeojs-metro immediately with the thought that enabling 1 suite is better than enabling none.

I can then supply someone access to our machines for any devs to debug the way that the talos repo and metroharness interact, while updating mozharness/buildbot as we make changes are made.

I don’t see this being a huge time sink since we are 90% of the way there. My guess is we need to modify the way that we Talos uses ProcessHandler and metroharness.exe.

Attachment #8361856 - Attachment is obsolete: true

Attachment #8368777 - Flags: review?(armenzg)

Armen [:armenzg]

Updated

•

10 years ago

Attachment #8368777 - Flags: review?(armenzg) → review+

Jordan Lund (:jlund)

Reporter

Comment 69

•

10 years ago

Comment on attachment 8368777 [details] [diff] [review]
940690_metro_talos_bbotcfgs_geck29_drom_310114.diff

this is checked in: https://hg.mozilla.org/build/buildbot-configs/rev/77f04568b40d

Attachment #8368777 - Flags: checked-in+

Phil Ringnalda (:philor)

Updated

•

10 years ago

Depends on: 967352

Jordan Lund (:jlund)

Reporter

Comment 70

•

10 years ago

Attached patch 940690_metro_talos_bbotcfgs_rm_drom-040214.diff — Details — Splinter Review

Due to: 967352, looks like this will need to be backed out :(

I am not sure why it suddenly never wants to go green. Will need to be investigated after back out.

Attachment #8370142 - Flags: review?(armenzg)

Armen [:armenzg]

Updated

•

10 years ago

Attachment #8370142 - Flags: review?(armenzg) → review+

Jordan Lund (:jlund)

Reporter

Comment 71

•

10 years ago

Comment on attachment 8370142 [details] [diff] [review]
940690_metro_talos_bbotcfgs_rm_drom-040214.diff

checked in: https://hg.mozilla.org/build/buildbot-configs/rev/f5a02b58d338

Attachment #8370142 - Flags: checked-in+

Jordan Lund (:jlund)

Reporter

Comment 72

•

10 years ago

buildbot-config patch is in production :)

Armen [:armenzg]

Comment 73

•

10 years ago

Can you try running metro talos jobs on a PGO build?
PGO builds are a bit different than optimized builds (non-PGO).

Jordan Lund (:jlund)

Reporter

Comment 74

•

10 years ago

(In reply to Armen Zambrano [:armenzg] (Release Engineering) (EDT/UTC-4) from comment #73)
> Can you try running metro talos jobs on a PGO build?
> PGO builds are a bit different than optimized builds (non-PGO).

you can see here that PGO was also not passing: https://bugzil.la/967352 However, I did a couple on
my dev-master and also caught the same timeout.

=== status update === Dromaeojs-metro failed too often on m-c/aurora and had to be backed out.

Basically all of these suites are not reliable enough to make it on every release/project branch. :(

In order to proceed:

1) from buildbot/mozharness perspective, everything is in place 2) someone needs to actively
work/massage the talos repo/metroharness. Ideally this would be ateam[1] when priorities are high
enough or time allocates itself 3) I can loan one of our releng machines and also assist in running
this on my dev machine (note, all of metro-talos will stay on Cedar as-well for dev testing) 4)
patches for putting this on major branches have been processed so once the issue is solved, I can
land immediately.

[1] I will be continuing my work on porting desktop builds from buildbot to
mozharness(https://bugzil.la/858797). If ateam continues to have other priorities and the 'higher
powers' prefer me to work on metro talos instead, I am sure, given enough time, I can figure out the
talos system to solve the problem.

Assignee: jlund → nobody

Status: NEW → ASSIGNED

Jordan Lund (:jlund)

Reporter

Comment 75

•

10 years ago

jimm do you think we should close this bug? Should we also remove suites running on cedar?

Flags: needinfo?(jmathies)

Jordan Lund (:jlund)

Reporter

Comment 76

•

10 years ago

ahh I see https://bugzil.la/982034 for a dedicated metro branch.

I will mirror https://bugzil.la/983269 and keep these suites running on cedar until new branch is set up.

BTW, these talos suites are running very well in cedar lately FYI.

Joel Maher ( :jmaher ) (UTC -8)

Comment 77

•

10 years ago

I recall reading that we will not be working on metro firefox going forward, quite possibly we could discontinue this effort?

Jordan Lund (:jlund)

Reporter

Comment 78

•

10 years ago

FYI - started: Bug 984476 - move mochitest-metro and talos-metro suites running on cedar over to new dedicated metro proj branch

Jordan Lund (:jlund)

Reporter

Comment 79

•

10 years ago

marking as resolved since we will not be putting metro on major release branches.

Status: ASSIGNED → RESOLVED

Closed: 10 years ago

Resolution: --- → WONTFIX

Jim Mathies [:jimm]

Updated

•

10 years ago

Flags: needinfo?(jmathies)

Nobody; OK to take it and work on it

Assignee

Updated

•

6 years ago

Component: General Automation → General

bug_940690_metro_talos_major_releases-graphs.diff 11 years ago Jordan Lund (:jlund) 10.29 KB, patch	jmaher : feedback+	Details \| Diff \| Splinter Review
bug_940690_metro_talos_major_releases-talos.diff 11 years ago Jordan Lund (:jlund) 3.29 KB, patch	jmaher : feedback+	Details \| Diff \| Splinter Review
bug_940690_metro_talos_major_releases-graphs-171213.diff 11 years ago Jordan Lund (:jlund) 11.91 KB, patch	jlund : review-	Details \| Diff \| Splinter Review
bug_940690_metro_talos_major_releases-graphs-191213.diff 11 years ago Jordan Lund (:jlund) 12.87 KB, patch	jmaher : review+	Details \| Diff \| Splinter Review
bug_940690_metro_talos_major_releases-graphs-191213-3.diff 11 years ago Jordan Lund (:jlund) 13.25 KB, patch	jlund : review-	Details \| Diff \| Splinter Review
940690_metro_talos_major_releases-talos-060114.diff 10 years ago Jordan Lund (:jlund) 3.02 KB, patch	jmaher : review+	Details \| Diff \| Splinter Review
940690_metro_talos_bbotcfgs_150114.diff 10 years ago Jordan Lund (:jlund) 2.20 KB, patch		Details \| Diff \| Splinter Review
940690_metro_talos_bbotcfgs_geck29_160114-3.diff 10 years ago Jordan Lund (:jlund) 1.53 KB, patch	armenzg : review+	Details \| Diff \| Splinter Review
940690_metro_talos_major_releases-talos-230114.diff 10 years ago Jordan Lund (:jlund) 43.52 KB, patch	jmaher : review+	Details \| Diff \| Splinter Review
940690_metro_talos_m-i_240114.diff 10 years ago Jordan Lund (:jlund) 699 bytes, patch	jmaher : review+	Details \| Diff \| Splinter Review
940690_metro_talos_m-i_270114.diff 10 years ago Jordan Lund (:jlund) 542 bytes, patch	jmaher : review-	Details \| Diff \| Splinter Review
940690_metro_talos_bbotcfgs_geck29_drom_310114.diff 10 years ago Jordan Lund (:jlund) 1.33 KB, patch	armenzg : review+ jlund : checked-in+	Details \| Diff \| Splinter Review
940690_metro_talos_bbotcfgs_rm_drom-040214.diff 10 years ago Jordan Lund (:jlund) 1.33 KB, patch	armenzg : review+ jlund : checked-in+	Details \| Diff \| Splinter Review