Closed Bug 862595 Opened 11 years ago Closed 9 years ago

Need controlled build and machine platform data in pulse

Categories

(Release Engineering :: General, defect, P3)

x86
macOS
defect

Tracking

(Not tracked)

RESOLVED WONTFIX

People

(Reporter: jeads, Unassigned)

References

Details

(Whiteboard: [treeherder])

We need controlled platform data written to build properties in the pulse stream. The current property, 'payload.build.properties.platform', written to pulse.mozilla.org contains a mixture of os, platform, architecture, and build options. This information is not always present in the platform property, and when present, the field order is not always conserved. Occasionally, slightly different data is written to a similar property called 'payload.build.properties.stage_platform'.

Some examples of the platform and stage_platform data from the pulse stream are shown below.

['platform', 'ics_armv7a_gecko', 'Builder']
['platform', 'panda_android', 'Builder'], ['stage_platform', 'android', 'Builder']
['platform', 'xp', 'Builder']
['stage_platform', 'win32-pgo', 'Builder'],
['platform', 'android', 'Builder'], ['stage_platform', 'android', 'Builder']

This makes it very difficult for downstream consumers to access platform information without resorting to regular expressions, complicated conditional logic, and whitelist strategies that regularly fail.

Most of the downstream pulse consumers use the properties 'buildername', 'builderName', or the 'routing_key' to get platform information. In treeherder (tbpl 2.0) development we copied the regular expressions from http://mxr.mozilla.org/build/source/buildapi/buildapi/model/util.py#21 and infer the os, platform, architecture, and vm (https://github.com/mozilla/treeherder-service/blob/pulse-consumer/treeherder/pulse_consumer/consumer.py#L23). These expressions are applied to the 'payload.build.properties.buildername' property.

This is a very brittle strategy destined to fail in a variety of ways.

It would be great if we could keep all regular expressions in the build source code and have pulse consumers get information from explicit build properties in the pulse stream.  This way there would be a single source of platform regular expressions controlled by release engineering.

Would it be possible to add the properties machine_os, machine_platform, machine_arch, build_os, build_platform, build_arch, and vm to 'payload.build.properties' in the pulse stream?

Examples:

[ 'machine_os', 'linux' ]
[ 'machine_platform', 'Fedora 12' ]
[ 'machine_arch', 'x86' ]
[ 'vm', False|True ]

and a similiar set for the build:

[ 'build_os', 'linux' ]
[ 'build_platform', 'Fedora 12' ]
[ 'build_arch', 'x86' ]
Whiteboard: [treeherder]
It would be great if we could have the same info on buildapi. That's currently the only source we have to retrieve pending and running jobs.
Product: mozilla.org → Release Engineering
mdoglio: would you mind listing how you retrieve the information from buildapi?
I think once fixed for pulse it gets fixed for buildapi as the properties are the same.

Some info has been collected in here:
https://etherpad.mozilla.org/releng-buildernames

ScriptFactory takes every job property that is dumped into the "properties" directory on the slave and publishes it.

If we load one of our buildjson data sources [1] we can see that jobs have a variety of properties.
Here are some samples of the properties for a test [2] and a build [3] job.


[1] http://builddata.pub.build.mozilla.org/builddata/buildjson/builds-2014-04-30.js.gz

[2] Build properties
      "properties": {
        "base_bundle_urls": [
          "https://ftp-ssl.mozilla.org/pub/mozilla.org/firefox/bundles"
        ], 
        "base_mirror_urls": null, 
        "basedir": "/builds/slave/b2g_try_emu-kk-d_dep-000000000", 
        "branch": "try", 
        "buildername": "b2g_try_emulator-kk-debug_dep", 
        "buildid": "20140429195314", 
        "buildnumber": 323, 
        "builduid": "b36d6010408b4c868feeb54f29192e21", 
        "compare_locales_revision": "1fc4e9bc8287", 
        "gaia_revision": "725a23802708eb70e3d7e8a2ce7179adbac806e4", 
        "gecko_revision": "d65850223286", 
        "hgurl": "https://hg.mozilla.org/", 
        "log_url": "http://ftp.mozilla.org/pub/mozilla.org/b2g/try-builds/rvitillo@mozilla.com-d65850223286/try-emulator-kk/b2g_try_emulator-kk_dep-bm75-try1-build419.txt.gz",
        "master": "http://buildbot-master83.srv.releng.scl3.mozilla.com:8101/", 
        "mock_target": null, 
        "platform": "emulator-kk-debug", 
        "product": "b2g", 
        "project": "", 
        "repo_path": "try", 
        "repository": "", 
        "request_ids": [
          40760423
        ], 
        "request_times": {
          "40760423": 1398804916
        }, 
        "revision": "d65850223286", 
        "scheduler": "b2g_try-b2g", 
        "script_repo_revision": "14fd3b0b767f", 
        "slavename": "bld-centos6-hp-040", 
        "tooltool_url_list": [
          "http://runtime-binaries.pvt.build.mozilla.org/tooltool"
        ], 
        "upload_ssh_key": "trybld_dsa", 
        "upload_ssh_server": "stage.mozilla.org", 
        "upload_ssh_user": "trybld"

[3] Build properties:
      "properties": {
        "appName": "Firefox", 
        "appVersion": "32.0a1", 
        "basedir": "c:/builds/moz2_slave/b2g-in-w32-pgo-000000000000000", 
        "branch": "b2g-inbound", 
        "builddir": "b2g-in-w32-pgo-000000000000000", 
        "buildername": "WINNT 5.2 b2g-inbound pgo-build", 
        "buildid": "20140429143004", 
        "buildnumber": 284, 
        "builduid": "0b7a198deb904635b9d6e09ed86da5c1", 
        "comments": "", 
        "filepath": null, 
        "forced_clobber": false, 
        "got_revision": "469b786fd5f1", 
        "hashType": "sha512", 
        "installerFilename": "firefox-32.0a1.en-US.win32.installer.exe", 
        "installerHash": "4abb9d99ddaf13ec3d08e858526c1b4052f0d16705e584ee2ddbb48dcc7e2240df85ee9fdc74385f276096b16c46bd755a52d6e81edbaf7fbbc7a7a1d3cea190", 
        "installerSize": "33099664", 
        "jsshellUrl": "http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/b2g-inbound-win32-pgo/1398807004/jsshell-win32.zip", 
        "log_url": "http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/b2g-inbound-win32-pgo/1398807004/b2g-inbound-win32-pgo-bm82-build1-build284.txt.gz", 
        "master": "http://buildbot-master82.srv.releng.scl3.mozilla.com:8001/",
        "packageFilename": "firefox-32.0a1.en-US.win32.zip",
        "packageHash": "5b6d8f0653a11f91488f80a84831713631875498a5823193e3dc02af30dfe46402857e1edd839643f052533fb1f7116a3434e07903c22527c9ffcee22e345e15",
        "packageSize": "41946691",
        "packageUrl": "http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/b2g-inbound-win32-pgo/1398807004/firefox-32.0a1.en-US.win32.zip",
        "periodic_clobber": false,
        "platform": "win32",
        "product": "firefox",
        "project": "",
        "purge_actual": "159.22GB",
        "purge_target": "12GB",
        "purged_clobber": false, 
        "repository": "", 
        "request_ids": [
          40762864
        ], 
        "request_times": {
          "40762864": 1398807005
        }, 
        "revision": "469b786fd5f17474b6584e2d50015216a6fb787e", 
        "scheduler": "b2g-inbound periodic", 
        "slavebuilddir": "b2g-in-w32-pgo-000000000000000", 
        "slavename": "w64-ix-slave16", 
        "sourcestamp": "469b786fd5f1", 
        "stage_platform": "win32-pgo", 
        "symbolsUrl": "http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/b2g-inbound-win32-pgo/1398807004/firefox-32.0a1.en-US.win32.crashreporter-symbols.zip", 
        "testresults": [
          [
            "libxul_link", 
            "libxul_link", 
            2916331520, 
            "2916331520"
          ]
        ], 
        "testsUrl": "http://ftp.mozilla.org/pub/mozilla.org/firefox/tinderbox-builds/b2g-inbound-win32-pgo/1398807004/firefox-32.0a1.en-US.win32.tests.zip", 
        "toolsdir": "c:/builds/moz2_slave/b2g-in-w32-pgo-000000000000000/tools", 
        "vsize": 2916331520
      },
armenzg: we are currently fetching builds-4hr.js.gz, builds-pending.js and builds-running.js.
For what concerns the builds-4hr file, we inspect the properties node for each job and we get buildername and other properties out of it. We then run several regular expressions against the buildername to obtain the properties I listed in the "Property rich builders" section of that etherpad. You can see the shape of the final json structure here [1]. If we could publish those properties listed on the etherpad in both the builds-4hr, builds-pending and builds-running file we would be able to get rid of all these regular expressions[2]

[1] https://github.com/mozilla/treeherder-service/blob/master/treeherder/etl/buildapi.py#L139
[2] https://github.com/mozilla/treeherder-service/blob/master/treeherder/etl/buildbot.py#L25-L386
Excellent! This should help move this forward.
I don't currently have time but I hope to do so in the next 2 weeks if no one picks it up first.

For the record, if we change mozharness to create a file under the directory "properties" with this format:
property_name1:value2
property_name2:value2
...
property_nameN:valueN

That is all it would take IIUC.
We would just need a new action to dump the properties.
Blocks: 1031238
(In reply to Armen Zambrano - Automation & Tools Engineer (:armenzg) from comment #4)
> Excellent! This should help move this forward.
> I don't currently have time but I hope to do so in the next 2 weeks if no
> one picks it up first.

Hi! Still up for doing this? :-)
Flags: needinfo?(armenzg)
Blocks: 1026109
I'm working on the Mulet reftests.
I'm assigning it to me, however, I will also be looking for someone else to pick it up.

Ed, what is the timeline and priority in here? (since it is a bug that has been filed for a while)
What does it block? (I want to have a clear summary)
Assignee: nobody → armenzg
Flags: needinfo?(armenzg)
Priority: -- → P3
jeads, what is the timeline and priority in here? What does it still block?

(In reply to Jonathan Eads ( :jeads ) from comment #0)
> Examples:
> 
> [ 'machine_os', 'linux' ]
> [ 'machine_platform', 'Fedora 12' ]
> [ 'machine_arch', 'x86' ]
> [ 'vm', False|True ]
> 
> and a similiar set for the build:
> 
> [ 'build_os', 'linux' ]
> [ 'build_platform', 'Fedora 12' ]
> [ 'build_arch', 'x86' ]

Why not keep both the same?
Do you need anymore than that? I also saw these:
* build type [pgo|asan|debug]
* job type [build|unittest|talos|repack]

I think an implementation of this would look like this:
* We create a mozharness mixin that is executed at the end of a script like blobber does [1]
** This is where to hook it up for test jobs [2]
* We have a function there that set every property we want with set_buildbot_property() [3]

Some of the values can be easily determined. We might want to add a validation system.
>>> os.uname()[0]
'Linux'
>>> os.uname()[-1]
'x86_64'
>>> "%s %s" % (platform.dist()[0], platform.dist()[1])
'Ubuntu 13.10'
>>> platform.machine()
'x86_64'


[1] http://hg.mozilla.org/build/mozharness/file/ac66baf119f1/mozharness/mozilla/blob_upload.py#l97
[2] http://hg.mozilla.org/build/mozharness/file/default/scripts/desktop_unittest.py#l36
[3] http://hg.mozilla.org/build/mozharness/file/ac66baf119f1/mozharness/mozilla/blob_upload.py#l91
Flags: needinfo?(jeads)
I will unassign until I hear back.
Assignee: armenzg → nobody
Flags: needinfo?(jeads)
Given bug 1031238 comment 2, should this be marked WONTFIX as well?
Status: NEW → RESOLVED
Closed: 9 years ago
Resolution: --- → WONTFIX
Component: General Automation → General
You need to log in before you can comment on or make changes to this bug.