Closed Bug 1046056 Opened 11 years ago Closed 11 years ago

Datazilla shows too many results for flame-319MB

Categories

(Datazilla Graveyard :: User interface, defect, P1)

Tracking

(Not tracked)

RESOLVED FIXED

People

(Reporter: davehunt, Unassigned)

References

Details

(Keywords: perf, Whiteboard: [c=automation p= s= u=])

Attachments

(2 files)

In bug 1036675 we started running tests against a 319MB memory configuration on devices that are also used to submit results for other memory configurations. As expected, the 'flame-319MB' now shows up in datazilla, however there are results shown that predate this device name. In the attached screenshot, the latest three datapoints load successfully and show the correct device name, but prior datapoints will not show any results. I suspect we are somehow combining 'flame' and 'flame-319MB' results.
Blocks: 1046060
Jeads: Do you have any thoughts on this? I also notice that flame-512MB device hasn't appeared despite us sending results with this value. Is this because the mac addresses are providing data for different device names?
Flags: needinfo?(jeads)
May be related, but looking at results for the flame-319MB, some of the data points do not render the replicates chart. When looking at the Ajax response, instead of seeing the normal 2 objects for each point, the bad points show 3 objects, 2 of which contain test_machines of type flame-512MB. [1] https://cloudup.com/cGOJr19Mnbi
It looks like Eli is correct, these problems are linked. Here are the three json structures returned for the replicate query: https://datazilla.mozilla.org/b2g/refdata/objectstore/json_blob/revisions?branch=master&gaia_revision=c987645e91a87615e2faf3c0a76e114c0a5009b3&gecko_revision=4a390946b5c9&test_id=9&test_type=cold_load_time All three json objects have test_build.id set to the same value 20140803122727. The query that returns the data for the top performance graph in the database joins directly to both machine.id and test_build.id: https://github.com/mozilla/datazilla/blob/master/datazilla/model/sql/perftest.json#L640 There is one entry associated with all three json structures in b2g_perftest_1.build and three entries in b2g_perftest_1.machine. The device type is included in the WHERE clause with the inclusion of "m.type" but the where filtering occurs after the join in the SQL execution plan, so I think this is the source of the mixed data on the performance chart. The "m.type" needs to be added to the GROUP BY clause explicitly. The situation with the replicates is a little more complicated. The three replicate structures in Eli's example are returned due to the test_build.id issue but there's a guard in the UI that ensures the device type matches here: https://github.com/mozilla/datazilla/blob/master/datazilla/webapp/static/js/b2g_apps/ReplicateGraphComponent.js#L116 When you load this page: http://s4n2.qa.phx1.mozilla.com/b2g/?branch=master&device=flame-319MB&range=7&test=cold_load_time&app_list=email%20FTU&app=email%20FTU&gaia_rev=c987645e91a87615&gecko_rev=4a390946b5c9&plot=avg The conditional switch on L116 referenced above never evaluates to false, every replicate structure is skipped with continue: if( (results === undefined) || (device != deviceType ) ){ continue; } I wrote the values for "device", "deviceType", the test name, and "results" for each json structure to the console when the page loads and this is what we get: Array [ "flame-319MB", "flame-512MB", "cold_load_time", undefined ] Array [ "flame-319MB", "flame-319MB", "cold_load_time", undefined ] Array [ "flame-319MB", "flame-512MB", "cold_load_time", Array[30] ] So the problem there is the test selected is "cold_load_type", the device selected is "flame-319MB", for that combination there's no match in any of the three json structures. The json structure associated with the cold_load_time test has a device type of "flame-512MB". If you change the test selection to startup_>_moz-content-interactive the replicates load as expected: http://s4n2.qa.phx1.mozilla.com/b2g/?branch=master&device=flame-319MB&range=7&test=startup_%3E_moz-content-interactive&app_list=email%20FTU&app=email%20FTU&gaia_rev=c987645e91a87615&gecko_rev=9c6c2a98b9e7&plot=avg I'm going to try adding "m.type" to the GROUP BY clause in the main query in my local env and see how that goes. Will report back when I know more.
Flags: needinfo?(jeads)
Severity: normal → major
Keywords: perf
Priority: -- → P1
Whiteboard: [c=automation p= s= u=]
The answer to davehunt's question "Is this because the mac addresses are providing data for different device names?" is a definitive Yes. The problem is not a SQL query. The issue is the machine names (mac address) are being reused for different device types. In the datazilla data model a unique machine is defined as the combination of machine_name and operating_system_id (osversion in the json) https://github.com/mozilla/datazilla/blob/master/datazilla/model/sql/template_schema/schema_perftest.sql.tmpl#L139 These are all of the occurrences of the machine_name "24:0a:11:e2:20:34" in b2g json objects in datazilla. { "date": "2014-07-29 02:59:35", "osversion": "2.0.0.0-prerelease", "machine_name": "24:0a:11:e2:20:34", "machine_type": "flame-273MB" } { "date": "2014-07-31 08:05:10", "osversion": "2.0.0.0-prerelease", "machine_name": "24:0a:11:e2:20:34", "machine_type": "flame-319MB" } { "date": "2014-08-01 02:31:53", "osversion": "2.0.0.0-prerelease", "machine_name": "24:0a:11:e2:20:34", "machine_type": "flame" } { "date": "2014-08-06 18:35:00", "osversion": "2.0.0.0-prerelease", "machine_name": "24:0a:11:e2:20:34", "machine_type": "flame-512MB" } { "date": "2014-07-29 16:58:10", "osversion": "2.1.0.0-prerelease", "machine_name": "24:0a:11:e2:20:34", "machine_type": "flame-273MB" } { "date": "2014-08-01 03:52:04", "osversion": "2.1.0.0-prerelease", "machine_name": "24:0a:11:e2:20:34", "machine_type": "flame" } { "date": "2014-08-07 12:51:04", "osversion": "2.1.0.0-prerelease", "machine_name": "24:0a:11:e2:20:34", "machine_type": "flame-319MB" } { "date": "2014-08-07 14:49:45", "osversion": "2.1.0.0-prerelease", "machine_name": "24:0a:11:e2:20:34", "machine_type": "flame-512MB" } There are only 2 unique rows in the machine table for each of these json object entries: mysql> select m.id, m.name, m.type, o.name, o.version, m.date_added from machine as m join operating_system as o on m.operating_system_id = o.id where m.name = '24:0a:11:e2:20:34'; +-----+-------------------+-------------+------------+--------------------+------------+ | id | name | type | name | version | date_added | +-----+-------------------+-------------+------------+--------------------+------------+ | 335 | 24:0a:11:e2:20:34 | flame-512MB | Firefox OS | 2.0.0.0-prerelease | 1405388222 | | 265 | 24:0a:11:e2:20:34 | flame-319MB | Firefox OS | 2.1.0.0-prerelease | 1403828223 | +-----+-------------------+-------------+------------+--------------------+------------+ The type is getting updated for each unique combination of machine name and operating_system_id but all of the previous device type associations are lost. We can fix the index because all of the json objects have an associated test_run_id which I can use to update the associated machine id and create the correct device type association from the original data received. The following steps would be required: 1.) Add the device type to the unique key on the machine table. 2.) Iterate through the json objects identifying all of the unique combinations of machine_name, operating_system_id, and device type that are not present and insert them into the machine table. 3.) Use the mapping of test_run_ids to the original json objects to update the index with the correct information. davehunt does this sound correct? Is it expected that the same machine name (mac address) will be used for multiple device types? If so, I think this is the best way to proceed. If not, can we make the machine names unique?
Flags: needinfo?(dave.hunt)
(In reply to Jonathan Eads ( :jeads ) from comment #4) > davehunt does this sound correct? Is it expected that the same machine name > (mac address) will be used for multiple device types? If so, I think this is > the best way to proceed. If not, can we make the machine names unique? We hadn't anticipated the mac address being used for different device types, and in fact there's still only one device (flame) in use. We are however now running a device in a combination of different modes (memory configurations in this case). I think making the device type part of the unique key sounds like a good approach, and I wouldn't expect any negative impact. The only thing I can foresee is that if we decide to change a device type (for example when we changed from device IDs to common names) we might need to perform a database update to merge the two datasets.
Flags: needinfo?(dave.hunt)
Depends on: 1054452
Status: NEW → RESOLVED
Closed: 11 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.

Attachment

General

Created:
Updated:
Size: