Performing adhoc queries on TH data allows me to identify specific instances of data corruption, and have a way to monitor the size of the problem, track its resolution using a query. For example, Treeherder leaves jobs in a running state . ActiveData can now be used to track this problem, count the size, or calculate the percent, and the query can be used to confirm the problem is resolved. We can not do this for Treeherder unless we ingest Treeherder data.  https://bugzilla.mozilla.org/show_bug.cgi?id=1296077
The current index holds Treeherder job snapshots, not just completed jobs. This was done as an easy way to ensure we have the latest job information. The next step is to add expiry dates to all the snapshots that have been superseded so we can filter them out.