The build team wants to gather and process data that is generated by the Mach tool, so they can measure and track build times and other relevant developer metrics / pain points. We can start with the minimum viable set up, providing them an HTTP endpoint to which they can POST their JSON data, which we will route to an S3 bucket from which they can retrieve said data. First iteration doesn't require us to parse or crack open the JSON at all. It's okay for the data to go into S3 as a Heka message stream, I (i.e. rmiller) will work with them to make sure they have the tooling they need to convert the Heka message stream back into the original JSON for their own processing.
Scheduled for the next sprint so it can make Q1
Due to e10s and ops related fires, we are unable to address this ticket during this sprint.
Per IRL chat today, we'd like to revive this request.
Initially, we're looking to do some rapid prototyping on the type of data we send. So we're probably looking for a single endpoint that accepts N different message types with initially no schema validation. The data will almost certainly be JSON.
Once we have the client-side bits implemented and have confidence in the data we're sending, we can formalize a schema "productionize" the ingestion. How that switchover works, I'm not sure. It probably involves 2 endpoints (e.g. "stage" vs "production") or some kind of routing key in the HTTP request. Not sure what options are available.