Investigate spike in schema errors related to rally
Categories
(Data Platform and Tools :: General, task)
Tracking
(Not tracked)
People
(Reporter: ascholtz, Assigned: whd)
Details
(Whiteboard: [dataquality])
Starting 2023-02-10 there is a significant increase in schema errors for multiple paths related to rally. This could be related to the shut down of rally.
Assignee | ||
Comment 1•2 years ago
|
||
This sounds suspiciously like a routing error related to https://mozilla-hub.atlassian.net/browse/DSRE-1140. I will investigate.
Reporter | ||
Updated•2 years ago
|
Assignee | ||
Comment 2•2 years ago
|
||
This seems pretty annoying. AFAICT, sanic
dropped support for non-regex based prefix matching in routing in some version between what we were using in April 2022 and now: https://sanic.dev/en/guide/basics/routing.html#adding-a-route
I don't see where the breaking change was documented but it's easy to reproduce:
#!/usr/bin/env python
from sanic import Sanic
from sanic import response as res
app = Sanic(__name__)
@app.route("/rally-<_id:str>")
async def test(req):
return res.text("I'm a teapot", status=418)
if __name__ == "__main__":
app.run(host="0.0.0.0", port=8000, debug=True)
All the docs examples start a path parameter expression right after a /
, but I don't see where that's made a requirement. The application doesn't fail when reading this config, it just doesn't seem to route correctly. It seems pretty bizarre to me that they would add this given the docs say str
is equivalent to Regular expression applied: r"[^/]+")
which should match. My guess is there is some middleware that they've moved to that pre-parses on slashes. Ordinarily you could just use a regex, but our config is passed as JSON from the environment, and all the examples use python regex literals to pass in the regex.
So right now I'm not sure how to get the previous behavior without doing something like updating the route table to include all pioneer namespaces (annoying but at least this list is now static so there won't be any more) or updating the code in some fashion.
Assignee | ||
Comment 3•2 years ago
|
||
:relud determined that you can pass regexs in as ordinary strings, they are just the "default" when no other pattern is matched. Therefore we can use a regex to replace the existing config. Once I've pushed that change I will copy the pioneer pings to the pioneer environment and then run a backfill.
Assignee | ||
Comment 4•2 years ago
|
||
After the route fix finished I imported the data into VPC-SC and backfilled the affected days. The vast majority of pings are affected by UnwantedData
exceptions via collect_through_date
metadata. I consider this issue resolved.
Reporter | ||
Updated•2 years ago
|
Description
•