[tracker] rewrite processor as thread-free service
Categories
(Socorro :: Processor, task, P3)
Tracking
(Not tracked)
People
(Reporter: lonnen, Unassigned)
References
Details
Reporter | ||
Updated•7 years ago
|
Updated•7 years ago
|
Comment 2•6 years ago
|
||
Comment 3•5 years ago
|
||
Unassigning myself since I'm not going to get to this any time soon.
Comment 4•5 years ago
|
||
It's been a long time since this bug was created, so it's prudent to update the mission here.
In our current infrastructure, each processor node runs a Docker container with a single processor process in it. Each processor process runs multiple threads. The processor spends the bulk of its time running minidump-stackwalk on crash reports and moving data over the network with S3 and Elasticsearch.
There are a few disadvantages to the current threaded model:
- Threaded architectures are more complex because they have to deal with contention between threads and resources. If we rewrite to a non-threaded model, we can remove some of this contention-handling code and the corresponding tests.
- Python threads doing network i/o block the entire process. One way to speed up applications that are network i/o heavy, is to change the model to multiprocess, or coroutines, or switch to asyncio and asyncio-aware libraries. The latter is hard especially since (at the time of this writing) boto3 isn't ascynio-aware, yet. Going multiprocess or coroutines seems helpful. We use multiprocess in some parts of Socorro already and we use coroutines in the collector.
- The processor spends the bulk of its time waiting for minidump-stackwalk to run in a separate process. The LLT team is rewriting breakpad bits in Rust and that will eventually involve us switching to a different minidump-stackwalk. We may not want to run that in a separate process. It's a lot simpler to plan for that if we're not dealing with multiple threads.
- The current processor architecture doesn't work in other contexts. It's not possible to write a cli that takes a crash id, processes the crash report, and then ends. Having something like that would be really helpful.
Thus, I want to rewrite the processor as a single-thread application and either switch to coroutines or move the concurrency up a level to either a process manager or to the infrastructure.
Comment 5•2 years ago
|
||
Making this block GCP migration because this will make instance sizing and scaling easier.
Updated•2 years ago
|
Updated•2 years ago
|
Updated•2 years ago
|
Comment 6•9 months ago
|
||
Removing this from the GCP migration. We can migrate the processor we've got as is.
Updated•4 days ago
|
Description
•