Open Bug 1614775 Opened 5 years ago Updated 5 years ago

webapp nodes running out of memory with Reprocessing usage

Categories

(Socorro :: Webapp, defect, P3)

defect

Tracking

(Not tracked)

People

(Reporter: willkg, Unassigned)

Details

I'm reprocessing 660k crash reports. In the process of doing that, the memory usage graphs for the webapps suggest that the webapps are leaking memory and it's directly related to reprocessing api endpoint usage.

https://earthangel-b40313e5.influxcloud.net/d/LysVjx8Zk/socorro-prod-megamaster-remix?orgId=1&from=1581431132948&to=1581450878998&fullscreen&panelId=50

Is it just the reprocessing API? Are other API endpoints affected? Is there some caching going on somewhere? Maybe sessions?

This bug covers looking into this.

Making this a P3. The circumstances don't come up much, but this is fishy and we should look into it.

Type: task → defect
Priority: -- → P3

My gut feeling is that this is related to how the pubsub library tries really hard to be asynchronous, but I tried to work around that to make it synchronous in the webapp reprocessing api code.

If we wanted to look into this further, I think we should try to reproduce it in a local dev environment. Something like this:

  1. run the webapp
  2. sample the memory usage
  3. run 100 reprocessing API requests (or 500? I'm not sure how many you'd need)
  4. sample the memory again and see what happened

I can't remember how the api views work in the webapp. If they keep or cache the implementation instance, then we could try keeping the publisher instance in the PubSubCrashQueue instance rather than creating a new one every time publish is called.

You need to log in before you can comment on or make changes to this bug.