Memory issues for pulse_actions

RESOLVED FIXED

Status

Testing
General
RESOLVED FIXED
2 years ago
2 years ago

People

(Reporter: armenzg, Unassigned)

Tracking

Firefox Tracking Flags

(Not tracked)

Details

(Reporter)

Description

2 years ago
In order to help us talk around this topic.

We deployed an ijson fix to pulse_actions.
Logging in papertrail has stopped. Tomorrow we will see our quota start from scratch.

I tried upgrading the account but it is hard to figure out how to upgrade.
I will try again tomorrow.
(Reporter)

Comment 1

2 years ago
It seems that after 15:11 PT we dropped almost 3GB.
After 17:15 it stopped happening.

I assume the system started freeing up the memory which Python had allocated in the past.

I've also started filtering DEBUG messages until I upgrade Papertrail's account.

Jul 20 15:11:36 pulse-actions heroku/worker.1: Process running mem=4935M(481.9%)
Jul 20 15:11:36 pulse-actions heroku/worker.1: Error R14 (Memory quota exceeded)
Jul 20 15:12:35 pulse-actions heroku/worker.1: Process running mem=1103M(107.8%)
Jul 20 15:12:35 pulse-actions heroku/worker.1: Error R14 (Memory quota exceeded) 

Jul 20 15:13:40 pulse-actions heroku/worker.1: Error R14 (Memory quota exceeded)
Jul 20 15:14:02 pulse-actions heroku/worker.1: Process running mem=1066M(104.1%)
At July 20th 15:11 (Heroku time) I deployed a version that cleared mozci caches before processing a message. After that all our errors happened right after loading a builds-{day}.js file (We can see that comparing the timestamps on [0] and [1]). The error would happen or not depending on the size of the builds-{day}.js file.

Yesterday I deployed a version that uses ijson [2]. There were a couple of crashes while I was trying to install a C library on Heroku to enable a yajl backend for ijson. On July 21st 17:15 (Heroku time) the version with ijson and yajl was deployed. Hopefully this works and we will not see new memory errors.

[0] https://papertrailapp.com/systems/pulse-actions/events?q=+Process+running+mem
[1] https://papertrailapp.com/systems/pulse-actions/events?q=About+to+load+%2Fapp%2F.mozilla%2Fmozci%2Fbuilds-2015-
[2] https://pypi.python.org/pypi/ijson
(Reporter)

Comment 3

2 years ago
Alice thanks for filling in the timelines.
I'm very happy to hear that you also managed to get yajl backend installed; that is very sweet!

The links [0] ad [1] are not working probably because of the papertrail changes I'm working on.
I think once we have 24 hours of no issues we can declare victory.

Are there any other steps that you believe are remaining?
Memory usage is stable and bellow 300MB [0], I believe we can close this bug as fixed.

[0] https://papertrailapp.com/systems/pulse-actions/events?q=sample%23memory_total&r=560796355627646977-560909022765842438
(Reporter)

Comment 5

2 years ago
and there was joy! \o/

Thanks Alice!
Status: NEW → RESOLVED
Last Resolved: 2 years ago
Resolution: --- → FIXED
You need to log in before you can comment on or make changes to this bug.